Midwest Music and Audio Day 2019

Indiana University, Bloomington, IN, June 27, 2019


The Midwest Music and Audio Day is an informal workshop where Midwest researchers and industry folks interested in audio, either in musical form or not, get together to present, discuss, listen, and have the usual cross-pollination that results from meetings. The whole event is confined to a single day to make it easier for people to participate. We are low-key and friendly, as Midwest folks are supposed to be. We try to give everyone who wants to present a forum to do this, though it is not required that you do so. Registration is free, but required through this website.


Photos from the event are available here.

When and Where


June 27, 2019

Workshop Venue (Luddy Hall)

Dorsey Learning Hall (Room 1106), Luddy Hall

700 N. Woodlawn Ave., Bloomington, IN 47408


  1. Please indicate that you need a parking pass during registration
  2. You can park at one of the "Visitor" spaces on the 2nd floor in the Fee Lane Parking Garage
  3. The Luddy Hall is walking distance (less than 5 min)
  4. You'll get a pass from the organizers and use it to get out of the garage for free
  5. Call Minje if you're lost! (217-419-8372)


There are many lodging options in Bloomington. Some of them are as follows:


  • If you are flying in, the closest airport is the Indianapolis International Airport (IND). From there, you'll need to rent a car to drive down to the Bloomington campus (less than an hour if the traffic is light), or take a shuttle (reservation site, $23 one-way). Uber or Lyft works, too, but the fair could be unexpectedly high during the rush hour.

Attending Organizations

Northwestern University

Carnegie Mellon University

University of Rochester

Indiana University

Swarthmore College




"Toward Human-Computer Collaborative Music Making "


Music performance is central to all music activities, and it is often a highly collaborative effort among multiple musicians: They harmonize their pitch, coordinate their timing, and reinforce their expressiveness to make music that strikes the hearts of the audience. Can humans collaborate with machines to play music together? Can this collaboration be as natural as that among human musicians themselves? This is a dream of many musicians and researchers, and this is a dream that we aim to fulfill. In this talk, I will first review existing work on human-computer collaborative music making in the past decades including automatic music accompaniment systems and music dialogue systems. Then I will argue that the key for allowing natural collaboration between humans and machines is to empower machines three core musicianship skills: perception, performance, and theory/composition, where perception and performance skills need to be audiovisual. I will present our recent work on audiovisual music processing, and ongoing work on music generation and human-computer interactive improvisation. Finally, I will outline future directions for integrating these components together toward human-computer collaborative music making.


Zhiyao Duan is an assistant professor in Electrical and Computer Engineering, Computer Science and Data Science at the University of Rochester. He received his B.S. in Automation and M.S. in Control Science and Engineering from Tsinghua University, China, in 2004 and 2008, respectively, and received his Ph.D. in Computer Science from Northwestern University in 2013. His research interest is in the broad area of computer audition, i.e., designing computational systems that are capable of understanding sounds, including music, speech, and environmental sounds. He is also interested in the connections between computer audition and computer vision, natural language processing, and augmented and virtual reality. He co-presented a tutorial on Automatic Music Transcription at ISMIR 2015. He received a best paper award at the 2017 Sound and Music Computing (SMC) conference, a best paper nomination at the 2017 International Society for Music Information Retrieval (ISMIR) conference, and a CAREER award from the National Science Foundation (NSF). His research is funded by NSF, NIH, and University of Rochester internal awards on AR/VR and health analytics. He has served as session chairs and PC members for international conferences such as ISMIR, ICASSP, ACM Multimedia and as regular reviewers for international journals such as IEEE TASLP, TMM, TIP, TKDE and THMS. He is a member of the IEEE.

Call For Participation

We invite submissions (talks/posters/demo/art works) across a variety of categories for research related to music and audio. If you want to present your work, please submit an abstract describing your work in the registration page. The abstract should be less than 250 words.

Research area of particular interest include (but are not limited to):

  • Music information retrieval
  • Music recommendation
  • Instrumentation identification
  • AI for music generation
  • Speech modeling and synthesis
  • Auditory modeling
  • Environmental sound recognition
  • Audio fingerprinting
  • Song lyrics transcription
  • Voice morphing
  • Audio source separation
  • 3D audio
  • Auditory display and sonification
  • Sound visualization
  • language detection
  • Speech recognition
  • Speech generation
  • Interface design for music or audio application
  • Digital musical instrument design
  • Controllers and interfaces for musical expression
  • Interactive sound art and installations

Free Registration

Please submit your registration form and abstract here.

Important Dates

  • Abstract submission/registration deadline: Jun 15, 2019
  • MMAD - June 27, 2019


Opening Remarks 9:00-9:10

Morning Session (Music Composition and Tools) 9:10-12:00

    • 9:10-9:30 Roger Dannenberg: Using Structure in Automatic Music Composition
    • 9:30-9:50 Yujia Yan: What's expected for music generated from a model?
    • 9:50-10:10 Sam Goree: Exploring virtual keyboard design
    • 10:10-10:30 Christodoulos Benetatos: BachDuet: A Human-Machine Duet Improvisation Systems

Coffee break: 10:30-10:50

    • 10:50-11:10 Yucong Jiang: Score following of Piano by modeling Latent Timbral Parameters
    • 11:10-12:10 Keynote Talk—Toward Human-Computer Collaborative Music Making, Zhiyao Duan

Lunch 12:10 - 1:40

Early Afternoon Session 1:40 - 3:00

    • 1:40-2:00 Don Byrd: Music Information Research (not just Retrieval): A Byrd's Eye View
    • 2:00-2:20 Jon Dunn: The Audio-Visual Metadata Platform
    • 2:20-2:40 Kahyun Choi: A Trend Analysis on Concreteness of Popular Song Lyrics
    • 2:40-3:00 Minje Kim: Deep Autotuner

Coffee break : 3:00 - 3:20

After Coffee Session (Audio Processing) 3:20- 5:00

    • 3:20-3:40 Donald Williamson: The UCAN method for automatical assessment of speech quality
    • 3:40-4:00 Ethan Manilow: Libraries and Datasets to Power the Next Generation of Source Separation Research
    • 4:00-4:20 Yi Shen: Listener Preference on the Local Criterion for Ideal Binary-Masked Speech
    • 4:20-4:40 Khandokar Md. Nayem: Incorporating Intra-spectral Dependencies With A Recurrent Output Layer For Improved Speech Enhancement
    • 4:40-5:00 Bongjun Kim: A Human-in-the-loop system for labeling sound events in audio recordings

Closing Remarks 5:00-5:05

Hosted by School of Informatics, Computing, and Engineering at Indiana University


Christopher Raphael

Donald Williamson

Minje Kim

Kahyun Choi

(If you have any questions, contact Minje: minje@indiana.edu or 217-419-8372)

Organized by

School of Informatics, Computing, and Engineering at Indiana University

Sponsored by

Department of Intelligent Systems Engineering