Instructor: Juan Pablo Bello
Office hours (online, book here): Tuesdays 9.30-11am, 15 mins each slot
Email: jpbello@nyu.edu
Office: 370 Jay St. Floor 3, 323
Course Assistant: Julia Wilkins
Office hours: 10.30-11.30am Wednesdays in-person
370 Jay St. Floor 3 Room 324
Email: jw3596@nyu.edu
Course Information
Course number and section: CS-GY 6933 I, ECE-GY 9173 I
Title: Machine Listening
Lectures: Tuesdays 6-8:30pm. 6 MetroTech Center, Room 775
Credits: 3
Recommended skills: prior knowledge of linear algebra, basics of machine learning, programming in Python
Desirable but not required: basics of signal processing and acoustics
Prerequisite courses: CS-GY 6923 Machine Learning or equivalent
Course overview: This course provides a comprehensive introduction to the field of machine listening, i.e., the development of computational methods for making sense of the world through sound. It combines knowledge and methods from machine learning, signal processing and acoustics towards the automatic detection, classification, localization, separation and retrieval of everyday sounds. Through lectures and assignments you will learn about (a) audio signal processing and machine learning fundamentals; (b) classical and deep ML approaches to sound event classification and detection; (c) spatial audio analysis and sound event localization; and (d) audio representation learning.
Goals: Students will acquire fundamental knowledge of sound, audio signal processing, and advanced applied machine learning techniques for audio analysis. They will read and discuss the literature, learn to define and evaluate tasks, use open data and code benchmarks, and gain hands-on experience on the implementation and application of various techniques via assignments and a final project.
Download the syllabus here.
Course Requirements
The grade for this course will be determined according to the following:
40% Assignments
4 Assignments (each includes a code lab + in-class quiz)
10% each
Announced in class (Tues) → due the following Thursday at midnight EST
You are allowed 2 x one-day extensions to use during the semester
30% Midterm exam
Pen and paper, no laptop/phone. You’re allowed a 1-page cheat sheet with notes.
Choose 3 out of 4 essay questions, will include pseudo-code, 10% each
30% Project
Groups of 3
Choose 1 Task from the last 3 years of the DCASE Challenge
10% Project proposal presentation: detailed review of the task (definition, data, metrics) and at least 2 different “state of the art” approaches from the literature in great technical detail (submit slides)
20% Final project presentations: implement and evaluate one of the review approaches from scratch, including a new variation and ablation study (submit code, report and slides)
Extra credit – Class participation: attendance (on time), questions, discussions (in-person and on Ed), general interest
Additional Guidance on Assignments:
Language: Python >= 3.9
For assignments 1 and 2 you will work in local Jupyter Notebooks, and for assignments 3 and 4 you will use Google Colab to help with computational resources.
All assignments will be released as Jupyter Notebooks, downloaded from course Github: https://github.com/juliawilkins/nyu-csgy6933-ML26 (for Colab, you will need to upload the .ipynb).
Jupyter notebook startup guide on the course repository!
Assignments will have a tutorial section and a core problem section. The tutorial section will be used to guide you through more Python-based concepts (e.g. loading audio in Python), and the problem section will be the bulk of the assignment where you will be solving different problems related to concepts discussed in class.
Submission process: Complete code → evaluate the notebook → submit evaluated .ipynb file in Brightspace.
Assignment will specify whether to use built-in functions or write things from scratch.
The submission of the code and results will be followed by an in-class quiz to assess your understanding of the assignment. You need to pass the quiz to pass the assignment.
Note: Please don’t rely on ChatGPT/Gemini etc. to write your code! You will learn by doing the work, and failure to do it yourself will affect your performance in the quizzes and mid-term exam.
⚠️For Assignments 3-4, you may use Google Colab for compute purposes. It’s no secret that Colab has many built-in AI features (e.g. autocomplete+Gemini). Please turn this off via Settings -> AI Assistance. You will need to understand the code and concepts deeply to succeed in the in-class quizzes and exams and I highly recommend not relying on this to complete the assignments. ⚠️
Resources and Course Communication
Textbook: no textbook required. The course uses structure and content from:
Virtanen, T., Plumbley, M., and Ellis, D. “Computational Analysis of Sound Scenes and Events”. Springer (2018)
In Bobst online: https://search.library.nyu.edu/permalink/01NYU_INST/1d6v258/alma9998754449407871
Other suggested readings and references will be provided throughout the semester
Brightspace: for announcement, grades and course materials -- https://brightspace.nyu.edu/d2l/home/544820
Github to get assignment code: https://github.com/juliawilkins/nyu-csgy6933-ML26
EdStem Discussion platform for course questions on material/assignments
Join from Brightspace (content → course tools)/here: https://edstem.org/us/courses/93879/discussion
Code of Conduct
General principles:
Be nice, respectful and tolerant
If you need special assistance, it is my job to accommodate you, not your job to figure it out. Please let me know asap
If issues out of your control arise, please communicate them on a timely fashion
If you are uncomfortable, please speak up. You will make us better
We are all people: fallible, slow to respond, with a life outside the classroom. Please be patient.
Attendance and Tardiness: all students are expected to be in class on time. Delays and unjustified absences will result in loss of class participation grade.
Late Assignments: You are allowed 2 x one-day extensions to use during the semester for any 2 assignments. No other late work will be accepted.
Don’t cheat (you will be cheating yourself): we have a zero tolerance policy on cheating. Coursework evaluations will be set to 0 for any copied, (nearly) identical submissions, or any submitted work suspected not to be your own.
Generative AI Use: You’re discouraged to use chat bots or any other generative AI system to write the code or written parts of your assignments and projects (this includes auto-complete and built-in AI via Gemini for the assignments using Colab notebooks). Much of the learning in this course is achieved through coursework, and use of generative AI will impede your learning and affect other parts of the course. Furthermore we will use in-class evaluations such as quizzes on the assignments, the mid-term exam, and Q&As after presentations to assess your in-depth understanding of the materials. You will also be expected to answer detailed questions about your project work during the proposal and final presentations.