OBJECTIVE:
Our objective for this project is to develop a software algorithm capable of automatically transcribing musical audio recordings into sheet music or MIDI sequences. This will require a multidisciplinary approach, integrating principles from digital signal processing and music theory. The primary goal is to accurately analyze and notate the pitch, timing, and dynamics of musical performances, enabling the creation of detailed and precise musical scores from audio recordings.
BACKGROUND:
This project is intended to streamline a musician's ability to get a thought from their brain to paper or some notation. The market for other software like text-to-speech and other transcription software is a market that is continuously growing but needs high-quality products to refine what is currently out there. Any musician who has ever thought of an idea and been unable to solidify it by the time they forget it would see great value in a tool like this.
METHODOLOGY:
To properly obtain the results we are chasing, multiple actions must be taken. First, we must collect data, such as audio recordings, sheet music, and transcription rules, that will help guide our software to develop positive transcriptions. This step feeds directly into the next, which is choosing and running a machine learning algorithm that is able to receive the an audio file and its corresponding sheet music and show understanding of what is going on. Coding and debugging will be a major part of this step. Once we have trained this algorithm, we finally put it into testing by feeding it audio files without sheet music and obtaining the transcriptions it makes.
EXPECTED RESULTS:
Real-time transcription of audio to midi signals.
Accurately analyze and notate the pitch, timing, and dynamics of musical performances.
COSTS:
This project should not be too costly! In the context of doing this through class, we will not need to have a salary for those working on the project. The main source of expenses will be the computing and data collection. The computing aspect includes the power to run the electronics as well as software/hardware that is needed to run our code. This can vary depending on which type of machine learning we decide to use. Data collection can vary in price as we can use public recordings and transcriptions to train our program, though this can be limiting and we may have to license music in order to get enough data to send through the machine.