Gracen Wallace

Week of 1/31/19

Continued looking into previous research relating lyrical content to melody
Possible GUIs for raw audio analysis:
- Sonic Visualizer with Melodia extension used for melody analysis and extraction
  - http://www.justinsalamon.com/melody-extraction.html
  - https://www.upf.edu/web/mtg/melodia
- MATLAB based MusicXML that uses separate melody, lyric, and chord files from online repositories to analyze relationship between these 3 elements
  - http://music.informatics.indiana.edu/code/musicxml/
Began researching Essentia and its possible uses
- https://essentia.upf.edu/documentation/essentia_python_tutorial.html

To Do:

Test out programs and continue searching for other possibilities as well
Test out MusicXML with raw audio of a few chosen hip hop songs

Week of 2/7/19

Continued research:
- Paper on the correlation between musical variations and word processing in the brain: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2847603/
- Database of XML files for use in MATLAB MusicXML software: https://musescore.com/dashboard
- Paper on the compilation of a large database of lyric-melody aligned data, and analysis of word boundaries in lyrics/melody pairs: http://www.aclweb.org/anthology/N18-1015
MusicXML has been difficult, mostly in writing to the correct directories so I am still working on its implementation
Researched methods by which musicians write lyrics to melody and vice versa to get a better background knowledge of how the respective elements are created
Melodia is working well, I have tested a couple different hip hop songs, but am trying to figure out a way to implement it more efficiently without having to load and test each individual audio file
Pictured below: waveform of an audio file, with the melodic range spectrum displayed underneath

To Do:

Fix MATLAB, make it write to the correct directory hopefully
Finish reading through paper from Japanese university

Week of 2/14/19

(Mostly research)

WORDS AND MELODY ARE INTERTWINED IN PERCEPTION OF SUNG WORDS:

Purpose: determine whether words and melodies in songs are processed interactively or independently, determine influence of attention on the processing of words and melodies in songs
Conclusion: lyrics and tunes are intertwined in sung word recognition, variation in melody and/or variations in lyrics elicited a cerebral response

A MELODY-CONDITIONED LYRICS LANGUAGE MODEL:

Purpose: create a large database of lyric-melody aligned content with syllable-note alignments and word/sentence/paragraph boundaries, in the end to generate a language model that produces lyrics given input melody
Previously there had not been a sufficient amount of data to analyze the relationship between lyrics and melody
Writing lyrics from melody requires accounting for word boundaries and rests in melody
Current computer-aided lyric writing software still requires the user to interpret the source melody and determine constraints such as syllable count and rhyme position
Goal: create a program that automatically identifies these restraints otherwise identified by the user and creates an independent program to generate coherent lyrics
Discourse structure of lyrics (sentence/paragraph boundaries) is determined by melody rests and context words
Lyrical generation requires more than just the analysis of syllable stresses and beats
Melody-lyric database: dataset of digital music scores, each specifying a melody score augmented with syllable information (word/sentence/paragraph boundaries) for each note, aligned with raw lyrical text files (see Figure 2 from paper below)
Phenomena discussed in paper:
- Words, sentences, and paragraphs rarely go beyond a long melody rest
- Boundaries of larger components generally align with longer melody rests
Recurrent Neural Network Language Model: neural network model that remembers all previous inputs
- https://towardsdatascience.com/recurrent-neural-networks-the-powerhouse-of-language-modeling-d45acc50444f
Score data is useful to analyze relationship between phonological aspects of lyric and melody, but the lyrical text files are needed to analyze the discourse components (need structural information of the lyrics)
Word segmentation and pronunciation: accomplished with a morphological analyzer
Alignment of lyrics and score: Needleman-Wunsch algorithm (global alignment of two sequences)
- http://vlab.amrita.edu/?sub=3&brch=274&sim=1431&cnt=1
****Method for data creation is general enough to be applied to MusicXML (I just need to get the program working, will be focusing on that next week)
Boundaries of Lyrics: Notable Finds
- Positions of lyrics segment boundaries are biased to melody rest positions (less likely to appear at note positions)
- Probability of boundary occurrence depends on the duration of a rest (see Figure 4 below)
  - Short rest= word boundary
  - Long rest= block boundary
Probabilistic approach: modeling lyrics using a rigid set of rules would be extremely difficult; better to use a melody-conditioned language model
Need to create a language model that generates coherent lyrics with discourse segments that fit a given melody (segment boundaries fit distribution directly above, in terms of rest and boundary alignment and size)
Melody-conditioned RNNLM: standard RNNLM with featurized input melody

NEURAL NETWORK LANGUAGE MODELING:

To Do:

Learn more about neural network language modeling & machine learning
Figure out how to get Matlab working and apply the Japanese article's findings to MusicXML
1 sentence demo description

Week of 2/21/19

Demo idea: algorithm that takes in two audio files, lyrics and instrumental, extracts frequencies and analyzes word boundary alignment to melody rests, and places each analysis on an index of most rigidly aligned lyrics (0) and melody to least rigidly aligned lyrics and melody (9)
Python library: https://github.com/tyiannak/pyAudioAnalysis
- Most useful functions: beat extraction, spectrogram generation
Start with simple audio files (less than 15 seconds)
Usefulness: could help Shimon generate melody that correctly follows the lyrics by determining how rigidly the rests of Shimon's melody must follow the input lyrics
Will focus on frequency analysis through sampling of two audio signals

To Do:

Continue writing script, add to github

Week of 2/28/19

Ideas for analysis:
- Compare numpy arrays of vocals and instrumental and search for alignment of lower decibel output
- Compare spectrogram content of two audio files
How to determine when a rest in melody occurs:
- Brief silence or a marked decrease in volume?
- Will need to determine how to identify melody rests in louder songs vs softer songs
Overall: frequency-domain analysis, time-domain analysis, or a combination of both
(Possibly: analyze zero crossings of data?)

Week of 3/7

Researched more on analysis in frequency domain
Algorithm:
- Using five different songs in the rap/hip hop genre, I determined the point in a spectrogram where a rest is indicated in the melody, as a percentage of the average frequency of the song
- I will use this data when extracting spectrogram content in my script, and will find places in the instrumental audio where the frequency dips below a calculated "melody rest" frequency
- These sections will be analyzed for alignment with word boundaries, and the frequency of correct alignment will determine the index at which the audio will fall in terms of lyric-melody alignment along rest boundaries

Week of 3/14

Continued writing script
- Possibly save implementing indices until after 3/4 presentations when more data is collected to make ranges more accurate
Prepare for 3/4 presentations

Week of 3/21 (Spring Break)

Continued writing script
Finished 3/4 presentation slide

Week of 3/28

3/4 Presentations (Friday)
Last minute switch to a MATLAB implementation, minor changes in syntax of code but otherwise same function as previous Python script
.m file: https://1drv.ms/u/s!Auq8QpoHcWlGs1nYnlJTl8ddQF0w

Week of 4/4

Begin working on a real time demo wherein someone could play and sing directly into the microphone and have the program immediately output the percent correlation

Week of 4/11

Real time implementation is functional but computer's microphone is not the best resource for using in conjunction with the script
Begin work on research paper outline

Week of 4/18

Github upload + read me creation
Fine tuning program to work well as a real time implementation
Compiled research outline to add to main doc
Update: real-time implementation works poorly in Matlab, going to implement a program that can compare audio from different songs instead
https://github.com/uzumochi/lyricmelody

Google Sites

Report abuse