Week 4 (01/31):
- familiarizing myself with NLTK
- found a particularly interesting function with parse trees & parts of grammar
- NLTK has very specific ways of defining parts of speech (not just Subj, Verb, Object of the Preposition, etc)
- decided to focus on mapping the relationship of the mood of a song vs. the use of parts of speech
To-Do:
- feedback: could look into how much repetition/semantic composition/roadmap goes into different subcategories of hiphop (probably rap, specifically)
- narrowing down the 3-5 emotions/categories of rap to focus on (ie. happy, sad, angry, etc)
- beginning to look into categorizing certain songs into certain categories (text-coding songs manually or find a database that matches my categories)
- future: try running through a rap song (may have to beware of slang and non-traditional sentences)
Week 5 (02/07):
- trying NLTK on a sample of different rap lines/sentences
- results in a parse tree, but contrary to what i thought i would know, there's plenty of types NLTK parses them into that require more understanding
- reading through the different types of grammar
- decided to start with happy/groovy, heart break, informative/message as the first three categories
To-Do:
- still trying to find a dataset of rap lyrics that have been pre-text-coded
- if I can't find a dataset of these, i will begin to find 1 song / category and will aim to have 5 songs / category to test
- (5 because many songs have slang)
Week 6 (02/14):
- couldn't find a suitable rap lyrics database for the categories/emotions i want [happy/groovy, heart break, informative/message], so starting to build my personal "5 songs / category" database
- difficulty: finding songs that are very clearly 1 category and don't bleed into another section
- another difficulty: songs have to have lots of recognizable/"grammatically correct" words, unlike (https://pudding.cool/projects/vocabulary/ or rapper-created slang words)
- thinking about limiting the database to the last 20-30 years for a smaller population size
To-Do:
- continuing to pick the ideal 15 songs & then observing if there is indeed a connection between certain parts of grammar and emotion
- planning on dividing the types of grammar into 2 sections:
- 1. nltk dictionary
- 2. parts of speech we generally learn in school
Week 7 (02/21):
- finalizing what I want to show for the 3/4 demo: showing three different graphs correlating the parts of speech and the different types of emotions
- comparing and contrasting what parts of speech show up in the top 5 for each
To-Do:
- run through Cameron's recommendations (just lyrics) to begin finding the final product's correlations for the graphs mentioned above
Week 8 (02/28):
- pretty simple week: just collecting data on Excel (beginning with Cameron's songs)
- mapping 3 types of emotions to frequencies of grammar part frequencies
To-Do:
- continuing to gather data
- may write a script to read a nltk parse tree and record frequencies in a csv/excel file
Week 9 (03/07): missed the meeting, was sick
Week 10 (03/14):
- preparing for 3/4 demos
- created statistics for 2 songs each: happy, sad, and informative
- mapped graphs of nouns, verbs, adjective phrases
Week 11 (03/28): 3/4 demos
Week 12 (04/04):
- thinking about what to do for the final presentation
- looking into the paper, and the statistics that i could compile in order to produce meaningful data results
- also still creating a script in order to create parse trees on any new data
Week 13 (04/11):
- worked on the paper outline, see https://docs.google.com/document/d/1j1CLgV7Q_WkNy6avCEOrTWJAo3BoJIHkqI3Q4Cu4ngk/edit
- filling in the script in order to convert command line prompts into an interactive script which will produce a parse tree and output statistics
- emotion, categories field
- will be filling in py journal when the script is complete
- to-do: combine script (lyric analysis) with rob & cameron's lyric analysis tools
Final Demo (04/19):
- https://github.com/guess-hwu/lyrics-nltk
- Future project ideas:
- Integrate Rob Firstman's usage of the genius library, JSON format in order to query songs
- Expand the emotional genres. More than just happy, sad, informative.
- Collect data, find the ideal margins for each emotional genre.
- Test how accurate these margins are! Are the backwards correlations correctly identifying the emotional genre the statistics gave?
Write up (04/28):