https://drive.google.com/drive/folders/1vQ1zedLuHoap0GZ7tmygCKYNxWzNU6FA
Markov Music Machine Learning Statistics Project Time Stamp
This serves as undeniable proof and hard evidence that I, Benjamin Davis, came up with this idea and made it publicly available on Nov 3, 2023 (Using Google Sites' Publication Date feature)
(Google Drive Link also has publicly available last modification date of the associated pdf documents)
Since this is related to AI which is rapidly transforming and evolving and developing at leaps and bounds, changing the landscape for all different industries and fields, including music, I decided to take this bare minimum precaution to protect myself and my idea from being stolen directly without any credit at all whatsoever. Even if this makes no money, if someone does eventually bring it to fruition and makes a lot of profit from it, I at least deserve credit for coming up with it and putting it out there first, even if no academic institution or journal ever cared.
AI/Machine learning is quickly taking off, so Ben is laying claim to this idea on Nov. 3, 2023:
I cannot predict the future, so I'll probably forget that I made/created/ this page even exists eventually.
Benjamin Davis's Idea:
Use Markov Chains to model musical chord progressions, the fundamental basis structure behind music theory and composition and how all tonal music is organized.
Ultimately all Lead Sheets or chord structures determine the harmonic content of the song or piece, and to mathematically interpret or analyze, one can use Markov Chains, with successive higher oder chaining of dependency for the next one in sequence.
This is very analogous to Claude Shannon's model of NLP from his seminal paper "A mathematical theory of communication" and sort of how the mechanics of ChatGPT and the like for large language models actually work; they ultimately get very good at predicting the next word in sequence in a sentence or phrase. The better job at predicting comes from basically higher order conditioning, accounting for more history before making the successive decision/classification/guess of what comes next. the limit of how well one can perform with this task more or less sort of comes from the underlying entropy rate of a sequence. regardless, with superfluous amounts of data which we have nowadays in the age of computing and information, the technology not so readily available in Shannon's day, we can get very good at predicting the next word which gives rise to the advent of like Chat GPT and large language models, which actually intelligibly and comprehendably seems to intuit language and text-based communicative expression.
Supposedly one can do the same thing upon musical chord progressions, not just MIDI, sheet music, Short Time Fourier Transform Spectrogram images, fast wavelet transforms, or raw audio data. Indeed, the insight is musical chord progressions are better for comprehension of a lot of musical and psychoacoustic characteristics from the underlying music theory which is inherent to designing such chord progression, integral to music composition practice.
Likewise, the same can be done in music, and instead of doing note by note, one should approach the fundamental structure using Markov Chains upon the chord progressions, which is they key underlying distinction "backbone" skeleton structure.
Anyways, this site was made permanent and publicly available on November 3 2023, mainly as a way to serve as a time-stamp "proof" or "record" of when this idea was established.
Ben was classically trained with perfect pitch and inherently analyzes all music this way anyways using brain and ear automatically, and perhaps this methodology could be made more technologically available by compounding it with big data for analyzing the kinds of patterns that arise.
Any updates to this project will not be contained on this site as this purely serves as a time-stamp permanent record.
Couple math details:
in terms of time domain of Markov chain:
you could quantize to measure bar or beat and have repeated (self transition) for if the same chord is held or kept for a while or long duration
because most music is 4/4 3/4 or 6/8 time signature you're not going to have to deal with like continuous time, because ultimately chord transition is going to happen on some specific breakpoint beat
in terms of state space:
you could do hierarchical stratification more or less among the following:
ultimately you have A B C D E F G, sharps and flats, so all the white keys and black keys on the piano.
Each white key and black key has an associated major or minor chord with like inversions.
there's also diminished and augmented chords
there's also 7th and 9th chords perhaps, as well as idiomatic chords like french augmented sixth and various suspended/ jazz harmonies flavors
in 7th chords there's dominant 7th, minor major, major-minor diminished
At the end of the day, the big picture point is there's a finite number of actually distinct recognized chords within music theory convention and standardization (we're not just going to call every subset selection of the 12 keys (7 white + 5 black keys) a chord.)
With big data, the histogram or clustering or clumping will allow law of large numbers to apply to each state (like you will indeed find an/multiple instances of each chord in actual music out there.)
so basically, we have discrete time and discrete value/space/state markov chain so to say.
In short:
The key to actually understanding music, particularly its structure, using all my music composition and theory expertise, is ultimately directly behind the chord progression. This is the backbone skeleton matrix that holds the rest of the embellishments /phrases of the song. This is indeed how musicians actually comprehend and interpret music.
As an example. one can quickly dish out pop/rock music with 1564, I-V-vi-IV or 6415 vi-IV-I-5 in major, or on 4536 IV-V -iii -vi in major for like Japanese pop/rock. this significant insight is crucial for processing more complex music like romantic era Tchaikovsky, Chopin, Liszt, Rachmaninoff, or jazz.
Without exhausting all combinations and sequences or degrees of freedom within the space of all music, e.g. a fully connected layer in a neural network yields computational intractability hence the curse of dimensionality, one should guide one's attention to focus on chord progressions.
Indeed, hypothetically the space of all classifiers or regressors on raw music data analogous to VC dimension and whatnot would include the data processing/pre-processing of first preliminarily computing chord progressions based on ground data, or perhaps the things that light up in a generic CNN will indeed end up indicating one should look at chord progressions.
But as a human who has extensively studied music, I'm giving the upfront hint that this is the key feature or quality one should examine for AI applied to music. It might take sheer enormous amounts of computational power to get to this without human insight upon how music composition actually fundamentally works, even after consulting literature in psychoacoustics or music cognition which doesn't inherently necessarily consider music theory and compositional aspects.
Supposedly, the insightful human here is saying to at least have a separate channel in parallel in the neural network architecture where chord progression processing is done; (e.g. analogous to various parallel image processing channels via convolutional filters/colors) jumping directly to this point could potentially save drastically enormously astronomical amounts of computational resources. Music is a different animal from text; audio files are inherently higher dimensional (with more degrees of freedom in formant/frequency space) and it takes more resources to process, store, and compute upon it, and frankly at this point is less abundant than text available on the web.
Indeed, chord progressions will yield the features that give rise to understanding how the interwoven structure or canvas fabric of music works, analogous to the Einstein Field equations and the associated tensors providing the metrics of the underlying curvature of spacetime.
Anyways, if this call is heeded, I deserve credit. If not, well I suppose composers will keep their jobs for a while longer.
BIOST579_Homework1_BenjaminDavis
BIOST579 Homework 1 | Benjamin Davis | BLDBLD@UW.EDU
Project Proposal: “Mathematical-Markov-Modelling-Music-Machine Learning”
Brief relevant personal background: Ben grew up writing music for the Seattle Symphony, way back did a music Bachelor’s, and is now a second year UW electrical engineering PhD; after successful completion of BIOST579 and the STAT qualifying exam, Ben will also earn a UW Statistics concurrent ScM.
Background academic context
In the realm of artificial intelligence applied to musical structure comprehension (as opposed to the signal processing of sound), there are currently basically only two philosophies, both of which are almost mutually exclusive and extremely contrasting in their ideology:
A. music to engineering approach:
TL;DR oversimplified: collect a bunch of MIDI data, extract the melody line, and run a deep recurrent predictive time-series neural network on it, standardly used for autocomplete of text sentences in context, and hopefully with enough data and brute forcing enough computational power, some sort of mysterious incomprehensible pattern will emerge from the rubble.
B. engineering to music approach:
TL;DR oversimplified: flip a coin and write down the corresponding note, to ultimately algorithmically generate melodic phrases using either fractals or Monte Carlo simulation.
Here I propose an entirely new method of structural interpretation and analysis of music using machine learning and mathematical modeling of stochastic processes and Markov Chains.
Music theory background:
Artificial intelligence researchers are already barking up the wrong tree by throwing out and ignoring overarching polyphonic harmonic content, and they are furthermore losing the forest for the trees in such unnecessarily close detailed examination of single monophonic melody lines, as illustrated by the previous examples. In order to get a more nuanced and in-depth understanding of overarching musical structure, one should rather examine the chord progressions, the sequences of chords coming one after the other in succession, as opposed to scrounging for meaning throughout inconsequentially mundane melody lines or Shakespeare quotes when there ultimately isn’t even anything significant to find. This is an extremely well-understood concept throughout music theory curriculum, which is why musical practitioners undergo extensive training in Roman Numeral harmonic analysis, exemplified by Jazz Fake Book standards lead sheets which simply ultimately include the chord progression guitar symbols and a single line of melody along with its corresponding lyrics (as opposed to the full orchestration.) This illustrates the key idea that musical structure can ultimately be boiled down to its underlying chord progression, which even somewhat dictates the melody (e.g. a melody may not necessarily sound the most “tonally pleasing” when harmonized in an alternate or different key than the original one it was intended for.)
Connection to biostatistics:
If music can ultimately be structurally analyzed by the fundamental sequence of which chords the orchestra or band plays in succession, then a lot of interesting insights can be gleaned through data analysis of these chord progressions and rumination of the types of sequences that are undergone in practice. Similar to Claude Shannon’s original model of increasingly-higher-order-memory-complexity text generation from information theory, each chord generated in a musical chord progression sequence is not going to be generated independently and identically distributed; there is going to be some amount of “memory trail” tailing behind indicating some sort of stochastic process where there is some correlation between successive measurements. Indeed, naturally, some of the methodology to be used for this project can be used for DNA sequencing analysis.
Data Set:
Ben has perfect pitch and has been/continues to collect his own data in his free time through individual laborious analysis by tabulating by hand by ear/musical notation manuscript score visual analysis the chord progressions he observes from a variety of musical compositions from composers throughout various eras of music history, including: Chopin, Liszt, Rachmaninoff, Tchaikovsky, Brahms, Strauss, Mozart, Beethoven, Ben himself, etc. This data set does not exist anywhere else, and since it was self-collected using public knowledge from public-domain musical scores/manuscripts, this data set can ultimately be used for any purpose we so desire.
Scientific inquiry goals/specific questions of interest:
For the moment, this is still framed as an exploratory data analysis project; we are curious to see what kind of interesting patterns may emerge in a statistical analysis of collected data on chord progressions. Worth noting, the data set and analysis will both be truly novelly original, (e.g. nobody has attempted anything remotely similar.) Hopefully, we can estimate the Markov transition probability matrix for chord states and interstitial harmonic interval jump states in relation to/normalized to the underlying key signature each musical section is in; we can separate these stochastic TPM by composer and genre and provide a further breakdown of specific patterns (frequency of diminished chords, switching parity from major to minor, etc.)
Basically very metaphorically analogous to structural biology protein folding:
second order section analysis and ternary analysis is also of interest within first order next chord in sequence "tomorrow" so to say
like within first tonal area and second tonal area or development section, specific happenings can be analyzed in a manner similar to "contextual bandits": in this context, the system behaves a certain way.
there are various degrees of intertwined curling and folding within protein structure as well as musical chord progression harmonic structure
in computational genomics/sequence analysis/NLP/structured data, this is accounted for in one-hot encoding to capture relationships between distant features and happenings along sequence:
here supposedly the underlying key context change/modulation is a feature worth examining, which was done in this statistical analysis where all sequences from chopin, liszt, mozart, etc. were "normalized" to C major base anchor key when analyzing what chord is currently being played
In a nutshell, if someone ends up running deep learning convolutional neural network on actual chord progressions using one-hot encoding (analogous to ATGC DNA RNA sequences being converted to images with pixels lit up for which character or letter), supposedly I Benjamin Davis came up with this idea at least latest date of November 3, 2023.
supposedly Benjamin Davis came up with this idea and actually generated some basic data sets tediously by hand using perfect pitch plus many years of classical music theory training.
supposedly Ben used his extensive knowledge of classical music theory to identify that this is the key underlying "feature space" to analyze if one wants to truly understand the structure of music, as opposed to other aspects of music:
it is not the raw Midi, the raw Wav or mp3 file, direct spectrogram frequency domain short time fourier transform, or the sheet music image that is of interest. You can throw all that unprocessed info into the AI machine learning algorithm and it may not know what to do with it and make it actually make sense of the underlying patterns that truly define it, especially in a way that humans can understand. The true backbone structure of music lies in the chord progressions, which I am proclaiming and identifying here.
I argue the key processing step for intuitive comprehension of music using artificial intelligence lies in the chord progressions.
That being said, there is only one of me to go around and hand analyze chord progressions of various pieces. Supposedly this could be automated using digital signal processing frequency domain fast fourier transform FFT analysis upon sound files by identifying and transmitting /converting/transferring to MIDI, and then analyzing chords observed through the midi sequences and which polyphonic notes coincide and what kind or quality of "chord" does that instance comprise of.
That being said, most pop music is not very sophisticated and so you may not find any interesting patterns; it is past the 1700s classical music like Mozart and Beethoven and into the romantic era 1800s and Jazz era 1900s that music became sophisticated enough to actually merit using harmonic structural analysis as such instead of motivic analysis.
In a nutshell, whether the data set is hand generated by humans via like supervised classification or clustering or supervised learning, or it's generated automatically using signal processing methods upon midi, sheet music, or Fourier Transformed audio data, the idea is to analyze music directly using chord progression space and like markov/one hot encoding CNN image convolutional neural network architecture structure upon such chord progressions, which will yield insightful results from a music theory standpoint, and can aid in other tasks like automatic generation or classification procedures/tasks etc. Additionally, this will yield "explainable AI"-type results which humans can readily understand and interpret, especially those versed in music theory.
Regardless, on November 3, 2023, I have officially proclaimed such an idea outright and published publicly before some startup steals this makes a ton of money and doesn't give me any credit.