Song 1 Analysis: Daddyphatsnaps- Flashy - Jhon
This first image represents a waveform taken from the software Audacity. At the beginning of the track, from 5 seconds all the way to 18 seconds, the beat’s wavelength is constant and doesn’t change. However, by the 19-second mark, the wavelength’s size gradually starts to increase in size because of the inclusion of the vocals. The amplitude also increases as when the artist starts rapping, the song starts to get louder. There is also a transient at the 19-second mark due to the hi-hats playing when the artist starts rapping. Then, by the 1 minute and 13 second mark the waveform starts to gradually decrease in size. I also noticed that the song also has a high dynamic range as with my headphones alone, I’m able to hear all the sounds in the song very clearly; from the beat, the drums all the way to the rapper’s voice.
The spectrogram of this song shows that the first 15 seconds of it is shown to be at 4.9 kHz. Then, when the song reaches the 30-second mark, it starts to shoot up to 15.5 kHz. These spectrograms were taken using this website: https://academo.org/demos/spectrum-analyzer/
This waveform picture was also taken from Audacity, and it shows in this picture that the track’s wavelength at the beginning is fairly stagnant for about 8 seconds, then the wavelength starts to get bigger, signaling that the song is going to get louder and more instruments are going to come into the song. Then, by the 17-second mark, there is a transient as the violins come into the track, then the transient dies down as the drums and percussion make their way into the track. I also noticed that the stereo width is pretty balanced and that the song has a high dynamic range as the song is clear, and I can hear all the instruments as well as the singer’s voice.
For this first spectrogram photo of the song, I noticed that it stayed at 4.5 kHz for a few seconds, then it shot up to 14.35 kHz within the span of 6 seconds.
For this second spectrogram, I noticed that from the 20-second to the 30-second mark, which is about 10 seconds, the song constantly stays at about 14.5 kHz. The spectrogram for this song was taken using this website: https://academo.org/demos/spectrum-analyzer/
To demonstrate the loudness of this track I have used the izotope insight pro History module that displays the integrated loudness information across the time (Picture2) and the actual protools waveform(Picture1). The beginning is much quieter than the middle of the song (around 1 min)with few high amplitude waveforms from the Bass track. The Dynamic range is comparatively getting higher from the beginning. A few seconds after the song starts, the voice and the drums will appear in the mix. Because of that the high frequencies were going to show up from the hihat. As the song plays in 1 minute, the trumpets and trombones start. These brass instruments are sounds with a high amount of harmonics and overtones.
In the picture 2 we can understand and visually see whenever something is a higher and lower in amplitude than the target average level, which shows by the dotted red line across the horizontal axis. While we Looking at the picture 3 below. It’s really useful to have a good frequency analysis to understand where we need to bring levels up or down by automation of the daw. As this is a screenshot of the beginning of the song Obviously this shows the high frequencies not appearing. This song is a salsa-infused high tempo song. The timbre of the instruments created the specific overtones and harmonics that song requires to need to be in it’s genre.
As we can see in this picture 1, It’s showing a very safe mix in all 4 frequency bands. But the mix already turned into the left side which is quite rare. But it adds a great taste to the overall mix. In the picture 2 also displaying the same 4 frequency bands in another way. As the white line doesn’t move outside it shows a safe mix too. Anyway the bass stil there to peak and step out as well as bass frequencies has a long wavelength. The RMS level of this song is roughly around -7dBFS to -12dBFS specially because this song isn’t a loud mix.
Here is a Display of spectrogram Analyzer in picture 2. This is a real-time 3D spectrogram which generates a well comprehensive topographical map of audio. This shows a well tamed low end and a high end frequencies with a low dynamic range quite rare in drum tracks specially. Since the Bass track and drum track grabs the vast majority of the lower frequencies, those are presented by waveforms with high amplitudes while the high frequencies show with shorter and sharp waveforms on the other end. The wavelengths that happen in the guitar solo is clearly mentioned in this picture as a quick roseup of the frequencies. The Pitch and distortion in the guitars added a warmth to the song in the beginning. Due to the acoustic envelope of the guitar, it creates a nice vibe in the song. The RMS level of this song is roughly around -16dBFS to -18 dBFS specially because this song is a loud mix.
This displays a sound field analysis. In this Polar sample we can understand it’s not a stereo mix, as we can see the meter at the right hand side stays near at the +1 mark. If this was a stereo mix the details will be much spread towards the 45 degree angle lines. So basically this song is mixed in mono which is pretty unique. But as the drum kit has overheads, the main lead guitar part with effect and the added keys in the interlude section seems like a bit wider to make this mix a bit stereo that gives a nice blend of tone and timbre of instruments. The small dots around the straight line show the high frequencies ,the overtones and harmonics.
Live Capture
MP3 30 Seconds Clip (0:20 to 0:50):
https://drive.google.com/file/d/1iM5N-QhUlwPdmM1sjnR9IXFnT3rOn7J-/view?usp=sharing
The above-inserted waveform image demonstrates the amplitude change in a 30 second period of an outdoor natural bird sound capture. This capturing process was recorded using an iPhone 12 microphone. Within this waveform recording screenshot, it can be seen that there are various sounds being picked up at once. This can be identified through the large movement of the sound wave from one peak to another in a short period of time.
The above image demonstrates the different pitches which are being captured in the space of 30 seconds of the birds' recording. The above image is able to demonstrate the ratio between the highest and lowest sound captured, which is known as the dynamic range of the recording. Due to it being a bird recording, it can be identified that there are various large pitch shifts between each section of the waveform. The use of different sized waves within each pitched note demonstrates the velocity of each bird sound being picked up. Whereby, the velocity of each sound can be identified by the rate at which the recorded waveform changes position.
The above EQ image demonstrates the different frequencies at which the combination of different sounds is captured and creates the birds' recording. From the graphical representation of the EQ, it can be seen that there is a high wave amplitude of sound being produced at a 20db level between 20 and 400 hz, which is abnormal for a bird recording. When referring back to the recording and the preceding information, it is clear that the lower frequencies are background noise. On the other hand, the visual representations of the higher end frequencies between 2k and 6k can be identified as the birds' sounds themselves, with a peak frequency volume of 40db.
Youtube Track
The above-inserted waveform image demonstrates the amplitude change in a 30 second period of the track Heather. By Connan Gray. The song contains a buildup of instruments into a chorus section at bar 35. Within this waveform recording screenshot, it can be seen that there are various sounds being picked up at once. This can be identified through the large movement of the sound wave from one peak to another in a short period of time. In this section of the track, it can also be seen that the left and right audio signals are quite identical based on the inserted waveform.
The above image demonstrates the different pitches which are being captured in the space of 30 seconds in the track Heather by Connan Gray. The images above depict various instruments such as vocals, drums, guitars, pianos, and percussion instruments. The above image is able to demonstrate the ratio between the highest and lowest sound captured, which is known as the dynamic range of the track. This screenshot captures the one-way demonstration of the master audio outputs left and right. In an actual studio recording session, we could have a clearer understanding of which instruments would be playing at which times by soloing different parts. The use of the soloing feature could allow us as audio engineers to easily identify different instruments, their musical timbres, and the pitches at which they play. In further addition, the use of different harmonic layers can be identified throughout the different instruments. The harmonic sections of this track can be identified by the single melodies that are accompanied by one or more harmonic parts or instruments. Due to it being a master send recording, it can be identified that there are various large pitch shifts between each section of the waveform. The use of different sized waves within each pitched note demonstrates the velocity of each bird sound being picked up. Whereby, the velocity of each sound can be identified by the rate at which the recorded waveform changes position.
The above EQ image demonstrates the different frequencies at which the combination of different sounds is captured and creates the track recording of Heather by Connan Gray. The graphical representation of the EQ shows that the overall recording has a generalized high level of sounds due to the majority of the audio wave frequencies being above 50 dB. Based on the visual representation of the EQ, it can also be seen that there is a high wave amplitude of sound at the lower end of the frequencies being produced at a 15db level between 20 and 200 Hz. In this track, the boost of frequencies between 20 and 200 Hz can normally be referred to as the capture of lower-end sounding instruments like drums, keys, and bass instruments, for example. When referring back to the recording and the preceding information, it is clear that the higher end frequencies between 6 and 20 kHz could be identified as slight background noise. On the other hand, the visual representations of the mid-range frequencies between 300 Hz and 3 kHz can be identified as the general sound spectrum space being used by most of the instruments.
John’s Comparison:
When comparing my first song compared to Shiron’s first song, I noticed that at the beginning, our amplitudes were small. Then after a few seconds, the sizes started to change, however, the amplitude of Spanish Joint is a constant size until the end of the recording compared to mine, as the size of the waveform gets a lot bigger as there are a lot of peaks, due to the fact that the track is rap song. Then when the rap parts and the hi hats start to kick in, the amplitude starts to increase immediately, as the rap portion and the drums start to get louder as they are put more in front and the other sounds get put more behind the rap and drums. However, when comparing my first song to Pedro’s first recording, the wavelength of his live bird recording isn’t as big compared to both mine and Shiron’s as the waveform is shown to be at a consistent level. The dynamic range isn’t really that high, considering that it was recorded outdoors where there were other sounds that were recorded, such as the wind and people’s voices in the background. Other than that, there was nothing that I could find.
Shiron’s Comparison:
I would like to start my comparison with John's first song called daddyphatsnaps by flashy. This frequency analysis is a rap song that has an 8 bars introduction which is having a low amplitude with sudden high amplitude waveforms. When the verse starts after the introduction, the waveform is Notebly raised into a high amplitude and it's showing a high dynamic range as well.
D’angelo- spanish joint which is my first waveform and frequency analysis for this project is raising the amplitudes gently and gradually as it starts with guitar brazilian guitar lick which contains very low amplitude. John’s first song is showing a high dynamic range after a few 8 bars of starting the song. That song has a deep low end with lower frequencies. The wavelengths of the low frequencies are long. As a result of that we can hear those frequencies better than the high frequencies from a long distance and the velocity of the sound will depend on the temperature and the properties of the medium.
Pedro's waveforms are comparatively lower as they contain a low amplitude with sudden changes. The analysis done by me and John are quite different from this analysis as this is a real-life recording. The pitch and the frequencies of the birds are quite high rather than a song. The timbre of the bird's sound gives different harmonics and overtones. That can clearly be seen in his second screenshot. The whole recording has a low frequency as shown in the third picture of the Eq graph.
Pedros’ Comparison
In conclusion, for my comparison of this project, I will be comparing my teammates' selected track analysis to mine and evaluating the differences and similarities between the tracks. In order to achieve my desired HD grade, I will also need to ensure that in my comparison all the correct terminology is used to thoroughly describe the tracks' features.
At first, I will be comparing my track to Jhon's track "Daddy Phatsnaps-By Flashy". At the beginning of Jhons’ chosen track, there are mainly mid-to-high frequencies being heard; this is due to the instruments that are being played. The waveform and its amplitude then gradually increase throughout the song due to the changes in rhythmic features and the addition of instruments and vocals throughout the different sections. In terms of the vocals within the track, there is a very consistent dynamic range and timbral of 500Hz and 2K.
Based on what can be evaluated on the original track, it can be seen that the drum section of the track contains high attack compression parameter. In comparison to my track, what can be heard is that both mine and Jhon's tracks contain a wide range of frequencies due to the variation of pitches, dynamic ranges, and timbras due to the different instruments used and the overtones that can be heard.
Based on the spectrograms that have been inserted into the analysis section of the tracks, there are also clear visual representations of the different frequency analyses throughout the tracks. It can also be seen that the effects and filtering techniques used in both tracks help ensure a clear and coherent final outcome.
In terms of my second track analysis, I will be comparing Shirons’ track "D’Angelo-By Spanish Joint" In terms of Shiron's first chosen track and based on the visual representation of the audio spectrum that was inserted into the analysis section, it can be seen that the song's structure is quite similar to my chosen track where there is a gradual amplitude change throughout the track. When listening to the track and analysing the most common frequencies used, it can be seen that some of the high frequencies that can be heard are in reference to the HiHats of the drum pattern.
The timbre of different instruments has an overall effect on the shape of the audio wave that is produced. Furthermore, the frequency and dynamic range of the various instruments being used can result in the production and use of overtones. The way in which this can be compared to my track is the way in which there are very loud frequencies, which result from high velocity notes being played. This can be seen on the EQ images inserted into both analysis sections. In terms of the images inserted by Sheiro, image 3 demonstrates the use of different frequency bands that help identify the differentiation of pitches produced by different instruments.
Glossary of Terms
Waveform: This is simply an image that shows a recording of an audio source. It shows amplitude changes over a period of time. Waveforms give producers a visual idea of what's being recorded. This let’s them judge if there are changes that need to be made to the recorded audio (Techterms, n.d).
Wavelength: In simple terms, this is the size of a wave that is measured from one peak to another. If one were to imagine, it mainly looks like a wave. It is also the distance the crest from one wave to the next. In terms of distance, two peaks are usually 1 meter apart. When someone wants to find the wavelength of any sound, this would be the formula that would be used: wavelength=speed/frequency (Acoustics Today, n.d).
Spectrogram: Spectrograms are mainly audio pictures, however they are not waveforms as spectrograms cannot be altered or changed. This means that the brighter the color’s figure gets, the more concentrated around the specific frequencies. This gives us an understanding of the audio’s shape and structure. ( Towards Data Science, 2021).
Amplitude: By definition, this term is the measurement of change over a period of time. In an audio setting, it’s the change of the volume of a recording from one point to another. An example of this is a volume tuner on a radio player. By changing the volume, the amplitude is either increasing or decreasing by some amount. This shows you how loud or soft a sound is (BACKTRACKS, n.d).
Timbre: Timbre, also called timber, quality of auditory sensations produced by the tone of a sound wave. The timbre of a sound depends on its waveform, which varies with the number of overtones, or harmonics, that are present, their frequencies, and their relative intensities. (Britannica, n.d.)
Dynamic Range: This term is the ratio of the strongest to the weakest part of a sound. This can be measured in dB’s. Dynamics are essential when it comes to making music in general, as there needs to be a combination of melody, harmony and rhythm, so that the sound can be compelling to listen to and soothing to the ear. However, if a song has too much dynamic range, there will be no balance between the loud and quiet sounds (Levine, 2021).
Frequency: The number of waves that pass a fixed location in unit time is referred to as an audio frequency. Frequency is measured in hertz and is a periodic vibration whose frequency is within the hearing range of a typical human being. (“Frequency.” Encyclopædia Britannica)
Velocity of Sound: The velocity of sound, also known as audio velocity, is the distance traveled by a sound wave per unit of time as it propagates in an elastic material. (“Speed of Sound.” Wikipedia)
Harmonics: A harmonic is a wave with a frequency that is a positive integer multiple of the fundamental frequency, the frequency of the original periodic signal, such as a sinusoidal wave. The original signal is also called the 1st harmonic, the other harmonics are known as higher harmonics. As all harmonics are periodic at the fundamental frequency, the sum of harmonics is also periodic at that frequency. The set of harmonics forms a harmonic series (Wikipedia, n.d).
Acoustic Envelope: An envelope is the evolution of sound in any music piece. The most common type of envelope is the ADSR envelope, (Attack, Decay, Sustain and Release). This type of envelope is mainly used to control the loudness of a sound or song. The Attack phase determines how loud a sound can reach full volume or peaks before it reaches the Decay stage. The Decay stage determines the sound’s drop length, from the sound’s peak, all the way until the sound reaches the Sustain stage. The sustain stage determines the sound’s volume for it’s entire hold time between the decay and the release stage. The Release stage determines the speed at which a sound ends, the release time can be either long or short (MasterClass, 2022).
Pitch: Pitch is simply defined as the vibration that an instrument makes. The pitch and a timbre define how a note sounds. When pitch and duration is combined, melodies are made. There are 5 types of pitches. They are Perfect Pitch, which allows a person to identify a note in a scale if they have another note as a reference point. Absolute Pitch is when a person can easily identify any musical note just from hearing any song without needing any reference whatsoever. A Sharp Pitch is a type of pitch that is too high for a specific note, however, it can be adjusted by re-tuning your instrument or by adjusting your technique. A Flat Pitch on the other hand, is too low for a specific note, which is the complete opposite of a Sharp Pitch. The last pitch type is a Diatonic Pitch, which is a part of a major or minor scale. For example, in a C major scale, notes C,D,E,F,G,A and B are all diatonic pitches (MasterClass, 2022).
Distortion: Distortion refers to when there is a deviation from the original waveform that is desired. There are three types of distortion, they are Fuzz, which was made famous by rock bands like the Rolling Stones and legendary guitarist Jimi Hendrix. This allowed them to use reed-based instruments like a saxophone. Exciters are frequency-based distortion units. These units allow producers to ‘excite’ certain frequencies by creating upper harmonics that weren’t present in an original signal. This in turn makes the sound more brighter and detailed. The third and final type of distortion is a saturation, which is when a piece of circuitry is driven to almost to the point of distortion. This type of distortion is mainly to add subtle harmonics and warmth to a sound (Splice, 2020).
Overtones : overtone, in acoustics, tone sounding above the fundamental tone when a string or air column vibrates as a whole, producing the fundamental, or first harmonic. If it vibrates in sections, it produces overtones, or harmonics. The listener normally hears the fundamental pitch clearly; with concentration, overtones may be heard (Britannica, n.d)
RMS: This stands for Root Mean Square, which is a metering tool that is used to measure the average loudness of any audio track within a rough window of 300 milliseconds. The value that is displayed is an average of the audio signal. The value gives the person a more accurate display of the perceived loudness of the audio track for the average listener. Utilizing metering tools help to visualize the average RMS of a track that helps with avoiding distortion (Modrall, 2021).
Filtering: It is essentially using tools called filters in order to transform the tone of a song into what we want. They can remove certain frequencies or isolate certain frequencies so that they can be boosted. The two most common filters are Low-Pass Filters and High-Pass Filters or LPF and HPF for short. LPF’s are mainly used to isolate bass, create warmth in a song by removing harsher high frequencies, preserve the fundamental frequencies of a sound, whilst removing harmonics, last but not least, they also create low shelf filters. HPF’s on the other hand, are used to remove rumble and any other sounds that are below the lowest fundamental frequency of a sound, remove basslines and kicks when sampling and mixing and also to create tension before a bassdrop arrives so that there is more of an impact when the low ends return (Producer Hive, n.d).
Frequency analysis: Normally, sound and vibration phenomena occur with specific frequency characteristics. Multiple frequency components coexist in complex patterns. Determining the respective levels of these frequency components is called frequency analysis. (Support Room, n.d.)
Throughout this project, the main aim was to understand and evaluate the musicality of 2 of our chosen tracks. We were required to use discovery and learning techniques to clearly explain and analyze our findings when describing output track features. I really enjoyed working on this group project with Jhon and Sheiron as there were many things that we were able to learn from each other's prior knowledge. Despite not being physically present during weeks 1, 2, and 3 at the beginning of the term, I was still able to fully understand the project brief and quickly come up with a selected track and evaluation. It was a fantastic experience to be able to discuss each of our findings with my group members and think about how we were describing our track while including all the glossary terminology learnt in class.
I personally believe it was a great learning curve to have the opportunity to not only learn how to closely define our track but also conduct in-depth research of the different glossary terms used throughout the descriptive practice. At the beginning of my analysis process, I personally found it quite difficult to understand all the different glossary terms, but with further research, I was then successfully able to apply these terms in order to analyse my chosen track.
My next steps, upon receiving feedback from my peers and my lecturer, will be to understand what could be done better for future project submissions and to apply the feedback received. However, what I personally think I could do better is to collect further extensive research on each of the keywords and try and identify multiple tracks that present the same or different features. This will allow me to go into an in-depth analysis process and support me when writing up my comparison section.
Overall, in terms of the transferable skills learned and applied within this project, I personally feel that the best-reflected ones have been critical thinking, cognitive outsourcing, and time management. The reason why I believe their skills were reflected in this project was due to the fact that I was required to collect extensive research on all the glossary terms used and then be able to apply what I had learnt to my two chosen tracks in a very short period of time. I definitely think that, however, it was an incredible learning experience and looking forward to project two.
Acoustics Today. (n.d ). The World Through Sound: Wavelength. Retrieved from: https://acousticstoday.org/wavelength/
BACKTRACKS.(n.d). Amplitude Definition. Retrieved from: https://backtracks.fm/resources/podcast-dictionary/amplitude
Britannica. (n.d.). Timbre. https://www.britannica.com/science/timbre
Britannica. (n.d.). Overtone. https://www.britannica.com/science/overtone
Mordall, L. (2021). RMS Level for Mastering: Achieving the Perfect Loudness. Retrieved from: https://emastered.com/blog/rms-level-for-mastering
“Frequency.” Encyclopædia Britannica, Encyclopædia Britannica, Inc., https://www.britannica.com/science/frequency-physics.
MasterClass. (2022). ADSR Envelopes Explained: 4 Stages of an ADSR Envelope. Retrieved from: https://www.masterclass.com/articles/adsr-envelope-explained#4-stages-of-an-adsr-envelope
MasterClass. (2022). Pitch in Music Explained: 5 Examples of Pitch in Music. Retrieved from: https://www.masterclass.com/articles/pitch-in-music-explained#how-is-pitch-measured
Producer Hive. (n.d). Audio Filter Types (Explained Simply). Retrieved from: https://producerhive.com/music-production-recording-tips/audio-filter-types/
Support Room.(n.d). Frequency Analyzers. Retrieved from: https://rion-sv.com/support/st_frequency_en.aspx
“Speed of Sound.” Wikipedia, Wikimedia Foundation, 1 Mar. 2022, https://en.wikipedia.org/wiki/Speed_of_sound.
Splice. (2020). What is distortion? | Distortion’s main types and use cases in music. Retrieved from: https://splice.com/blog/effects-101-distortion/
Towards Data Science.(2021). Learning From Audio Spectrograms. Retrieved from: https://towardsdatascience.com/learning-from-audio-spectrograms-37df29dba98c
Wikipedia. (2022). Harmonics. https://en.wikipedia.org/wiki/Harmonic