Companion Audio Examples and Descriptions: All files are wav files at 44.1kHz sampling rate, 16-bit depth*
Best Experienced With Headphones! Please allow a minute or two for the audio files to load
A compressor is used to compress a sound's dynamic range, making the louder parts of the audio closer to the quieter parts with respect to amplitude. Without make-up gain, this leads to overall quieter audio that is more amplitude constant from beginning to end. There are both digital and analog compressors, and the compressors used to generate these examples have four variable settings: threshold, ratio, attack time, and release time. Threshold determines at what amplitude the compressor begins compressing, meaning the lower the number the less amplitude is needed for the compressor to engage. Ratio determines how much the compressors reduces the amplitude of audio above the threshold. A ratio of 2:1 means that for every 2 decibels over the threshold the original audio would go, the compressor only allows that audio to go 1 decibel over amplitude-wise. Attack time is how quickly the compressor engages when audio crosses the threshold, and is often measured in milliseconds. Release time is how quickly the compressor disengages after the audio dips back below the threshold and is also often measured in milliseconds.
Comp4c is a digital compressor effect with four "knobs" that are variable. The four knob settings are threshold, ratio, attack time, and release time. The knob ranges are: Threshold (-30 to 0), Ratio (1:1 to 5:1), Attack Time (1ms to 40ms), Release Time (1ms to 40 ms). Each audio example below using the comp4c effect was outputted from a model trained on the full range of the knobs, but the example is just output audio from a single setting. For ease of comparison each example using the comp4c effect was generated with the following settings:
Threshold = -15, Ratio = 3, Attack Time = 20.5ms, Release Time = 20.5ms
Comp_one is also a digital compressor effect with four "knobs" that are not variable. The four knob settings are again threshold, ratio, attack time, and release time. The comp_one effect has locked knob ranges with the expectation that learning this single setting would be an easier task, and thus lead to higher quality results, than the full range of the comp4c effect. Each audio example below using the comp_one effect was generated with the following settings:
Threshold = -25, Ratio = 4, Attack Time = 5ms, Release Time = 20ms
The Leslie speaker is a combined amplifier and loudspeaker that projects the signal from an instrument and modifies the sound by rotating a baffle chamber ("drum") in front of the loudspeakers. A similar effect is provided by a rotating system of horns in front of the treble driver. The Leslie cabinet used to generate the dataset used in these examples contained two speakers, a horn and a woofer. The horn speaker is responsible for amplifying the high frequencies, and the woofer is responsible for amplifying the low frequencies. A musician can control the rotation speed using either a pedal or an external switch that alternates between a slow and fast setting. The slow rotational setting is commonly referred to as the "chorale" or "chorus" effect, and the fast rotational setting referred to as the "tremolo" effect.
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target
Predicted
Target