Exploring Quality and Generalizability for Parameterized Neural Audio Effects
Companion Audio Examples and Descriptions: All files are wav files at 44.1kHz sampling rate, 16-bit depth*
Best Experienced With Headphones! Please allow a minute or two for the audio files to load
Effects
Compressors:
A compressor is used to compress a sound's dynamic range, making the louder parts of the audio closer to the quieter parts with respect to amplitude. Without make-up gain, this leads to overall quieter audio that is more amplitude constant from beginning to end. There are both digital and analog compressors, and the compressors used to generate these examples have four variable settings: threshold, ratio, attack time, and release time. Threshold determines at what amplitude the compressor begins compressing, meaning the lower the number the less amplitude is needed for the compressor to engage. Ratio determines how much the compressors reduces the amplitude of audio above the threshold. A ratio of 2:1 means that for every 2 decibels over the threshold the original audio would go, the compressor only allows that audio to go 1 decibel over amplitude-wise. Attack time is how quickly the compressor engages when audio crosses the threshold, and is often measured in milliseconds. Release time is how quickly the compressor disengages after the audio dips back below the threshold and is also often measured in milliseconds.
Comp4c:
Comp4c is a digital compressor effect with four "knobs" that are variable. The four knob settings are threshold, ratio, attack time, and release time. The knob ranges are: Threshold (-30 to 0), Ratio (1:1 to 5:1), Attack Time (1ms to 40ms), Release Time (1ms to 40 ms). Each audio example below using the comp4c effect was outputted from a model trained on the full range of the knobs, but the example is just output audio from a single setting. For ease of comparison each example using the comp4c effect was generated with the following settings:
Threshold = -15, Ratio = 3, Attack Time = 20.5ms, Release Time = 20.5ms
Comp_one:
Comp_one is also a digital compressor effect with four "knobs" that are not variable. The four knob settings are again threshold, ratio, attack time, and release time. The comp_one effect has locked knob ranges with the expectation that learning this single setting would be an easier task, and thus lead to higher quality results, than the full range of the comp4c effect. Each audio example below using the comp_one effect was generated with the following settings:
Threshold = -25, Ratio = 4, Attack Time = 5ms, Release Time = 20ms
Leslie Cabinet:
The Leslie speaker is a combined amplifier and loudspeaker that projects the signal from an instrument and modifies the sound by rotating a baffle chamber ("drum") in front of the loudspeakers. A similar effect is provided by a rotating system of horns in front of the treble driver. The Leslie cabinet used to generate the dataset used in these examples contained two speakers, a horn and a woofer. The horn speaker is responsible for amplifying the high frequencies, and the woofer is responsible for amplifying the low frequencies. A musician can control the rotation speed using either a pedal or an external switch that alternates between a slow and fast setting. The slow rotational setting is commonly referred to as the "chorale" or "chorus" effect, and the fast rotational setting referred to as the "tremolo" effect.
Raw Audio Samples - Inputs
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Example #1
Example #2
Acapella Example
Speed Results: Baseline, Frozen Layers, Removed Skip Connection
Effect and Settings
Baseline: Comp4c
Predicted
Target
Frozen Layers: Comp4c
Predicted
Target
No Skip Connections: Comp4c
Predicted
Target
Example #1
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Example #2
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Accuracy Results: Baseline, 10,000 Epochs, Comp_one Effect,
Effects
Baseline: Comp4c
Predicted
Target
Comp_one Effect: Comp_one
Predicted
Target
10,000 Epochs: Comp_one
Predicted
Target
Example #1
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Example #2
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Acapella Audio
Effect and Settings
44.1kHz Sampling Rate: Comp_one
Predicted
Target
16kHz Sampling Rate: Comp_one
Predicted
Target
Acapella Example
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Dataset Manipulation: Guitar Note Trained Model - Comp_one Effect
Guitar Note
Predicted
Target
Full Instrumentation
Predicted
Target
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Leslie Cabinet Effects: Chorus and Tremolo
Horn Chorus
Predicted
Target
Woofer Chorus
Predicted
Target
Horn Tremolo
Predicted
Target
Woofer Tremolo
Predicted
Target
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)