Main Objective
The purpose of the echo function is to allow the user to enhance audio tracks with a delay effect that repeats sound segments, adding depth and texture to the music. This feature will allow the user to manipulate the timing and intensity of the echos, as well as how many repetitions are present. This allows for dynamic transitions and building atmosphere during live performances or mixes. The script requires an input WAV file that it will add the echo effect to, giving the user freedom to choose exactly what sounds they wish to manipulate.
Script Algorithm for Initial Iteration
** This iteration does NOT contain live-audio manipulation
Error checking:
The echo2.py script begins with checking that path to the WAV file that the user has chosen exists, stopping the program if it doesn't. There are other areas throughout the script where we check for errors in order to make sure that the file given is able to be process. For example, we plot the waveform of the inputted WAV file, which is only supported if the input is a mono file, not a stereo file. Due to this, we check that the input is a mono file in this function.
Waveform plotting:
The script then processes the input file by checking that the input is a mono file, reading the audio data, converting it to a numerical signal, and generating a time axis based on the sample rate. It then plots the waveform of the audio signal according to the plotting function that follows. Image 1 below is an example of an outputted waveform for a mono WAV file that the program provides before applying the echo effect.
Image 1: Waveform of original WAV audio file
Adding echo with convolution:
The script then adds an echo to the signal using convolution. We achieve this by first normalizing the audio to prevent clipping, which is done by diving the signal by its maximum amplitude to ensure the audio values are between [-1,1]. An impulse response of the signal is then generated based on the inputted delay and feedback specified by the user, which is then convolved with the original signal to create the echo effect. Finally, the function output the modified signal to a new WAV file, as well as a plotted waveform of the signal with the echo effect. As you can see below in image 2, the signal is also scaled back to its original amplitude.
Image 2: Waveform of WAV audio file with echo after using convolution
Script Algorithm for final iteration
Error Checking:
At the start of the script, the program ensures the selected WAV file exists on the system by checking its file path. This prevents errors later in the processing pipeline by verifying input validity upfront. If the file is found, a "File found!" message is displayed, confirming readiness for processing. If the file is missing, an error message "File not found!" is shown, halting further operations until the issue is resolved.
WAV File Processing:
The program reads the contents of the selected WAV file, extracting critical audio properties such as the signal data, sample rate, number of channels, and sample width. While stereo files are accepted, they cannot be plotted due to their dual-channel structure. Mono files, however, can be fully utilized for visual analysis and processing.
Echo Effect using Convolution:
An echo effect is applied to the audio signal using a convolution process. The effect is controlled by user-defined parameters: delay, which specifies the time between the original sound and its echo, and feedback, which determines the intensity of the echo. The function normalizes the audio, adjusts the signal, and ensures the output signal length matches the input. It supports both mono and stereo files, ensuring versatility in use. This function is shown below in image 3.
Image 3: Screenshot of section that utilizes convolution to create echo effect
Audio Playback:
The processed audio is converted into a format compatible with the "simpleaudio" library, enabling seamless playback. This feature ensures that users can instantly hear the applied echo effect without additional conversion or delays, delivering a responsive and real-time experience.
GUI Application:
The program incorporates a user-friendly graphical interface built with Tkinter. It includes a "Load WAV File" button for easy file selection via a file browser. Additionally, two interactive sliders allow users to adjust the delay and feedback parameters, with values ranging from 0 to 1. The GUI eliminates the need to modify code directly, providing an intuitive way to control and experiment with audio effects. An image of the GUI interface is shown below in image 4.
Image 4: GUI Interface for second iteration
Continuous Audio Playback:
A separate thread ensures the audio is played continuously while users make real-time adjustments to the delay and feedback sliders. The program automatically updates the audio with the modified settings, providing immediate feedback on the changes. This enhances the interactivity and responsiveness of the application, making it ideal for experimenting with audio effects.
Challenges and Results
The function updates the terminal with the new delay and feedback values whenever the sliders are adjusted, allowing for real-time feedback on the changes. The audio plays continuously regardless of the length of the signal, so users can hear the effect of their adjustments when moving the sliders. However, there is a noticeable delay between adjusting the sliders and hearing the changes, especially when working with longer audio signals. For example, with a 7-second audio clip, there was about a 30-second delay before the effect was audible. This lag could be due to the time it takes for the convolution operation to process the entire signal, especially since the current implementation processes the full audio file before playback continues.
The delay is likely caused by the fact that the convolution is being applied to the entire signal in one go, which can be computationally intensive. Additionally, the single-threaded processing approach may be contributing to the issue, as the audio playback and convolution processing are happening in the same thread, causing interruptions. One potential solution to reduce latency could involve processing smaller chunks of the audio in blocks, allowing for real-time updates to the playback. Using more efficient algorithms, such as fast convolution with the Fast Fourier Transform (FFT), could also speed up the processing. Another option might be to implement multi-threading, where the convolution happens in a separate thread, enabling smoother and more responsive playback.
DSP Techniques Utilized
This program uses several DSP techniques to modify audio and add an echo effect. First, the signal is normalized to ensure the audio amplitudes are scaled between -1 and 1, preventing clipping during processing. The main technique is convolution, which applies an impulse response based on user-defined delay and feedback values to simulate an echo. For stereo audio, each channel is processed separately to maintain the original stereo balance, then merged back together after applying the effect. The program also handles sample rate calculations to determine the correct number of delay samples, ensuring accurate timing of the echo. Finally, the processed signal is scaled back to its original amplitude, clipped to fit the 16-bit audio range, and converted for playback. These techniques work together to manipulate the audio in real-time, providing a hands-on demonstration of DSP concepts.
All images used on this page are our own.