The focus of our project is sound rendering from various directions -- the opposite of sound localization. Sound rendering is an important tool to adding realism to virtual experiences, such as games, movies, music, and so on. Most audio files today have two-channels, one for each ear, which creates a fuller, more comprehensive listening experience. Ever wanted to simulate the experience of an invisible bee flying around your head? With our project, you can.
We created two simulations; both take an input signal and angle and return a two-channel sound that simulates the sound playing from the given angle, but they do so in very different ways. The first simulates the experience by applying purely theoretical principles to the input signal, while the second uses recorded impulse data to manipulate the input signal.
We will first discuss the theoretical simulation in the Theoretical Audio Models section. We take a dive into the geometry of the system on a 2 dimensional scale. Then we get a bit more specific by involving the individual head related transfer functions.
Next we discuss the usage of convolution and experimental data to create a simulation in the Simulation with Data section. We strategically recorded impulse responses at various angles around a person wearing microphones near their ears. We linearly interpolate between the angles we do have data for to generate data for the angles for which we do not have data. We convolve a series of impulse responses in a second simulation to create the illusion of sound moving from any given angle to another, and we use Hamming windowing to smoothen the transition from one angle to another.
For both simulations, we conducted user testing to validate (or invalidate) our results. We go over this in detail in both simulation sections.
User testing showed that neither of the simulations accurately or reliably rendered sound from different directions, but that the use of the Hamming window did improve the listening experience by smoothing out the sound and making it seem like it is moving rather than jumping around.
The simple geometrical method did not yield results as expected, because it had rough tuning and did not account for various other acoustic variables. Utilizing HRTFs would make for more accurate attenuation. Additionally, headphone quality contributes to reproducing the attenuation with sufficient fidelity.
While the simulation with experimental data had more dynamic results, it was not consistently accurate. We had trouble taking data in a controlled environment, and our data had a considerable number of outliers that impacted our ability to successfully simulate sound from those directions. Using better microphones in a more controlled environment would improve the quality of the simulation, although the transition from angle to angle would probably still not be as smooth as we would like it to be. Future work would include creating a more gradual transition from angle to angle; using the Hamming window helped but was not sufficient in this regard.
On the whole, although the theory of simulating 3D audio is relatively well understood, the practice of generating convincing sound requires precise measurements and models, and we were unable to produce accurate and consistent results.