Abstract:
Reproduction of high quality spatial sound has gained considerable importance with the recent technology developments in the fields of virtual and augmented reality. Recently, the reproduction of binaural signals in the Spherical-Harmonics (SH) domain has been proposed. This is performed by using SH representations of the sound-field and the Head-Related Transfer Function (HRTF). These processes offer the flexibility to control the reproduced binaural signals, by manipulating the sound-field or the HRTFs using algorithms that operate directly in the SH domain. However, in most practical cases, the binaural reproduction is order-limited, which introduces truncation error that has a detrimental effect on the perception of the reproduced signals, mainly due to the truncation of the HRTF. A recent study showed that pre-processing of the HRTF by ear-alignment reduces its effective SH order, which may be beneficial for alleviating the above effect. In this paper, a method to incorporate the pre-processed ear-aligned HRTF into the binaural reproduction process is presented. The method uses Ambisonics representation of the sound-field formulated at the two ears, which builds on the Binaural B-Format and denoted here as Bilateral Ambisonics. The proposed method leads to a significant reduction in errors due to the limited-order reproduction, which yields a substantial improvement in perceived binaural reproduction quality even with SH as low as first order.
The following audio excerpts are the ones used in the perceptual evaluation described in Sec. V of the paper.
The evaluation was performed using a MUltiple Stimuli with Hidden Reference and Anchor (MUSHRA) listening experiment. The experiment comprised two separate tests, one for each audio source signal (Castanets, Speech), with seven test signals in each test: 2 reproduction methods (Basic+MagLS, Bilateral) X 3 orders (N=1, 2, 4) + a hidden reference (high-order basic reproduction with N=41).
Scene geometry:
*Note that the experiment was originally developed in Matlab. The interface presented here aims to give the ability to listen to the signals presented in the experiment, but it is different from the interface used in the experiment.
Test 1: Castanets
Reference: high-order standard reproduction with N=41:
Basic+MagLS Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=4:
Test 2: Speech
Reference: high-order standard reproduction with N=41:
Basic+MagLS Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=4:
White Noise Bursts
Reference: high-order standard reproduction with N=41:
Basic+MagLS Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=4:
Blaster Fire
Reference: high-order standard reproduction with N=41:
Basic+MagLS Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=4:
Jingle Stick
Reference: high-order standard reproduction with N=41:
N=1:
N=2:
N=4:
Guitar
Reference: high-order standard reproduction with N=41:
N=1:
N=2:
N=4:
The following audio excerpts are used in a preliminary experiment that compared the Bilateral Ambisonics reproduction to the Basic reporoduction with Eq (no MagLS).
The experiment comprised three separate tests, one for each audio source signal (Castanets, Speech, Guitar), with seven test signals in each test: 2 reproduction methods (Standard, Bilateral) X 3 orders (N=1, 2, 6) + a hidden reference (high-order standard reproduction with N=41).
Scene geometry:
Test 1: Castanets
Reference: high-order standard reproduction with N=41:
Basic+Eq Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=6:
Test 2: Speech
Reference: high-order standard reproduction with N=41:
Basic+Eq Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=6:
Test 3: Guitar
Reference: high-order standard reproduction with N=41:
Basic+Eq Ambisonics Reproduction Bilateral Ambisonics Reproduction
N=1:
N=2:
N=6: