Spheringer:

A Choir Sample Library and Sampler Pipeline Experiment in FOA Format

(2022-23)

Master's Thesis, Music Technology, NYU

An open-source Ambisonics sample library and plugin suite with high-quality audio reproduction, flexible spatialization, and good compatibility with existing sound field plugins.

KEYWORDS:

Ambisonics Recording; Sample Library; JUCE (C++); Kontakt KSP; Live Head-tracking

Spheringer is a choir sample library prototype and production pipeline experiment aiming to test the compatibility of Ambisonics audio and the JUCE sampler framework, further exploring the unique characteristics of choral art while being compatible with a traditional production workflow.

Packaged as VST and AU plugins for mainstream DAWs with a user-friendly interface, Spheringer provides high-quality reproduction of musical and non-musical samples recorded in native first-order Ambisonic format from multiple altitudinal perspectives.

The Spheringer sample library also double as an open-source stereo/FOA dataset for applications in DSP and MIR, such as formant analysis for singing vocal.

A technical demo of Ambience Singing in FOA with real-time head tracking enabled by the NVSonic head tracker. Binaural decoding is done by the AmbiBIN plugin from the SPARTA suite.

A technical demo of the three Spheringer sampler plugins in the order of (1) Kontakt KSP, (2) stereo sampler using built-in JUCE sampler functions, and (3) native FOA sampler with original code.

Links to explore:Thesis PaperGithub Repository: FOAGithub Repository: Stereo

Schematic of translating the microphone positions to source positions for the listener.

Sample Recording

An array of 3 first-order Ambisonics microphones (Sennheiser AMBEO) are positioned equidistantly to the singer(s) at 0 ̊ azimuth and three different elevation angles (-30 ̊, 0 ̊, and +30 ̊, respectively). By positioning additional microphones above and below the sound source, this array is able to capture a full spherical wavefront of direct sound through interpolation of samples.

Three choral singers are recorded both individually and together as a trio, performing the following elements:

Pitched Notes: Spanning all four voice parts and three octaves (G2 - E5), naturally transitioning through the chest and the mixed registers. 9 syllables, 2 articulations (legato/staccato), and 2 dynamics level (piano/forte) are recorded for each note.
Ambience Singing: A performance technique prevalent in the contemporary choral repertoire where singers vocally recreating natural elements foleys such as wind, rain, and animal sounds. To knit together a dense, vibrant eco-system soundscape, the singers rotate positions around the main vertical FOA array and improvise multiple takes for post-production layering.

Session photos: microphone setup from singer perspective (left) and with reference to the singer (right).

Flow charts of the plugin framework, using JUCE built-in sampler functions (above) and original sample callback & playback functions (below).

Plugin Design in JUCE

Two plugin frameworks are developed with the JUCE (C++) framework using:

(1) Built-in sampler functions in JUCE, from the juce::Synthesiser and juce::SamplerSound classes. Sample sounds for all notes within the pitch range are generated from a reference (base note) sound via simple transposition.

(2) Original code for direct audio callback. Audio files are read into buffers with pre-allocated memory and stored in a mapping container with their corresponding MIDI note number. Once a MIDI number is triggered, the plugin calls up its audio file and plays it back at the DAW sample rate.

The user interface (GUI) consists of a MIDI keyboard visualization for easy mouse input, and knobs for volume and ADSR parameters.

GUI of the Spheringer plugins.

Evaluation and Discussion

Both JUCE-based plugins are tested in REAPER working in conjunction with existing commercial and open-source sound field plugins.

Framework (1) is computationally efficient and robust with minimal runtime issues, as built-in sampler functions are optimized for memory allocation and buffering tasks. However, the juce::Synthesiser and juce::SamplerSound methods are not compatible with FOA data as they cannot correctly de-interleave the multichannel audio input, eventually down-folding a four-channel input to two channels. The juce::Synthesiser method has additional errors with pitch transposition, as the actual MIDI transposition range exceeds the range defined within the given functions. Nevertheless, this framework performs well as a stereo sampler.

As Framework (2) only uses fundamental audio processing and playback functions, the resulting “sampler” can indeed load imported audio into pre-allocated memory and preserve the Ambisonic format throughout the runtime. Unsurprisingly, this framework is less stable and computationally taxing, posing limits on the maximum length of files. Prioritizing audio format preservation while maintaining robustness and runtime efficiency, this current framework recalls and processes only a single audio file at a time, and it's not advisable to trigger more than one MIDI event synchronously. Nevertheless, it serves well as a proof-of-concept; for optimized memory allocation and real-time signal processing, loop buffering and real-time threads can be added to the current basic structure. Future JUCE versions may also update their default sampler functions to accommodate multichannel audio streams.

return to: Home

Page updated

Google Sites

Report abuse