adaptiveplayout

Audio Adaptive Playout

Research: Zoomable User Interfaces - Multiple Description Coding - Audio Adaptive Playout - Peer2Peer - LDPC/DF - BioElectronic

Teaching: MicroElectronics (UniBG) - MultiMedia Coding (UniMI) - Thesis projects

Miscellanea: About me - Remarks - Blog

Download

Demos: original (S.Vega), slow (x1.25), fast (x.75), loss (20%), concealed.

Infos: contact authors

Partners

Audio Adaptive Playout

Adaptive playout is the ability to make the playout last longer or shorter. This is also known as time-scale modification or time stretching. It can serve several purposes:

- buffer management
- loss concealment

Waveform Similarity Overlap-and-add (WSOLA)

There are several techniques to do time-scale modification in the frequency or in the sample domain.

Other Resources

PoliTO Survey of Adaptive Playout techniques and application to concealment.

Developers

University of Bergamo (UniBG)

If you simply stretch audio the pitch changes noticeably. Instead typical waveform periods are cut-copied-pasted. One of the most advanced techniques to do so is WSOLA: waveform similarity overlap and add.

Antonio Servetti (PoliTO)

Polytechnic of Turin (PoliTO)

Eric Bonfadini (UniBG)

WSOLA is done in this example by taking a segment known as template and searching back/forth for a best-match. When it is found, the best-match is mixed with the template. Finally a longer/shorter segment is created by taking the mix and the rest of the segment from the best-match to the end of the packet.

Buffer Management

Adaptive playout can be used to avoid buffer underflows/overflows by lengthening/shortening audio packets. Also, the pre-roll period can be shortened: the playout can start earlier, by slowing-down the buffer level will continue to increase, until an optimal level is reached.

New mobile receivers will include a so-called Adaptive jitter buffer management (AJBM). It is being standardized in 3GPP. An adaptive control logic will decide the stretching factor based on network statistics (inter-arrival delays), buffer fullness and loss pattern.

The adaptation can depend upon packet contents. As an example: unvoiced /noise segments can be stretched by a large amount; voiced/tonal segments are characterized by a trade-off between quality and stretching; transients cannot be stretched.

The adaptation can be time varying: lengthening can be done at the beginning of talkspurts and shortening can be done at the end. This is known as Virtual Buffering as there will no buffer (hence no delay) except during talkspurts.

Loss Concealment

Loss concealment is done by stretching neighboring segments to literally fill the gap. The more the packets streteched, the less the stretching factor to be used and the higher the quality.

Actually packets will be stretched a little bit more than needed in order to be able to mix the overlap region and avoid discontinuities.

The quality of the concealment will be very high with respect to other methods.

Created: 3rd April 2007. Updated: 29th June 2007.

Page updated

Google Sites

Report abuse