A New Dimension in Audio Compression and Transmission
Prof. Les Atlas, demonstrating the new audio codec at the IEEE ICASSP 2001 Conference
Mark Vinton at ICASSP 2001
An audio coding technique for variable, bandwidth-constrained channels such as the
Internet must do two things; sound good at low data rates and adapt gracefully
to changes in available bandwidth.
Professor Les Atlas
and Mark Vinton of the Department of Electrical
Engineering at the University of
Washington have designed an audio coding algorithm that is superior in both
counts. It is inherently scalable; the encoded data stream can vary between
11-96 kilobits/second (with original sample frequency at compact disk's 2 stereo
channels of 44,100 samples per second) without the need for recoding. This means
that variable Internet or wireless channel conditions can be matched without the
need for additional computation. Moreover, it is compact; in preliminary
subjective tests our algorithm, coded at 32 kilobits/second/channel,
outperformed full bandwidth MPEG-1 Layer 3 (MP3) at almost twice the data rate.
This new technique is based upon engineering abstractions of a recently discovered
new "modulation dimension" in our auditory system. With support by the
Office of Naval Research, we are part of a
Multi-University Research Initiative: The
Center for Acoustics and Auditory Research. The past 4 years of this
multi-university collaboration was a major factor in the results we report here.
If you have a technical background and/or want to
go directly to audio samples of our new techniques' performance, please
click here to skip down to our recent publication and samples.
If you are interested in research collaborations,
here to skip to contact information for Prof. Atlas.
For licensing information, please
click here to skip down to our University's licensing office.
Less Technical Background
We are engineering researchers and are thus
providing possibilities which will be available to the public 2-5 years from
now. While we feel that we've opened up a new technical approach to audio and
perhaps video transmission, none of work is expected to have an impact for the
public in the next year or two. There still are unanswered research questions
and a large amount of engineering and development needed before any of our ideas
make it to market.
Given that disclaimer, we feel that our research
could impact how the public gets their music in the future.
Examples of Possible Application's
Internet radio broadcast and music delivery:
Our algorithm provides several potential enhancements
over technologies currently used for Internet broadcast:
Uninterrupted service: Due to the inherent fine grain scalability provided by our
new approach, channel capacity and congestion are potentially much less
likely to prevent audio reception. For example, if
all your neighbors who share your cable modem system decided to listen to
music at the same time, with current technology your music would
completely stop. With our new technique, your listening experience would
instead only degrade slightly.
Complete use of available bandwidth: As the coded data stream scales without
further computation, the best possible match to the channel capacity can
be achieved and hence the best possible audio quality can be provided.
What this means is that your music will sound better and high-quality surround
sound delivery might also be possible through the Internet or other
wireless channels. People who have telephone modem connections should also
get better music experiences, with less connection delay and almost CD
One single broadcast data stream services all users regardless of their
Internet connection (28.8 Kb/sec telephone modem through DSL and
better) without sacrificing audio quality for high-bandwidth users.
flexibility means that digital music broadcasters will need less complex
systems and will potentially be able to have more music stations with less
Progressive transmission: The algorithm also offers the possibility of shorter
delays between connecting to media and the start of audio reproduction. As
you may have noticed, selecting Internet radio stations involves a
significant buffer delay before you can hear the music. With
our new technique, the delay should be substantially reduced and music
could start almost instantly after selecting an Internet radio station.
Remote server personal media storage:
People can compress and store their audio media on a remote Internet based server and
access their audio files at home, at work or anywhere in the world regardless
of available Internet connection speeds. This idea has become more exciting
with the advent of mobile wireless devices, such as cell phones, that can
access the Internet. The flexibility and efficiency of
our new technique, if taken to its research potential, means that everyone,
from the audiophile to the casual listener to the potential customer who's
shopping for the next song to buy, can store or listen to the quality of the
music that they want.
Integration with new business models for music delivery:
How can music listeners, music distribution
companies, and artists all benefit from this kind of new technology? First of
all, it will very likely be possible for all existing MP3-encoded music to be
converted to our new approach. Also, since virtually all MP3 players are
software or firmware programmable, they can easily be upgraded to work with
our new technology. What we offer is a new approach with
inherent quality advantages for the public with, if needed, potential built-in
capabilities for extremely accurate and efficient audio fingerprinting.
Technical Information and Samples
Our just-published paper is available in the Adobe Acrobat (pdf) format:
M.S. Vinton and L. E. Atlas,
“A Scalable and Progressive Audio CODEC,”
Proceedings IEEE ICASSP '01,
Salt Lake City, 2001.
Please beware: These
samples of our new process, which have been converted to .wav's, are very large.
If you try to download these samples with a telephone modem you will wait for
hours!In the near future we hope to be able to
share decoders with our research partners. Our decoder will provide for, as
expected with our low bit rates, very quick downloads.
To load all files below you will
need to right-click your mouse on them and "Save Target As..." to some
directory on your computer. If you are on a Microsoft Windows platform should
then be easy to click on them and listen. These files, which have been
losslessly coded as .wav's after passing through our new coder and decoder at
the speeds below, are not intended for streaming audio.
These samples were deliberately
chosen to illustrate the range of quality we see with our CODEC right now for
original sample rates of 44,100 samples/sec. For example, the 32 K Dire Straits
passage sounds very good, yet the 32 K Tracey Chapman passage has significant
artifacts in the artist's voice. We think we know why there still are some
artifacts. The principal behind the correction of them is one topic of our
32 Kilobits/sec stereo wav files:
Dire Straits passage (6.9 MB) (Please right-click and "Save Target
Neil Finn passage (5.4 MB) (Please right-click and "Save Target
Richard Marx passage (5.6 MB) (Please right-click and "Save Target
Tracey Chapman passage (5.8 MB) (Please right-click and "Save Target
Important: These CODEC evaluation code and very short
music samples are to be used only for non-commercial evaluation, research, and
instruction under your supervision. We're not saying anything will always work
right, so you'll have to consider carefully how you use this CODEC evaluation
code and accept any risk involved. The CODEC evaluation code and samples
should not be redistributed. Contact us if you have any questions.
For Technical Information
For more technical information, please contact:
Professor and Associate Chair for Research
Department of Electrical Engineering
University of Washington
Seattle, WA 98195-2500
For Licensing Information
There is a patent pending on our new process. For
licensing information, please contact:
Information Products Licensing
Software & Copyright Ventures/University of Washington
Box 352143/Fluke Hall
Phone: 206.616.3451 FAX: 206.616.3322