Past Projects‎ > ‎

Audio Codec

A New Dimension in Audio Compression and Transmission

(Patent Issued)

Prof. Les Atlas, demonstrating the new audio codec at the IEEE ICASSP 2001 Conference

Mark Vinton at ICASSP 2001

An audio coding technique for variable, bandwidth-constrained channels such as the Internet must do two things; sound good at low data rates and adapt gracefully to changes in available bandwidth. Professor Les Atlas and Mark Vinton of the Department of Electrical Engineering at the University of Washington have designed an audio coding algorithm that is superior in both counts. It is inherently scalable; the encoded data stream can vary between 11-96 kilobits/second (with original sample frequency at compact disk's 2 stereo channels of 44,100 samples per second) without the need for recoding. This means that variable Internet or wireless channel conditions can be matched without the need for additional computation. Moreover, it is compact; in preliminary subjective tests our algorithm, coded at 32 kilobits/second/channel, outperformed full bandwidth MPEG-1 Layer 3 (MP3) at almost twice the data rate.

This new technique is based upon engineering abstractions of a recently discovered new "modulation dimension" in our auditory system. With support by the Office of Naval Research, we are part of a Multi-University Research Initiative: The Center for Acoustics and Auditory Research. The past 4 years of this multi-university collaboration was a major factor in the results we report here.

If you have a technical background and/or want to go directly to audio samples of our new techniques' performance, please click here to skip down to our recent publication and samples.

If you are interested in research collaborations, please click here to skip to contact information for Prof. Atlas.

For licensing information, please click here to skip down to our University's licensing office.

Less Technical Background

We are engineering researchers and are thus providing possibilities which will be available to the public 2-5 years from now. While we feel that we've opened up a new technical approach to audio and perhaps video transmission, none of work is expected to have an impact for the public in the next year or two. There still are unanswered research questions and a large amount of engineering and development needed before any of our ideas make it to market.

Given that disclaimer, we feel that our research could impact how the public gets their music in the future.

Examples of Possible Application's

Internet radio broadcast and music delivery:

Our algorithm provides several potential enhancements over technologies currently used for Internet broadcast:

  • Uninterrupted service: Due to the inherent fine grain scalability provided by our new approach, channel capacity and congestion are potentially much less likely to prevent audio reception. For example, if all your neighbors who share your cable modem system decided to listen to music at the same time, with current technology your music would completely stop. With our new technique, your listening experience would instead only degrade slightly.
  • Complete use of available bandwidth: As the coded data stream scales without further computation, the best possible match to the channel capacity can be achieved and hence the best possible audio quality can be provided. What this means is that your music will sound better and high-quality surround sound delivery might also be possible through the Internet or other wireless channels. People who have telephone modem connections should also get better music experiences, with less connection delay and almost CD quality.
  • Flexibility: One single broadcast data stream services all users regardless of their Internet connection (28.8  Kb/sec telephone modem through DSL and better) without sacrificing audio quality for high-bandwidth users. This flexibility means that digital music broadcasters will need less complex systems and will potentially be able to have more music stations with less overall cost.
  • Progressive transmission: The algorithm also offers the possibility of shorter delays between connecting to media and the start of audio reproduction. As you may have noticed, selecting Internet radio stations involves a significant buffer delay before you can hear the music. With our new technique, the delay should be substantially reduced and music could start almost instantly after selecting an Internet radio station.

Remote server personal media storage:

People can compress and store their audio media on a remote Internet based server and access their audio files at home, at work or anywhere in the world regardless of available Internet connection speeds. This idea has become more exciting with the advent of mobile wireless devices, such as cell phones, that can access the Internet. The flexibility and efficiency of our new technique, if taken to its research potential, means that everyone, from the audiophile to the casual listener to the potential customer who's shopping for the next song to buy, can store or listen to the quality of the music that they want.

Integration with new business models for music delivery:

How can music listeners, music distribution companies, and artists all benefit from this kind of new technology? First of all, it will very likely be possible for all existing MP3-encoded music to be converted to our new approach. Also, since virtually all MP3 players are software or firmware programmable, they can easily be upgraded to work with our new technology. What we offer is a new approach with inherent quality advantages for the public with, if needed, potential built-in capabilities for extremely accurate and efficient audio fingerprinting.

Technical Information and Samples

Our just-published paper is available in the Adobe Acrobat (pdf) format:

M.S. Vinton and L. E. Atlas, “A Scalable and Progressive Audio CODEC,” Proceedings IEEE ICASSP '01, Salt Lake City, 2001.

Audio Samples

Please beware: These samples of our new process, which have been converted to .wav's, are very large. If you try to download these samples with a telephone modem you will wait for hours!In the near future we hope to be able to share decoders with our research partners. Our decoder will provide for, as expected with our low bit rates, very quick downloads.

To load all files below you will need to right-click your mouse on them and "Save Target As..." to some directory on your computer. If you are on a Microsoft Windows platform should then be easy to click on them and listen. These files, which have been losslessly coded as .wav's after passing through our new coder and decoder at the speeds below, are not intended for streaming audio.

These samples were deliberately chosen to illustrate the range of quality we see with our CODEC right now for original sample rates of 44,100 samples/sec. For example, the 32 K Dire Straits passage sounds very good, yet the 32 K Tracey Chapman passage has significant artifacts in the artist's voice. We think we know why there still are some artifacts. The principal behind the correction of them is one topic of our research.

32 Kilobits/sec stereo wav files:

Dire Straits passage (6.9 MB) (Please right-click and "Save Target As...")

Neil Finn passage (5.4 MB) (Please right-click and "Save Target As...")

Richard Marx passage (5.6 MB) (Please right-click and "Save Target As...")

Tracey Chapman passage (5.8 MB) (Please right-click and "Save Target As...")

Important: These CODEC evaluation code and very short music samples are to be used only for non-commercial evaluation, research, and instruction under your supervision. We're not saying anything will always work right, so you'll have to consider carefully how you use this CODEC evaluation code and accept any risk involved. The CODEC evaluation code and samples should not be redistributed. Contact us if you have any questions.

For Technical Information

For more technical information, please contact:

Les Atlas
Professor and Associate Chair for Research
Department of Electrical Engineering
University of Washington
Box 352500
Seattle, WA 98195-2500

For Licensing Information

There is a patent pending on our new process. For licensing information, please contact:

Dana Bostrom
Information Products Licensing
Software & Copyright Ventures/University of Washington
Box 352143/Fluke Hall
Phone: 206.616.3451 FAX: 206.616.3322