Home | Linux | Sound | Guestbook | Me

Lossy encoding

Lossy compression is any compression which causes information to be lost. Compressing and then uncompressing a file results in something similar, but not identical, to the original file. This is no good for things which must be interpreted by a computer, like executable programs/applications or most computer-readable data, but is often just fine for things where the interpretation is being done by a human (like photographs or sounds). The trick is to remove little bits of information in places where it can't be perceived.

Lossy audio compression works using a psychoacoustic model. That is, by modeling how your ears (and your brain) hear sound, it is possible to find places to remove information that you wouldn't have perceived anyway. A full treatment of these techniques is beyond the scope of this document, but here are two simple examples:

Though humans can technically hear tones up to 20 kHz in pitch, most can't hear anything above 15 kHz, especially when other sounds are present. However, most CD-quality audio contains information for reproducing these tones anyway. By filtering out tones outside this range, you reduce the amount of information that has to be stored without affecting the perceived sound quality. (And even humans that can hear such tones wouldn't have heard them anyway on cheap computer speakers unable to produce such frequencies.)

Similarly, if a piece of music contains a loud bass drum hit (such as most rock and roll, a couple of times each second), the eardrum is too busy reacting to the percussive hits of the drum to register any other sounds at all for a few milliseconds. By simply omitting the samples immediately after such sounds, less information can be stored while still maintaining the same perceived sound quality. (Note: this is merely an example. I am not aware of any encoder which actually does this.)

Using sophisticated techniques such as these, lossy audio compression formats such as Ogg Vorbis and mp3 can achieve results which are provably indistinguishable from the original, CD-quality sound but are a mere 10 to 20% of the size.

And what's even better, being more aggressive with these techniques can result in files which are less than 5% of the original size but still sound quite good on normal equipment (think FM radio quality).

  Wasted space