Pages

Sunday, November 8, 2009

Music

I was asked recently to explain a little bit about music files, specifically MP3 files, in regards to quality.


The general idea was that the original files in the WAV format were over 100 MB in size, but after converting to MP3, the file size was reduced to only roughly 3% the original size. How can MP3 possibly be of any useful quality if it is so small?

As I know the inquisitive nature of the requester, I began by explaining some of the basics of the MP3 digital music file as we use them today:

*** Please be aware, this is not an 'official guide to audio files'. The info I am about to present has omissions and inaccuracies to the true nature of the audio file. I am trying to represent an easily understood theory.***


Firstly, todays digital music is saved, as with all 'computer' files, in binary on your Hard Drive, however, since most operating systems report file sizes in those magical B's we see so often (KB, MB, GB and so on), I chose to work with MB as was used in the original request. I explained the basic idea that the music files had to be 'stored' in these individual pieces of data, with each piece able to produce it's own sound.

So, a full song is comprised of a collection of the above noted pieces of data played one after the other in the correct order. This should have you thinking about two things: how many pieces do you need and how big are they?

The term Sample Rate is used to explain how many pieces there are in a song. Most common for CD quality MP3 music is 44.1KHz. What this means is that 44100 times per second a Sample of sound is stored or played back. As you may have guessed, the larger the number the smoother the audio will playback.

It is important however to keep in mind that depending on the source, you can only increase that number so far before it starts to store empty sound. I explained using a highway scenario:

Pretend your job is to take a picture of one spot on a highway so that you can count cars; lets pretend that 1 car passes every 2 seconds, now lets say that you take your picture every 1 second, the result would be that half of your pictures do not have cars in them. You would end up spending a lot of time and effort looking at empty data, kinda wasteful.

 So to answer our first question, for CD quality MP3 sound, we are looking at 44100 pieces of data for every second of audio playback.

 If the thought of so many pieces is daunting to you, please make sure you are sitting now. What we have to do now is apply a size to the pieces. We call this Bitrate. Bitrate is measured in bits per second. CD quality MP3 sound runs at 128000 bits / second. This means that our pieces above are approx. 2.9 bits each.

Those afraid of math jump ahead a bit.....

A three minute thirty second song will be comprised of 9261000 pieces (Samples) of sound. If each piece is worth 2.9 bits, then the whole song is worth 26856900 bits, thats 3357112 bytes, or 3.2 MB.

OK, now you have a rough idea of how an MP3 file size is determined. I will throw at you that there are further components to it all. I will not try to explain them, they are far too complicated to answer the question at hand.

Now that I have walked you around in circles a little while, I can explain that the WAV format (as with a few others) is what we call an uncompressed format. What this means is that the file is saved in it's truest form with no regard for file size (KB, MB...) and is played back by simply reading the file bit for bit. MP3 is one of many compressed formats. This means that a specific mathematical algorithm has been applied to the file as a means to reduce it's file size. MP3 file compression also crops the sound a little bit for us, often eliminating sound that falls outside of the human audible sound range.

There are other types of compression out there, the theory of one I had used to try to really simply explain compression. Lets say you have the number 000000100000000000000000011111111111111.00000.  You could store it as you see it (like WAV) or you could break it down: the leading 0's and the 0's after the decimal are not needed, cut them, and the same digit appears a number of times in a row, so the compressed version you store is: 1(18-0's)(14-1's).  Uncompressed is 45 characters, compressed is only 17 characters.

Again, the above is NOT really how compression works, but merely an explanation of how it could work.

There you have it, if you were able to keep up with the rambling above and connect all the dots, you understand it all.

MP3 files use compression and cropping to reduce file size. As long as you maintain the equivalent settings (Sample Rate and Bitrate) as the original, a minimal reduction in sound quality is to be expected.

For advanced audiophiles and regular listeners expecting truly high quality sound, MP3 does have it's losses, but for the average MP3 device, using regular earphones for example, little to no issues are to be expected.

I like to use the MP3 format at 44.1KHz and 320 Kbps.