Broadcast engineers that have been in the business for more than 5 or 10 years will agree that the landscape and technical requirements have changed drastically. One must adopt a “lifelong learner” attitude to stay in the business. At the top of the list is converting the audio plant from analog to digital. Add to that the requirement to stream the program feed to listeners over the internet.
Let’s take a moment to “Look Under the Hood” at this stream of 1’s and 0’s.
When an analog audio file is digitized it will become one of two different files types…Lossless or lossy.
It doesn’t take a “rocket scientist” to understand the difference in the two. Lossless simply means the file, once created then stored or transmitted will stay the same without changing any of the data. It keeps all the audio quality of the original source.
Lossy on the other hand will by necessity discard some of the data to aid in storage and transmission. What and how much data is discarded is a function of a compression algorithm based on a Psychoacoustical model. This looks at how the human ear and brain perceive sound waves.
The most common lossless files are WAV (Waveform Audio File Format) and AIFF (Audio Interchange File Format). Both of these are uncompressed formats and are essentially the same quality; they just store the data a bit differently.
There are also some lossless files that are compressed. Wait…didn’t we just say that compressed files discard some of the data? This type of compression uses a mathematical type of compression. If you have ever created a word document or spread sheet and used the zip tool to reduce the size of the file, then you have created a “compressed lossless” file. Once it is “unzipped” the file is the same as the original. It is akin to a secretary or court reporter using short hand to take notes in a meeting.
Two compressed lossless file formats are FLAC and ALAC. FLAC (Free Lossless Audio Codec) is an audio compression codec primarily authored by Josh Coalson. A digital audio recording compressed by FLAC can be decompressed into an identical copy of the original audio data.
ALAC (Apple Lossless Audio Codec) is similar to FLAC. It’s a compressed lossless file made by Apple. Its compression isn’t quite as efficient as FLAC, so your files may be a bit bigger, but it’s fully supported by iTunes and iOS (while FLAC is not).
Normally a lossless audio file used in the broadcast and consumer audio arena is label as “CD Quality Audio”. The analog to digital conversion will have a sample rate of 44.1 kHz with a bit depth of 16 bits. This creates a bit rate for stereo at 1.411 Mbps. Be careful, sometime people confuse bit rate and bit depth. It is not the same thing.
Bit rate refers to the number of bits that are processed over a certain amount of time. In audio, this is usually stated as kilobits per second or Kbps.
By the way, there something called “High-Res” audio used in the lossless streaming world. High-res audio is: “Lossless audio that is capable of reproducing the full range of sound from recordings that have been mastered from better than CD quality music sources.”
Therefore any file greater than 44.1 kHz sample rate and 16 bit depth is considered High-res.
Let’s move across the hall to the world of lossy audio.
Since lossless audio takes up a lot of storage space and is almost impossible to transmit over the internet, the industry has developed ways to reduce the amount of data without destroying the audio file. This procedure takes into account how the ear and brain receives and processes sound. Certain sounds can be removed (saving data) and the brain can recreate or ignore the missing parts. One must be careful…remove too much data and the reproduced sound file will no longer sound like the original.
The first thing you need to learn about lossy audio is how the quality of the stream is stated. Unlike lossess streams, where the bit rate is determined by simply multiplying the sample rate times the bit depth, once you compress a file this is no longer the case.
Quality in a compressed stream is simply listed as the amount of data processed over a period of one second. Normally stated as Kbps (kilobits per sec).
The most popular form of compression is MP 3 which is short for MPEG Audio Layer-3.
This form of compression can compress a song by a factor of 10 or 12 and maintain close to CD quality.
Specifically, MP3 is classified as a perceptual audio codec. Such codecs are based on perceptual models of the human auditory system. These models describe which elements in an audio signal can or cannot be perceived by the human ear, regardless of whether or not the listener has a highly trained ear.
Another form of compression is AAC (Advanced Audio Coding). AAC was designed to be the successor of the MP3 format. Both where developed by Fraunhofer Institute for Integrated Circuits in Germany and is an audio coding standard for lossy digital audio compression.
The first version of the AAC codec was standardized in 1994. The MPEG-2 AAC codec was extended in the MPEG-4 standard by adding perceptual noise shaping (PNS), spectral band replication (SBR), and the parametric-stereo (PS) tools.
Further improvements have created MPEG-4 “High Efficiency Profile” (HE-AAC). It typically uses 48 to 64 Kbps for stereo.
High Efficiency AAC version 2 Profile applies a parametric approach to coding the stereo signal, achieving a further reduction in bit rate. Instead of transmitting two channels, the PS encoder extracts parameters from the stereo signal.This enables reconstruction of the stereo signal at the decoder side and produces a mono downmix, which is HE-AAC encoded.
HE-AAC v2 algorithm has a top speed of 64 kbps with most services using 48 Kbps or 56 Kbps.
There are a number of streaming audio services used on the web, each with their own protocol and bit rate.
Amazon, where possible encodes their MP3 files using variable bit rates (VBR) for optimal audio quality and file sizes, aiming at an average of 256 Kbps. Using a variable bit rate allows them to allocate a higher bit rate to the more complex sections of music files while using a smaller bit rate for the less complex sections. The average of these rates is then calculated to produce an average bit rate for the entire file that represents the overall sound quality. Some of the content is encoded using a constant bit rate (CBR) of 256 Kbps. This content will have the same excellent audio quality at a slightly larger file size.
*File Size: A typical 3-minute song takes up approximately 5MB of storage space.
You Tube streams video and audio separately and the web player/app combines them on the fly. Due to this, the audio bitrate is not directly affected by video quality like in the past.
The audio you hear during a YouTube video will usually be 126 Kbps AAC2 in an MP4 container or anywhere from 50-165 Kbps Opus in a WebM container. Changing video resolution (360p, 720p, etc) in the video settings will probably not impact the audio stream, but it is likely that your connection performance will.
iTunes now offers AAC at 256 Kbps, though some older tracks may still only be available at 128 kbps.
In our next installment we will see how all these numbers and protocols work in the OTA (over the air) broadcast world.
- 2017 schedule of ABA Engineering Academy classes and seminars
- Online sign-up for classes and seminars