What does a “Bass Clipper” do?

Bass Clippers — an overview

By Cornelius Gould

Whenever audio is “clipped”, it is literally “distorted”.  This distortion is very similar to what is used for that big LOUD rock guitar sound.  The key to broadcast processing is to do this without the distortion being audible.   Clipping in broadcast audio give the audio more “impact” and in most cases also boosts perceived loudness.

The Bass clippers probably came to prominence with the introduction of Bob Orban’s Optimod 8100 audio processor.

In the 8100, the purpose of the bass clipper in the 8100 is to more-or less allow the bass processor to run at a more natural rate.  This rate means that the attack time is somewhat slow.  Slow attack times means that you sometimes get large peak excursions that must be dealt with to control modulation.  The cleverness of the 8100 is this:  The peak is allowed to happen, but it is “chopped” off by the bass clipper. This provides instantaneous compression of the bass audio.   Without losing bass “impact”.  This is cool, but that’s not all!

The action of the bass clipper would be quite audible if left in that configuration. To some listeners, it may sound as if they have blown a woofer, and an annoying rattle would be heard whenever bass thumps were in the program audio.  The solution is to have the bass clipper followed by a low pass filter.  This effectively removes all the distortion products that produces the “rattling sound”, leaving only clean tightly controlled bass.

Since the bass clipper reduces bass peaks to a known manageable level, the 8100’s overall program clipper can be set to provide more loudness without much concern to loud bass peaks that can cause annoying distortion to singer’s voices, or other similar sounds.

Since the time of the 8100, many manufacturers have provided a host of user controls to adjust the bass clipping network to suit their tastes.  In the 8100, the bass clippers weren’t user adjustable, so they  “were what they were”.  They were adjusted for minimum audible effect, and left at that.  The difference now is that the end users are given the ability to “drive the bass” at a much “hotter” level into the bass clipper. In some instances, the user is given control over how the lowpass filter following the bass clipper works.

Manipulating this filter alters the “texture” you get from the bass clipper.  Other schemes involve changing from hard to soft clipping with the goal of finding the appropriate “texture” for the bass to further tailor your on-air sound.

 

 

Posted in Audio Processing Fundamentals | Comments Off on What does a “Bass Clipper” do?

Audio Processing for HD Radio – Pt2

PROCESSING GUIDE (Radio Guide Magazine)

Audio Processing for HD Radio –

Part 2

By Cornelius Gould

Getting the best sounding audio from digital transmission requires learning some new techniques. Cornelius Gould continues laying the groundwork for understanding the new generation of audio processors.

We are living in a unique time in communications history. For better or worse, mass entertainment continues to change from what was to what will be – and most of these changes revolve around the word “digital.”

PECEPTUAL CODING

Virtually all “digital” entertainment media use bit-reduced or perceptual coding; such delivery to the general public will be a fact of life for radio and TV, as well as those new forms of mass communication that have yet to be invented.

It has been shown in studies that the use of perceptual coding is “acceptable” to the vast majority of the population simply because very few people pay attention to things they are not supposed to hear. And that is the trick behind how these systems work!

A basic understanding of how this all works is key to knowing how the best way to pre-condition your program audio for HD Radio and web streaming. To do this, we will look at some of the common components used in today’s audio bit-rate deduction schemes used such as HE-AAC, and MP3.

Of course, a complete discussion on these processes is quite complex and beyond the scope of this series. Our purpose here is not to be so much an “all encompassing guide to codecs” as to help anyone new to audio processing for coded audio to understand how to best use their processing tools to gain impressive results from HD Radio, web streams, and Podcasts.

WHY PERCEPTUAL CODING IS NEEDED

Since is impossible to broadcast all of the digital information available in linear digital audio due to strict bandwidth restrictions, much of the data has to be selectively discarded in order to make it “fit” in a manner that is as transparent as possible to the majority of “listening ears.”

There are, of course, definite limitations as to how far you can take this data reduction idea. Anyone who has listened to a dialup quality Internet stream can attest to this!

Since the exact nature of the CODEC used for HD Radio is unknown due to the proprietary nature of the system, we will need to focus on a coding scheme that can best mirror the performance of HD Radio. This why I decided to focus on the HE-AAC codec, since its performance does indeed comes closest to what we get from HD radio. For simplicity’s sake, I will have to describe this all in what appears to be a step-by-step process. It is far from it.

HUFFMAN CODING

The first and most basic step of almost all bit reduction schemes is something called Huffman coding. Welcome to our Department of Redundancy Department.

Originally developed by an MIT student in 1952, Huffman coding looks for repetitive information and replaces it with a much smaller, simpler “description.” The most common means of data reduction for generic computer data, and we all use Huffman coding every day in the form of “.zip” files.

As an example, consider a digital audio file that contains a 60 Hertz hum component at –20 dB. This hum in our recording never changes. It is just there in the background as a constant.

Now this hum can take up awful lot of repetitive data just to reproduce it digitally, especially if it is a really long file. What the Huffman coding scheme for digital audio basically brings to the table is the ability to replace all that with a simple descriptor.

HUFFMAN IN ACTION

What this descriptor does is to tell the decoder: “In the background of this entire file, there is a 60 Hertz waveform, you need to re-create this.” It would also relay to the decoder that this 60 Hertz hum is -20 dB down, and to keep generating the “hum audio” until told to stop.

The decoder would then generate the appropriate hum at the prescribed specifications as part of the background of our recording. The encoder does not necessarily send the actual “hum audio” data.

All that is sent in our case is a description of the “hum audio” for the decoder to regenerate locally. Bam! – An awful lot of the data is removed and we have now made the audio file much smaller.

LOOKING AT IT VISUALLY

Whenever I talk to anyone about bit-reduced audio, I inevitably fall back to describing bit-reduced video.

This is because our western society is so visually oriented there are lots of descriptive words for things we see, but very few to describe what we hear. In fact, most of us in western lands will typically notice strange things visually before we notice things aurally.

I can use this to advantage as an illustration in terms of bit-reduced audio vs. video, as there are a lot of parallels with video bit-reduction. Visually, I can describe an entire series of phenomena (including the exact visual parallels with audio) in a couple of paragraphs, and the majority of individuals will understand what I am talking about. On the other hand, I could write an entire book on just one coding phenomena using audio terms and very few would even have a clue as to what I am talking about even after reading the entire book!

Therefore I will describe how Huffman coding is used for video in a visual way and we can go on to draw parallels with audio from there.

A DAILY DISPLAY

A form of Huffman coding can be seen every day on digital cable services including the popular cable services over “dish” type systems.

In these systems, the need to give the consumer more and more channels has resulted in removing more and more data from all the existing channels to “squeeze” addition things into the existing bandwidth. The side effects of doing this are subtle to most people as they happen outside their normal realm of perception.

Image
A JPEG picture with some artifacts from data reduction

This is also true for audio coding. It can be heard, but you need to know where and how to listen in order to hear the side-effects. More on that later. For now, to see Huffman Coding in action on video, it is simply a matter of knowing where to look. These systems are called perceptual coding for the way they use “kind-of a slight of hand trickery” to accomplish their goal.

DO NOT LOOK BEHIND THE CURTAIN

In video services, its action can be observed best by watching background images. Most of what we see on TV is static, with only a small portion of the TV screen containing actual changing (moving) images. Since our brains are wired to pay attention to moving things, we will typically only concentrate on the moving parts of the picture.

For example, a person is talking on the TV screen. We, as humans, immediately focus on the fact that his mouth is moving, then read the person’s facial expressions and listen to the words to tell the rest of the story. Very few people even notice what is happening around the actor on the screen.

With this in mind, Huffman coding schemes for video is quite interesting. It all happens with managing background images. The Huffman Coding part of bit-reduced video encoder scheme communicates with the decoder in this way: “Here is the data that makes up the background of this scene. Paint it once, and keep repeating it until I send an update for it…”

The description conversation continues: “If the change in video is small, then I will only the data that makes up that specific moving image, and the portions of the background that need to be updated.” From there, the decoder makes the appropriate changes.

NOW YOU SEE IT

There are limitations here. For example, think about what happens to video when the entire screen has to change (update) rapidly, such as when the camera is panning around quickly, or is shaking around a lot. Just try to make out any kind of details in the pictures. You typically cannot. It is usually a jumble of blocky images and bright jagged pixel squares for any bright images that zip across the screen.

If you were to compare this to the original, you would most likely notice that the original does not have these annoying artifacts. It is highly likely that you could easily see all the images clearly on the original, even though there is a lot of camera movement. Now please remember this as we go along with our discussion on digital audio bit reduction.

ARTIFACTS

I highlighted the word “artifacts” above. In the bit-reduced audio/visual world, artifacts refers to the side-effects incurred from throwing away so many bits, that what is left can no longer be reproduced in a transparent manner.

The same thing can be observed when taking images from your digital camera and, for example, manipulating these images to make the file size smaller for use on web site pages, or for e-mail.

The JPEG (.jpg) image format is the still picture equivalent to perceptual audio/video coding. You can remove amazing amount of digital data from a picture and have it still look the same – or at least very close to the original. Remove too much data, and the image starts to have strange looking things happen with color transitions, and with sharpness.

Image
When large changes happen, the artifacts become very apparent

These “strange things” are the still-frame version of the audio/video artifacts I will be referring back to repeatedly in this series.

As you have seen, the two JPEG compressed images show a visual parallel to audio artifacts. The first picture shows the typical quality of an image ready to post on a website. The second is the same picture at the same size, but with more data compression, which results in lots of visual artifacts. The more data you force the bit-reducing algorithm to throw away, the harder it is for the decoder to hide what it is removing from the original.

Audio can also contain just as much “artifacting” resulting in many strange sounds and noises that were not part of the original recording.

A KEY CONCEPT

Now, in the descriptions above I mentioned something very important. I stated that if the individuals directly compared a copy of the program material that was not encoded with the results of the encoded material, the changes would be more obvious.

With that in mind, let us return to our audio recording. In that example, what we described is an unrealistic picture to present to a Huffman encoder. Using Huffman coding alone on any audio material would not work very well.

The reason it would not work is because Huffman coding alone would not remove enough information to make any appreciable reduction in the amount of data needed to reproduce this audio file. This is why we cannot simply “zip” our audio on the fly to a decoder to “unzip it back to normal.”

Why is that? Because there are a lot of other things going on in the recording, such as the principal audio we are really trying to catch in the recording, room noises, etc. All of these other sounds are random in nature and do not lend themselves easily to Huffman coding technology. Other techniques will have to be exploited to make Huffman Encoding a more effective tool.

As we will see as we move along in our series, the audio must be broken up into smaller pieces, which allows the use of other data reduction tools (and in some cases, cascading these tools) to effectively remove enough data to create a smaller file while leaving a trail of vitally necessary “digital bread crumbs” behind for the decoder to reconstruct something that sounds pretty close to the original audio.

IT IS ALL UP TO THE CODEC

What we basically learned this time around is that what makes perceptual coding possible is the relationship between the bit-reducing COder and the DECoder (CODEC).

The Encoder’s job is to decide what information to throw away, what information to simplify, and what information to keep. The Decoder’s job is to take this information, and present it to the end user in a way that the processes used by the encoder is as inaudible as technically possible.

As we move deeper into what is going on between the encoder and decoder, it should become a bit clearer how to (and how not to) process your audio for bit reduced media.

 

Added note for my website: You can train your ears to hear some of these coding artifacts by going here:

http://www.cgould.com/listening-coding-artifacts/

Sig

Posted in Audio Processing and CODECS, Audio Processing Fundamentals | Comments Off on Audio Processing for HD Radio – Pt2

Audio Processing for HD Radio – 1

Audio Processing for HD Radio –
Getting the best out of bit-rate reduced transmission systems.
(Originally in Radio Guide’s Processing Guide magazine)
Part 1 – Changes
By Cornelius Gould

2005 will be remembered as the year digital transmission exploded onto the scene. Many engineers worked tight schedules modifying existing transmitter sites or in some cases completely rebuilding them in order implement this new technology.

As with any new enhancement to the broadcast service, there is always some learning curve to overcome to get the most reliable results from this new service.

With HD Radio technology, this learning curve is pretty intense as it involves a completely separate transmission system which makes completely separate radio waves that exist around your main AM or FM signal.

SOMETHING TO LEARN

Unlike previous advancements to the broadcast medium, this time around we are faced with a brand new technology whose inner workings are shrouded in mystery and some broadcast engineers are faced with a scenario where their best efforts fall short due to some mysterious process within their digital transmitters – and the resulting audio might not be very nice sounding.

Others are following the guidelines presented by audio processing manufacturers and are having good results, but they still want to have a better grasp as to what is going on “under the hood” to better understand the HD Radio beast they have to handle.

While driving around my section of the country with an HD Radio receiver, I find it interesting to listen to how HD Radio is being implemented by broadcasters. One thing that jumps out at me is how many systems that still need work in many basic areas.

THE OTHER DELAY

The most common problem is a lack of diversity delay on the analog channel. This is most apparent when a listener with an HD Radio drives in and out of the optimum reception conditions for the digital carriers.

Essentially what happens is they find them selves jumping back by as much as eight seconds in time when the HD carriers are decoded, only to shoot ahead as many seconds when the radio rolls back to analog service. As a result the listener can totally miss entire sentences in a conversation.

Other issues involve audio quality and audio consistency.

I have heard stations with as much as a 12 dB level difference between the digital and analog services. Another annoyance comes from stations whose digital transmitters are fed with the clipped, pre-emphasized FM analog processed audio.

Even if the de-emphasis is turned on (to make it flat again), the resulting audio heard from the digital service is still be very unpleasant to the ear.

FOCUS ON THE AUDIO

The above examples show there is a lot that needs to be learned by the broadcast engineering community about this technology. If this system is to be successful, proper adjustment and implementation is essential.

Of course, I am aware there are many people out there who feel HD Radio should not have been allowed to be used due to its use of the spectrum within what has been traditionally considered the “guard bands” of AM and FM signals. While there is considerable debate as to the validity of many aspects of this new service, I will not be addressing these issues.

The point of this series is not to convince anyone to change their opinions on the validity of this system one way or another, but rather to help point broadcasters in the right direction to get the best audio performance from what we have to work with today. Since my specialty is audio processing, naturally my focus is in that area.

I will pick apart what is going on (as best as anyone outside the iron gates of Ibiquity can) and with these tips my hope is that you will be able to get the best audio performance on your digital transmission system from understanding both the audio processing and the system in general.

NOW FOR SOMETHING COMPLETELY DIFFERENT

Broadcasting with HD Radio technology is an entirely different beast. From a technology point of view, it shares almost nothing in common with the legacy broadcasting technology with which we have become accustomed.

The biggest difference – and the hardest concept for many broadcasters to grasp – is that HD Radio is not a “linear” transmission system.

Analog broadcasting can be thought of as a linear process. That is, every sound that leaves the audio processor and enters the transmitter will be sent over the air with very little change.

On the other hand, HD Radio is not a linear process. Only a portion of the audio you feed into the HD Radio system actually makes it to your listeners. The art of deleting large amounts of audio data while preventing the human ear from “hearing” it – for the most part – is called “Perceptual Coding.”

In fact, most of the audio data is thrown away and, through some neat ear trickery and the proper use of technology, very few people will ever know!

LINEAR AUDIO

What we are describing here is not a difference between digital audio and analog audio. Digital audio can be linear too.

Analog audio is given its name because the entire process by which it works is by literally electrically copying sound waveforms onto some medium, making a literal copy of the sound image onto the medium of choice.

For our discussion here, this medium is a radio wave. As the sounds from the mouths of your announcers strike a microphone, their voice is instantly turned into an electrical signal which travels through your audio chain to change the radio signal directly in proportion to the sound at the studio microphone.

In the case of digital audio, the sounds of your announcers are still picked up by a microphone and the electrical signals are turned into digital data. This is done in a device known as an Analog to Digital converter. (The reverse happens in a Digital to Analog converter.)

BIG BANDWIDTH

There is one major problem with the basic concept of such a linear digital transmission system: assuming the process is meant to be of “CD Quality” – whatever that is – we find the system takes an enormous amount of data to accomplish its task when compared to the analog system.

By way of comparison, the analog system can create the same sound quality of digital with only 20 thousand Hertz of electrical space. Digital, on the other hand, requires almost 1.5 million Hertz of electrical bandwidth per second to reproduce the same kind of audio. (While this watered down explanation is not entirely technically accurate, it is meant to get the point across to as many readers as possible.)

If the quality is the same, and digital is not as efficient as analog, why even bother with digital?

WHY DIGITAL

The advantage digital audio has over analog is that the process of converting audio into digital bits is inherently immune to noises present in any transmission or storage medium. In other words, for all its disadvantages, the main thing you gain is the ability to make endless copies of the data and still have it sound as good as the original.

Please note that this benefit assumes you are not changing the data in any way during the copying process. This is an important factor to remember for reasons that will become apparent very soon.

There is no way to broadcast full linear CD Quality audio to listeners with the transmission systems in use for the past 80 or 90 years. Remember: it takes about 1.5 million Hertz of electrical space to reproduce linear digital audio; the most electrical bandwidth any Digital Audio Broadcast (DAB) service in existence has to work with is about 256 thousand Hertz of space.
For broadcasters using IBOC (HD Radio), the space available is even less: about 96 thousand Hertz of space. And this is assuming there is only one digital program service; there is even less space available if the secondary (or tertiary) channels are used in “Multicasting.”

Image
The digital audio needs to “fit” in about 1/16 its “normal” bandwidth

So, how do you squeeze 1.5 million Hertz of data into 96 thousand Hertz of space?

PUTTING 1.5 MILLION HERTZ IN A 96 THOUSAND HERTZ BAG

Digital Audio Broadcast services have to use methods to permanently, and destructively, discard most of the digital audio data in order to make it all “fit” within the tight spectrum constraints.

The method of throwing away this “excess” data is commonly called “bit reduction” – where varying amounts of digital bits of data are discarded to make what is left fit within signal bandwidth constraints.

Now, remember what I said before: Perfect copies of digital audio data contain no noise nor errors so long as there is no change in the digital data across many copies. Bit reduced audio involves major changes to the digital data right from the first copy and, as a result, the decoded audio has very little resemblance to the original source.

The trick is for the decoded bit reduced audio to be perceived to be a “good enough” (if not close to a perfect) copy of the original. Audio processing becomes extremely important in this area as having optimum audio performance can enhance what is left of the audio and can even make or break the entire process.

SPECIAL PROCESSING NEEDS

Over the past nine or so years, this is the area in which I have been working. How can an audio processor enhance this process for the better? What new processes can be developed specifically for this new technology?

As my wife and friends can tell you, I am obsessed with these questions. What I could not have realized back then is how much of what I learned over the years of doing this is paying off now in such a major way.

I got involved with mixing audio processing with bit-rate-reduced perceptual coding technology back in 1996 when a friend and I decided to start up a 24/7 Internet radio station. Of course, the big thing that stuck out at me was the quality of the coded audio.

It was not good, of course, and I set out to see just how far I could take improving audio quality. What started out sounding like a gravely telephone-grade programming rapidly evolved into something that sounded more like AM radio broadcasts within a month of intense audio processing work.

INTO THE CODEC JUNGLE

Along the way I became intrigued by these perceptual audio CODECs and how you can use audio processing to get the most out of their performance.

It also did not hurt that this interest took hold when I started to work for Telos Systems – one of the leaders in the handling of coded audio for broadcast applications. If I ever need to know why certain CODECS behaved the way they did, the answers were in a thick deep technical reference book somewhere in their library!

Since that time, with every new CODEC that is released, I anxiously jump on board to see what it can do – and then immediately after that what I could do with it audio performance-wise.

NEW CONCEPTS

I had not done much research work with AM or FM audio processing in quite some time. With my normal day job and dealings with small non-commercial stations I still spend lots of time adjusting what I call “legacy broadcast audio processing” on a regular basis.

But, by far, most of my research fun comes from learning how I can make perceptual CODECs “play” at peak performance through the use of external audio processing. To make bitrate reduced perceptual CODECs work at their best level, I find it necessary to research as much as possible about the technology in question.

This is also the same sort of information the broadcast engineer in the field needs to understand to make HD Radio technology play at its best.

After all, just how good would your ability to adjust your legacy AM or FM station to sound its best if you did not understand certain fundamental things such as the internal design of the transmitter, the choice of the transmission line and antenna, and the way all of that can have an effect on your audio processing efforts?

HELP ON HAND

The major audio processing manufacturers have been doing a great job at staying ahead of the curve for you. Each of them have come up with decent presets that will work acceptably right out of the box, but you and I know that the best results come from hand-tailoring your processing to your facility and market.

Doing this with bit reduction CODECs requires some knowledge of what is going on under the digital radio transmitter hood. As this series progresses, my goal is to shed some light on this and point you in the right direction to learn more as you need it.

For example, while the exact nature of the CODEC used for HD Radio is a complete mystery to anyone outside Ibiquity corporate circles, a reasonable guess by many (including me) is that it is either the HE-AAC CODEC or some derivative closely related to it.

WHERE WE ARE HEADED

During my audio processing experimentation with both the HE-AAC / aacplus technology and HD Radio, I find the results to be extremely similar – close enough that I can test ideas at home in my workshop using aacplus and implement them the next day through the HDRadio system with virtually identical results.

With that correlation in mind, I plan to base our discussions around making HE-AAC / aacplus sound its best with audio processing. To start this series, we need to look at how the HE-AAC bit reducing CODEC operates.

In a previous article (Radio Guide, Septemer 2003), I have already touched upon the basics of perceptual coding, although it was somewhat outside the scope of that series of articles. However, if you want to go back into the archives, the article was titled “The Rock and The Pin” – it is a simplified discussion on perceptual audio coding, but will make a nice foundation as we start this series.

sig

 

Posted in Audio Processing and CODECS, Audio Processing Fundamentals | Comments Off on Audio Processing for HD Radio – 1

Audio Processing U

Audio Processing University – Lessons for your ears!

We audio processing mainiacs refer to all kinds of “distortion” that only we seem to be able to hear.   You will hear us toss around all kinds of wierd terms to describe these seemingly mythical sounds we hear when evaluating audio processing systems.

This page is designed to give you an audio class for your ears, because it is all about how to listen for these sounds.

Listening to these files will also help you with some of the audio concepts described in my Audio Processing 101 page .

We’ve taken a small piece of audio from Joan Armatrading’s song “Lover’s Speak” to use for today’s lesson.  I will use grossly exaggerated audio to show what we are hearing when we use terms such as “harmonic Distortion”, “Pumping”, etc.

(when your browser asks what to do with the files below, just save it to your disk for playback)


Our first audio clip demonstrates our sample song as it is supposed to be heard.

Normal Audio
This next clip demonstrates harmonic Distortion

Harmonic Distortion Demonstration

 

We are now up to an audio clip which demonstrates “Intermod Distortion”.  Notice how the piano and vocal audio in this clip sounds “jagged” due to bass note activity.

Intermod Distortion Demonstration

 

This last clip demonstrates the phenomenon known as “pumping” notice how piano and voice “jump down” in level when the kick drum hits….

Pumping Demonstration

Hope this helps!

 

C. Gould

 

 

Audio Clip used in the demonstration – Joan Armatrading – “Lovers Speak” from the album “Lover’s Speak” (c) 2003 by Denon Records.

 

Posted in Audio Processing Fundamentals | Comments Off on Audio Processing U

Audio Processing 101

Audio Processing 101

By Cornelius Gould

What Is Audio Processing?
To help with understanding some of the terms used here, I suggest you check out Audio Processing University part One.

Audio processing has its roots in the early days in radio stemming from the desire to automatically control peak levels in a broadcast chain.   The reason for this came about as a way to assist radio console operators.  In this case, the console operators literally controlled the modulation of a broadcast station by manually turning up and / or down the levels on their console.  If the levels were to peak above 100% on the board, then the transmitter would operate illegally.   So, the operators had to act quickly when this situation arose to keep distortion and illegal over-modulation to a minimum.

This is where the automatic peak limiter came into play.  Physically, the peak limiter would be a device with audio inputs and outputs.   The unit operates by monitoring the levels feeding into it, and make audio level corrections anytime the levels exceeded a pre-set reference point (or threshold). Usually this reference point was set to 100%.

If program levels remain below 100%, the limiter assumes a “unity gain” state. That is, the audio levels on the input equal the output.   If the audio exceeds 100%, then the output of the limiter is reduced by an equal amount.  That is, if the input is, say, 175%, then the output of the limiter is 75% less than the input.

The action of these peak limiters is usually quite fast. The time it takes for these units to act is in the order of milliseconds.   After the peak condition passes, the limiters would quickly recover to unity gain.  The time it takes for a limiter to return to unity gain is usually 2 seconds or less.
The Birth of Modern Audio Processing.

In the advent of Rock and Roll radio, someone discovered that setting these limiters to “kick in” sooner than the 100%, caused the music to sound “bigger”.  This happens because of the “squeezing effect” of the limiter compressing most of the natural level variations of the program material down to one uniform level.     This, of course, was an abuse of the intention of a transmission limiter, but those who dared to break the rules discovered that they appeared to sound much louder than their competitors who used the limiter units as they were intended.

The following two clips demonstrate this effect:

Audio which has no processing on it

Audio that is processed to show the squeezing effect.

(The processing on the latter file is done with wideband processing – remember this as you read along…)

OK, back to our lesson…

This “squeezing effect” also causes the sound of music material to change dramatically…. so a rock song played through one of these radio stations after heavy limiting would sound nothing like it does when you play it on your home stereo.   It had a “much bigger” sound than normal.

As these loudness wars heated up, we found stations connecting these units back to back, thus creating (at that time) stations that were incredibly loud sounding when compared to the ones that “played by the rules”.

The main problem with this approach to loudness is that it is still totally dependent on the DJ at the studio to keep the board running as close to 100% as possible.  While Mr. DJ could no longer accidentally over modulate the transmitter, the effect of loudness hinged totally on how much studio level was being sent to the limiter.

A Better Mousetrap

Realizing this limitation, units called Automatic Gain Control (AGC) devices were created.  The AGC’s were basically slower versions of the transmission limiters, and their specific function is to control the average level of the program signal.   The compression ratio is also much looser than the transmission limiter.  In a transmission limiter, it’s compression ratio would be as close to infinity to one as possible…meaning the output level should never exceed the 100% reference regardless of the level of signal present at its input.  In an AGC, the typical ratio used was about ten to one (10:1), meaning for every ten dB increase of level, the output would only raise 1dB.

What the AGC units accomplished is a consistent operating level into the limiter unit.  This allowed the best of both worlds for most broadcasters.  The AGC would serve the function of keeping overall levels consistent over a wide range of erroneous operating levels from the studio, while the transmission limiter was allowed to be used more for what it was intended for (to protect the transmitter from over modulation). If the users wanted to “abuse” the limiters to gain more loudness, the effect was now much more consistent.

This method of creating loudness has its limitations.   This system is a wideband processing chain as the system is operating across the entire audio spectrum with a single AGC / Limiter.  The side effect of doing this is that any dominant material concentrated in a specific frequency range will cause a reduction in audio level for the entire audio spectrum.

If there is a song that starts off accapella, the wideband processing chain will adjust its output level to bring the voice to “100%” modulation.

But when the rest of the instrumentation resumes, the voice is suddenly pushed way back behind the instruments.

If the instrumentation stops, and the vocals are once again the only sound present in the program, the vocals will once again be adjusted to 100%.  This obvious “up and down” action on the vocals by the backing instruments is usually referred to as “pumping”.

Slowing down the recovery time on the AGC’s and limiters solved this problem to a degree, but the effect of loudness was lost as the “smashing” effect was diminished.

Some improvements to this problem were made by the creation of the “freeze gate”.  Freeze gates are usually implemented in the AGC section of a broadcast chain.  What this gate does is to monitor the input levels of the AGC, and whenever the audio level falls below a user-determined level, the units will hold their gain state until the audio crosses the threshold again.   This can be considered as a “dual speed” AGC system where the AGC would recover faster on louder material, and slow to a virtual freeze on low-level material.

…Then came the 80’s

80’s brought on a new style of music.  There was now music featuring heavier bass percussion than before.   This is where the wideband audio processing chain started to show its limitations.    The Bass elements, such as kick drums and synthesized beat tracks, would cause pumping in the wideband chain.

In this situation, each beat of bass percussion would “punch holes” in the upper frequency areas of the audio spectrum.   In this situation, certain songs would have totally unnatural artifacts introduced by the wideband chain.  One perfect example from this era was Kim White’s “Betty Davis Eyes”.  In this example, every time there was a bass drum beat, Kim’s voice and synthesized keyboard would just about disappear for the duration of the bass drum sound, then suddenly “jump back” in between beats. This was one of the actual songs that showed the Broadcast Community that something better was necessary.

Multiband Processing

Multiband processing actually has its roots in the early to mid 70’s and the concept behind them was simple, actually…

Split the audio into multiple frequency bands say, Highs, Mids, and Lows (bass), and feed the output of each band to its own AGC unit.   The audio could then be put together as a single source.  Doing this would give two improvements:

1) The audio was now spectrally more consistent from cut to cut as the multiband AGC, by its very nature, will have a tendency to “re-equalize” the audio to achieve more-or-less the same amount of treble, mid, and bass out of wildly varying recording styles.

2) Since there was no longer a real problem with bass notes modulating the entire audio spectrum, the units could be driven harder to achieve much greater loudness levels with fewer side-effects than was possible with the wideband chain.

Early multiband AGC systems were used ahead of wideband limiters.  This worked fine for a while, but as the loudness wars continued to heat up, and people were forced to rely more and more on the wideband limiter, the problems of the wideband chain were re-introduced via the wideband limiter.

The next obvious step was to replace that wideband limiter with a multiband one, giving us the basis behind modern audio processing as we see it today.

The following two clips are processed to the extreme, but they should show the difference in the two methods of processing quite clearly…

Heavily processed wideband audio processing

 Heavily Processed multiband audio processing

 

Today’s Systems

Today’s audio processing chains are a mixture of units each used for a specific purpose.


The typical modern audio processing “chain”

 

 

Virtually every processing chain starts with some kind of gain riding AGC (element #1 in our typical chain).

The Wideband AGC

 

 Wideband AGC

 

 

 

The job of the gain riding AGC is simply to control the levels of the programming leaving the main control studio.  Doing this will assure that the rest of the chain will always see ideal operating levels, even if the jock is “pegging” the console level meters.

The designers of these devices, and ultimately YOU must decide: “How exactly do we handle the correction once the threshold of  “too Much Level” is crossed?

The obvious answer to this question is “as slow as possible so it cannot be heard”.

What exactly does “slow” mean?

The slower you make the AGC, the longer it takes to recover from a loud segment after the operator re-adjusts the “board levels”.  But it also means that the audio is corrupted the least by AGC activity.

Speeding up the AGC means that the unit will recover quicker from “board operator” errors, but at the same time, it will have unpleasant “artifacts”.   These artifacts will make themselves known as “pumping” on the percussion elements of your music.

This brings us to the second element in the chain.

The multiband leveler. 

A leveler can be a slightly faster version of the AGC, or it can be one of the quite elegantly designed RMS detector types, where only the relative power level of the audio is sensed, rather than the instantaneous peak level.

The leveler is usually used for two purposes at once.

1)    To provide a “signature sound” for your station by automatically adjusting the equalization of your programming to your taste.

2)    Since it is multiband, it can aid in gain riding by operating faster than the ACG ahead of it.  The multiband nature of this stage will not cause much noticeable pumping to percussion elements, assuming it too is not running “too fast”.

 

The operating range of the multiband is usually less than that of the AGC so as not to cause too much of an upset in spectral balance, which limits any changes in timbre of some programming elements to an “acceptable” point.   Some people / manufacturers will give their multiband more range than others.  In any case, the usefulness of the multiband to add any significant “pep” to gain riding is typically limited.

Stage three: The (dynamic) limiter

We are referring to “the limiter” as a faster version of the leveler with the appropriate amount of compression ratio to make it a “limiter”.  We made this distinction since many manufacturers call the clipping sections of an audio processor a limiter.

A limiter is generally used as a tight peak level control device.  It is usually tighter than the leveler, but cannot limit brief transients, so Clipping limiters are used after this stage to clip off the transients.

These days, the limiters are made multiband (two or more bands) to mask most of the effects of extreme waveform distortion (which is what a limiter is supposed to do) to some degree.

The Final Limiter

This sounds so…”final” doesn’t it?  Anyway, this stage removes any loose peaks left in the audio from the limiter stage (there will be lots of them!).  It does this by “chopping”, or “clipping” the peak from the audio waveform.

This stage is also used to increase apparent loudness.   This is due to the fact that most peaks can be removed without much of the audience noticing.  Peaks can easily exceed 10% (usually more) of the 100% of the average modulation reference point.  So if you have peaks exceeding 10-20% of the maximum average modulation level you would have to reduce modulation by that much to prevent illegal operation.

Clipping these peaks from the audio will result in the ability to turn up the audio levels by this same 10-20% amount.  All without introducing any artifacts that the listening audience will notice.

The more clipping used, the louder the station appears to sound. This also means that the louder the station becomes, the more the clipping becomes noticeable as harmonic distortion .

This effect is purely aesthetic, and it requires the user to decide how much distortion he or she is willing to live with to achieve a certain amount ofperceived loudness.  This is because the final limiter is a distortion device…similar in concept to a guitar distortion pedal.  The more distortion, the louder and bigger it seems.  The problem here is the more distortion we can hear, the harder it is to listen to for long periods of time.  The more distortion, the shorter that time frame becomes.

Cleverly designed final limiters will incorporate distortion canceled clipping. What this does is to buy “extra” range by masking most of the objectionable distortion.  The net result (if done properly) is more perceived loudness with less obvious distortion.   This process isnot transparent, and the distortion masking techniques all have a sound unique to the specific type of distortion control used.  The differences between “Brand X’” processor and “Brand Y” lie mostly in the “sound” of their final clipping limiter, and how it interacts with the music material used to make up the format of the station using the processor.

Posted in Audio Processing Fundamentals | Comments Off on Audio Processing 101

The 75 Microsecond Pre-Emphasis Curve

The U.S. 75 microsecond Pre-Emphasis Curve used for (U.S.) FM BroadcastsThe U.S. Pre-Emphasis Curve

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

What is Pre-Emphasis?

Pre-Emphasis is an early form of noise reduction.  It is where the high audio frequencies are boosted by some amount and transmitted over a delivery system.  At the receive point, the high frequencies are attenuated (De-Emphasized) at a porportional amount as the boost.  By doing this, noises in the transmission system are also reduced by that amount.

This type of noise reduction is used in analog tape and phonograph recordings, as well as radio broadcasts.  This is why high frequency materials (such as “esses” and cymbols) are sometime difficult to handle through these mediums.

The chart above shows the mandated limits for broadcasting on the radio bands here in the United States.Here is what pre-emphasised audio sounds like:

This is the same audio clip with De-Emphasis applied, just like what happens through your FM radio.

These clips are also heavily processed so you can hear some of the effects of heavy clipping on the pre-emphasized audio.

Posted in Audio Processing Fundamentals | Comments Off on The 75 Microsecond Pre-Emphasis Curve