Technical Note BC-05 - Broadcast Audio
FITTING THE BANDWIDTH
A Discussion on the Nuances of Broadcast Limiting

BACKGROUND

Most FM radio station operators are aware of the finite allocation of bandwidth available to each station on the US broadcast band (88-108MHz). In the process of meeting that requirement, filtering and limiting are employed. In the early days of FM broadcast, good technical limiting did not emphasize the loudness of the signal, rather, it addressed the FCC requirements to remain within the allocated 150KHz channel width by preventing carrier modulation from deviating more than 75KHz to either side of the assigned center frequency.

A LITTLE HISTORY OF LIMITERS

Early limiters achieved this goal, for the most part, by responding to the shortest peaks, and lowering overall level (like a fast volume control) to a point where the loudest peaks would just hit 100% modulation, then slowly ramp the gain back up during softer passages. This worked okay in the 1960s when most FMs played movie soundtrack music and orchestral music which sounded best with minimal processing. However, with the arrival of the 1970s and pop music taking on a more aggressive sound, the old technology began to show its age. More loudness was wanted by GMs in competitive markets, so the first thing that happened was they shortened the time it takes these limiters to recover from a peak transient. The reasoning was that if the gain were to rapidly rise after each peak, then the average loudness would increase. On the down side, this method of increasing loudness brought a new artifact: "gain pumping." In extreme cases, the program material would lose all sonic impact and turn to lifeless mush, giving the sonic equivalent of putting the music through a strainer, if one were to apply a visual metaphor.

It wasn't until the early 1980s when systems like the CBS Volumax was eclipsed by more modern methods employing "clipping" to increase loudness without attendant "pumping" effects. Such new generation devices included products by Harris (MSP-100), Dorough (DAP-610), Orban (Optimod 8100) and others. That was soon followed by variations on where in the air chain, clipping was employed. Aggressive designs employed actual clippers following the stereo subcarrier + main mix; this came to be called the "composite clipper."

THE CRUX OF THE MATTER

Today, with formats all sounding alike, the only thing left among competitors in a given market is either loudness, or, a certain firmness to the sound. The choice depends on the decisions made by the GM and the Director of Engineering. Loudness will never be last in line, as long as the format is pop/rock-oriented.

So what is loudness anyway? One might define it as the magnitude of a sound, but that's too broad, so let's get a little deeper. The perception of loudness depends on the complex system of human hearing. It's composed of a mechanical system (the eardrum, cochlea, nerves) and a psychological system (the brain, memory, emotions). Because of the response time for signals to be processed by the ear-brain system, it has been established that the system of human hearing is average-responding -- that is to say that very short term, high amplitude spikes are not perceived as strongly as longer-duration sounds. Upon this characteristic is based one of the fundamental cornerstones of broadcast audio processing: emphasizing the average RMS power level of the program signal.

So how do we accomplish this? Much program material has rather "spiky" peaks, and a rather high peak-to-average ratio. If we try to mash it all down with compression, the end result sounds disgustingly-mashed together. That's because compression, by changing the volume instantaneously, changes the rise angle of the signal. The rise angle, or rise time, of a transient is the single determining factor which we perceive as "punch" and "kick" and "dynamic."

MY "WAVE VELOCITY FACTOR" THEORY

In studying the launch angles of various versions of a signal passed through different processing chains, I soon noticed an analogy to velocity, whether it be of a piston/diaphragm in a transducer, or a swinging hammer. In the case of the hammer, it is easy to visualize that a hammer moving rapidly can do a lot more concussion damage than one moving very slowly. Even if it only moves a short distance. A 1/4" stroke with tremendous velocity has more impact than a 12" stroke that takes several seconds to complete. The stroke length could be compared to the carrier deviation, or signal amplitude of the program. Much of the impact we sense in music percussion comes from this factor which I call "velocity" --the speed with which the oncoming wavefront arrives and completes its rising edge. The zero to ninety degree portion of the rotating vector. I chose to use the term "velocity factor" here, but I must first differentiate it from transmission line velocity factor, which is a different use of the phrase. The concept I am discussing here has nothing to do with transmission lines or the speed of a traveling wave. Within the context of this special definition, it has to do with the characteristic rise time within one cycle of a wave. Let's illustrate what I mean by this:

Here is an un-processed sine wave. For arbitrary intents, let's say it's over-modulating the transmitter, but it's got the full dynamic range of the original "program." It's launch angle is pretty steep, about 25 degrees less than 90 (noting the angle between the red and the blue arrows). Note that angle. Its power factor works out to 68% (in theory, it maxes out at 70.7%, but this measurement was a fraction of a dB below the instrument's limit, hence the lower reading).

Here is the same waveform, compressed by a non-clipping limiter. Notice the different launch vectors. The signal processed by compression has a shallower launch vector, about 45 degrees. I derive my "velocity factor" based on these vectors. Let's set some arbitrary standards. A 90-degree vector shall equal 100% velocity. A 0-degree vector (zero amplitude) shall equal a velocity of zero. Therefore, a 45-degree vector equals a velocity of 50%. A square wave would have infinitely fast rise time, so its velocity would be 100%.

So our unprocessed sine wave appears to have an angle of about 25 degrees, which puts it at nearly 70% velocity. Our compressed but unclipped sine wave has a launch angle of about 45 degrees, which puts it at a velocity of 50%. Notice that this discussion conspicuously omits mention of peak amplitudes. While they weigh into the RMS power factor from a reference point based on an arbitrary amplitude, they have little meaning when we start to compare various "100% modulation" waves of different launch angles.

The compressed wave has much less velocity. Even the RMS power factor is barely more than half that of the uncompressed wave, about 36%. So how do we overcome this?

Let's look at what happens to the launch vector angle when clipping is applied instead of compression alone:

Here's that exaggerated clipped wave. In practice, we wouldn't clip this much, but it makes it easier to illustrate what's happening to the launch vector. Notice how, despite the amplitude being just half of the uncompressed wave, the launch vector is nearly identical to the uncompressed wave? The velocity factor is nearly 70%. Even the RMS power factor has increased substantially, despite the fact that some of the weighting is still based on amplitude. The RMS power is 44%. Now if this wave represented the rising edge of a snare drum hit, you could see how the "hammer" can do as much "damage" in this compressed and clipped version as with the longer "throw" of the uncompressed version. Not nearly as much "impact" in the compressed version; that "hammer" moves at half the speed. Since the ear perceives the launch angle of the signal as the leading cue for dynamics, we can say that the "dynamic range" of the signal processed by clipping is essentially the same as the raw signal, despite the fact that it's clipped and half the peak amplitude. This is great news, since the peak amplitude is what we need to limit to maximize use of the allocated channel bandwidth in an FM transmission.

Since our hearing relies on the speed of the wavefront to interpret dynamics and impact, we can often discard some of the amplitude and the mind will fill in the rest subconsciously. This is the primary rule of psychoacoustics as applied to well-implemented broadcast processing. Of course, it is largely an illusion, but a very effective one.

Now in reality, such extreme clipping would be impractical, but I chose the sine waves to make illustration easier to follow. In the real world there are two parts to a program signal:

The useful part--the part containing the most audible information
The part containing "theoretical peaks" --the part that often takes an extra 7-9dB of dynamic range
but contributes little to the audible impact of the sound.

Let me qualify #2 a little bit: As an audiophile and "purist," my contemporaries would have me burned at the stake for making such a statement. However, in the world of broadcasting, it is a constant tug of war between noise floor (a form of distortion believe it or not) and faithful reproduction of the original material. There are times when clipping to raise the average above the noise floor is the lesser of two evils and results in effectively lower distortion. That being said, let's go on:

Given the two characteristics of the typical musical waveform as mentioned above, we can conclude one fact if we seek to maximize the loudness: we must find a way to discard the "theoretical peaks" --those short spikes which add little loudness but trigger massive limiting action -- while doing as little damage as possible to the listenability of the program material.

If we have to clip, then how much clipping is acceptable? That depends on two factors:

The quality expected on the final product
The type of program material (some material is more forgiving than others)

I did some careful measuring of some of the commercial FM stations to see what the real-world was doing. Using an oscilloscope connected to the last IF stage ratio detector on a laboratory FM tuner, I made my measurements. The local rock station was clipping for durations as long as 5 mS. Depending on the material, it was barely noticeable, to slightly plagued with intermod distortion when heavy bass percussion was present. Other stations had varying degrees, but none clipped for less than 1 mS during percussive transients. One millisecond is nearly impossible to hear, except under classical musical program conditions, on very high resolution hi-fi equipment.

It so happens to be that most "theoretical peaks" that we want to eliminate are also under 1 mS in duration. This fortunate combination of characteristics makes possible a limiter design which retains a high degree of dynamic range, leaving the rise times of the program material passing through it largely intact. This type of clipping can be applied nicely to a broad-band limiter system.

Let's look at some real-world program waveforms.

Here's 5 seconds of typical program audio. It's a selection containing bright brass leads, string backup and typical jazz drum set percussion. Because it's recorded rather "bright" and because of the nature of brass sound, it has a high peak to average ratio, not much RMS power. In fact, just 7% as averaged over 5 seconds. Note the tall, thin spikes that appear in abundance. This is how it appears and how it leaves the studio mixing board.

Now let's see what broadcast processing does to "beef up" the sound:

Here's the same program material as processed through a good air chain. Note that the peaks are still there, but the average level has come up considerably. The average RMS power over the entire duration is 23%. The sound is fuller, with more bass punch than the original, and you can still hear the kick of the snare drums. No gain pumping is evident.

Let's magnify the horizontal so we can see 20mS of time in detail:

Here we see the raw signal off the mix board, just at the onslaught of a dynamic accent in the music. Note the high peak to average ratio, the thin, needle-like spikes of considerable amplitude in this sample. The spikes actually about equal the amplitude of the more substantial "body" of the signal. Strict limiting of such a signal would push the "body" or average level (which equates to loudness by nature of the ear-brain mechanism discussed earlier) down so low as to make the on-air sound seem weak, anemic, uncompetitive.

What we need to do to correct this situation is to remove the short-term spikes, or lower their amplitude without using a super-fast limiter (the latter of which would reduce the launch angle of the transients, thus removing much of the sonic impact). This is where the clipper comes into play. Let's look at the magnified output of our air chain:

Here, we are looking at the output of a tuner, tuned to our station. "But it doesn't look clipped!" Perhaps not, but remember, we are using clipping judiciously and the signal undergoes much harmonic filtering prior to broadcast, so clipping artifacts are very minimal. However the end result is a more powerful average signal. The low frequency component saw a slight boost from multiband processing, and the spikes are still present, but shifted to points within the now-dominant average RMS power of the program --the part with the most information. Notice also that the symmetry is restored. The spikes have been reduced without altering the timbre of the music.

LIMITER BALLISTICS PLAY A ROLE

Now that we've covered the seldom-observed phenomena of vectors as they relate to perception of sonic impact, let's briefly touch upon the dynamics of the time constants of the gain control circuits in the limiter itself. As with clipping, there is a rule of using timing in moderation. Too fast an attack time results in "ducking" or "suck-out," while too slow a response time results in gross clipping, or worse, overmodulation. The key here is "reasonably fast" but not instantaneous.

There are two parameters to limiter ballistics: attack time and decay or recovery time. Let's talk about the latter for a moment. This is one of the characteristics which determine the quality of the listening experience. Usually, slower is better, and again, a balance must be found between maintaining average loudness and retaining a sense of "openness" and "contrast" without obvious gain pumping or "breathing," or worse, mashing the dynamics to mush. Part of the quality that makes music sound "transparent" and "open" is the brief, but barely noticed, silence, or near silence in between percussive attacks. The silence may be only 20mS long, but it provides the contrast that pleases the listener and helps convey the sense of dynamics.

The need for both attack and recovery times will be affected by the amount of clipping employed and where in the chain the clipper resides. Usually, one can allow for a greater leeway in the timings of attack parameters, and can allow a longer recovery time constant, because clipping ultimately controls overshoot, and since the sense element of the loop that tells the gain stage how much to lower the gain is affected by this, then the less the overshoot, the less the gain will need to be reduced, hence the less the effects of a longer recovery time will be heard in an undesirable way. The clipper gives us a little more grace on the ballistics timings.

MULTIBAND PROCESSING

I just love it when people brag about multi-band processing. Some of the worst audio I have heard came from poorly-adjusted multi-band limiters. It's so easy to tune everything for fast recovery, so not only the whole audio spectrum is constricted, but also audio existing within individual octaves,and finally the massive baseband clipping that gets applied to that mush, making the multi-band limiter the ultimate sonic Cuiseinart. The rule here is do it with subtlety. Instead of maxing out the compression range or every band to 15dB or more, and applying super-fast recover times in the midrange and treble limiters, try using a maximum compression of 6dB and slow the recovery in the midrange down considerably. You might want to try this with the highs too. You'd be surprised how nicely the percussion comes back. Instead of just the reverberation of a snare drum hit, you'll even hear the snare drum "pop" itself.

Multiband compression should be used to level the spectrum, not pound it into mush. The idea is that the work of the protection limiter that follows is made easier by virtue of the fact that all energy spectra are presented to it in a balance, from low bass to high treble, rather than all bass and no midrange, causing pumping effects, for instance. In addition, radio is somewhat of an egalitarian medium; it's designed to make all program material sound consistent in loudness and spectral balance. Multiband limiting is essential to the latter effect. Some early experiments I conducted with using minimal processing on a pop music format were monitored in a traveling automobile. From this, I was able to quickly realize that processing is essential to getting a "hardy" sound on the radio. This is a comparative world. When you share space on the dial, you are constantly being compared to the "neighbors." So it's beneficial to compete with the neighbors.

I want to take multiband processing to the next step, but applying waveform shaping to each band individually, for minimum intermodulation distortion and maximum loudness. I have been unable to find any commercial equipment applying separate shaping within divided bands--only the summing of compressors to one wideband clipper composed usually of diodes, which have a soft threshold.

HOW IT'S IMPLEMENTED

The design of a broadcast limiter must be carefully-planned. The ballistics of the limiter gating circuitry have to be fast enough to respond to the average levels completely and totally, but not so fast that a single record click causes a gross "ducking" effect. The response time of 1-3 mS is generally good for most program content. So the limiter reacts after the first millisecond or so of the transient passing through the chain. What's left are the spikes. This whole thing passes through the clipper stage next. The clipping threshold is set just a bit higher than the limited audio, but low enough to chop off all the spikes that didn't get touched by the limiter variable-gain amplifier before it.

Every limiter has a gate voltage sampling or sense input. This circuit "sniffs" the signal and determines, by comparing it against a reference voltage, whether to increase or decrease the gain of the amplifier. In order for the benefit of the clipper to make any real advantage known, the sense circuit must sample the output after the clipper. If it sampled before the clipper, it would trigger on those "theoretical peaks" and the output would be much reduced (as much as 9dB!). By sampling the post-clipping signal, false triggering is eliminated and the overall average gain is greatly increased.

Many limiters use diode clippers, which have a soft threshold of 2-3dB. A hard clip threshold gives one even tighter aperture, allowing modulation levels to hug 100%, rather than a fuzzy range from 96% to 100%. Diodes begin to add intermod distortion, which I heard on virtually every modern commercial FM station with a Hot AC or AOR format, beginning at about 96% modulation. That window of non-linearity is 4%. While it's done to ostensibly produce lower order harmonics, higher order harmonics of "hard clippers" can be dealt with by downline filtering, with less audible intermod distortion and .5dB more usable loudness -- 5% more useable modulation. This principle applies to both monoband and multiband processing. However, if applying waveshaping to individual bands, some degree of low order clipping, or soft clipping may add useful "fatness" to the bass register. I use a form of diode clipper processing in my new bass processor, which is not the last line of protection, so it's okay and beneficial in a narrow-band application like shaping the bass for increased RMS power. There is a limit to how much a waveform can be clipped before its fundamental order becomes degraded. Soft clipping "rounds" the edges and steepens the slopes of the wave, which tends to add "whack" to the sound, especially kick drums, and is more beneficial than hard clipping in this case, when applied in moderation. Too much clipping would reduce real fundamentals and the result would be hollow sound when real bass is present. It's not advisable to go much beyond 20% clipping, when processing bass like this. Even so, the duration should be limited to 100mS, lest continuous bass notes be degraded.

Finally, I use diode bridges to "clamp" the spikes of intense high frequency program material within the preemphasis circuit itself. You can see that a multi-pronged approach emerges here as the method behind effective and clean broadcast audio. (It should be noted that since making this HF clamp modification and doing away with the Sibilance Limiter, the high end has opened up remarkably. When monitored on a audio spectrum analyzer connected to a tuner, one can observe a full 4dB more output in the 15KHz region than all the brightest-sounding FM stations in the area. Sonically speaking, this resulted in a sense that the treble was not as compressed as typical broadcast processing. Indeed, a A-B comparison to the CD player versus the tuner audio revealed that in fact, the treble was very much intact, despite being safely limited to 100% modulation on the loudest peaks.)

AN ACTUAL EXAMPLE

The typical path an audio signal follows in my station is fairly long and complex. It leaves the mixing board and hits a Spectro Acoustics 210 10-band graphic EQ, where the extreme low bass gets a boost and the upper midrange gets a modest 3dB boost for "presence", then, depending on program content, the very high end might get anywhere from 3dB to 9dB of boost.

Next, the signal is fed to my compander circuit, which levels the broadband volume over several seconds time so it doesn't "gain ride". This circuit also has a noise reduction or downward expander. From there, it feeds into a new bass processor, which is comprised of a state-variable filter that splits the audio into <90Hz and >90Hz bands, the lower of which is fed to a bass "soft clipper" to increase the apparent loudness of bass, and to take care of some really excessive program material to prevent "pumping" of the main limiter. It is both a slow-attack variable gain bass amplifier and a soft clipper. The high pass band is fed to the summer, along with the bass processor output. The high pass band receives some special attention with a clamp circuit that begins to become effective at 5KHz and above. It consists of another soft clamp circuit composed of zener diodes capacitively coupled in the feedback circuit of the highpass amplifier. Everything sums back together and is fed to the main transmitter.

At the transmitter, the signal meets a 2-pole low pass <15KHz filter, followed by a preemphasis circuit that is accurate to within 0.25dB of the ideal FCC 75uS preemphasis curve. Within the circuit is yet another HF clamp, to further control excessive levels that might develop above 8KHz after preemphasis. Finally, the output of the circuit gets one more <15KHz filter stage to level off anything above 15KHz so as to prevent meaningless but detrimental triggering of the main limiter.

Now, it finally makes it to the limiter itself! And here we have the gain stage, followed by a clipper, with the gain stage's sample loop reading the post-clipping levels. Recently, I also added yet another HF clamp circuit--this one in the feedback of the gain stage, to further soften any "hard" clipping that might occur should the program contain bell trees miked up-close, or should the deejay decide to jingle the car keys vigorously in front of a wideband studio condenser mic--both things that cause gross distortion on even the best air chains.

Finally, it's on to the stereo generator, which begins with a pair of extremely sharp cutoff low pass filters which are flat to 15KHz, 3dB down at 15.75KHz and 90dB down at 19KHz. At the end of the filters is a final, hard clipper to take care of any passband ripple that may occur due to aging and extremes of environmental temperature, and a simple one-pole LPF to smooth any slight final clipping that might occur. I am dead set against clipping the composite signal, because it will produce drastic crosstalk problems and unchecked harmonic output that can interfere with SCA and possibly 1st adjacent channels. I believe that no action should be taken that would compromise stereophonic channel separation; it should be practical to broadcast two separate programs in left and right channel with no crosstalk at all times.

It is possible to provide all the volume benefits by performing the clipping prior to the matrix and subcarrier modulator. With this simple modification alone, I was able to achieve loudness at 100% modulation that would require overmodulation to 120% to equal without this kind of final clipping. Good broadcast audio receives multiple levels of subtle massaging. Remember, moderately applied, each of the processes discussed above, can result in a greatly increased listener coverage. Such processing can, as I've demonstrated in DX test situations, provide perfectly listenable audio without irritation, even when the reception is barely adequate to receive in stereo, where the signal-to-noise ratio is as low as 20dB. Effective processing can mask that noise effectively. And that is a major goal of broadcast audio.

END

Technical Note BC-05 - Broadcast Audio FITTING THE BANDWIDTH A Discussion on the Nuances of Broadcast Limiting