TechnicalJun 15, 20266 min read

How to mix screamed vocals so the words come through

Tensormix Team

Screamed vocals spike 10-15 dB above their body and pile overtones into the same 2-4 kHz zone as distorted guitars. A three-part approach to the onset, the harshness, and the consonants keeps the words legible instead of buried.

Close-up of a microphone windscreen with soft foam texture

A screamed syllable is built like a small explosion: a sharp consonant onset, then a sustained vowel saturated with overtones from the open-throat technique, then a fast decay into breath. The onset can spike 10 to 15 dB above the body of the scream, and the overtone cloud sitting on top of the vowel lives in the same 2 to 4 kHz region as the upper midrange of a wall of distorted guitars. That collision is why screams that sound furious in solo turn into texture in the mix, and why turning the vocal up usually makes the problem worse rather than better. The fix is a chain that handles the onset, the overtones, and the consonants as three separate jobs.

1. High-pass before anything else

Insert a high-pass filter as the first plugin on the channel. 18 to 24 dB per octave slope, set between 80 and 120 Hz for most screams, pushed to 150 Hz if the vocalist cups the mic or the room is live. Growls with intentional chest weight can stay lower, around 60 to 80 Hz. Screamed vocals carry far more low-frequency energy than sung ones because of the physical force behind them, and that energy is mostly handling noise, plosive body, and rumble that will load up every compressor downstream. Cutting it here is the difference between a chain that responds to the performance and one that responds to the mic stand.

2. Subtractive sweep through the low mids

Drop a parametric EQ in after the HPF. Boost a narrow band, Q around 3, by 6 to 8 dB, and sweep slowly from 200 Hz up to about 800 Hz. The frequencies where the boost goes from informative to ugly are the resonances you want to cut. Two zones come up repeatedly: 300 to 500 Hz for boxy, papery build-up, and 700 to 900 Hz for nasal bite. On growls the mud usually sits a little lower, around 400 to 600 Hz. Cut 3 to 6 dB at each offender with a medium Q (1 to 1.5). Henrik Udd's Dayseeker session is a good public reference for how surgical this can get; the Nail The Mix breakdown shows him making a precise cut around 234 Hz on FabFilter Pro-Q 3 specifically to clear that boxy zone before any dynamics processing.

3. Fast-attack compression on the onsets

The peak onset of a scream is over within a few milliseconds, so this stage needs a compressor fast enough to catch it. An 1176-style FET compressor (Universal Audio's original, Waves CLA-76, Plugin Alliance Black 76) is built for exactly this - its attack runs in microseconds, far quicker than most compressors. Use a 4:1 ratio (step up to 8:1 if the source is wild), a fast attack in the 50 to 200 microsecond range, and a release around 60 to 120 ms, aiming for 4 to 8 dB of gain reduction on the loudest syllable starts. Note that the 1176's attack and release knobs run backwards: turning them clockwise toward 7 makes them faster, not slower. The fast attack clamps the spike; the medium release lets the body of the scream breathe back out before the next syllable hits.

4. A second, slower compressor for the body

After the peak compressor, insert a second one with a slower attack (15 to 30 ms), a gentler ratio (2:1 to 3:1), and a longer release (150 to 250 ms). An LA-2A style or any program-dependent VCA works. Aim for 2 to 4 dB of gain reduction. This one ignores the transients, which have already been handled, and rides the overall arc of the phrase. This is the step that keeps a long chorus scream sitting at a consistent level instead of ducking under the guitars every time the singer backs off the mic.

5. Consonant-band presence boost

Hard consonants (t, k, d, the attack of s) carry most of the intelligibility, and they live in the 2 to 3.5 kHz band. Screamed delivery physically softens these consonants, and distorted guitars sit right on top of them. A 2 to 4 dB boost at 2 to 3.5 kHz, medium-wide Q (0.7 to 1.2), brings the consonants back. The alternative, which often sounds cleaner, is to cut 2 to 3 dB at 2.5 to 3 kHz on the guitar bus with the vocal playing; you make space rather than adding energy.

6. De-essing aimed at scream artefacts, not sibilance

The harshness on a screamed vocal is an overtone spike in the 3 to 5 kHz region generated by the open-throat technique itself, not the 6 to 9 kHz "s" energy that a pop de-esser is designed for. Move the de-esser detector down to 3 to 5 kHz, target 3 to 6 dB of reduction, and use a wideband detection mode so it catches broad tonal spikes rather than just sibilant transients. On growls, drop the detector lower again, to 2.5 to 4 kHz. Waves Sibilance is built around an Organic ReSynthesis engine that isolates the offending content without dulling broader high-frequency tone, which is what makes wideband detection viable here without losing the air on top.

7. Parallel distortion blended underneath the dry

Send the dry vocal to a parallel channel. Insert a saturator (Soundtoys Decapitator on a mild setting, SansAmp, or any amp sim with a clear midrange voice). High-pass that channel at 200 to 300 Hz so it does not double the low end. Blend it back 8 to 15 dB below the dry signal: the parallel should be inaudible on its own, and only noticeable as a loss of weight when you mute it. The job is to add consistent mid-range harmonic density that shares texture with the distorted guitars, which is what lets the vocal occupy the same band without fighting them. This is most useful on growls, whose fundamental sits low and tends to disappear under down-tuned guitars and bass.

8. Separate reverb sends for screams and cleans

Running screams through the same reverb send as the clean vocals smears them backward in the mix. Build a dedicated scream send: short room or hall, pre-delay 10 to 20 ms, decay under 1.2 seconds, HPF the return at 150 to 200 Hz. The clean vocal send can be longer and richer (1.5 to 3 seconds decay, 25 to 40 ms pre-delay, plate or hall). Henrik Udd's Architects work is the obvious reference for this split; a Nail The Mix article on his approach to the band describes the principle plainly: "A core principle in Henrik's approach is treating clean vocals and screams as two separate beasts."

Henrik Udd mixing vocals for Architects' "Gone With The Wind" on Nail The Mix

Where this chain tends to break down

The single most common failure is reaching for more compression or more level when the issue is masking at 2 to 3 kHz. If the scream sounds clear in solo and disappears against the guitars, the chain is fine and the guitar bus needs a 2 to 3 dB cut at 2.5 kHz. If the scream sounds harsh in the mix, the de-esser is almost always sitting too high; pull the detector down to 3 to 5 kHz and try again before touching anything else.

Every other move in this chain is standard vocal practice ported to a louder source. The 3 to 5 kHz detector placement is the one decision that separates a screamed-vocal chain from a sung one, and it is the difference between a vocal you can hear words in and a vocal that just sounds angry.