Published: 2026-06-05 · Reading time: 16 min read · Author: Nikolay Sapunov, CEO at Fora Soft
Why this matters
Every audio delivery spec hands you a number, and the companion article on per-platform targets tells you what those numbers are. This article answers the next question every operations lead, product manager, or content engineer hits: how does the number get applied, and what do I actually do to my file? Confuse the two families and you make expensive mistakes — re-rendering a file that only needed a tag, crushing dynamics to "sound loud" on a platform that just turns you down, or shipping a Dolby stream with a wrong dialnorm value that makes every living room reach for the remote. This is the practical layer: dialnorm, Replay Gain, Sound Check, and the test you run before you press send.
Two families, one job
Loudness normalization has exactly one job: make different pieces of content play back at a consistent perceived level so the listener never reaches for the volume control. There are two fundamentally different ways to do that job, and almost every loudness mistake in production comes from confusing them.
The first way is to change the audio itself — process the signal so the file, after the change, measures at the target. This is what a mastering engineer does with a limiter, what an FFmpeg transcode does, and what a streaming service does in real time when it plays your track quieter. The bytes that reach the listener are different from the bytes you started with.
The second way is to leave the audio alone and attach a number — a small piece of metadata that says "this content measures X; please apply a gain of Y at playback." The audio bytes are untouched. A volume change happens, but it happens in the player, in the digital domain, just before the sound leaves the device. This is dialnorm, ReplayGain, Apple Sound Check, and the Opus output gain.
The mental model that keeps you out of trouble: processing rewrites the file; metadata rewrites the volume knob. The loudness measurement — the gated, whole-programme LUFS figure from ITU-R BS.1770-5 — is the same in both worlds. What differs is what you do with it. (If the term LUFS, the loudness number on a logarithmic scale where closer to zero means louder, is new, start with the loudness primer.)
Figure 1. The two families of loudness normalization. Metadata leaves your bytes alone and moves the volume knob; processing changes the signal the listener hears.
Dialnorm: the volume knob baked into Dolby
If you ship anything through Dolby — broadcast television, an OTT catalogue using E-AC-3, a next-generation stream using AC-4 — you will meet dialnorm, and it is the single most misunderstood field in the audio pipeline.
Dialnorm, short for dialogue normalization, is a metadata value carried inside every Dolby Digital (AC-3) and Dolby Digital Plus (E-AC-3) stream. It does not change a single audio sample. Instead it tells the decoder, in plain numbers, how loud the dialogue in this programme is, and the decoder uses that number to turn the whole programme up or down so that dialogue lands at a consistent reference level across every channel and every programme the viewer flips through.
The field is an integer from 1 to 31. Each step is one decibel. A value of 31 means "the dialogue in this content already sits at the reference level — apply no attenuation." A value of 1 means "the dialogue is very hot — turn this content down by 30 dB." The general formula the decoder applies:
attenuation_dB = dialnorm_value − 31
Plug in a real number. Dolby's most common production value is a dialnorm of 27, which corresponds to dialogue measured at −27 LKFS (LKFS and LUFS are the same scale). The decoder computes 27 − 31 = −4 dB, so it turns the programme down by 4 dB at playback. A programme whose dialogue actually sits at −24 LKFS should carry a dialnorm of 24, and the decoder turns it down by 7 dB. The whole point is that after the decoder applies its attenuation, every programme's dialogue lands at the same place, so the viewer is not blasted when a loud commercial follows a quiet drama. This is the technical machinery behind the US CALM Act and ATSC A/85.
The catch — and the source of countless support tickets — is that dialnorm is only correct if it matches the actual dialogue loudness of the content. Dolby's recommendation is to measure the average dialogue level and set dialnorm to that measured value. Set it wrong and the decoder applies the wrong gain: a stream whose dialogue is really at −24 LKFS but carries a default dialnorm of 31 will play 7 dB louder than its neighbours, because the decoder was told to apply no attenuation when it should have applied −7 dB. Many production failures trace to an encoder left at a factory-default dialnorm that nobody measured.
AC-4 takes dialnorm further
Dolby AC-4, standardized as ETSI TS 103 190, keeps the loudness-metadata principle and extends it. AC-4 carries loudness metadata in its bitstream so the decoder can hit a target loudness on any playback device, and it adds dialogue enhancement: a separate metadata layer that lets the viewer turn dialogue up relative to the rest of the mix during decoding, for clarity in a noisy room or for accessibility. Where AC-3 dialnorm is a single global gain, AC-4 separates the dialogue object enough that the decoder can boost it independently. The principle is the same — metadata instructs the decoder — but the control is finer. The AC-4 deep dive covers this in full.
Replay Gain: the same idea for music files
Replay Gain is the music world's version of dialnorm, and it predates the modern LUFS era. The problem it solves is the one everyone has hit: shuffle a playlist and one track blasts while the next whispers, because loudness has more to do with the year a record was mastered than with the music. Replay Gain stores, inside each audio file, the gain a player should apply to bring that track to a standard loudness.
The current specification, Replay Gain 2.0, measures loudness with ITU-R BS.1770 (the same algorithm everything else uses) and references a level of −18 LUFS, chosen for backward compatibility with the original 2001 specification. The gain is a simple subtraction:
replay_gain_dB = reference_level − measured_loudness
= −18 LUFS − measured_loudness
A track measured at −12 LUFS gets a stored gain of −18 − (−12) = −6 dB, so a Replay Gain–aware player turns it down 6 dB. A quiet jazz recording at −24 LUFS gets +6 dB and is turned up. The audio file is never rewritten; the value lives as a tag (a REPLAYGAIN_TRACK_GAIN frame in ID3v2 for MP3, a Vorbis comment for FLAC, an APEv2 key for WavPack).
Replay Gain stores two gains, and the distinction matters. Track gain makes every track equally loud — good for shuffle and noisy environments. Album gain calculates one gain for the whole album as a single concatenated programme, so the intentionally quiet ballad stays quieter than the hard-rock track that follows it, exactly as the mastering engineer intended. The spec also stores a peak value per track and per album so the player can predict clipping: if applying a positive gain would push the signal past digital full scale, the player either reduces the gain or limits the peaks.
Opus is the odd one out
One important wrinkle for anyone shipping Opus — the default codec for WebRTC and a growing presence in streaming. Opus follows EBU R128, so its loudness reference is −23 LUFS, not −18. RFC 7845 defines an output gain field in the Opus header that players must apply by default, and the R128_TRACK_GAIN tag is computed against −23 LUFS, 5 dB quieter than ReplayGain's −18. If you tag an Opus file with a tool that assumes −18, the level will be wrong by 5 dB. Use a tool that knows Opus is special.
Apple Sound Check vs Spotify: metadata vs real-time
Apple Sound Check and Spotify's loudness normalization aim at almost the same target, but they belong to opposite families — and that difference changes what you do as a producer.
Apple Sound Check is metadata. Apple analyzes each track's loudness, stores the result, and at playback applies a global gain to bring the track toward roughly −16 LUFS. The file is never altered; the gain is applied in the digital domain at decode time, lossless with modern 32-bit processing. This is conceptually identical to ReplayGain — a stored number, a volume nudge — just Apple's own implementation. A modern, hyper-loud master is turned down; an older quiet recording is nudged up; the dynamics inside each track are preserved.
Spotify normalizes in real time. It does not rely on a tag travelling with the file. When you play a track, Spotify measures it and applies a live gain to hit its target of −14 LUFS, with three user-selectable modes: Normal (−14), Loud (−11, which adds a limiter to protect quiet masters being pushed up), and Quiet (−19). Because the normalization is computed by the service, it works regardless of what tags your file carries.
The practical consequence is the rule every music producer should internalize: you cannot out-loud the normalizer. Master your single to −8 LUFS to "sound loud on Spotify" and Spotify simply turns it down 6 dB to hit −14 — you traded away dynamic range and gained nothing in playback level. One master at the music-streaming cluster target, around −14 LUFS with a true peak at −1 dBTP, plays correctly almost everywhere; Apple's −16 just means a touch less turn-down. The targets themselves, platform by platform, live in the per-platform targets article.
Figure 2. Which mechanism applies where. Find your destination on the left; the right tells you whether you tag, re-render, or just measure.
Normalize vs attenuate: which direction can the platform move?
A subtle but money-saving detail: not every destination can move your level in both directions. Some only ever attenuate — turn loud content down — and never boost quiet content up.
Spotify, with album-aware boosting and its Loud mode limiter, will turn a quiet master up. YouTube and, in most cases, Apple's playback normalization only turn content down; a master quieter than the target is left where it is, so it simply plays quiet. ReplayGain and dialnorm both move in either direction, bounded by clipping headroom on the way up.
Why it matters: if your destination is turn-down-only, mastering quiet "to be safe" means your content plays quieter than everyone else's and loses perceived punch. Master to the target, not below it, and let the platform attenuate from there only if it needs to. Attenuation is always safe; relying on a boost that may never come is not.
What happens when you break a target
Three failure modes recur, and each maps to a family.
Metadata mismatch (dialnorm wrong). The audio is fine; the number lies. The decoder applies the wrong gain, and the programme is too loud or too quiet relative to its neighbours. The fix is free: measure the dialogue, correct the dialnorm field, re-mux. No re-encode of the audio is needed because the samples were never the problem.
Over-compression to chase loudness. You crushed dynamic range to push the integrated loudness up, but the destination normalizes to a fixed target anyway. You end up at the same playback level as a clean master, minus the dynamics, plus likely true-peak overshoots on the listener's earbuds. Pure loss. The fix is to re-master with sane dynamics and let normalization do its job.
True-peak clipping after a boost. A quiet master gets boosted (by ReplayGain, by Spotify's Loud mode, by a player's pre-amp) and the peaks exceed full scale, clipping on the DAC. This is why the Replay Gain spec stores peak metadata and why streaming specs ask for a −1 dBTP ceiling — headroom so an upward gain doesn't clip.
How to test a master before delivery
You never have to guess. The same BS.1770 measurement every platform uses is available for free on your own machine through FFmpeg, so you can verify integrated loudness and true peak before you ship. The reliable method is two passes: measure, then act on the measurement.
# Pass 1 — measure only. Prints integrated loudness (I), true peak (TP),
# loudness range (LRA) and threshold as JSON. Read the numbers; change nothing.
ffmpeg -i master.wav -af loudnorm=I=-14:TP=-1:LRA=11:print_format=json -f null -
Read the input_i (integrated LUFS) and input_tp (true peak, dBTP) values from the JSON the filter prints. If input_i is already at your destination's target and input_tp is below the ceiling, you are done — ship the file untouched and, where the destination uses metadata, attach the right tag instead of re-rendering. If you genuinely need to re-render to a target (a delivery spec that requires the file itself to measure at −23 LUFS, say), run a second pass that feeds the measured numbers back and applies one linear gain so the dynamics survive:
# Pass 2 — apply a single linear gain using the measured values from pass 1.
# linear=true keeps it a clean gain change, not a moment-to-moment compressor.
ffmpeg -i master.wav -af loudnorm=I=-23:TP=-1:LRA=11:measured_I=-18.2:measured_TP=-3.4:measured_LRA=9.1:measured_thresh=-28.7:linear=true -ar 48000 out.wav
The decision rule sits underneath both commands: measure first, then choose the family. If the destination reads metadata (Dolby, Apple, a Replay Gain–aware library), the file can stay as it is and you set the tag. If the destination needs the bytes themselves at a level (a broadcast deliverable, a fixed-target transcode), you re-render with a linear gain. You almost never need to crush dynamics, and you should never set a metadata value you haven't measured.
Where Fora Soft fits in
In OTT and Internet TV systems we build, per-destination loudness handling is wired into the transcode and packaging step, not left to the upload. A single mezzanine master is measured once, then routed: a dialnorm value is computed and written for the Dolby renditions, a linear-gain transcode produces the broadcast deliverable, and the streaming renditions are left at the master level for the service to normalize. In video conferencing and telemedicine products we ship, the same discipline applies to the Opus path, where the −23 LUFS reference and header output gain have to be respected so voice levels stay consistent across participants. Getting the mechanism right per destination is the difference between a catalogue that plays evenly and one that generates volume-complaint tickets.
What to read next
- LUFS Targets per Platform in 2026 — the numbers each mechanism aims at.
- True Peak, dBTP and the Inter-Sample Peak Problem — why a boost can clip.
- Loudness Normalization: EBU R128, ITU-R BS.1770, ATSC A/85 — the measurement standard underneath everything here.
Call to action
- Talk to a audio engineer — book a 30-minute scoping call to talk through your dialnorm replay gain normalize plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the Loudness in practice — cheat sheet — One-page reference: dialnorm range and attenuation math, ReplayGain (-18 LUFS) vs Opus (-23 LUFS) references, Apple Sound Check vs Spotify (metadata vs real-time), the normalize-vs-attenuate rule, and the FFmpeg loudnorm two-pass test….
References
- ITU-R BS.1770-5, "Algorithms to measure audio programme loudness and true-peak audio level" (November 2023). The K-weighted, gated LUFS/LKFS measurement and the dBTP true-peak definition that every mechanism in this article reads. Tier 1. https://www.itu.int/rec/R-REC-BS.1770
- IETF RFC 7845, "Ogg Encapsulation for the Opus Audio Codec" (April 2016). Defines the Opus header output gain (Q7.8 fixed point, applied by default) and the
R128_TRACK_GAIN/R128_ALBUM_GAINtags computed against −23 LUFS. Tier 1. https://datatracker.ietf.org/doc/html/rfc7845 - ETSI TS 103 190-1 / 103 190-2, "Digital Audio Compression (AC-4) Standard" (current revisions to 2025). AC-4 loudness metadata and dialogue-enhancement metadata structures. Tier 1. https://www.etsi.org/deliver/etsi_ts/103100_103199/10319002/
- EBU R128 v5.0, "Loudness normalisation and permitted maximum level of audio signals" (November 2023). The −23 LUFS reference Opus and broadcast use, and the streaming supplement. Tier 1. https://tech.ebu.ch/publications/r128
- ATSC A/85 (with Corrigendum No. 1, Feb 2021), "Techniques for Establishing and Maintaining Audio Loudness for Digital Television." The Anchor Element (dialogue) approach that dialnorm serves, and the −24 LKFS US target. Tier 1. https://www.atsc.org/atsc-documents/a85-techniques-for-establishing-and-maintaining-audio-loudness-for-digital-television/
- Revised ReplayGain specification (ReplayGain 2.0), Hydrogenaudio Knowledgebase (rev. January 2026). The −18 LUFS reference, the gain = reference − measured formula, track vs album gain, peak metadata, and the ID3v2/Vorbis/APEv2 tag formats. Tier 6 (community spec); cross-checked against ITU-R BS.1770-5. https://wiki.hydrogenaudio.org/index.php?title=ReplayGain_2.0_specification
- "Dialnorm," Wikipedia / HandWiki (orientation), corroborated by Dolby encoding documentation. Dialnorm range 1–31 mapping to −30…0 dB and decode-time attenuation. Tier 6 for orientation; the normative source is the Dolby/ATSC framing in refs 3 and 5. https://en.wikipedia.org/wiki/Dialnorm
- Dolby Professional, "Dolby AC-4: Audio Delivery for Next-Generation Entertainment Services" (white paper). First-party deployment context for AC-4 loudness metadata and dialogue enhancement. Tier 4. https://professional.dolby.com/siteassets/technologies/dolby_atmos_ac-4_whitepaper.pdf
- MeterPlugs, "Apple Switches to LUFS, Enables Sound Check by Default" (2022). Apple Sound Check as metadata-based normalization toward −16 LUFS, applied as a global gain at playback. Tier 4. https://www.meterplugs.com/blog/2022/03/23/apple-switch-to-lufs.html
- Spotify for Artists, "Loudness normalization on Spotify." Real-time normalization, −14 LUFS Normal target, Loud (−11, with limiter) and Quiet (−19) modes, −1 dBTP upload guidance. Tier 4. https://support.spotify.com/us/artists/article/loudness-normalization/
- kylophone, "FFmpeg loudnorm" reference, and the FFmpeg
loudnormfilter documentation. Two-pass measure-then-apply workflow, the JSON measurement output, 192 kHz upsampling for true-peak detection, andlinear=truefor a single clean gain. Tier 4. http://k.ylo.ph/2016/04/04/loudnorm.html


