Why this matters
If you build or operate a product that delivers video or audio — streaming, OTT/Internet TV, e-learning, telemedicine, or AR/VR — a client or a product manager will eventually say the word "Atmos," and what they mean by it depends entirely on which costume they have seen. The marketing person means "the immersive thing Apple Music advertises"; the broadcast engineer means "AC-4 or E-AC-3 with object metadata"; the cinema person means "64 speakers." This article gives a non-technical reader the full mental model — what Atmos stores, how it is carried on each platform, what it costs in bitrate, and what loudness target applies — so you can scope an immersive-audio feature, write a sane storage and CDN budget, and talk to audio engineers without guessing. It builds on the channel-vs-object-vs-scene taxonomy but assumes you have not read it; every term is defined before it is used.
Atmos is object-based audio with a channel bed
Before the four costumes, the body underneath them. Dolby Atmos is object-based audio, which means the file does not say "play this signal out of the left-rear speaker." It says "here is the sound of a helicopter, and here is its position in the room — up, behind you, drifting left." The sound is called an audio object, and the position-over-time is metadata: data about the sound. A piece of software called the renderer reads the position and works out, at playback time, which of your actual speakers should play how much of that object to put it in the right place. That is the whole trick, and it is why one Atmos master can play correctly on headphones, a soundbar, a 5.1 system, or a 64-speaker cinema with no separate mix for each.
But Atmos is not pure objects. It is a hybrid, and the hybrid is the part most people get wrong. An Atmos mix has two layers. The first is a bed: a set of ordinary channel-based audio — fixed speaker feeds — used for ambience, music, and anything that does not need to move with precision. Beds are written in a three-number notation we will decode shortly, often 7.1.2 in the home renderer. The second layer is the objects: individual sounds with positional metadata that the renderer places dynamically. A door slam, a passing car, a single line of dialogue — these become objects.
The Dolby Atmos Renderer, the software studios use to author the mix, accepts up to 128 inputs in total, where each channel of a bed counts as one input and each object counts as one input. Within that budget the renderer supports up to 118 object paths, with the remainder going to the bed. That single constraint — 128 inputs, of which up to 118 can be roaming objects — turns up again and again, because every Atmos costume is built from the same 128-input master.
Figure 1. The anatomy of an Atmos mix: a channel-based bed plus up to 118 positional objects, 128 inputs in total, feeding one master.
A useful analogy holds the whole article together. A channel-based mix is a stack of pre-addressed envelopes — "left-rear speaker" written on the outside, the sound sealed inside, delivered perfectly only if your room has a left-rear speaker in the right spot. An object-based mix is a stack of GPS addresses — "this sound lives at these coordinates" — and the renderer is the local driver who knows your neighbourhood, your exact speakers, and delivers each sound to the right spot regardless of how your room is laid out. The bed is the handful of envelopes you keep for the background; the objects are the GPS-addressed packages that need to land precisely.
Costume 1: Atmos in the cinema
The cinema is where Atmos started in 2012, with the release of Brave, and it is the costume closest to the raw master. Understanding why Dolby built it the way they did explains the rest of the format. Before Atmos, cinema sound was channel-based: a fixed number of speaker feeds (5.1, then 7.1) that every auditorium wired up identically. The problem was that a small screening room and a thousand-seat hall got the same small number of feeds, so the big room simply ran more speakers off each feed — the back-left "channel" might drive a dozen physical speakers all playing the same thing. A sound could not travel smoothly across that wall of speakers, because they were all one channel. Atmos broke each speaker out into its own addressable zone and let the mix describe sounds as objects that the room places, so a sound can travel smoothly from the screen to the back of a 64-speaker hall. That is the origin of the object idea, and every downstream costume inherits it. A cinema Atmos package carries up to 128 audio tracks — a 9.1 bed plus up to 118 audio objects — and the auditorium can drive up to 64 independent speaker feeds, where every speaker is its own addressable zone rather than part of a wired-together surround array. The same single package, shipped inside the Digital Cinema Package (the DCP, the file a projection booth actually receives), adapts to whatever the room has, "from 5.1 and 7.1 up to 64 channels."
The adaptation happens live. The in-theatre Dolby Atmos cinema processor reads the bed and the 118 object paths and renders them, in real time, against the known positions of that specific cinema's speakers. That is why the identical Atmos master sounds correct in a room with 24 speakers and a room with 64: the creative intent ("the helicopter is up and behind, moving left") is preserved, and each cinema solves the placement problem for its own geometry.
Two cautions matter here for anyone reasoning about Atmos in a product. First, Dolby describes the cinema package as carrying its audio losslessly — that is true of the DCP, and it is the only place "lossless Atmos" routinely applies. The streaming costumes in the next section are lossy. Do not carry the word "lossless" from the cinema description over to a streaming pipeline. Second, the cinema's 118-object figure and the home renderer's 118-object figure are the same architectural number — 128 inputs minus a roughly 9- or 10-channel bed — not two independent facts. State it once and reuse it.
Costume 2: Atmos at home and in streaming
Shipping 128 separate channels of audio to a living room is not practical; at 48 kHz and 24 bits per sample, uncompressed, one channel costs:
48,000 samples/s × 24 bits/sample = 1,152,000 bits/s = 1.152 Mbps per channel
128 channels × 1.152 Mbps = 147.5 Mbps
About 147 Mbps for audio alone is absurd for home delivery — it would dwarf the video. So Atmos for the home does two things: it spatially codes the scene down to a handful of elements, then carries those elements inside a normal codec.
Spatial coding: 128 elements down to ~16
Spatial coding is the algorithm that groups the 128 beds and objects into 12, 14, or 16 perceptually distinct elements, often called clusters: objects that sit close together in space, or that the ear cannot separate anyway, are merged into one carried element, and the metadata records how to expand them again. This is the single most important number to remember about delivery Atmos: the studio authors with up to 128 elements, but only ~12–16 elements are actually transmitted. You are not streaming 128 channels. A vendor or a colleague who says "Atmos is 128 channels of audio over the wire" has confused the studio format with the delivery format.
Figure 2. The home-delivery chain: 128 studio inputs spatially coded to ~16 elements, then carried in E-AC-3 JOC or AC-4 — a roughly 100-to-1 reduction.
The carriage codec: Dolby Digital Plus with Joint Object Coding
The most common way streaming Atmos reaches a home in 2026 is Dolby Digital Plus — technically Enhanced AC-3, or E-AC-3 — carrying a feature called Joint Object Coding (JOC). The controlling standard for E-AC-3 is ETSI TS 102 366 (current published revision V1.4.1, 2017-09), where Enhanced AC-3 lives in Annex E; the US equivalent is ATSC A/52, also Annex E. E-AC-3's data-rate envelope runs from 32 kbit/s up to 6.144 Mbit/s.
The clever part of JOC is backward compatibility. A DD+ JOC stream is a normal 5.1 (or 7.1) Dolby Digital Plus core plus a side payload — the JOC data and the Object Audio Metadata (OAMD) — that lets a modern decoder reconstruct the object scene. A device that has never heard of Atmos simply decodes the 5.1 core and ignores the side data. So one stream serves both an old soundbar and a new Atmos receiver. This is exactly the "graceful degradation" property a streaming product wants: one rendition, two audiences.
The practical bitrates are where studio-versus-delivery thinking pays off. Dolby's own encoder defines minimums of 384 kbps for a 12-element spatial coding and 448 kbps for 16 elements. In real-world premium video streaming the figure observed is usually higher — commonly 640 to 768 kbps — but those are observed numbers, not a mandated value. Compare 768 kbps against the naive 147.5 Mbps from the worked example above and you have roughly a 100-to-1 reduction: that gap is the entire reason "studio Atmos" and "streaming Atmos" must never be treated as the same bitstream. (A side benefit: DD+ JOC peaks well under HDMI ARC's roughly 1 Mbps return path, which is why Atmos from a TV app can reach a soundbar over plain ARC without the newer eARC connector.)
The next-generation carriage codec: Dolby AC-4
The successor codec is Dolby AC-4, governed by ETSI TS 103 190-1 (the core codec) and ETSI TS 103 190-2 (immersive and personalized audio, current revision V1.3.1, 2025-07), with its reference renderer in ETSI TS 103 448. AC-4 carries immersive/object Atmos at roughly half the bitrate of DD+ for comparable quality, using a more advanced object-coding scheme Dolby calls A-JOC (Advanced Joint Object Coding).
Dolby's white paper puts numbers on the efficiency: an immersive 7.1.4 mix is characterized as "good" at about 192 kbps and "excellent" at about 320 kbps, and a two-channel Immersive Stereo (IMS) variant is near-transparent at 256 kbps, good at 112 kbps, and acceptable from as low as 64 kbps. Treat these as Dolby's own quality characterizations rather than hard spec limits. As of 2026, AC-4 on streaming is emerging rather than universal — most subscription video-on-demand Atmos still ships as DD+ JOC, and AC-4 deployments on streaming services are early — so for a product targeting the broadest device base today, DD+ JOC remains the safe default and AC-4 the forward-looking option.
The lossless exception: TrueHD on disc
There is one consumer place where Atmos is genuinely lossless: Dolby TrueHD on Blu-ray and 4K UHD Blu-ray. TrueHD uses Meridian Lossless Packing (MLP) and embeds the Atmos object metadata inside the lossless stream. This is disc-only; no streaming service delivers TrueHD Atmos, because the bitrate would be many megabits per second. If a requirement says "lossless Atmos," it means physical media, not a stream.
| Carriage | Spec | Where it ships | Typical bitrate | Lossy / lossless |
|---|---|---|---|---|
| E-AC-3 (DD+) JOC | ETSI TS 102 366, Annex E | Netflix, Disney+, Prime Video, Apple Music | 384–768 kbps | Lossy |
| Dolby AC-4 (A-JOC) | ETSI TS 103 190-2 | Emerging streaming; ATSC 3.0 broadcast | ~192–320 kbps (7.1.4) | Lossy |
| Dolby TrueHD | MLP | Blu-ray / 4K UHD Blu-ray only | several Mbps | Lossless |
Table 1. The three Atmos carriage paths a product person meets, with the controlling spec and where each one actually appears. Tinted cells mark each codec's standout property.
Costume 3: Atmos in music
Atmos crossed from film into music when Apple launched Spatial Audio on Apple Music in 2021, and the music costume has its own rules — different from video in two ways that catch people out: a different delivery split and, critically, a different loudness target.
On the delivery side, Apple Music delivers Atmos via DD+ JOC, while trade reporting indicates Amazon Music and Tidal lean on AC-4 Immersive Stereo (IMS) for headphone-optimised playback. The per-platform codec split is reported by industry press rather than published in a single vendor spec sheet, so treat the exact "who uses what" as needing confirmation before you bake it into a contract; the safe, durable statement is that music Atmos rides the same two carriage codecs as video Atmos.
The loudness rule is the one to memorise. Atmos music must not exceed −18 LKFS integrated loudness, measured per ITU-R BS.1770-4, with a true peak no higher than −1 dBTP, and albums are measured per track rather than aggregated. (LKFS and LUFS are the same scale for our purposes; broadcast and ATSC documents say LKFS, streaming and music documents tend to say LUFS.) That −18 LUFS target is deliberately quieter than a typical stereo master, which often sits around −10 to −14 LUFS after loudness-war mastering. The consequence is real and worth flagging to anyone shipping music: an Atmos mix played next to its stereo counterpart can sound softer, and engineers who do not account for this complain that the Atmos version feels "dull." It is not dull — it is correctly mastered to a quieter, more dynamic target.
Pitfall — using the wrong loudness target. Atmos music targets −18 LUFS integrated (per-track). Streaming video Atmos uses a different reference entirely: Netflix and Dolby specify roughly −27 LKFS dialogue-gated for home/streaming mixes (a dialogue-gated measurement, not the same method as integrated music loudness). These are different numbers measured in different ways for different content. Applying the music −18 to a film mix, or the video −27 to a music master, produces a deliverable that fails QC. Match the target to the domain.
One more music-specific detail catches engineers out. Because most Atmos music is heard on headphones, Apple's delivery spec asks the mixer to set a Binaural Render Mode — Off, Near, Mid, or Far — for each bed channel and each object, controlling how "distant" that element sounds in the binaural render, and to keep binaural peak-limiting indicated at no more than about 3 dB. This is the music world quietly acknowledging that the binaural render (Costume 4) is the actual delivery for most listeners, so the mixer is given control over how each object virtualises to headphones rather than leaving it to a generic downmix. For a product handling music ingest, the lesson is that an Atmos music master carries headphone-rendering intent inside it; stripping or ignoring that metadata changes how the track sounds on AirPods.
The production master for music — and for everything else — is the same file format, covered next.
Costume 4: Apple Spatial Audio (the binaural render)
The fourth costume is the one most consumers actually hear: Apple Spatial Audio, the head-tracked Atmos you get on AirPods. The distinction to hold firmly is that the Atmos master is object-based and speaker-agnostic, while what reaches your ears on headphones is a binaural render — a two-channel signal engineered so your two ears perceive full 3D placement. Binaural audio is not a production format; it is the output of a renderer.
Apple produces that render on-device, in real time. According to Apple's delivery documentation, the path is to downmix the Atmos master to 7.1.4, then virtualise that 7.1.4 to binaural — meaning Apple uses its own renderer rather than Dolby's binaural engine. Two extra features ride on top. Head tracking uses the headphone's motion sensors so that when you turn your head, the sound field stays anchored to the device, which is the same head-rotation idea that makes scene-based audio attractive for VR — but here it is applied to an object render. Personalised Spatial Audio captures the shape of your head and ears with the iPhone's TrueDepth camera (iOS 16 and later) to build a personalised model of how your anatomy colours sound from each direction — a head-related transfer function, covered in our ambisonics, HRTF, and binaural rendering article.
Figure 3. Apple Spatial Audio: the Atmos master is downmixed to 7.1.4, then virtualised to a head-tracked, optionally personalized binaural pair for headphones.
The thing all four share: the ADM BWF master
Cinema, home, music, and binaural are four deliveries of one source. That source is the Dolby Atmos master, and it exists in two equivalent forms. The first is a Dolby-specific trio of files — a .atmos metadata header, a .atmos.audio PCM payload, and a .atmos.metadata file holding the object positions and automation. The second, increasingly the interchange standard, is a single ADM BWF .wav file: a Broadcast Wave File whose embedded metadata follows the Audio Definition Model, the vocabulary specified in ITU-R BS.2076 (the in-force revision is BS.2076-2), with the EBU's Tech 3364 as a companion. Dolby publishes a "Dolby Atmos Master ADM Profile" that pins down exactly how an Atmos master uses the ADM.
The technical baseline of that master is simple to remember: 24-bit linear PCM at 48 kHz, with 24 fps timecode. Every costume in this article is rendered from that one file. When an engineer says "send me the ADM," they want this — the speaker-agnostic, object-based source, not a 5.1 downmix and not a binaural render.
This is also where the "real Atmos versus fake Atmos" question lives. Apple's music delivery rules are explicit that a stereo mix merely placed in the sound field with added ambience or reverb is not allowed — a genuine Atmos master requires authored object-based mixing, where a human (or a deliberate process) has placed sounds as objects with intent. An automated stereo-to-Atmos upmix that just sprinkles reverb around a flat mix is not a true Atmos master, and platforms that police it will reject it. For a product decision, the takeaway is that "we support Atmos" should mean "we ingest and deliver authored object masters," not "we run an upmixer."
How loudness works across the Atmos costumes
Loudness deserves its own short section because it is the most common place Atmos deliverables fail QC, and because the numbers differ by domain. Three references matter:
The music reference is −18 LUFS integrated, −1 dBTP true peak, measured per ITU-R BS.1770-4, per track. The streaming and home video reference is roughly −27 LKFS dialogue-gated (the value Dolby's professional encoder defaults its dialogue-normalisation, or "dialnorm," to for movie soundtracks, and that Netflix mandates for Atmos home mixes), which tends to sit near −24 LKFS when measured as full-program integrated loudness. The dialnorm value itself is metadata: it tells the decoder how loud the dialogue was authored to be, so the playback device can normalise different programs to a consistent level without re-encoding the audio.
The product-level lesson is that "what LUFS should Atmos be?" has no single answer — it depends on whether you are shipping music or video, and which measurement method (integrated versus dialogue-gated) the platform specifies. Get this wrong and the asset is rejected; get the measurement method wrong and you can hit the right number by the wrong route and still fail.
Decoding the speaker-layout notation
The three-number notation appears throughout Atmos documentation, and decoding it once removes most of the confusion. The pattern is floor . LFE . height:
- The first number counts the full-range, ear-level speakers around the listener.
- The second number counts the Low-Frequency Effects channel — the "LFE," the deep-bass channel your subwoofer plays. It is "point one" because it carries only roughly 20–120 Hz, about a tenth of a full channel's bandwidth, not because there is exactly one of it.
- The third number counts the overhead (ceiling) speakers — the height layer that makes Atmos "immersive."
So 5.1.2 is five ear-level speakers, one subwoofer, two overhead; 7.1.4 (Dolby's home reference layout) is seven ear-level, one sub, four overhead — twelve speakers; and 9.1.6, the largest named home layout Dolby publishes setup guides for, is sixteen. The same Atmos stream auto-adapts across all of them, because the renderer places the objects against whatever layout it finds. Note the domain split clearly: home layouts top out at the named 9.1.6 (and most consumer receivers cap at 7.1.4, an 11-channel hardware limit, not a format limit), while cinema scales to 64 independent feeds. Cinema and home are different worlds; do not quote a cinema speaker count when scoping a home or streaming feature.
The production pipeline, end to end
It helps to walk the chain once in order, because the four costumes are really four exits from a single road. The road starts in a digital audio workstation — Pro Tools, Nuendo, Logic, or similar — where a mixer assigns each sound either to a bed channel or to an object. Assigning to an object means giving that sound a position and, usually, automation: a path it travels over time. The DAW feeds the Dolby Atmos Renderer, which is the piece of software that monitors the mix (so the engineer can hear it on whatever local speaker layout or on binaural headphones) and, when the mix is finished, writes out the master — the ADM BWF .wav or the .atmos file trio described above. Nothing downstream sees the DAW session; everything downstream sees the master.
From that one master, each exit applies a different transform. The cinema exit packages the bed and objects into a DCP that the auditorium's processor renders live. The streaming exit runs spatial coding to reduce the scene to ~12–16 elements and then encodes them as E-AC-3 JOC or AC-4, producing the few-hundred-kbps stream a CDN actually serves. The music exit applies the −18 LUFS loudness pass and delivers the ADM BWF to Apple, Tidal, or Amazon, who encode it for their apps. The binaural exit — which can happen on the consumer's device rather than in the studio — downmixes to 7.1.4 and virtualises to a head-tracked stereo pair. The discipline that keeps all four consistent is that they all derive from the same authored master; an organisation that re-mixes per platform, rather than re-rendering from one master, multiplies its cost and its chances of drift between versions.
A second pipeline detail worth knowing is dialnorm, the dialogue-normalisation metadata mentioned in the loudness section. It is not audio; it is a number the encoder embeds that tells every playback device how loud the program's dialogue was authored to be. The device then attenuates so that a quiet documentary and a loud action film play back at a consistent dialogue level without anyone re-encoding the audio. For a streaming product, dialnorm is the lever that lets a catalogue of titles, mastered by different studios, present a uniform loudness to the viewer — set it wrong and viewers reach for the volume control on every title change.
A storage and CDN budget for Atmos
Product people need a number, so here is how to build one. The headline is that Atmos is cheap on the wire relative to video, because spatial coding has already done the heavy lifting. Take a premium film delivered with Atmos at 768 kbps and a two-hour runtime:
768 kbps × 7,200 s = 5,529,600 kbits = 5,529.6 Mbit
5,529.6 Mbit ÷ 8 = 691.2 MB per title for the Atmos audio rendition
Under 700 MB of audio for a two-hour film, against perhaps 8–20 GB for the 4K video of the same title — the Atmos track is a single-digit percentage of the total. If you also store the backward-compatible fallbacks, the math stays modest: a 5.1 E-AC-3 fallback at 384 kbps adds about 346 MB, and a stereo AAC fallback at 128 kbps adds about 115 MB. The three audio renditions together come to roughly 1.15 GB, still small beside the video. Switch the carriage to AC-4 at 320 kbps for the immersive track and the Atmos rendition alone drops to about 288 MB — the "half the bitrate" claim made concrete.
Pitfall — budgeting Atmos like it is 128 channels. The naive mistake is to size storage and egress as though you were carrying the studio's 128 inputs. As the worked example at the top of the streaming section showed, that would be ~147 Mbps and tens of gigabytes per title — two orders of magnitude too high. Always budget the delivery bitrate (a few hundred kbps), never the studio element count.
The CDN egress story follows directly: at 768 kbps, one hour of Atmos streamed to one viewer moves about 346 MB; at the AC-4 320 kbps figure, about 144 MB. Multiply by your concurrent-viewer peak and your average session length, and the audio component of egress is almost always dominated by the video component. The honest product takeaway is that adding Atmos to an existing video service is a small bandwidth cost; the real costs are authoring (you need genuine object masters, not upmixes), QC (loudness and the no-fake-Atmos rule), and device testing (making sure the JOC fallback actually degrades gracefully on the long tail of older hardware).
Where Atmos sits next to its rivals
For completeness, two adjacent formats come up whenever Atmos does. DTS:X is Dolby's main rival — also object-based, also layout-flexible, competing on ecosystem and licensing rather than on a different underlying philosophy; for a product decision, treat DTS:X and Atmos as the same category. MPEG-H 3D Audio, specified as ISO/IEC 23008-3, is the open ISO immersive standard that carries channel, object, and scene-based audio in one bitstream (up to 64 loudspeaker channels and 128 codec core channels) and is the broadcast-world alternative — the sole audio system of South Korea's ATSC 3.0 service, and one of two options (alongside Dolby AC-4) in the US ATSC 3.0 audio standard A/342. Atmos is not MPEG-H; when a broadcast target requires MPEG-H, that is a different authoring and carriage path, covered in MPEG-H 3D Audio: the ISO immersive standard.
Where Fora Soft fits in
Fora Soft has built video products since 2005 across streaming, OTT/Internet TV, video conferencing, e-learning, telemedicine, video surveillance, and AR/VR, and immersive audio is the part of those pipelines that users notice first when it is wrong. In OTT and streaming work we wire up Atmos delivery as E-AC-3 JOC — the broadly compatible default — with AC-4 where a target device base supports it, and we keep channel-based stereo and 5.1 fallbacks so a title plays correctly from a phone to a home cinema. We treat the ADM BWF master as the single source of truth and render the costumes from it, rather than upmixing stereo. In AR/VR projects, where the sound field must track the headset, we lean on the head-tracked rendering ideas that Apple's Spatial Audio popularised. The point of knowing all four costumes is practical: it is how you scope an Atmos feature that ships on the devices, and at the bitrates and loudness targets, your users and platforms actually require.
What to read next
- Channel-Based vs Object-Based vs Scene-Based Audio
- MPEG-H 3D Audio: the ISO immersive standard
- Atmos and immersive audio in streaming: Netflix, Disney+, Apple TV+, Tidal
Call to action
- Talk to a audio engineer — book a 30-minute scoping call to talk through your dolby atmos for streaming plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the Dolby Atmos: Delivery Cheat Sheet — One page: the four Atmos costumes (cinema / streaming / music / binaural), the carriage codecs (E-AC-3 JOC vs AC-4 vs TrueHD) with bitrates, the loudness targets (-18 LUFS music vs ~-27 LKFS video), speaker-layout notation decoded, and….
References
- ETSI TS 102 366 V1.4.1 (2017-09), "Digital Audio Compression (AC-3, Enhanced AC-3) Standard." Controlling spec for E-AC-3 (Dolby Digital Plus); Enhanced AC-3 in Annex E; basis for the DD+ JOC streaming carriage and the 32 kbit/s–6.144 Mbit/s data-rate envelope. Tier 1 (official standard). https://www.etsi.org/deliver/etsi_ts/102300_102399/102366/01.04.01_60/ts_102366v010401p.pdf
- ETSI TS 103 190-2 V1.3.1 (2025-07), "Digital Audio Compression (AC-4) Standard; Part 2: Immersive and personalized audio." Controlling spec for AC-4 immersive audio and Advanced Joint Object Coding (A-JOC); current 2025 revision; reference renderer in ETSI TS 103 448. Tier 1. https://www.etsi.org/deliver/etsi_ts/103100_103199/10319002/01.03.01_60/ts_10319002v010301p.pdf
- ETSI TS 103 190-1, "Digital Audio Compression (AC-4) Standard; Part 1." AC-4 core codec specification. Tier 1. https://www.etsi.org/standards-search?search=103%20190-1
- Recommendation ITU-R BS.2076-2, "Audio Definition Model." The ADM metadata model underlying the Dolby Atmos master ADM/BWF deliverable (the speaker-agnostic interchange master). Tier 1. https://www.itu.int/rec/R-REC-BS.2076/en
- Recommendation ITU-R BS.1770-4, "Algorithms to measure audio programme loudness and true-peak audio level." Loudness-measurement method underlying the −18 LKFS music target, the −1 dBTP true-peak limit, and the dialogue-gated video reference. Tier 1. https://www.itu.int/rec/R-REC-BS.1770/en
- ISO/IEC 23008-3, "MPEG-H 3D Audio" (Part 3). The open ISO immersive standard used for comparison; carries channel/object/HOA in one bitstream, up to 64 loudspeaker channels and 128 codec core channels. Tier 1 (catalogue/abstract; normative text paywalled). https://www.iso.org/standard/83525.html
- Dolby Laboratories, "Dolby Atmos Renderer Guide." Source for the 128-input renderer limit (beds + objects), the up-to-118-object cap, the 7.1.2 home bed, and the Atmos master file formats. Tier 4 (first-party vendor). https://professional.dolby.com/siteassets/content-creation/dolby-atmos/dolby_atmos_renderer_guide.pdf
- Dolby Laboratories, "Dolby Atmos Cinema Sound." Source for cinema figures: up to 64 independent speaker feeds, a 9.1 bed plus up to 118 objects (up to 128 tracks), single-DCP real-time rendering, and the cinema "lossless" carriage. Tier 4. https://professional.dolby.com/cinema/dolby-atmos/
- Dolby Laboratories, "Dolby AC-4: Audio Delivery for Next-Generation Entertainment Services" (white paper). Source for AC-4 immersive 7.1.4 bitrates (~192–320 kbps) and Immersive Stereo (64–256 kbps), and the "~half the bitrate of DD+" efficiency claim. Tier 4 (vendor characterization, not a hard spec limit). https://professional.dolby.com/siteassets/technologies/dolby_atmos_ac-4_whitepaper.pdf
- Dolby Laboratories, "Dolby Atmos encoding" (Hybrik documentation). Source for spatial coding to 12/14/16 elements, the DD+ JOC minimum data rates (384 kbps for 12 elements, 448 kbps for 16), and the JOC backward-compatibility model. Tier 4. https://docs.hybrik.com/tutorials/dolby_atmos/
- Apple, "Delivering Dolby Atmos audio" (Apple Music Provider Support). Source for the music master format (ADM BWF, 24-bit/48 kHz), the −18 LKFS / −1 dBTP music loudness target, the 7.1.4-downmix-then-binaural render path, and the no-upmix ("not allowed") rule. Tier 4. https://itunespartner.apple.com/music/support/5216-delivering-dolby-atmos-audio
- Netflix, "Dolby Atmos Home Mix Deliverable Requirements v2.3." Source for the ~−27 LKFS dialogue-gated home/streaming loudness reference. Tier 4. https://partnerhelp.netflixstudios.com/hc/en-us/articles/115001539991
Note on source hierarchy (per our research standard): where a vendor description and a controlling standard disagree, the article follows the standard. Codec-carriage facts are anchored to the ETSI and ITU-R standards above; Dolby and Apple documents are cited for deployment specifics (renderer limits, encoder minimums, platform loudness targets) that the standards do not fix. The per-platform music codec split (Apple = DD+ JOC; Amazon/Tidal = AC-4 IMS) and AC-4's streaming rollout status are reported by trade press and are flagged in-text as needing primary confirmation.


