Why This Matters
If you ship video to more than one platform — iOS and Android, Safari and Chrome, Apple TV and Roku, a browser and a smart TV — you used to pay a tax. You encoded once, packaged twice, encrypted twice, stored twice, and watched your CDN cache hit ratio fall because the same content existed in two binary-identical-but-not-byte-identical formats. CMAF is the format that ended that tax, and the saving is not small: a 50% drop in storage, a roughly 2× improvement in CDN cache hit ratio, a single multi-DRM workflow instead of three, and a single packaging pipeline instead of two. The mechanics matter because three things still go wrong in production — the encryption scheme you pick (cenc vs cbcs), the codec brand you declare (which devices reject cmf2?), and the segment-vs-fragment-vs-chunk distinction (Apple's HLS spec uses "segment" to mean what CMAF calls "fragment") — and getting any of them wrong reintroduces the dual-stack tax silently. This article makes every layer of CMAF readable: the box hierarchy, the brand vocabulary, the encryption modes, the relationship to HLS and DASH manifests, and the production stack as it actually ships in 2026.
What CMAF is, in one paragraph
CMAF is a packaging format — not a codec, not a protocol, not a delivery mechanism. It tells you how to lay out video and audio bytes on disk so that the same file can be referenced by both an HLS playlist and a DASH manifest, decrypted by FairPlay, Widevine, and PlayReady from the same key, and decoded by any player that already understands fragmented MP4. Concretely, CMAF is a constrained profile of ISO Base Media File Format (ISO BMFF, the same standard that defines .mp4), with a fixed packaging order — initialization segment containing a moov box, followed by media segments each containing one or more moof + mdat pairs — and a vocabulary of brands and media profiles that declare which codec, encryption mode, and HDR characteristics are inside. The standardization story is short: Apple and Microsoft proposed CMAF jointly to MPEG in early 2016, the first edition published as ISO/IEC 23000-19 in 2018, the second edition in 2020, and the third edition in February 2024 — that third edition is what production deployments built in 2026 should be targeting.
The history, in four milestones
CMAF exists because of a problem the streaming industry created for itself in 2009 and lived with for almost a decade.
The first milestone is the 2009 publication of HLS by Apple. HLS was built on MPEG-2 Transport Stream — the same .ts segment format used by satellite and cable broadcast — because in 2009 every Apple device decoded MPEG-TS natively in hardware. The trade-off was that nobody else used MPEG-TS for internet streaming. Browsers, Android, Smart TVs, and the rest of the device population converged on fragmented MP4 (fMP4), which was the format that DASH adopted when MPEG published ISO/IEC 23009-1 in 2012. From 2012 onward, any service that wanted to reach both Apple and the rest of the world had to package the same content twice: once as MPEG-TS for the HLS playlist and once as fMP4 for the DASH manifest. Storage doubled. CDN cache efficiency halved. Encryption doubled — Apple's FairPlay encrypted MPEG-TS, while Widevine and PlayReady encrypted fMP4 — meaning multi-DRM was a three-pipeline operation rather than a one-pipeline operation.
The second milestone is June 2016, when Apple announced at WWDC that HLS would, for the first time, support fragmented MP4 segments alongside MPEG-TS. The announcement was the structural prerequisite for CMAF: it meant a single fMP4 file could in principle be referenced from both an HLS playlist and a DASH manifest. Four months earlier — in February 2016 — Apple and Microsoft had jointly submitted a proposal to MPEG to formalize the shared format. Akamai and a handful of other vendors joined the proposal during 2016. The MPEG working group accepted it onto a standardization track that year.
The third milestone is January 2018, when MPEG published the first edition of ISO/IEC 23000-19 — Common Media Application Format for segmented media. The 2018 edition defined the CMAF track, segment, fragment, and chunk; the structural brands cmfc and cmf2; the first set of media profiles for H.264/AVC, HEVC, and AAC; and the requirement that CMAF media be carried in 'isom' / 'iso6'-compatible fragmented MP4. The 2020 second edition added profiles for HEVC HDR (HDR10, HDR10+, HLG) and Dolby Atmos audio. The 2024 third edition — published February 2024 — is the current baseline; it adds profiles for AV1, VVC (Versatile Video Coding, H.266), and tightens the structural constraints on the moof box. Amendment 1 to the 2024 edition, published July 2024, adds Low Complexity Enhancement Video Coding (LCEVC) and related profiles.
The fourth milestone is the production-stack maturation between 2019 and 2026. The major packagers (Shaka Packager, Bento4, AWS Elemental MediaPackage, Unified Streaming's Unified Packager, Wowza Streaming Engine, Bitmovin Live, Mux, Norsk) all shipped CMAF-emitting modes between 2019 and 2022 and made CMAF the default between 2023 and 2025. The major players (Apple's native HLS player on iOS / tvOS 10+, hls.js, Shaka Player, dash.js, Bitmovin Player, ExoPlayer, Video.js, THEOplayer) all support CMAF natively. The major DRMs (FairPlay, Widevine, PlayReady) all support cbcs encryption, which means one encrypted CMAF stream covers all three. As of the Bitmovin Video Developer Report 2025/26, CMAF is the dominant packaging format among the 700+ surveyed streaming engineers — the dual-stack tax is, for new deployments, gone.
The cost of not using CMAF — and the saving when you do
The single most concrete way to understand CMAF is to count the bytes. Consider a representative VOD library: 1,000 hours of content, encoded at five renditions (1080p, 720p, 540p, 360p, 240p), four-second segments, average video bitrate 2 Mbps across the ladder.
Without CMAF — packaging each rendition as both MPEG-TS for HLS and fMP4 for DASH — the storage math is:
1,000 hours × 3,600 s/hour × 2 Mbit/s / 8 = 900 GB per rendition
5 renditions × 900 GB = 4,500 GB per packaging stack
2 packaging stacks (HLS + DASH) = 9,000 GB total at origin
With CMAF — one set of fMP4 files referenced by both an HLS playlist and a DASH manifest — the math collapses to one stack:
1,000 hours × 3,600 s × 2 Mbit/s / 8 = 900 GB per rendition
5 renditions × 900 GB = 4,500 GB
1 packaging stack (CMAF) = 4,500 GB total at origin
The saving is exactly 50% of origin storage, with the same content reach. The CDN economics are even better. A CDN bills you for cache misses (origin egress) and stores popular files at the edge. Without CMAF, segment 02.ts (HLS) and segment 02.m4s (DASH) are two separate cache objects even though they carry the same frames — the cache wastes shelf space on duplicates, and the hit ratio drops. With CMAF, both manifests reference the same .m4s object, the cache stores it once, and the hit ratio roughly doubles for the same edge footprint. The compounded effect — half the storage, double the cache efficiency, half the encoding-and-packaging pipeline cost — is what makes CMAF the default in 2026 and not a curiosity.
The encryption saving is independent. Without CMAF, you encrypted MPEG-TS for FairPlay (AES-128-CBC at the time), fMP4 for Widevine (AES-128-CTR, the cenc scheme), and fMP4 again for PlayReady (also cenc). That meant three encrypted variants. With CMAF and Common Encryption's cbcs scheme — supported by FairPlay since 2017, by PlayReady 4.0+ since 2018, and by Widevine since L1 firmware in 2018 — you encrypt the CMAF file once under one content key, and all three DRMs decrypt it. One encrypted copy, three DRMs, full device coverage.
The CMAF object model — track, fragment, chunk, segment
The CMAF specification defines a small hierarchy of objects, and getting the names right is the single most important thing for production engineers to do. The same word — "segment" — means different things in HLS, DASH, and CMAF, and the confusion is the source of half the production bugs in this space.
CMAF track
A CMAF track is a single continuous media stream — one video angle, one audio channel mix, one subtitle stream — carried inside a CMAF file. It corresponds one-to-one with a trak box inside ISO BMFF. Every CMAF presentation is a collection of tracks: typically one video track per rendition (1080p, 720p, etc.), one audio track per language and channel layout (English stereo, Spanish 5.1), and zero or more subtitle tracks (English WebVTT, Spanish WebVTT). A CMAF track is encoded with one codec at one fixed configuration; you switch quality by switching tracks, not by changing parameters inside a track. Every CMAF track has a track header — the moov box containing the mvhd, trak, mdia, and codec-specific configuration boxes — that the player reads once at startup and reuses for the duration of the playback.
CMAF chunk
A CMAF chunk is the smallest decodable unit. It is exactly one moof box (the movie fragment, containing sample timing and offsets) followed by exactly one mdat box (the media data, containing the compressed frames). The pair moof + mdat is the atomic unit a player's source buffer accepts. A typical chunk holds 200–500 ms of media — at 30 frames per second, that is 6–15 video frames per chunk; at 60 fps, 12–30 frames per chunk. The CMAF 2024 specification tightens the ISO BMFF allowance from "one or more chunks per fragment" to exactly one chunk per fragment in the strict cmf2 brand, which simplifies player parsing and is the production default in 2026.
CMAF fragment
A CMAF fragment is one or more CMAF chunks that share a fragmented MP4 fragment boundary — meaning they sit between two styp (segment type) markers in the file. In ordinary VOD, a fragment is the same thing as a segment (the file the player downloads). In low-latency live streaming, a segment is built from many fragments, each of which is built from one or more chunks; this is the mechanism that lets LL-HLS and LL-DASH publish partial segments before the segment is finished. The hierarchy from top to bottom is: track contains segments; segment contains fragments; fragment contains chunks; chunk contains frames.
CMAF segment
A CMAF segment is the file the player actually downloads. It begins with a styp box (segment type, declaring which CMAF brands apply), then one or more moof + mdat pairs (the fragments and chunks), and ends when the segment file ends. Segments are typically 2–6 seconds long; 2 seconds is the modern default for live, 4 or 6 seconds is common for VOD. The HLS manifest references a CMAF segment as one #EXTINF entry; the DASH manifest references it via $Number$ or $Time$ in a SegmentTemplate. Same file, two references.
The terminology gets worse before it gets better. Apple's HLS specification uses the word "segment" to mean what CMAF calls a "fragment". MPEG-TS HLS uses ".ts" for what is, technically, a single CMAF segment. The DASH manifest's element refers to a fragment-of-a-segment in some configurations. The single most useful mental model is: frames pack into chunks; chunks pack into fragments; fragments pack into segments; segments are what the player downloads. Everything else is naming history.
CMAF brands and media profiles — the labels that determine compatibility
A CMAF file's styp (segment type) box and ftyp (file type) box carry one or more four-character brand codes that tell the parser what subset of ISO BMFF and which codec the file uses. Brands are the contract: a player that understands brand X will accept any file declaring brand X; a player that does not understand brand X will reject the file. Two layers of brands exist — structural brands that say which version of the CMAF specification applies, and media profile brands that say which codec and HDR characteristics are inside.
Structural brands: cmfc and cmf2
The two structural brands are cmfc (the base CMAF brand) and cmf2 (a tighter restriction). A file declaring cmfc complies with all the base CMAF constraints — packaging order, box presence, sample-grouping rules — but allows multiple chunks per fragment, multiple tracks per file, and some legacy ISO BMFF features. A file declaring cmf2 additionally enforces exactly one chunk per fragment, exactly one track per file, and a stricter moof box; it is the brand most modern players target because the strictness makes parsing deterministic. The 2026 production default is cmf2 for new content; older content tagged cmfc remains widely supported.
Video media profiles
A media profile brand declares the codec, profile, level, bit depth, and HDR transfer function. The CMAF third edition (2024) registers the following:
| Brand | Codec | Profile | HDR | Typical use |
|---|---|---|---|---|
cfhd | H.264 / AVC | High | SDR | Web, mobile, smart TV — the safest default |
chd1 | HEVC | Main10 | SDR | 4K SDR on iOS, tvOS, Smart TVs |
clg1 | HEVC | Main10 | HLG | Broadcast-grade HDR live |
cud1 | HEVC | Main10 | HDR10 | UHD HDR10 movies on iOS, tvOS, Smart TVs |
chh1 | HEVC | Main10 | HDR10+ | Dynamic HDR (Samsung-led) |
cdm1 | HEVC | Main10 | Dolby Vision | Dolby Vision profile 5 / 8 |
cvvc | VVC (H.266) | Main10 | SDR/HDR | Future-proof, 2026 still emerging |
cav1 | AV1 | Main | SDR/HDR | Web, YouTube, Netflix |
Audio media profiles
| Brand | Codec | Channel layout | Typical use |
|---|---|---|---|
caac | AAC-LC | Mono / stereo / 5.1 / 7.1 | Web, mobile, smart TV — default |
chea | HE-AAC v2 | Mono / stereo | Low-bitrate mobile |
cmac | xHE-AAC | Mono / stereo / 5.1 / 7.1 | Adaptive loudness, modern mobile |
cac3 | Dolby AC-3 | 5.1 | Broadcast legacy |
cec3 | Dolby Digital Plus (E-AC-3) | 5.1 / 7.1 | Premium VOD |
caca | Dolby Atmos (E-AC-3 JOC) | Object-based | Premium VOD |
cusc | xHE-AAC USAC | Stereo/multichannel | Newest, low-bitrate |
Subtitle and metadata profiles
CMAF also defines profiles for IMSC1 text subtitles (im1t, im1i), WebVTT (wvtt), and event-message tracks (emsg) used for ad insertion markers and metadata. A modern CMAF presentation typically declares one video brand, one or two audio brands, and one subtitle brand in its styp and ftyp boxes.
A worked example
A 4K HDR10 movie packaged as CMAF for a 2026 deployment might declare, in its ftyp box:
ftyp:
major_brand = 'cmf2'
minor_version = 0
compatible_brands = ['cmf2', 'cmfc', 'isom', 'iso6', 'cud1', 'caca']
This says: "I am a strict-CMAF (cmf2) file, also compatible with the base CMAF brand (cmfc), with the older ISO BMFF brands (isom, iso6), carrying HEVC Main10 HDR10 video (cud1) and Dolby Atmos audio (caca)." A player that understands cmf2, cud1, and caca decodes the file; a player missing any of those rejects it. The brand declaration is the contract that makes CMAF an explicit-compatibility format rather than a try-and-pray-it-works format.
Common Encryption — one key, three DRMs
CMAF's encryption layer is Common Encryption (CENC), defined by ISO/IEC 23001-7. Common Encryption decouples the encryption operation from the DRM that holds the keys. Content is encrypted once at packaging time using AES-128 in one of two modes; each DRM (FairPlay, Widevine, PlayReady) provides license-server logic that hands the same content key to the player, and the player decrypts with the same algorithm regardless of which DRM minted the license. One encrypted file, three DRMs, every modern device.
Two encryption schemes are in production use:
cenc (AES-128-CTR). Counter mode. Every byte of the sample is encrypted; the IV is computed from a per-sample counter. The 2018 default. Supported by Widevine and PlayReady; not supported by FairPlay. If you encrypt with cenc, Apple devices cannot play your content. cenc survives in legacy non-Apple deployments but is no longer the recommended default for new content.
cbcs (AES-128-CBC with subsample pattern). Cipher Block Chaining with a pattern that encrypts only some of the bytes — typically 1 of every 10 blocks for video (1:9 pattern) — leaving the rest in the clear so hardware decoders can parse the stream without decrypting frames they don't need. The pattern dramatically reduces decryption load on mobile chips. Supported by FairPlay (since 2017), PlayReady 4.0+ (since 2018), and Widevine L1 (since 2018). The 2026 default; if you encrypt with cbcs, your single CMAF file covers iOS, Android, smart TVs, and the open web.
Older content sometimes uses a third scheme — cbc1 (AES-128-CBC without the subsample pattern) — but cbc1 was never widely supported and is effectively retired in 2026.
The encryption metadata sits in the CMAF file's moov box (key system declarations) and inside each moof (per-fragment initialization vectors). The Content Encryption Key (CEK) is identified by a Key ID (KID, a 16-byte UUID). At license-request time the player sends the KID to each DRM's license server; the license server returns a license containing the CEK protected under the DRM's own root-of-trust (FairPlay binds it to the Apple device's secure enclave; Widevine to the Trusty TEE or Strongbox; PlayReady to the Microsoft Hardware DRM bound key). The player's media stack decrypts each sample as it arrives at the source buffer. Every DRM sees the same CMAF file; only the license-server protocol differs.
The wire-level effect is dramatic. Before CMAF + CENC + cbcs, a multi-DRM deployment needed three encoded streams (or, more commonly, three encrypted variants of one stream), three license-server integrations, three CDN paths, three packaging pipelines, and three test plans. After CMAF + cbcs, the same library uses one stream, three license servers (still — there is no way around three license servers; the keys are the same but the licensing protocols differ), one CDN path, one packaging pipeline, and one test plan. The encryption operation moved from per-DRM to per-content; the licensing remains per-DRM.
A common production mistake is to assume that "supporting cbcs" means a device automatically supports it. Devices below iOS 11 (released September 2017) cannot play cbcs-encrypted CMAF; pre-2018 PlayReady devices cannot either; some smart TVs shipped with Widevine L3 firmware that decoded only cenc. The 2026 rule is: if your audience is on devices released in or after 2019, cbcs covers everyone; if you still have to reach a long tail of pre-2018 hardware, you may need both a cenc variant and a cbcs variant of the same content. The decision tree is straightforward: audit your target device list, find the oldest device you must support, check whether that device supports cbcs, and pick cbcs if it does.
The dual-manifest pattern — one set of files, two manifests
The architectural payoff of CMAF is that one set of .m4s segment files on disk feeds both an HLS playlist (.m3u8) and a DASH manifest (.mpd). The packager writes the files once. The manifests are small text files generated alongside; their only job is to point at the segments and tell the player which segments belong together.
A skeleton HLS playlist referencing CMAF segments looks like:
#EXTM3U
#EXT-X-VERSION:6
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-MAP:URI="init.mp4"
#EXTINF:4.0,
segment-1.m4s
#EXTINF:4.0,
segment-2.m4s
#EXTINF:4.0,
segment-3.m4s
#EXT-X-ENDLIST
The corresponding DASH manifest references the same init.mp4 and the same segment-N.m4s files:
<MPD xmlns="urn:mpeg:dash:schema:mpd:2011"
profiles="urn:mpeg:dash:profile:isoff-on-demand:2011,urn:mpeg:dash:profile:cmaf:2019"
type="static" mediaPresentationDuration="PT12S">
<Period>
<AdaptationSet mimeType="video/mp4" segmentAlignment="true" startWithSAP="1">
<Representation id="720p" bandwidth="2500000" codecs="avc1.4d401f">
<SegmentTemplate
initialization="init.mp4"
media="segment-$Number$.m4s"
startNumber="1"
duration="4"
timescale="1"/>
</Representation>
</AdaptationSet>
</Period>
</MPD>
The HLS player sees three segments in its playlist, fetches init.mp4 plus each .m4s file in order, decrypts under FairPlay using the cbcs scheme declared in the file's moov, and renders. The DASH player sees the same three segments via the SegmentTemplate, fetches the same init.mp4 and the same .m4s files in order, decrypts under Widevine or PlayReady using the same cbcs scheme and the same content key, and renders. The bytes on disk are identical. The CDN cache stores each .m4s exactly once. The origin egress is roughly half what it would be with the two-stack model.
A subtle detail in the DASH manifest is the profiles attribute carrying urn:mpeg:dash:profile:cmaf:2019 — that's the DASH profile URN that declares the content is CMAF, and it tells the player to apply CMAF-aware parsing (in particular, the cbcs scheme rather than the older cenc). The corresponding HLS spec convention is the EXT-X-VERSION:6 line (minimum HLS version that supports fMP4) and the #EXT-X-MAP reference to the initialization segment. Both manifests are small (kilobytes), light to regenerate, and live alongside the segment files; the cost is in the segments themselves, which are now packaged once.
Where CMAF fits inside the broader streaming stack
CMAF is one of three layers in a modern streaming pipeline: the codec layer (H.264, HEVC, AV1, VVC) compresses frames; the packaging layer (CMAF) wraps the compressed bytes in addressable files; the delivery layer (HLS, DASH, LL-HLS, LL-DASH) serves those files to players via HTTP. CMAF sits in the middle. It does not care which codec produced the bytes (you can put any of the registered codecs inside a CMAF chunk) and it does not care which delivery protocol references the files (you can serve the same CMAF files via HLS, DASH, LL-HLS, or LL-DASH). What CMAF cares about is the order of boxes inside the file, the brand declarations that say what's inside, and the encryption scheme that wraps the samples.
The relationship to LL-HLS and LL-DASH is direct. Both low-latency protocols depend on the CMAF chunk being decodable independently of the rest of its parent segment. That is, the moment the encoder produces a 333 ms slice of video, the packager flushes a complete moof + mdat pair to disk; the origin reads that pair the instant it appears; the chunked-transfer or HTTP/2 stream framing pushes the bytes to the player; the player's source buffer accepts the chunk and the decoder starts producing frames roughly 333 ms behind the camera. Without CMAF chunks, low-latency HTTP streaming is impossible — you would have to wait for the whole 2–6 second segment to finish. CMAF is the packaging substrate that makes the LL-HLS and LL-DASH protocols' wire-level tricks (#EXT-X-PART, @availabilityTimeOffset) work. See LL-HLS in depth and LL-DASH in depth for how the protocols on top read the manifest signals; this article is about the substrate underneath.
The relationship to Media over QUIC (MoQ) is forward-looking. MoQ is a new transport protocol — built on QUIC — for the next generation of low-latency live streaming, and the working group has explicitly designed it to carry CMAF chunks as its payload. The IETF draft draft-wilaw-moq-cmafpackaging-01 (and the LOCMAF "Low Overhead CMAF for MoQ" proposal) define how a CMAF chunk maps onto a MoQ object. The point is that the packaging format does not change; only the transport above it does. CMAF is the lingua franca that makes investments in MoQ continuous with investments in LL-HLS and LL-DASH — you keep the encoders, the packagers, and the storage; you swap the delivery protocol when you need lower latency.
The relationship to MPEG-TS — the legacy HLS segment format — is one of careful retirement. MPEG-TS HLS is still required for a small population of devices: older Roku models, some Smart TVs released before 2018 (Vizio early models, certain LG webOS 3.x, certain Samsung Tizen 3.x), and a few embedded set-top boxes. A 2026 deployment that targets those devices ships a small MPEG-TS HLS fallback rendition alongside the main CMAF stack. For everything released since 2019, CMAF is the only packaging format you ship.
The packagers — what actually emits CMAF in 2026
Five packaging implementations dominate production:
Shaka Packager (Google, open source) is the most widely deployed CMAF packager. It emits cmf2-strict CMAF by default, supports cenc and cbcs encryption, integrates with every major DRM key-management system (Widevine, FairPlay, PlayReady via the EZDRM, Axinom, BuyDRM, PallyCon, etc.), and runs in CI pipelines as a command-line tool or as a long-running live mode. The reference implementation behind hundreds of production deployments at Google, YouTube, and large operators.
Bento4 (Axiomatic Systems, open source) is the second-most-common implementation. Bento4 ships the mp4dash and mp4hls tools that produce DASH + CMAF and HLS + CMAF outputs from the same input, plus a rich library for inspecting fragmented MP4 files. Bento4 is the go-to packager for engineers who want a CLI-friendly toolkit and tight control over brand declarations.
AWS Elemental MediaPackage v2 is the dominant managed CMAF packager in cloud deployments. It accepts a single live ingest (typically RTMP, SRT, or RIST) and emits CMAF segments served simultaneously as LL-HLS and LL-DASH from one chunked-CMAF origin behind CloudFront. The v2 release (2023) made CMAF the default and removed the older dual-stack mode.
Unified Streaming's Unified Packager / Unified Origin runs as a long-running daemon that re-multiplexes incoming fMP4 (or MPEG-TS) into CMAF on the fly, generating HLS and DASH manifests dynamically per request. The "just-in-time packaging" model means you store one CMAF-friendly source on disk and let the origin emit any manifest variant — HLS, DASH, with or without low-latency, with any DRM, with any audio language combination — at request time. Used by a large fraction of European operators.
Bitmovin Live, Mux Live, Norsk (id3as) are the major managed live-encoder-and-packager-as-a-service offerings. Each accepts an ingest (RTMP, SRT, WHIP), runs encoding to a configurable bitrate ladder, packages as CMAF with low-latency chunks, and serves LL-HLS and LL-DASH from a managed origin. The differentiator between the three is the ladder strategy, the API ergonomics, and the included analytics; the CMAF emission is functionally equivalent.
Shaka Packager and Bento4 cover the open-source bench; the three managed services cover the SaaS bench. Picking between them is a build-vs-buy question on top of which CMAF emission is, in 2026, table stakes.
Where Fora Soft fits in
We have been building video streaming, WebRTC, OTT, telemedicine, e-learning, video surveillance, and AR/VR software since 2005, and the CMAF transition is one we have walked clients through end to end — from the pre-2019 "package twice, encrypt twice" architecture to the 2026 "package once, encrypt once, serve to everything" baseline. Where we add value is in the migration: auditing an existing dual-stack origin, identifying which content can drop the MPEG-TS or cenc-only legs, planning a cbcs-only multi-DRM cutover, and getting the CDN cache-key configuration right so the new single-CMAF objects actually get the cache hit ratio you paid for. We do not sell encoders or packagers; we ship the application around them — the players, the back-end origin orchestration, the DRM integrations, the QoE dashboards — and we have been writing CMAF-aware streaming services since the format went GA.
Common pitfalls — the failure modes that still appear in 2026
The format is mature but the deployment is not always. Five failure modes still appear regularly in production audits.
Pitfall 1 — Encrypting with cenc when the audience includes Apple. This is the single most common mistake. An engineer reads the Common Encryption spec, finds the older cenc scheme listed first, and picks it as the default. Apple devices refuse to play. The fix is to pick cbcs for all new content; the audit step is to check the moov box's pssh (Protection System Specific Header) and tenc (track encryption) boxes and confirm cbcs is set.
Pitfall 2 — Declaring the wrong codec brand. A packager emits a 4K HEVC HDR10 stream but writes cmfc instead of cmf2 in the ftyp box, and forgets the cud1 brand. Some players accept the file (they read the codec from inside the moov box); some reject it (they trust only the brand declaration). The fix is to write every applicable brand into the compatible_brands list — base CMAF (cmfc or cmf2), the codec profile (cfhd / chd1 / cud1 / cav1 / etc.), and the audio profile (caac / cec3 / caca).
Pitfall 3 — Mixing chunk durations across renditions. The 1080p rendition uses 333 ms chunks, the 720p uses 500 ms chunks, the 540p uses 200 ms chunks. The player's adaptive-bitrate algorithm fails because the chunks no longer align in wall-clock time. The fix is to fix one chunk duration across the entire ladder — the production default in 2026 is 333 ms because it lines up neatly with 30 fps (10 frames per chunk) and 60 fps (20 frames per chunk).
Pitfall 4 — Forgetting that the CMAF presentation must be segment-aligned across renditions. The 1080p segment 42 begins at wall-clock 02:48.000, but the 720p segment 42 begins at 02:48.040 because the encoder cadence drifted. The player's ABR switch produces a 40 ms visible glitch. The fix is to set segmentAlignment="true" in the DASH manifest and ensure the upstream encoder produces aligned GOPs across renditions — every rendition must start a new IDR frame at the same wall-clock moment.
Pitfall 5 — Caching .m4s segments without normalizing the cache key. The HLS playlist URL contains a session token (?session=abc123), and the CDN treats segment-42.m4s?session=abc123 and segment-42.m4s?session=xyz789 as two different cache objects. The same segment is stored once per viewer, and the cache hit ratio collapses to nothing. The fix is to either (1) strip the session token from the cache key at the CDN edge, leaving the path-only as the cache key, or (2) put authentication on the playlist URL only and let the segment URLs be unsigned. The CMAF saving is real only when the CDN actually caches each unique object once.
Production reality — adoption, throughput, and the next two years
The Bitmovin Video Developer Report 2025/26 — the most recent industry survey at the time of this writing — reports that CMAF is now the most common packaging format in the surveyed population of streaming engineers. Adoption rose from roughly 32% in 2020 (the year cbcs reached production-stable support across all three DRMs) to roughly 78% by the 2024/25 report; the 2025/26 report shows that figure approaching 90% among new deployments. The remaining 10% is a mix of long-tail legacy MPEG-TS-only HLS deployments, internal-network broadcast environments that ship MPEG-TS for tooling reasons, and a small number of pre-2018 packagers that have not yet been migrated.
The throughput characteristics matter for capacity planning. A CMAF-packaged 1080p stream at 4 Mbps live, 333 ms chunks, runs at 12 chunks per second through the packager and origin; a 5-rendition ladder produces 60 chunks per second; a 10-thousand-viewer event with a smart-edge CDN footprint serves those 60 chunks/second to roughly 99% cache hits at the edge, meaning the origin sees only the fresh chunks plus the rare miss. The wire load is the same as the pre-CMAF dual-stack origin saw for one stack. The difference is the missing second stack and its missing duplicate cache pressure.
Looking forward, the two changes worth tracking are the codec layer and the transport layer. The codec layer is shifting toward AV1 (currently the YouTube and Netflix default for new content) and VVC/H.266 (the long-promised HEVC successor, now shipping in select Asian markets); both have CMAF brands defined and ready, and the packaging story does not change as the codec changes. The transport layer is the more interesting axis: Media over QUIC is the working group's attempt to put CMAF chunks onto a true streaming transport rather than HTTP, and the Cloudflare 330-city MoQ relay network plus the Bitmovin + Cloudflare MoQ integration are early production hints of where the next major shift goes. Throughout that shift, CMAF stays. The packaging format is the substrate that lets every protocol on top — HLS, DASH, LL-HLS, LL-DASH, MoQ, HESP — share the same files.
What to read next
- HLS in depth: m3u8, segments, multi-variant playlists
- MPEG-DASH in depth: MPD, periods, adaptation sets, representations
- LL-DASH and low-latency CMAF: chunked encoding in practice
Talk to us · See our work · Download
- Talk to a streaming engineer about your CMAF migration — pre-CMAF dual stacks, multi-DRM cutovers, CDN cache-key audits. →
/contact - See our case studies from twenty years of video streaming, WebRTC, OTT, and telemedicine work. →
/case-studies - Download the CMAF migration checklist (2026) — twenty items every team should verify before declaring a CMAF transition complete, covering brand declarations,
cbcsencryption setup, segment alignment across renditions, CDN cache-key normalization, and player compatibility.
References
- ISO/IEC 23000-19:2024 — Information technology — Multimedia application format (MPEG-A) — Part 19: Common media application format (CMAF) for segmented media. Third edition, February 2024. The current baseline CMAF specification. Catalogue page. Tier 1 (official spec).
- ISO/IEC 23000-19:2024/Amd 1:2024 — Amendment 1: Low complexity enhancement video Coding (LCEVC) and other technologies. Published July 2024. Adds LCEVC media profile to CMAF. Catalogue page. Tier 1.
- ISO/IEC 23001-7:2023 — Information technology — MPEG systems technologies — Part 7: Common encryption in ISO base media file format files. Fourth edition. Defines
cenc,cbcs, andcbc1schemes. Tier 1. - ISO/IEC 23009-1:2022 — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats. Fifth edition, August 2022. Defines the DASH manifest, the
urn:mpeg:dash:profile:cmaf:2019profile URN, and the CMAF integration into DASH. Tier 1. - Apple HLS Authoring Specification, revision 2025-09. The normative Apple document that specifies fMP4 / CMAF requirements for HLS, including the requirement to use
cbcsfor FairPlay-protected content. Tier 1. - draft-pantos-hls-rfc8216bis-15 — The HLS specification revision under IETF process, May 2025. Adds CMAF-aware tags and clarifies fragment / segment terminology. Subject to revision before final publication as an updated RFC. Tier 1 (IETF Internet-Draft).
- DASH-IF Implementation Guidelines: Content Protection and Multi-DRM v1.5 (2025). The de facto implementation profile for CMAF + CENC with FairPlay, Widevine, and PlayReady from one stream. Tier 1 (industry forum).
- CTA-WAVE Content Specification (CTA-5001) — Web Application Video Ecosystem Content Specification. The CTA specification that pins CMAF brand declarations and codec profiles for smart TVs and connected devices. Tier 1.
- Shaka Packager v3 source tree (Google, open source) — the reference implementation of CMAF emission used across Google's production and many third-party deployments. Tier 2 (reference implementation).
- Bento4 v1.6 source tree (Axiomatic Systems, open source) — the alternate reference implementation for CMAF emission and inspection. Tier 2.
- Akamai — "CMAF: What It Is and Why It May Change Your OTT Future" (Will Law, 2016, archived blog post) — the original industry post that introduced CMAF to the broader streaming community. Tier 3 (first-party engineering blog from a co-editor of the format).
- Bitmovin Video Developer Report 2025/26 (report page) — the most recent annual survey of 700+ streaming engineers, the source of the CMAF adoption figures cited in this article. Tier 4 (industry survey).
In any disagreement between sources, this article followed the ISO/IEC 23000-19:2024 third edition over older vendor blog posts; in particular, the chunk-per-fragment constraint cited in the "CMAF chunk" section follows the 2024 third edition and overrides 2018-era blog posts that describe the looser pre-2024 constraint.


