Why This Matters
Almost every product-side argument about streaming protocols — "can we hit one-second latency?", "why does the bill keep climbing?", "will this play on smart TVs?" — comes back to which protocol the team picked, and which protocol the team picked comes back to a single three-axis decision they may not have written down. A product manager who knows the family tree can read an architecture diagram, spot the hybrid stack, and predict the cost shape without asking the engineering team to translate. An engineer who knows the family tree can defend a protocol choice with one sentence — "we use LL-HLS because we need three-second latency on iOS without a per-viewer compute cost" — instead of a forty-minute slide. This article is the map of the territory before any of the deep-dive articles in Block 4. Read it before you read the HLS, DASH, CMAF, WebRTC, WHEP, HESP, or MoQ articles; everything in those pages slots into the tree you build here.
The three choices that make a delivery protocol
A delivery protocol is the rule book for moving a video stream from a server (or a peer) to a player over the public internet. The rule book always answers three questions, and the answers to those three questions are what put a protocol on one branch of the family tree or another. Memorise the three questions and you can place every named protocol in the right place on the tree without having read its specification.
The first question is which transport layer the bytes ride on. The internet offers three serious options. TCP, the connection-oriented transport, was designed for files; it guarantees that every byte arrives in order, retransmits whatever was lost, and waits as long as it needs to, which is great for an .iso download and painful for a live stream because a five-hundred-millisecond retransmit means a five-hundred-millisecond freeze in the picture. UDP, the connectionless transport, sends datagrams and forgets about them; loss is up to the application to detect and decide on. QUIC, the new transport published as RFC 9000 in 2021, runs on top of UDP but adds an encrypted handshake, multiplexed streams that do not block each other, and connection migration; for streaming purposes you can think of QUIC as "TCP's reliability when you want it, UDP's speed when you don't, in one protocol".
The second question is what shape the data takes on the wire. The two dominant shapes are segments (a finished file of two to six seconds of video, fetched whole over HTTP) and RTP packets (small real-time packets that ship as soon as the encoder produces them, typically 10–60 ms of audio or one slice of a video frame at a time). A third shape, chunks, sits in between: a chunk is a piece of a segment that the player can start fetching before the rest of the segment exists. Most low-latency HTTP streaming in 2026 is chunked.
The third question is who initiates the conversation. HTTP pull means the player asks the server for the next piece whenever it wants it; the server is dumb, the player is in control, and everything in front of the server (cache, CDN, edge) is a normal web cache. HTTP push is similar but the server hints or pushes the next piece before the player asks. Peer signalling (WebRTC and its modern descendants WHIP, WHEP, MoQ) is the opposite: the server and the client (or two clients) negotiate a session up front, then media flows over a kept-open channel, and there is no cache in the middle.
Every named protocol you see in 2026 is a specific combination of one transport, one data shape, and one delivery model. Once you can name the three choices a protocol made, you can predict its latency floor, its scale economics, and its device-support story without reading any spec.
The family tree
The picture below is the map. Read it top to bottom: the root splits into HTTP-based delivery (the left side of the tree) and real-time delivery (the right side). HTTP-based delivery splits again into segmented (HLS, DASH) and low-latency chunked (LL-HLS, LL-DASH, the chunked-CMAF profile of DASH). Real-time delivery splits into UDP-based (WebRTC, WHEP) and QUIC-based (MoQ, HESP). RTMP sits to the side as a legacy contribution protocol that no longer delivers to viewers.
Each branch corresponds to a coherent set of trade-offs.
Branch 1 — HTTP-segmented (HLS, DASH)
The oldest live branch, born when Apple shipped HLS in 2009 (later codified as RFC 8216) and MPEG published the first edition of DASH in 2012 (now ISO/IEC 23009-1:2022). The data shape is a segment: a two- to six-second finished file, usually fragmented MP4 these days. The transport is TCP, framed by HTTP/1.1 or HTTP/2. The delivery model is HTTP pull: the player downloads a manifest (.m3u8 for HLS, .mpd for DASH), reads which segments exist, and asks for them one at a time.
The numbers that fall out of this branch are predictable. Glass-to-glass latency lives between fifteen and forty-five seconds for a typical six-second segment ladder, because the player must wait for a segment to finish encoding, then wait for it to upload, then start playing it after a one-or-two-segment buffer. The scale economics are exceptional — every segment is a static file, every cache in front of it serves it for free after the first miss, the CDN bill is "bandwidth out" and nothing else. Device support is universal: every browser, every smart TV, every set-top box, every Roku, every iOS device speaks at least one of HLS or DASH natively or with a small JavaScript player. This branch carries roughly 80% of live delivery hours in 2026 according to the Bitmovin Video Developer Report 2025 and Conviva's 2026 State of Streaming, and essentially 100% of long-form VOD.
Branch 2 — HTTP-chunked low-latency (LL-HLS, LL-DASH)
The 2020-and-after branch, born when Apple added the LL-HLS extensions to the HLS Authoring Specification in 2019 and DASH-IF published the low-latency CMAF profile of DASH in 2020. The data shape is still a segment, but each segment is sliced into chunks of 200–500 ms (HLS uses EXT-X-PART parts; DASH uses chunked transfer encoding of CMAF chunks), and the player can fetch a chunk before the rest of the segment is encoded. The transport remains TCP framed by HTTP/1.1 or HTTP/2 — Apple removed the original HTTP/2-push requirement from LL-HLS in the September 2023 revision of the HLS Authoring Specification, so articles describing HTTP/2 push as required are out of date. The delivery model is HTTP pull with a few extra primitives: blocking playlist reload, preload hints, rendition reports.
Glass-to-glass latency drops to two to five seconds. The CDN story stays good — the chunks are still cacheable HTTP responses — though origin shielding and tiered caching need to be tuned for the new request rate. Device support is now broad but not universal: native LL-HLS works on iOS 14+ and tvOS 14+; LL-DASH needs a JavaScript player (Shaka, dash.js, hls.js's DASH mode) on browsers and smart TVs. This branch carries roughly 10% of live delivery hours in 2026 and is the default new-deployment choice when latency below ten seconds is required and per-viewer cost must stay flat.
Branch 3 — UDP real-time (WebRTC delivery, WHEP)
The real-time branch on the right side of the tree. WebRTC, finalised as W3C Recommendation in March 2023 and built on the RFC 8825–8866 family, was designed for one-to-one calls; the industry has stretched it for one-to-many delivery by re-routing each viewer through a Selective Forwarding Unit (SFU). WHEP — WebRTC-HTTP Egress Protocol, currently draft-ietf-wish-whep-04 — is a thin HTTP signalling layer on top of WebRTC that makes browser viewers as easy to attach as a tag.
The data shape is RTP packets. The transport is UDP with DTLS-SRTP encryption (RFC 5764). The delivery model is peer signalling: the viewer's player and the SFU exchange an SDP offer/answer, ICE candidates flow over STUN/TURN, then media flows over a kept-open peer connection. Glass-to-glass latency is 200–500 ms. The scale story is the opposite of HTTP delivery: every viewer is a real WebRTC peer at the SFU, the SFU back-end pays per-viewer for CPU and bandwidth, and the cost curve grows linearly with concurrent viewers, not with bytes shipped. This branch carries roughly 7% of live delivery hours in 2026, concentrated in interactive use cases — live auctions, sports betting overlays, two-way classrooms, telemedicine, esports, video conferencing fed back to a public audience.
Branch 4 — QUIC real-time (Media over QUIC, HESP)
The new branch, still under construction. Media over QUIC (MoQ) is the protocol the IETF moq working group has been designing since 2022. The active document in 2026 is draft-ietf-moq-transport-17 (January 2026). The data shape is QUIC streams (an MoQ "object" is the unit), the transport is QUIC over UDP (RFC 9000), and the delivery model is publish/subscribe: a producer publishes to a relay, subscribers attach to the relay, and the relay multicasts inside a single QUIC connection. Cloudflare runs a public MoQ relay in 330+ cities in 2026; Meta and Twitch operate internal relays; an NAB 2026 interop demo connected eleven independent vendor stacks across the same network. HESP (draft-theo-hesp family) is a smaller, vendor-led alternative that ships sub-second latency over HTTP/2 and HTTP/3 with two synchronised streams (an "initialisation" stream and an "incremental" stream); it is in production at THEO Technologies customers but has not converged with MoQ.
Glass-to-glass latency on this branch lives between 200 ms and 1 s, and — uniquely — the protocol scales like HTTP (relays are cacheable, fan-out is many-to-many at the relay) while delivering like WebRTC (sub-second, no segments). That is why every major streaming team is tracking it. Production share in 2026 is small (≤ 3% of live hours), but the curve is the steepest of any branch.
The outlier — RTMP
RTMP, the protocol Adobe published in 2002 and effectively abandoned in 2012, is on the tree only as a dashed branch off to the side. RTMP for distribution (server-to-player) is dead: browsers dropped Flash in 2020, and almost no modern player plays RTMP. RTMP for ingest (encoder-to-server) is the undying default — OBS, vMix, every consumer streaming appliance, YouTube Live, Twitch, Facebook Live, every CDN's live-streaming endpoint still accepts RTMP because every encoder still speaks it. We cover that asymmetry in RTMP in 2026: dead protocol, undying default.
The protocol cheat sheet
Eight protocols, the three choices each one made, the latency floor each one buys you, the cost shape, and the dominant 2026 use case. Read across a row to know what a protocol is for; read down a column to compare them on one axis.
| Protocol | Transport | Data shape | Delivery model | Glass-to-glass latency | Cost shape | 2026 share of live hours | Canonical use |
|---|---|---|---|---|---|---|---|
| HLS | TCP / HTTP/1.1, HTTP/2 | Segments (fMP4 or MPEG-TS) | HTTP pull | 15–45 s | Bandwidth-only, CDN-cached | ~45% | VOD and standard live |
| DASH | TCP / HTTP/1.1, HTTP/2 | Segments (fMP4) | HTTP pull | 15–45 s | Bandwidth-only, CDN-cached | ~35% | Non-Apple VOD and live |
| LL-HLS | TCP / HTTP/1.1, HTTP/2 | Chunked parts inside segments | HTTP pull with blocking reload, preload hints | 2–5 s | Bandwidth + tuned origin | ~6% | iOS live, news, sports |
| LL-DASH | TCP / HTTP/1.1, HTTP/2 | Chunked CMAF | HTTP pull with chunked transfer | 2–5 s | Bandwidth + tuned origin | ~4% | Non-Apple low-latency live |
| WebRTC delivery (incl. WHEP) | UDP + DTLS/SRTP | RTP packets | Peer signalling (SDP, ICE) | 200–500 ms | Per-viewer compute and bandwidth | ~7% | Auctions, betting, two-way live |
| HESP | TCP-then-HTTP/2 or HTTP/3 | Two parallel HTTP streams | HTTP pull, init + incremental | 0.4–1.0 s | Bandwidth + light origin compute | ~1% | Sports, betting, in production |
| MoQ | QUIC over UDP | QUIC streams of objects | Publish/subscribe via relay | 0.2–1.0 s | Bandwidth-cacheable at relay | ~2% and growing | Interactive broadcast, MUSH |
| RTMP (distribution) | TCP | AMF stream | Push | 2–10 s | Dying ecosystem | <0.1% | Legacy embeds only |
A common pitfall: "we picked HLS"
The most frequent mistake we see in product reviews is the sentence "we picked HLS" used as if it answered the question of how the product delivers video. It almost never does. A real production stack ships at least two delivery protocols and often three. A typical 2026 OTT live-events product runs LL-HLS to iOS and Apple TV viewers, LL-DASH to everyone else, and a tiny WebRTC overlay for the host's monitor return and the live moderator chat backchannel — three protocols, one product. A typical 2026 video-conferencing product runs WebRTC for the interactive participants and LL-HLS for the read-only audience joining via a public link — two protocols, one product. A typical 2026 surveillance product runs WebRTC for the live operator view and HLS for the recording playback — two protocols, one product.
Asking "what protocol do you use?" produces a wrong-shaped answer. Asking "what protocol does each viewer class use, what is the latency budget for each class, and where does the protocol switch happen?" is the right shape. We unpack the architectures explicitly in The "switching protocols" reality: hybrid stacks.
The corollary pitfall is the one-protocol-fits-all decision matrix that you sometimes see in vendor decks: a single column labelled "our recommendation" pointing at HLS or WebRTC for every row. Real recommendations are matrix-valued. The matrix is in Picking a delivery protocol in 2026: a decision tree.
Where Fora Soft fits in
Fora Soft has shipped products on every branch of this family tree since 2005 — HLS and DASH backbones for OTT and telemedicine, LL-HLS for live e-learning, WebRTC SFUs for video conferencing and AR/VR collaboration, and MoQ relays for early-stage interactive broadcast trials. Our delivery teams maintain reference architectures for each branch and a hybrid-stack playbook that connects them, and we routinely re-platform clients from a single-protocol stack onto a two- or three-protocol stack when the product's latency targets diverge across viewer classes. The branch a project lives on is rarely the same branch the project will live on after its first hundred thousand concurrent viewers.
What to read next
- HLS in depth: m3u8, segments, multi-variant playlists — the segmented branch in detail.
- LL-HLS in depth: parts, preload hints, blocking reload, rendition reports — the chunked branch in detail.
- Media over QUIC (MoQ) in depth: the 2026 turning point — the new branch.
Call to action
- Talk to a streaming engineer — bring your latency target, your viewer-class breakdown, and your CDN bill. We'll map your product onto the family tree in a 30-minute scoping call.
- See our case studies — OTT, telemedicine, AR/VR, video conferencing, surveillance, e-learning. One example per branch.
- Download the Delivery protocol family-tree cheat sheet (PDF) — one page, eight protocols, the three choices each one made, the latency floor and cost shape, printable for a wall.
References
- **IETF RFC 8216 — HTTP Live Streaming (Pantos & May, August 2017). The canonical HLS specification. https://www.rfc-editor.org/rfc/rfc8216 — primary source for HLS segment-based delivery; supplements (LL-HLS) live in the Apple authoring spec.
- Apple HLS Authoring Specification, revision 2025-09. The living document Apple maintains for HLS conformance in the Apple ecosystem; contains the LL-HLS extensions (
EXT-X-PART, preload hints, blocking reload, rendition reports). HTTP/2 push was removed from the LL-HLS section in revision 2023-09. https://developer.apple.com/documentation/http_live_streaming - ISO/IEC 23009-1:2022 — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats. The MPEG-DASH base specification, fifth edition. https://www.iso.org/standard/83314.html — paywalled; DASH-IF Implementation Guidelines mirror the normative content.
- ISO/IEC 23000-19:2024 — Common Media Application Format (CMAF) for segmented media. The packaging format under both HLS and DASH. https://www.iso.org/standard/85673.html
- IETF RFC 9000 — QUIC: A UDP-Based Multiplexed and Secure Transport (Iyengar & Thomson, May 2021). The QUIC transport that MoQ builds on. https://www.rfc-editor.org/rfc/rfc9000
- IETF RFC 9114 — HTTP/3 (Bishop, June 2022). HTTP over QUIC. https://www.rfc-editor.org/rfc/rfc9114
- IETF RFC 9725 — WebRTC-HTTP Ingest Protocol (WHIP) (Murillo & Garcia, March 2025). https://www.rfc-editor.org/rfc/rfc9725 — sibling protocol to WHEP; relevant for the family-tree's real-time branch.
- draft-ietf-wish-whep-04 — WebRTC-HTTP Egress Protocol (WHEP). The current Internet-Draft as of January 2026; subject to change before RFC publication. https://datatracker.ietf.org/doc/draft-ietf-wish-whep/
- draft-ietf-moq-transport-17 — Media over QUIC Transport (January 2026). The active MoQ document; subject to revision before RFC publication. https://datatracker.ietf.org/doc/draft-ietf-moq-transport/
- W3C WebRTC 1.0: Real-Time Communication Between Browsers — Recommendation, 26 January 2023. https://www.w3.org/TR/webrtc/ — the W3C side of the WebRTC standard, paired with the IETF RFC 8825–8866 RTCWEB family.
- IETF RFC 5764 — Datagram Transport Layer Security (DTLS) Extension to Establish Keys for Secure Real-time Transport Protocol (SRTP) (May 2010). https://www.rfc-editor.org/rfc/rfc5764 — the encryption layer for the WebRTC branch.
- Bitmovin Video Developer Report 2025. https://bitmovin.com/video-developer-report-2025 — industry survey of protocol adoption; cited for the 80/10/7/3 share split in 2026.
- Conviva 2026 State of Streaming. https://www.conviva.com/state-of-streaming — cross-checks the Bitmovin share split with viewer-side telemetry.
- Cloudflare blog, "Media over QUIC: an early relay deployment in 330+ cities" (2025-11). https://blog.cloudflare.com/media-over-quic-relay-deployment — first-party engineering blog from a moq-transport co-editor team; cited for the production-share number for MoQ.
- DASH-IF Implementation Guidelines: Low-Latency Live Streaming (v1.2, 2024). https://dashif-documents.azurewebsites.net/Guidelines-LowLatency/master/Guidelines-LowLatency.html — the implementation profile for LL-DASH chunked-CMAF.
Note on hierarchy: in any disagreement between sources above, this article followed the standards documents (RFC 8216, Apple HLS Authoring Spec, ISO/IEC 23009-1, RFC 9000, RFC 9725, draft-ietf-moq-transport-17) over vendor and analyst sources. The HTTP/2-push detail for LL-HLS, in particular, follows the Apple spec's 2023-09 revision and contradicts older blog posts.


