Chunked Transfer Encoding (CTE)

Chunked Transfer Encoding lets a server stream a response body as a sequence of length-prefixed chunks terminated by a zero-length chunk. Originally designed so dynamic web pages could start sending bytes before knowing their final size, CTE became the technical foundation of low-latency HTTP streaming. The packager writes each CMAF chunk to the response body as soon as it is produced; the player parses chunks as they arrive instead of waiting for the full segment to land.

In LL-HLS and LL-DASH the pattern is: the player requests a segment that is still being produced. The origin holds the HTTP response open, the packager appends CMAF chunks to it as the encoder emits frames, the CDN passes the bytes through with HTTP/2 chunked transfer, and the player parses each chunk into MSE the moment it arrives. End-to-end latency for the player thus tracks the chunk duration (200–400 ms) rather than the full segment duration (2–6 s).

CTE has subtle interactions with CDNs. Caches must be willing to start storing a partial response before knowing its size, and must serve subsequent requests for the same URL out of that partial cache entry. Not every CDN supports this (Apple specifies it as "blocking playlist reload" for LL-HLS). Configuration mistakes here are the most common cause of LL-HLS deployments running at HLS-like latencies.

Chunked Transfer Encoding (CTE)

Related terms

HTTP/2

LL-HLS (Low-Latency HLS)

LL-DASH (Low-Latency DASH)

CMAF chunk