Delivery Observability: Who Is Buffering and Where

Why this matters

The most expensive delivery failures are almost never total outages — those are obvious and everyone scrambles. The expensive ones are brownouts: one delivery network degrading in one region for one internet provider while your overall numbers look perfectly healthy, quietly costing you the viewers in that slice. If you cannot see which viewers are suffering and where the fault is, you find out from a wave of cancellations and angry posts hours later, when the damage is done and the evidence is gone. This article is for the founder, product manager, or streaming engineer who has to keep a stream healthy across many networks, regions, and devices, and wants to know exactly what to measure, which standards make the measurement possible, and how a dashboard catches a brownout before the audience does. By the end you will be able to explain why server logs and player experience must be stitched together, name the two standards that stitch them, and describe the one dashboard move — slicing by dimension — that turns a lying average into an early warning.

A one-minute refresher: the path a stream travels

This article builds on how a CDN delivers video, multi-CDN architecture and orchestration, and origin and origin shielding; here is the one picture you need in front of you.

A streaming platform chops video into segments — short files of a few seconds each — and lists them in a manifest, a small playlist the player reads to know what to fetch next. Those files are served by a content delivery network (the worldwide fleet of caching servers, or CDN, that keeps copies of your video near viewers). Most requests are answered by a nearby edge server — the cache close to the viewer, the corner store that saves a drive to the warehouse — and only the misses travel back to your origin, the authoritative source. The single number that says whether this is cheap is the offload ratio: the share of requests the edge answers from its own cache without bothering the origin (covered in CDN cost engineering).

So a viewer's bytes travel a chain: player → edge → (sometimes an origin shield) → origin, and that chain runs across many CDNs, many regions, and many internet providers at once. Delivery observability is the discipline of seeing health along that whole chain in real time — and the central question it answers is the title of this article: when a viewer buffers, where in the chain did it happen, and how many others are in the same boat?

The core problem: two halves of the truth that don't talk

Picture the same five seconds of a stream from two vantage points. From the server's side, the CDN writes a log line for every request: the URL, the HTTP status code, whether it was a cache hit or miss, the time to first byte, the bytes sent. From the player's side, the viewer's app knows something the server cannot see: how full its buffer is, whether the picture froze, how long the video took to start. Both are true. Neither is complete.

Here is the trap that makes naive delivery monitoring fail. A CDN log line that reads 200 OK means "I delivered the bytes successfully." It does not mean the viewer was happy. If those bytes arrived slowly — fast enough to be a success, too slow to refill the buffer before it ran dry — the viewer was staring at a spinner while the server logged a clean success. Watch only the server logs and your dashboard is green while your audience is rebuffering. Watch only the player and you know someone is suffering but not which network or region to blame. The art of delivery observability is stitching the two halves together and then slicing the result by where the viewer actually is.

Two columns of the same moment: a CDN server log showing a 200 OK success on the left, and a player showing a frozen, rebuffering screen on the right, with a gap labeled the correlation problem between them. Figure 1. The two halves of the truth. The CDN log says the request succeeded; the player says the viewer is buffering. Both are correct. Delivery observability is the practice of connecting the server's record of what happened to the player's record of what was felt — and neither side knows about the other by default.

There is a boundary worth naming up front, because it keeps this article honest. The formal quality-of-experience metrics — exactly how startup time and rebuffering ratio are defined and measured inside the player — belong to the streaming layer; we link out to video QoE metrics and to player QoE instrumentation rather than re-derive them. This article owns the delivery-side view: the CDN logs, the edge and origin error rates, the per-region picture, and the dashboard that catches a brownout. The operator's wider analytics view — turning all of this into business decisions — is the job of the OTT analytics map in Block 9.

What the player knows: Common Media Client Data (CMCD)

The first standard that bridges the gap lets the player tell the CDN what it is experiencing, one request at a time. It is called Common Media Client Data (CMCD), published by the Consumer Technology Association as CTA-5004 (first edition September 2020). The idea is small and powerful: every time the player asks the CDN for a segment, it attaches a little structured note to the request — either as a custom HTTP header or a query argument — describing its own state.

The standard's own introduction states the thesis of this entire article better than any paraphrase: "Session identification allows thousands of individual server log lines to be interpreted as a single user session, leading to a clearer picture of end-user quality of service… Buffer starvation flags allow performance problems across a multi-CDN delivery surface to be identified in real-time" (CTA-5004, §1). Read that twice. The standard was designed to answer "who is buffering and where" from the server logs.

Four of its fields do almost all the work for observability. The session ID (sid) is a unique identifier — a GUID — that the player puts on every request in a playback session, including manifests, init files, captions, and even DRM key requests. With sid in the log, the thousands of scattered request lines from one viewer collapse into one readable session you can replay end to end. The buffer starvation flag (bs) is the headline signal: the player includes it when its buffer ran dry since the previous request, meaning the viewer was rebuffering and playback stalled. That single flag, riding along in the request and copied into the CDN log, is the literal answer to "who is buffering." The buffer length (bl) reports how many milliseconds of video are queued up — a buffer steadily draining toward zero is a viewer about to stall, the leading indicator before bs fires. And measured throughput (mtp) is the player's own estimate, in kilobits per second, of how fast the bytes are actually arriving — the number it uses to decide which quality rung to request next.

The fields are split across four headers — CMCD-Request, CMCD-Object, CMCD-Status, and CMCD-Session — grouped by how often each value changes, which helps HTTP header compression. The server side is deliberately easy: CTA-5004 says a server that receives sid should propagate it into its access logs (§4), so making CDN logs session-aware is often a configuration change, not a rebuild.

A single segment request travelling from a player to a CDN edge, carrying a CMCD note with session ID, buffer-starvation flag, buffer length, and measured throughput, which is then written into the CDN access log line. Figure 2. CMCD: the player annotates every request. The player stamps each request with a session ID, a buffer-starvation flag, its buffer length, and its measured throughput (CTA-5004). The CDN copies these into its access log, so an otherwise anonymous "200 OK" becomes a line that knows whose session it belongs to and whether that viewer was stalled.

One current, dated detail to flag for any team adopting this now. A second edition, CMCD version 2 (CTA-5004-A), was published in February 2026. It adds new keys, event-mode reporting — the player can report a discrete event such as a stall or an error as it happens, instead of waiting to piggyback on the next segment request — and tighter structured-field encoding. Event mode matters for observability because it shortens the delay between "the viewer stalled" and "your dashboard knows." Player and CDN support for v2 is still rolling out in 2026, so confirm both ends speak the same version before you rely on the new keys.

What the server knows: Common Media Server Data (CMSD)

CMCD is the player talking to the network. The companion standard runs the other direction: the network talking back. It is Common Media Server Data (CMSD), published as CTA-5006 (November 2022), and it lets every server in the chain — origin, mid-tier shield, and edge — stamp data onto each response.

Two of its fields are gold for finding where a problem lives. The intermediary identifier (n) lets each server in the chain name itself, so the response arrives carrying a breadcrumb trail of exactly which CDN and which edge handled it; the standard lists "identifying intermediaries in the media distribution chain, for the purposes of investigating delivery and resolving availability issues" as its very first use case. The duress flag (du) is a server raising its hand: it is included when the server is under stress — CPU, memory, disk, or network — and its explicit intent, in the standard's words, is that "the client will use this signal to move away to an alternate server if possible." A rising count of du flags from one CDN's edges in one region is a brownout announcing itself.

Alongside those, CMSD carries the network's own performance read: estimated throughput (etp) and round-trip time (rtt) per hop, plus a maximum suggested bitrate (mb) the server can use to ask players to ease off during congestion. Because each intermediary appends its own entry, a single response can show you the throughput and round-trip time at every hop from origin to edge — the difference between "the viewer's last mile is slow" and "our origin is slow," which are opposite fixes.

The HTTP standards underneath add two more self-describing signals you get without any media-specific work. Cache-Status (IETF RFC 9211) is a standard response header in which each cache reports whether it was a hit or a miss and, on a miss, why it had to go forward — so your offload ratio is readable straight from the headers, and a sudden rise in misses is an early brownout tell. Proxy-Status (IETF RFC 9209) lets an intermediary state how it handled a response and, on failure, pinpoint the stage and the cause. Together with the HTTP status codes themselves (IETF RFC 9110), these mean the delivery chain can describe its own health in standard, vendor-neutral terms.

Reading the chain: the standard error vocabulary

Before the dashboard, learn to read the raw signals, because they are standardized and they are blunt about what went wrong. HTTP status codes (IETF RFC 9110, §15) sort every response into families, and a handful matter for delivery. A 404 on a segment means the file the manifest promised is not where it should be — a packaging or origin-sync fault. A 403 usually means a signed or tokenized URL expired or was rejected — an access-control problem, not a network one (see edge caching, cache keys, and tokenized URLs). A 503 is an edge saying it is overloaded or unavailable; 502 and 504 are gateway and timeout errors that point upstream, toward the origin or the link to it. Watching the ratio of each family per region and per CDN — not the raw count — is how you tell a local hiccup from a spreading failure.

The cache signal is just as important and easy to miss. Every response can carry whether it was served from cache, and the share that were hits is your offload ratio in real time. A live brownout often shows here first: misses climb, more traffic falls through to the origin, latency rises, and only then do players start to stall. If you are watching the cache-hit ratio per region, you see the cause before you see the symptom.

Signal	Where it comes from	What a healthy value looks like	What a brownout looks like
`bs` buffer-starvation flag	Player, via CMCD (CTA-5004)	Rare, scattered across sessions	Clusters in one region / CDN / ISP slice
`du` duress flag	Server, via CMSD (CTA-5006)	Absent	Appears and rises from one CDN's edges
HTTP 5xx ratio	CDN logs (RFC 9110)	Near zero	Rises in one slice while global stays low
Cache-hit / offload ratio	Cache-Status header (RFC 9211)	High and stable (often >90%)	Drops; misses fall through to origin
Edge time-to-first-byte	CDN logs / RUM	Low and flat	Climbs in the affected slice first
`rtt` round-trip time	Server, via CMSD per hop	Stable per region	Rises on the last hop (last-mile) or upstream

Table 1. The delivery-observability signal set: what each signal is, where it comes from, and how its shape changes when a network degrades. No single row is sufficient — a brownout is the pattern across several, all lighting up in the same slice.

The dashboard: four golden signals, sliced by dimension

You could drown in the data above. The discipline that keeps it usable comes from site-reliability practice: the four golden signals, codified in Google's Site Reliability Engineering book. If you can watch only four things about a user-facing service, watch errors (the share of requests that fail), latency (how long a request takes — here, time to first byte and startup time), traffic (how much is flowing — requests per second and throughput), and saturation (how full the system is — cache-hit ratio, origin load). Any meaningful delivery incident shows up in at least one of these before it shows up anywhere else.

But the golden signals on their own still hide brownouts, because a brownout is a local failure drowned in a global average. The move that makes delivery observability work is to compute every signal not once but per dimension: per region, per CDN, per internet provider (identified by its network number, the ASN), and per device class. A rebuffering rate of 0.6% across your whole audience is excellent — and it can hide an 8% rebuffering rate for one ISP in one region on one CDN, which is a five-alarm fire for those viewers. The average is the liar; the slice is the truth.

A delivery-observability dashboard laid out as a grid: the four golden signals across the top and the slicing dimensions of region, CDN, ISP, and device down the side, with one cell lit up to show a localized brownout that the global average hides. Figure 3. The dashboard that catches brownouts. Watch the four golden signals — errors, latency, traffic, saturation — but compute each one per region, per CDN, per ISP, and per device. A single healthy global average (top-left) can hide a localized brownout (the lit cell); slicing by dimension is what surfaces it.

This is also where delivery observability hands off to action. When one slice lights up, the same per-region, per-CDN quality read feeds the real-user-measurement-based selection that multi-CDN orchestration uses to steer viewers off the failing network and onto a healthy one — the loop covered in multi-CDN architecture and orchestration. Observability is the eyes; multi-CDN switching and bitrate shedding are the hands.

A worked example: the blast radius of a brownout

Numbers make the stakes concrete, so size a brownout out loud. Take a platform with 500,000 concurrent viewers spread across CDNs and regions. One slice — a single internet provider in one region, served by one CDN — carries 3% of that audience:

Affected slice = total concurrency × slice share
               = 500,000 × 0.03
               = 15,000 viewers in the slice

On a normal night the buffer-starvation rate in that slice sits at 0.5%, the same as everywhere else. A peering link between that CDN and that ISP starts to congest, and over ten minutes the slice's bs rate climbs to 8%:

Rebuffering before = 15,000 × 0.005 =    75 viewers stalling
Rebuffering after  = 15,000 × 0.08  = 1,200 viewers stalling

Now look at what the global average does while that happens. The other 485,000 viewers are fine at 0.5%, so the platform-wide rebuffering rate moves from about 0.5% to:

Global after = (485,000 × 0.005 + 15,000 × 0.08) ÷ 500,000
             = (2,425 + 1,200) ÷ 500,000
             = 3,625 ÷ 500,000
             ≈ 0.73%

A jump from 0.5% to 0.73% platform-wide is the kind of wobble a global dashboard shrugs off as noise — yet underneath it, 1,200 real viewers are stalling and on the edge of quitting, all in one slice. Sliced by region × CDN × ISP, that same event is an 8% spike that trips an alert in minutes. The arithmetic is the whole argument for dimensional monitoring: the signal you need is invisible in the aggregate and screaming in the slice.

There is a sampling angle worth one line of math, too. Logging every field of every request from 500,000 viewers is a torrent, so teams sample. But sample too sparsely and a small slice goes dark: if your worst-affected ISP-region slice is 1% of traffic and you sample 1% of requests, you are down to a ten-thousandth of the stream, and a brownout there may not cross a detection threshold for many minutes. The fix is to sample the common path lightly and keep error and stall events at or near 100% — a viewer who stalled is exactly the data point you cannot afford to throw away.

Real-time, not tomorrow morning

One more thing separates observability that catches a brownout from observability that merely explains one after the fact: latency of the data itself. Traditional log delivery batches access logs and hands them over minutes or hours later — fine for billing, useless for a live event where the brownout is over before the log file lands. Modern CDNs answer this with real-time log streaming: AWS CloudFront real-time logs feed a stream within seconds, Fastly streams logs over syslog or HTTPS as they happen, and Akamai DataStream 2 delivers raw logs every 30 to 60 seconds while monitoring delivery health, latency, offload, and errors in near real time. The point is not the vendor; it is the requirement. For a live brownout, your telemetry has to arrive in seconds, not by morning — pair real-time CDN logs with player beacons so both halves of the truth land on the dashboard while you can still act, the same urgency that drives the live-event readiness work in the previous article.

Common mistakes that hide the problem

Most delivery-observability failures are a handful of predictable errors, each one a way of looking at the wrong thing.

The first and biggest is watching averages instead of slices — a green global dashboard while one ISP-region-CDN slice burns, exactly the arithmetic above. The second is watching the origin but not the edge: the origin is calm because the shield is absorbing everything, yet the viewers live at the edge, and the edge is where their stalls begin. The third is flying without CMCD, so the CDN logs are anonymous 200 OKs that can never be tied to a session or a stalled viewer — you have traffic data but no quality data. The fourth is trusting synthetic monitoring alone: a probe in a data center that fetches a segment every minute confirms the file exists, but it is not a real viewer on a real phone on a congested home network, and it will report all-clear straight through a brownout that only real-user measurement can see. The fifth is alerting on machine symptoms, not viewer impact — paging someone because origin CPU hit 70% (harmless) while not paging when rebuffering tripled in a region (an emergency). The sixth is batch-only logs, telemetry that arrives too late to act on during the very live events that need it most. And the quiet seventh is the vanity dashboard: a beautiful wall of graphs nobody watches, with no alert wired to the one number — rebuffering per slice — that actually predicts churn.

Where Fora Soft fits in

Delivery observability is a scale problem before it is a dashboard problem: at a few thousand viewers you can read the logs; at a few million across many CDNs, regions, and networks, the only way to find the one slice that is failing is to instrument the stream so it describes its own health. Fora Soft has built video streaming, OTT and Internet-TV, live-event, WebRTC, and video-surveillance software since 2005, across 625+ shipped projects for 400+ clients, and that experience runs straight through this layer — wiring CMCD into players so sessions and stalls reach the CDN logs, consuming CMSD and the standard Cache-Status and Proxy-Status signals so the chain self-describes, building the per-region, per-CDN, per-ISP dashboards that surface a brownout the global average hides, and connecting that signal to the multi-CDN switching that acts on it. When a platform has to keep millions of streams healthy across a delivery surface no one person can watch, that instrumentation-first engineering is the capability we bring.

Call to action

Talk to a streaming engineer — book a 30-minute scoping call to talk through your per-region qoe plan.
See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
Download the Delivery Observability Checklist & Brownout Runbook — A one-page worksheet to instrument a streaming platform so it describes its own health: the client signals to capture via CMCD (session ID, buffer-starvation flag, buffer length, measured throughput), the server signals via CMSD….

References

CTA-5004 — Web Application Video Ecosystem: Common Media Client Data (CMCD) — Consumer Technology Association (CTA-WAVE), September 2020. §1 states the observability thesis directly: session identification lets "thousands of individual server log lines [be] interpreted as a single user session," and "buffer starvation flags allow performance problems across a multi-CDN delivery surface to be identified in real-time." §3.3 defines the reserved keys used here — sid (session ID, GUID), bs (buffer starvation, boolean), bl (buffer length, ms), mtp (measured throughput, kbps); §4 requires servers to propagate sid to access logs. Tier 1 (standard). https://cdn.cta.tech/cta/media/media/resources/standards/pdfs/cta-5004-final.pdf (accessed 2026-06-16)
CTA-5004-A — Common Media Client Data, Version 2 (CMCD v2) — Consumer Technology Association (CTA-WAVE), February 2026. The second edition: adds new keys, event-mode reporting (report a stall or error as a discrete event rather than on the next request), and structured-field encoding. The dated, vendor-dependent detail to re-verify — player and CDN v2 support is still rolling out in 2026. Tier 1 (standard). https://shop.cta.tech/products/cta-5004-a (accessed 2026-06-16) — confirm both player and CDN speak v2 before relying on new keys.
CTA-5006 — Common Media Server Data (CMSD) — Consumer Technology Association (CTA-WAVE), November 2022. The response-direction companion to CMCD. Defines n (intermediary identifier — use case 1 is "investigating delivery and resolving availability issues"), du (duress — "the client will use this signal to move away to an alternate server"), etp (estimated throughput), rtt (round-trip time), and mb (max suggested bitrate), split across CMSD-Static (persistent) and CMSD-Dynamic (per-hop, each intermediary appends). Tier 1 (standard). https://cdn.cta.tech/cta/media/media/resources/standards/pdfs/cta-5006-final.pdf (accessed 2026-06-16)
RFC 9110 — HTTP Semantics — IETF, June 2022. §15 defines the status-code families used as the delivery error vocabulary here (the 4xx client / 5xx server split; 403, 404, 502, 503, 504). The controlling source for what each delivery error means. Tier 1 (standard). https://www.rfc-editor.org/rfc/rfc9110 (accessed 2026-06-16)
RFC 9211 — The Cache-Status HTTP Response Header Field — IETF, June 2022. The standard header by which each cache reports hit / miss / forward-reason, preserving prior values so "the entire chain of caches handling the request" is debuggable — i.e., the offload ratio and an early brownout signal read straight from the headers. Tier 1 (standard). https://www.rfc-editor.org/rfc/rfc9211 (accessed 2026-06-16)
RFC 9209 — The Proxy-Status HTTP Response Header Field — IETF, June 2022. Lets an intermediary convey how it handled a response and, on failure, pinpoint the stage and cause — the standardized "which hop failed and why" signal for the delivery chain. Tier 1 (standard). https://www.rfc-editor.org/rfc/rfc9209 (accessed 2026-06-16)
RFC 8216 — HTTP Live Streaming (HLS) — IETF (R. Pantos, Ed.), August 2017. The format under most delivery: the manifest/segment model that defines what each log line refers to (a manifest reload vs a media segment vs an init segment), and the object types CMCD/CMSD label. Tier 1 (format specification). https://www.rfc-editor.org/rfc/rfc8216 (accessed 2026-06-16)
Site Reliability Engineering — "Monitoring Distributed Systems" (the Four Golden Signals) — Beyer, Jones, Petoff, Murphy (eds.), Google / O'Reilly, 2016. The latency / traffic / errors / saturation framing applied here to a delivery dashboard. Tier 5 (industry/institutional). https://sre.google/sre-book/monitoring-distributed-systems/ (accessed 2026-06-16)
Amazon CloudFront Developer Guide — Real-time logs — Amazon Web Services, 2026. First-party documentation that CloudFront delivers real-time logs to a stream within seconds of a request — the real-time-vs-batch requirement for catching a live brownout. Tier 4 (first-party vendor engineering). https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/real-time-logs.html (accessed 2026-06-16) — vendor capability; re-verify on the 90-day re-baseline.
Akamai DataStream 2 — product documentation — Akamai Technologies, 2026. First-party reference for near-real-time CDN log streaming (raw logs every 30–60 s) that monitors delivery health, latency, offload, and errors — an example of the real-time telemetry this article requires, named with a date because vendor features change. Tier 4 (first-party vendor engineering). https://techdocs.akamai.com/datastream2/docs (accessed 2026-06-16) — vendor capability; re-verify on the 90-day re-baseline.

Source note (per §4.3.2): the client-to-CDN session and buffer-starvation signaling trace to CMCD (CTA-5004 §1, §3.3, §4) and its 2026 v2 (CTA-5004-A); the server-to-client throughput, round-trip-time, and duress signaling to CMSD (CTA-5006); the cache and error vocabulary to IETF RFC 9110 (status codes), RFC 9211 (Cache-Status), and RFC 9209 (Proxy-Status); the HLS object model to RFC 8216. The four-golden-signals dashboard framing is cited to Google's SRE book (orientation, tier 5), and the real-time-log requirement to first-party CloudFront and Akamai documentation (tier 4), dated because vendor features change. Where popular "just watch your CDN dashboard" advice conflicts with the per-slice reality, the article follows the standards-grounded approach — instrument with CMCD/CMSD and slice by dimension — and says why the average misleads.

Why this matters

A one-minute refresher: the path a stream travels

The core problem: two halves of the truth that don't talk

What the player knows: Common Media Client Data (CMCD)

What the server knows: Common Media Server Data (CMSD)

Reading the chain: the standard error vocabulary

The dashboard: four golden signals, sliced by dimension

A worked example: the blast radius of a brownout

Real-time, not tomorrow morning

Common mistakes that hide the problem

Where Fora Soft fits in

What to read next

Call to action

References

Related glossary terms

Delivery Observability: Who Is Buffering and Where

Why this matters

A one-minute refresher: the path a stream travels

The core problem: two halves of the truth that don't talk

What the player knows: Common Media Client Data (CMCD)

What the server knows: Common Media Server Data (CMSD)

Reading the chain: the standard error vocabulary

The dashboard: four golden signals, sliced by dimension

A worked example: the blast radius of a brownout

Real-time, not tomorrow morning

Common mistakes that hide the problem

Where Fora Soft fits in

What to read next

Call to action

References

Related glossary terms

Origin

Buffer

Rebuffering

Multi-CDN

Segment

Manifest

Bitrate

Cache-hit ratio