Latency Budget Calculator: Add Up the Seconds Before You Pick a Protocol

Why this matters

Almost every decision in a live or interactive streaming product flows downstream from one number: the latency budget. Give engineers 20 seconds and they ship classic HLS over any CDN at the lowest cost per viewer. Give them 3 seconds and they ship low-latency HLS or DASH with a tuned player and a CDN that supports chunked transfer. Give them 500 milliseconds and they ship WebRTC and accept the bill that comes with it. Product managers, founders, and operators meet the latency budget through three questions: what latency does my use case actually need, which protocol family does that latency force me into, and where in the chain is a vendor's number hiding the seconds it does not mention. This calculator answers all three in under a minute, using the same component ranges and protocol floors we documented, with standards citations, in our companion article on latency, glass-to-glass, end-to-end.

Use the calculator. The rest of this article explains every term it adds up and every protocol floor it compares against.

End-to-end pipeline from camera lens to viewer display, with the seven latency contributors labelled in order and the player buffer drawn as the largest band Figure 1. The seven contributors to glass-to-glass latency, in pipeline order. The packager and the player buffer are the two bands that swing the total; the other five sum to under a second in almost every configuration.

Glass-to-glass latency is an addition problem

Streaming latency is the elapsed time between the moment a frame leaves the lens of the camera and the moment that same frame is rendered on the viewer's screen. The industry calls this glass-to-glass, because the journey starts at one piece of glass — the camera lens — and ends at another — the display. The Streaming Video Technology Alliance, Mux, Cloudflare, and the Apple engineers who wrote the HLS Authoring Specification all use the same definition.

There is no magic in the total. You start at the camera, finish at the display, and sum the time each stage holds the frame before passing it on. Seven stages are worth naming, and the calculator adds them up for you. The realistic 2026 production ranges below come from measurements published by Bitmovin, Mux, Wowza, Cloudflare, and AWS Elemental, cross-checked against the controlling standards documents.

#	Stage	Typical 2026 range	What it is
1	Capture & ingest buffer	10 – 200 ms	Sensor exposure, image-signal-processor pipeline, audio capture, USB or HDMI grabber
2	Encoder	50 – 400 ms	Compresses raw video; lookahead and B-frame depth dominate
3	Packager (segmenter)	50 ms – 6 s	Cuts the bitstream into segments or CMAF chunks; the biggest controllable variable
4	Contribution network	20 – 250 ms	Round-trip from encoder to origin plus protocol recovery window
5	CDN / SFU forwarding	20 – 200 ms	Origin-to-edge propagation, or SFU forwarding for WebRTC
6	Player / jitter buffer	0.1 – 30 s	The safety reserve ahead of the playhead; usually the largest term
7	Decoder + display	30 – 100 ms	Hardware decode, frame queue, panel refresh

The sum of "everything except the packager and the player buffer" is rarely more than one second. Almost every multi-second number you see on a streaming spec sheet comes from those two terms — and both are protocol choices, not technology limits. That is the lesson of this calculator in one sentence.

Stage 1 — Capture and ingest buffer

A camera does not produce a digital frame the instant light hits the lens. The sensor exposes, the image-signal processor debayers and white-balances, and the device copies the result into memory the encoder can read. A modern broadcast camera adds 20 to 80 milliseconds; a webcam or phone camera adds 50 to 200; an action camera with on-sensor compression sits under 20. The calculator defaults this to 80 milliseconds and lets you override it.

Stage 2 — Encoder

The encoder turns raw video into a compressed bitstream, and two settings dominate its latency. Lookahead is how many frames the encoder examines before compressing the current one — higher lookahead means better quality at the same bitrate, but more delay. B-frame depth counts frames predicted from both past and future, so the encoder must wait for a future frame before emitting a B-frame.

The calculator computes the encoder term from those two settings plus a 30-millisecond fixed pipeline. The math runs out loud: at 30 frames per second, one frame is 1000 ÷ 30 = 33.3 milliseconds. A broadcast profile with 10 lookahead frames and 2 B-frames adds (10 + 2) × 33.3 = 400 milliseconds, plus the 30-millisecond pipeline, for 430 milliseconds total. A low-latency profile with zero lookahead and zero B-frames adds only the 30-millisecond pipeline. Changing the frame rate changes every frame-based term: at 60 frames per second a frame is only 16.7 milliseconds, halving the cost of the same lookahead.

Stage 3 — Packager

The packager cuts the encoder's bitstream into the units the delivery protocol expects: HLS or DASH segments of 2 to 10 seconds; CMAF chunks of 200 to 500 milliseconds; or, for WebRTC, individual Real-time Transport Protocol packets that are not packaged at all. This is the largest controllable contributor and the one engineers fight over.

A classic HLS configuration uses 6-second segments, and the packager cannot emit a segment until the encoder has produced six full seconds of video — so it adds up to 6 seconds before a byte leaves. Low-latency HLS and low-latency DASH emit partial segments or chunked CMAF of roughly 200 to 500 milliseconds, forwarding data as soon as a chunk is ready. RFC 8216 defined only full-segment HLS; the partial-segment extension lives in Apple's HLS Authoring Specification and in draft-pantos-hls-rfc8216bis-22 (May 2026) via EXT-X-PART-INF and PART-HOLD-BACK. ISO/IEC 23009-1:2022 and the DASH-IF Low-Latency Modes guideline define the analogous chunked-CMAF mechanism for DASH. WebRTC has no packager in this sense, which is the single largest reason it reaches sub-second latency where HLS cannot.

Stage 4 — Contribution network

The contribution leg is the hop from encoder to origin — what RTMP, SRT, RIST, WHIP, and WebTransport do when they push a stream into the platform. A healthy contribution leg adds the network round-trip plus the protocol's acknowledgement window, typically 50 to 150 milliseconds across the public internet, lower on a dedicated link. SRT and RIST add a tuneable error-recovery window of 50 to 250 milliseconds on top. See push vs pull, contribution vs distribution for the full picture.

Stage 5 — CDN or SFU forwarding

The content delivery network — the chain of cache servers that bring the stream close to each viewer — is rarely the bottleneck. A well-configured CDN with origin shielding and tiered caching adds 20 to 60 milliseconds for a warm edge and 100 to 200 for a cold-edge miss. The bigger CDN problem for low latency is not propagation time but chunked transfer support: a CDN that cannot forward a partial HTTP response as the origin emits it cannot deliver low-latency HLS or DASH at all. For WebRTC, this term is the SFU forwarding delay, usually 20 to 40 milliseconds. See CDN for the streaming engineer.

Stage 6 — Player or jitter buffer

This is the term that swallows everything else. The player buffer is the safety reserve the player keeps ahead of the playhead, so it can ride out the next network burp or cache miss. Long buffer means safe playback; short buffer means current playback — you cannot have both. Classic HLS recommends three segments of buffer: 18 seconds with 6-second segments. Apple's HOLD-BACK attribute defaults to three times the target duration; LL-HLS uses PART-HOLD-BACK, defined as at least twice and ideally three times the Part Target Duration, so 333-millisecond parts give a 0.7-to-1-second floor. WebRTC's equivalent is the jitter buffer — typically 50 to 200 milliseconds, two orders of magnitude smaller, and the second-largest reason WebRTC reaches sub-second latency.

Stage 7 — Decoder and display

The decoder pulls compressed bytes from the buffer, decompresses them into raw frames, and queues them for the display. Hardware decoders add 1 to 3 frames of pipeline; the display draws at its refresh rate — 16.7 milliseconds per refresh at 60 Hz, 8.3 at 120 Hz. A phone or laptop totals 30 to 60 milliseconds; a smart TV in default mode 60 to 100, which game mode reduces and movie mode does not.

A worked example the calculator reproduces

Pick a configuration and add the seven terms. Take a 1080p feed travelling from a stadium to a home viewer, first as classic HLS, then as WebRTC, at 30 frames per second.

Classic HLS, 6-second segments
  Capture & ISP             =     80 ms
  Encoder, 10 LA + 2 B      =    430 ms
  Packager, 6-s segment     =  6,000 ms   ← waits for one segment
  Contribution (SRT)        =    100 ms
  CDN warm edge             =     40 ms
  Player buffer, 3 × 6 s    = 18,000 ms   ← three-segment safety
  Decode + 60 Hz display    =     50 ms
                            ──────────
  Glass-to-glass total      ≈ 24.7 s

WebRTC, SFU, default jitter buffer
  Capture & ISP             =     80 ms
  Encoder, no lookahead     =     30 ms
  No packager (RTP)         =      0 ms
  Contribution (RTP/SRTP)   =     80 ms
  SFU forward               =     30 ms
  Jitter buffer             =    100 ms
  Decode + 60 Hz display    =     30 ms
                            ──────────
  Glass-to-glass total      ≈ 0.35 s

The roughly 70-fold gap between these two totals has almost nothing to do with the network and almost everything to do with two decisions: how the publisher packages the stream, and how much buffer the player holds. The encoder, contribution, CDN, and decode terms are within a few hundred milliseconds of each other in both stacks. Drop the calculator's preset to "Classic HLS" and then to "WebRTC" and watch the same two bands — packager and player buffer — expand and collapse.

Stacked horizontal bars comparing classic HLS and WebRTC latency budgets with the seven contributors coloured, showing the packager and player buffer bands dominating the HLS total Figure 2. The same camera and viewer, two protocol choices, a 70-fold difference. The packager and player buffer terms own almost the entire gap.

What "low latency" actually means

The industry uses low with three different floors, and the calculator labels your total against them automatically.

Reduced latency is 5 to 10 seconds: shorter segments, smaller buffer, no encoder lookahead, no change to the protocol family. Most OTT live sport and live news ship here. Low latency is 2 to 5 seconds: the chunked-CMAF, partial-segment regime — LL-HLS, LL-DASH, HESP — requiring a CDN that supports chunked transfer end to end. Most live betting and live shopping ship here. Ultra-low or real-time latency is under one second: the WebRTC regime, plus Media over QUIC (still drafting as draft-ietf-moq-transport in 2026), where a viewer can talk back, click, vote, or trade on what they see — at typically 2 to 10 times the cost per viewer of LL-HLS.

Common pitfall. Vendors quote the lowest latency tier their stack can hit, not the tier their typical customer ships. A platform's "Low Latency" mode may be 2 to 5 seconds on the receiver side, but if the contributing streamer's RTMP leg adds 2 to 5 seconds on top, the real glass-to-glass for a chat watcher is closer to 7. When you read a number, ask which legs it includes — the calculator's contribution-network term is exactly the leg vendors most often omit.

The protocol-family fit test

The calculator's bottom table maps your computed total against the realistic 2026 production floor of each protocol family, taken from the protocol latency table in our companion article. A green row means your budget sits at or above that family's floor, so the family can plausibly deliver it.

Protocol family	Glass-to-glass floor	Required conditions
WebRTC (W3C CR + RFC 8825–8866)	0.2 s	SFU or P2P; tuned jitter buffer
Media over QUIC (`draft-ietf-moq-transport`)	0.2 s	MoQ relay and player; work in progress as of May 2026
HESP (`draft-theo-hesp`)	0.4 s	HESP-aware origin and player
LL-DASH (DASH-IF Low-Latency Modes)	1.5 s	Chunked CMAF; chunked transfer in CDN; tuned player
LL-HLS (Apple HLS Authoring Spec)	1.5 s	Partial segments; chunked transfer; tuned player
Classic DASH (ISO/IEC 23009-1)	12 s	2–4 s segments; any CDN
HLS (RFC 8216, 6-s segments)	18 s	Defaults only; any CDN

Read it three ways. First, every family whose floor is above one second pays for it with player-buffer safety, not engineering inability. Second, real deployments run 1.5 to 3 times the floor — the floor is the lab demo. Third, if your target latency is 800 milliseconds, the table immediately rules out every HTTP-segmented family and points you at WebRTC, MoQ, or HESP, which is exactly the decision the calculator is built to make obvious.

Where Fora Soft fits in

We have built latency-sensitive systems across every vertical where the seconds matter: WebRTC conferencing and live shopping where sub-second interactivity is the product, OTT and e-learning where reduced-latency HLS keeps cost per viewer sane, telemedicine where a surgeon's remote view cannot lag, and surveillance where an operator acts on what the camera sees now. In each case the first conversation is the same one this calculator runs — agree the budget, find the dominant term, choose the protocol that target forces. We have shipped 239+ projects since 2005, and the latency budget is on the whiteboard in the first week of nearly all of them.

CTA block

Talk to a streaming engineer — bring your target latency; we will map it to a protocol stack and a cost per viewer.
See our case studies — WebRTC, OTT, telemedicine, and surveillance systems we have shipped where latency was the product.
Download the Latency Budget Cheat Sheet — the seven contributors, the three latency classes, and the 2026 protocol floor table on one page.

Call to action

Talk to a streaming engineer — book a 30-minute scoping call to talk through your latency budget calculator plan.
See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
Download the Latency Budget Cheat Sheet — Single-page printable: the seven contributors with 2026 ranges, the encoder frame-time formula, the three latency classes, and the 2026 protocol-family floor table.

References

RFC 8216 — HTTP Live Streaming. R. Pantos, W. May, IETF, August 2017. Tier 1. https://datatracker.ietf.org/doc/html/rfc8216 — classic HLS playlist format, EXT-X-TARGETDURATION, three-segment hold-back, 18-second player-buffer floor.
draft-pantos-hls-rfc8216bis-22 — HTTP Live Streaming 2nd Edition. R. Pantos et al., IETF, May 2026. Tier 1. Internet-Draft, subject to revision before RFC publication. https://datatracker.ietf.org/doc/html/draft-pantos-hls-rfc8216bis-22 — EXT-X-SERVER-CONTROL, HOLD-BACK, PART-HOLD-BACK, LL-HLS partial-segment extensions.
HLS Authoring Specification for Apple devices, revision 2025-09. Apple Inc. Tier 1. https://developer.apple.com/documentation/http-live-streaming/hls-authoring-specification-for-apple-devices — PART-HOLD-BACK floor of 2–3× Part Target Duration; removal of HTTP/2 push from LL-HLS in 2023. Where popular articles still describe HTTP/2 push as required for LL-HLS, this calculator follows the Apple specification, which removed it.
ISO/IEC 23009-1:2022 — Dynamic adaptive streaming over HTTP (DASH) Part 1. ISO/IEC, 2022. Tier 1. https://www.iso.org/standard/83314.html — MPD@suggestedPresentationDelay, segment availability timing, classic-DASH player-buffer floor. Normative text is paywalled; the presentation-delay role is mirrored in the open DASH-IF guidelines below.
DASH-IF Low-Latency Modes for DASH (CTA-5004) Implementation Guideline. DASH Industry Forum, 2024. Tier 1. https://dashif.org/docs/CR-Low-Latency-Live-r8.pdf — chunked-CMAF emission, ProducerReferenceTime (prft) box, LL-DASH 1.5-second floor.
W3C WebRTC 1.0, Candidate Recommendation. W3C. Tier 1. https://www.w3.org/TR/webrtc/ — peer-connection media path; jitter-buffer behaviour underlying the WebRTC sub-second floor. RTCRtpReceiver.jitterBufferTarget is defined in the W3C WebRTC-Extensions draft.
draft-ietf-moq-transport — Media over QUIC Transport. IETF MoQ Working Group, 2026. Tier 1. Internet-Draft, subject to revision. https://datatracker.ietf.org/doc/html/draft-ietf-moq-transport — the emerging real-time transport whose floor the fit table lists at 0.2 s; flagged as work in progress.
Mux: "What is latency, and how do you measure it?" Mux engineering blog. Tier 3. https://www.mux.com/articles — glass-to-glass definition and production latency ranges cross-checked against the standards above.
Cloudflare Stream: low-latency live streaming documentation. Tier 4 (production deployer). https://developers.cloudflare.com/stream/ — production CMAF chunk and edge-forwarding figures used to calibrate the CDN-stage range.
Bitmovin Video Developer Report 2025/2026. Tier 4. https://bitmovin.com/video-developer-report/ — adoption context for low-latency protocols and typical deployed latency tiers.