Startup time measures everything that happens before the first frame: DNS resolution, TLS handshake, manifest fetch, manifest parse, license acquisition (for DRM content), initial segment fetch, decoder setup, first decode, first render. Each stage adds milliseconds. A well-tuned web player on a warm connection hits 1.0–1.5 seconds; a cold-start with DRM, fresh DNS and a large manifest can reach 4–6 seconds.

Industry benchmarks (Mux, Conviva, Bitmovin): desktop under 1.5 s is excellent, 1.5–3 s is good, 3–5 s is acceptable, above 5 s causes measurable engagement loss. Mobile is generally 0.5–1 s slower due to higher RTT and weaker decoders. Live streams typically have higher startup time than VOD because the player must catch up to the live edge plus negotiate a starting rendition.

Reducing startup time is a per-stage optimisation problem. DNS prefetch on the previous page reduces resolution cost. CDN edge HTTPS connection reuse skips the TLS handshake on warm sessions. Manifest size matters — a 5 KB manifest parses faster than a 50 KB one. License acquisition can be parallelised with the first segment fetch. The starting rendition matters — too high and the first segment takes too long; too low and viewers see ugly low-bitrate for the first few seconds. Most operators target a starting rendition under 1.5 Mbps and aggressively upshift once the buffer has cushion.