Why This Matters
Cache keys do not get the airtime they deserve in streaming literature. The headline goes to the CDN brand, the protocol — HLS, DASH, CMAF — and the multi-CDN strategy; the cache key, the tiny string that decides whether a million viewers share one cached segment or shred it into a million private copies, sits in a configuration file nobody opens after the first deployment. Yet every production incident a streaming team has reported to us in the past three years that started with the phrase "the origin is on fire" has traced back to a cache key that did one of three things: it forwarded an authentication token, it forwarded a query parameter the player appends, or it inherited a default that made sense for an e-commerce site and is dangerously wrong for a live stream. This article gives a product manager the model needed to challenge the engineering plan ("are we sure the cache key does not include the user token?"), gives the architect the per-vendor knob names to set, and gives the operator the seven-item checklist to walk through before a launch. By the end you should be able to read a CloudFront cache policy, a Cloudflare Cache Rule, a Fastly VCL block, or an Akamai property and tell, in thirty seconds, whether the streaming hit ratio is going to survive launch day.
What a Cache Key Is, Carefully
Start with a definition the rest of the article will refine. A cache key is the deterministic identifier — a short string the cache builds from selected parts of every request — that the cache uses to look up a stored response. When a viewer's player asks for a video segment, the CDN does not search its cache by URL alone; it builds the cache key from the URL plus a chosen set of headers, cookies, and query-string parameters, hashes the result, and looks up that hash in its storage. If the key matches an existing entry, the cache serves the stored response. If it does not, the cache requests the content from the next tier — the shield, the regional cache, or, last of all, the origin — stores the response under the new key, and serves it.
The non-technical analogy is the library catalogue card. The card does not contain the book; it contains the descriptors the librarian needs to find the book. A library that filed every book by title alone gets every Shakespeare into one slot. A library that filed every book by title, author, edition, language, binding, and the date it last left the building gets every Shakespeare into its own private slot — and a librarian who finds nothing on the shelf when ten people ask for Hamlet at the same time. Cache keys behave the same way. The fewer descriptors in the key, the more requests land on the same cached object and the higher the hit ratio. The more descriptors, the more requests carve out their own private object — what the literature calls cache fragmentation — and the lower the hit ratio drops.
What changes in streaming is the stakes. A static website might serve a hundred unique URLs from each cache key; if the key fragments by a factor of ten the website ends up with a thousand private copies and still works. A live stream serves the same handful of segment URLs to every viewer; if the key fragments by even a small factor — by language preference, by Accept-Encoding variant, by an innocuous-looking session ID — the cache cannot hold all the copies, the hit ratio collapses, and the requests fall through to the origin. The origin saturates, the CDN bill jumps, and the player rebuffers in regions nobody planned for. The cache key is the single point in the whole streaming stack where one configuration line drives the gap between "healthy" and "on fire".
The Default Cache Key on Every Major CDN
Defaults matter, because the engineer who never opens the configuration ships whatever the CDN gave them on day one. The defaults look similar across vendors and differ in details that matter.
Cloudflare's default cache key is {scheme}://{host}{path}?{sorted_query_string}. Cloudflare's documentation states this explicitly, and a sort step inside the cache rules sorts the query string alphabetically by default so that two URLs with the same parameters in different orders share the same cache entry. Headers and cookies are not in the default key. A separate property called Query String Sort exists in the Caching app dashboard and can be toggled, but the underlying alphabetical sort in the cache key is on by default.
AWS CloudFront's default cache key is the URL plus whatever the active cache policy includes. The cache policy explicitly names which query strings, headers, and cookies enter the key — Cache Policy QueryStringsConfig, HeadersConfig, and CookiesConfig. CloudFront ships several managed policies: Managed-CachingOptimized includes no headers, no cookies, and no query strings in the cache key; Managed-CachingDisabled disables caching entirely. A second policy, the Origin Request Policy, controls what CloudFront forwards to the origin without making it part of the cache key — a separation that matters for streaming and that we return to below.
Fastly's default cache key is the URL plus the Host header, built inside the vcl_hash subroutine. Fastly's VCL best-practices document warns engineers to avoid modifying that hash unless they have to, and recommends the Vary header as a more flexible primary technique for cache-key variation. A req.hash_always_miss flag exists for forcing a miss without disabling request collapsing — useful for debugging, dangerous in production.
Akamai's default cache key is the URL; cache key query parameters and cache ID modifications are configured per property in Property Manager. Akamai's documentation calls the construct the cache ID, which is conceptually the same thing as the cache key but with a different vendor vocabulary; the Cache ID Modification behavior is what lets you add or remove components from it.
Google Media CDN's default cache key is the URL, and Media CDN's caching configuration controls which headers and query parameters get included.
A vendor-comparison summary, with the defaults stated identically:
| CDN | Default cache key | What's NOT in the key by default | Vary header respected? |
|---|---|---|---|
| Cloudflare | scheme + host + path + sorted query string | Headers, cookies | Honoured for selected header values only |
| AWS CloudFront | URL + policy-selected params | Whatever the cache policy excludes | Honoured for Accept-Encoding, Accept-Language, Origin (others ignored unless configured) |
| Fastly | URL + Host | Headers, cookies | Honoured by default; produces a secondary cache key per the spec |
| Akamai | URL | Headers, cookies, query strings (unless configured) | Ignored by default; explicit Remove Vary Header behaviour exists |
| Google Media CDN | URL | Headers, cookies, query strings (unless configured) | Honoured per configuration |
The Vary Header, the Standard's Cache Key Half-Twin
The HTTP specification carries its own opinion on cache keys, and a streaming engineer who skips it will end up arguing the wrong side of a misunderstanding with a CDN's support team. The relevant document is **IETF RFC 9111, HTTP Caching, published June 2022, which obsoletes RFC 7234.
The mechanism is the Vary response header. When an origin returns a response with Vary: Accept-Language, the cache reads that as a contract: this response was selected based on the client's Accept-Language request header, and the cache MUST NOT serve this stored response to another request whose Accept-Language does not match the original. RFC 9111 §4.1 calls the augmented key the secondary cache key* — the primary key is still the URL plus method, and the Vary-named request headers form a per-stored-response secondary key. A response carrying Vary: always fails to match and forces revalidation every time; the spec is explicit on this.
For streaming the implication is large and counter-intuitive. If the origin packager sets Vary: Accept-Encoding, Cookie, User-Agent on every manifest and segment, every viewer with a different User-Agent — every Safari, every Chrome version, every Roku model — gets a private cached copy. A single misconfigured Vary header can fragment the cache by a factor of fifty across browser versions. The defensive setting on the origin is to keep Vary short, listed by exact header name only, and never list User-Agent, Cookie, or on streaming responses. The Apple HLS Authoring Specification's general guidance on caching mirrors this — keep segment cache keys simple; segments are immutable.
The CDN's behaviour on Vary differs. Cloudflare honours Vary only for a small allow-list of headers (Accept-Encoding being the canonical example). Akamai ignores most Vary headers and recommends removing them at the edge unless they are Vary: Accept-Encoding. Fastly respects Vary by default and builds a secondary cache key per the spec. The implication: an origin that depends on Vary working consistently will see different cache hit ratios on different CDNs for the same content. The defensive practice — and the one we ship by default — is to design the cache key with explicit CDN cache-rule configuration and treat Vary as a belt-and-braces backup rather than the primary mechanism.
Seven Cache-Key Mistakes That Break Streaming
The seven mistakes below produce more than nine out of ten of the cache-key incidents we have helped streaming teams recover from. They are in order of frequency.
Mistake 1: The session token in the query string
A common authentication pattern in early streaming stacks signs URLs by appending a token to the manifest and segment URLs: …/segment_4172.ts?token=eyJh…. The token is unique per viewer, sometimes per session, and the CDN's default cache key includes the query string. The result is one cached copy per viewer per segment; the hit ratio collapses to near zero.
The fix is to move the token off the query string and into either a signed-cookie scheme (Cloudflare Signed Cookies, CloudFront Signed Cookies, Cloud CDN Signed Cookies) or a URL-prefix signature scheme that signs a path component the CDN strips from the cache key. AWS, Cloudflare, Google, Bunny, and Akamai all publish this exact recommendation in their token-auth documentation. Google's Media CDN dual-token authentication is one canonical example: the master URL signature is short-lived and viewer-specific; the sub-paths under it are signed with a longer-lived shared token that is stripped from the cache key. The result is per-viewer access control with a globally shared cache.
Mistake 2: Authorization header included in the cache key
A second flavour of the same problem: the origin or the CDN policy includes the Authorization request header in the cache key. The Authorization header carries a JWT or an opaque token that varies per viewer. The cache fragments by viewer, the hit ratio collapses. The fix is to exclude the Authorization header from the cache key while still forwarding it to the origin for validation. CloudFront's Origin Request Policy is exactly the right tool for this: the header rides through to the origin for the auth check; the cache key stays clean.
Mistake 3: The Vary trap
The origin packager returns Vary: User-Agent on the manifest. The CDN respects Vary and shards the cache by every distinct User-Agent string. Mobile Safari version bumps, Chrome version bumps, Roku model strings — each spawns its own cached object. Hit ratios drop, the origin starts seeing the long tail. The fix is to constrain Vary at the origin to the small allow-list — Vary: Accept-Encoding and nothing else, unless there is a concrete reason — and to use Akamai's Remove Vary Header behavior or its Cloudflare/Fastly equivalent as a belt-and-braces filter at the edge.
Mistake 4: Host-vs-host fragmentation
A streaming workload is served behind two CNAMEs — cdn1.example.com and cdn2.example.com — pointing at the same CDN. The Host header is in the default key for every CDN. The cache holds the same segment twice, once per hostname. If the workload uses three hostnames, three copies, and so on. The fix is to either route all traffic through one canonical host, or to use the CDN's host-normalisation feature (Cloudflare's "Resolved host" cache-key option, CloudFront's cache key based on path only, Fastly's req.http.Host rewrite in vcl_hash) to collapse multiple hostnames into the same key bucket.
Mistake 5: Percent-encoding and case mismatches
/segment%204172.ts and /segment 4172.ts and /Segment_4172.ts are three different keys at the cache level, even when they point at the same file on the origin. Players sometimes percent-encode characters differently across versions; an Android device that uppercases a path component will not share a cache with an iOS device that lowercases it. The fix is to enforce a canonical URL format at the edge — lowercase the path, normalise the encoding — using Cloudflare Transform Rules, CloudFront Functions, Akamai's Modify Outgoing Request behaviour, or a Fastly VCL set req.url line in vcl_recv. The work is small; the impact on hit ratio is measurable on every multi-platform launch.
Mistake 6: Cookies in the cache key
A site cookie tracks the viewer's last-watched episode and is set on every request, including segment fetches. The CDN's cache policy includes Cookie in the cache key. Every viewer gets a private cached segment, and the hit ratio dies. The fix is to keep cookies out of the cache key for the streaming subdomain. CloudFront's CookiesConfig.CookieBehavior: none is one example; Cloudflare excludes cookies by default and you only have to avoid adding them via a Cache Rule. The general principle: cookies belong on the application origin's cache, not the streaming CDN's cache.
Mistake 7: Segment-and-manifest crosswiring
Manifests are mutable; segments are not. A reasonable cache configuration treats them differently — short TTL on the .m3u8 and .mpd, long TTL on the .ts and .m4s. A common mistake is to apply the same cache-key recipe to both. The manifest's cache key gets all the long-lived caching of a segment; updates do not propagate. Or the segments inherit a short-TTL recipe and the cache evicts them in seconds, sending every retry to the origin. The fix is two separate cache rules — one for .m3u8 / .mpd / .dash with a 1–5 second TTL and serve-stale enabled, one for .ts / .m4s / *.mp4 segments with a 1-year max-age and immutable directive — and a verification step that confirms both rules match the right URL patterns before launch.
The Math: One Token in the Query String, Ten Times the Bill
Take the same worked example from the previous article in this section. A live sports event carries 50,000 concurrent viewers at 3 megabits per second — abbreviated Mbps — with 4-second HLS segments. The system issues 12,500 segment fetches per second; each segment is 1.5 megabytes (3 Mbps × 4 s ÷ 8 = 1.5 MB).
Clean cache key. The cache key is the URL only. The edge hit ratio is 88 percent; the shield hit ratio is 92 percent; the residual miss rate at the origin is 0.96 percent. The arithmetic from the origin-shield article gives:
origin misses per second = 12,500 × 0.12 × 0.08 = 120 fetches/sec
origin egress = 120 × 1.5 MB = 180 MB/s
Cache key with a per-viewer session token. The same workload, but the cache key includes ?token=… and every viewer's token is distinct. Each segment URL is now distinct per viewer; the cache has 50,000 versions of segment 4172 instead of one. The edge cannot hold them all, the shield cannot hold them all, and almost every viewer's first request for the segment misses every tier. The edge hit ratio collapses to roughly 5 percent — the residual is only the same viewer asking for the same segment within the cache window after their first fetch. The arithmetic:
origin misses per second = 12,500 × 0.95 × 0.95 = 11,281 fetches/sec
origin egress = 11,281 × 1.5 MB = 16.92 GB/s
Origin egress jumps from 180 MB/s to 16.92 GB/s — a factor of 94× on the same workload. At AWS's standard $0.09 per gigabyte EC2-to-public-internet rate, the broken steady-state bill goes from roughly $42,000 per month to roughly $3.95 million per month. Even at 4 hours of peak traffic per day — a more realistic sports-streamer profile — the broken bill lands near $660,000 per month, where the clean one is $7,000. A single line in a cache rule.
The point of running the arithmetic is not the absolute number; it is the order of magnitude. A cache-key mistake does not produce a 20 percent regression. It produces a 50× to 100× regression, fast enough that the on-call engineer sees the origin bill ticker move in real time before the alert from monitoring catches up.
Common mistake. "We already monitor cache hit ratio in our dashboard." Cache hit ratio is the right metric, but the dashboard refresh window is rarely shorter than the time it takes a session-token-in-query-string deployment to burn through a quarter's CDN budget. Run a cache-key audit before the deployment; treat the dashboard as a backstop, not a substitute. The pre-launch checklist below is the version we use internally.
A Pre-Launch Cache-Key Checklist
A short, vendor-agnostic list of the seven questions a streaming team should answer in writing before any live launch. We ship this same checklist as the downloadable companion at the bottom of the page.
The seven questions:
- What is in the cache key today? Print the exact rule from the CDN console. If you cannot articulate it in one sentence, the rule is too complicated.
- Are session tokens in the query string? If yes, move them to a signed cookie or a path-prefix signature.
- Is the Authorization header in the cache key? If yes, move it to the Origin Request Policy or its equivalent.
- What does the origin's Vary header list? If anything other than
Accept-Encoding, ask whether it can be removed at the edge. - How many hostnames serve the same content? If more than one, normalise to one cache-key host.
- Is the path normalised for case and percent-encoding? If not, add a one-line normalisation at the edge.
- Are
.m3u8/.mpdand segments treated by separate rules? If not, split the rule.
For each, the answer is one short phrase and a link to the configuration line. A reviewer must be able to read the list, understand the cache-key design, and approve or reject it within ten minutes.
Where Fora Soft Fits In
We have built and operated streaming stacks since 2005 — OTT/Internet TV, e-learning, telemedicine, video surveillance, and conferencing platforms with WebRTC and HLS hybrids — and the pre-launch cache-key audit is part of every project handover. In one OTT engagement we cut the operator's monthly CDN bill by more than 80 percent by removing the session token from the cache key alone; in another, an e-learning platform's first-month launch was rescued by collapsing four hostnames into one canonical streaming subdomain three days before the cohort opened. The pattern is consistent across verticals: cache-key configuration sits at the boundary of platform engineering and SRE, and most teams underweight it until the first incident makes it the centrepiece.
Per-Vendor Cheat Sheet for the 2026 Stack
| Task | Cloudflare | AWS CloudFront | Fastly | Akamai | Google Media CDN |
|---|---|---|---|---|---|
| Default cache key | scheme + host + path + sorted query string | URL + Cache Policy fields | URL + Host (vcl_hash) | URL (Cache ID) | URL |
| Add a query string to the key | Cache Rule > Custom Cache Key | Cache Policy > QueryStringsConfig | set req.url in vcl_recv | Cache Key Query Parameters behaviour | cdnPolicy.cacheKeyPolicy.includedQueryParameters |
| Forward header without keying | Workers / Transform Rule | Origin Request Policy | req.http.X = …; remove from vcl_hash | Modify Outgoing Request behaviour | Origin request configuration |
| Normalise host | Resolved host setting | Cache Policy > HeadersConfig (omit Host) | Rewrite req.http.Host in vcl_recv | Cache ID Modification > Hostname | Backend bucket host rewrite |
| Strip Vary at edge | Workers / Response Headers Transform Rule | Response Headers Policy | unset beresp.http.Vary in vcl_fetch | Remove Vary Header behaviour | Custom response headers |
What to Read Next
- Origin shielding and tiered caching — the layer that catches what the edge misses.
- Token authentication, signed URLs, and origin protection — how to authenticate without breaking the cache.
- CDN cost economics: 95th-percentile, commit, overage, transit — what a clean cache key buys, in dollars.
Talk to Us / See Our Work / Download
- Talk to a streaming engineer. Tell us about your stack; we'll review the cache-key configuration alongside the rest of the delivery layer.
- See our case studies. OTT, e-learning, telemedicine, video surveillance, WebRTC conferencing.
- Download the Cache-Key Pre-Launch Checklist (PDF). Single-page, vendor-agnostic, the version we use internally before every live launch.
References
- **IETF RFC 9111, HTTP Caching, June 2022. §4.1 (Calculating Cache Keys with the Vary Header Field). The controlling specification for the secondary cache key concept.
- IETF RFC 9110, HTTP Semantics, June 2022. §12.5.5 (Vary). Defines the Vary response header that produces the secondary cache key.
- Cloudflare Cache (CDN) docs — Cache Keys. Default cache key shape; Custom Cache Key Rules; Query String Sort. Last verified 2026-05-24.
- Cloudflare Blog — Increasing Cache Hit Rates with Query String Sort. Engineering write-up on the query-string-sort optimisation.
- AWS CloudFront Developer Guide — Cache content based on query string parameters. Defines whitelist/all/none behaviour for CachePolicyQueryStringsConfig.
- AWS CloudFront API Reference —
ParametersInCacheKeyAndForwardedToOrigin. Authoritative API definition of the fields that compose the cache key. - AWS CloudFront Developer Guide — Origin request policies. The policy that forwards headers and query strings to the origin without including them in the cache key.
- Fastly Documentation — Manipulating the cache key. Default
vcl_hashbehaviour and the recommendation to prefer Vary over hash modifications. - Fastly Documentation — VCL best practices. Section on
vcl_hashandreq.hash_always_miss. - Fastly Blog — Getting the most out of Vary with Fastly. Engineering-level treatment of how Fastly implements Vary as a secondary cache key.
- Akamai TechDocs — Cache ID Modification. The Property Manager behaviour that changes the components of the cache ID (Akamai's term for the cache key).
- Akamai TechDocs — Remove Vary Header. The behaviour that strips Vary at the edge to keep cache hit ratios high.
- Google Media CDN Documentation — Authenticate content. Token-authentication patterns that preserve cache hit ratio.
- Google Media CDN Documentation — Dual-token authentication. Path-prefix + viewer-token split that keeps the segment cache shared.
- Apple HLS Authoring Specification for Apple Devices. General guidance on caching of manifests and segments. Last verified 2026-05-24.
- PortSwigger Research — Practical Web Cache Poisoning. The unkeyed-input attack taxonomy that motivates strict cache-key discipline.
- Unified Streaming — Caching recommendations.** Vendor-independent caching configuration for HLS/DASH/CMAF stacks.


