Origin Shielding and Tiered Caching

Why This Matters

If you ship a live or popular VOD stream, the origin is the single point in the network that fails first, costs the most per gigabyte, and has the smallest margin for error — and the shield is the cheapest piece of infrastructure that solves all three problems at once. A team that runs without a shield will, on the day of the first hit event, watch the origin saturate, the bill spike, and the player rebuffer in a region nobody planned for; the same team running with a shield watches a graph that barely moves. This article gives a product manager the model needed to talk about origin protection with an engineering team, gives an architect the layered diagram and the math, and gives an operator the exact configuration knobs to set on AWS CloudFront, Cloudflare, Fastly, Akamai, and a Varnish-based custom shield. By the end you should be able to explain to a CFO why a $180,000 monthly origin egress line becomes a $14,000 one, and to a junior engineer why "we already have a CDN" is not the same as "we have a shield".

What an Origin Shield Is, Carefully

Start with a definition the rest of the article will refine. An origin shield is a single cache layer inside the CDN — usually one chosen Region or Point of Presence, abbreviated POP — that every upstream request from every edge funnels through before the request is allowed to reach the operator's own server. The cache that sits there is just a normal HTTP cache; what makes it a shield is the routing rule that says "no edge talks to the origin directly". The shield is the only thing that talks to the origin, and even the shield only talks to the origin when its own cache misses.

A second piece of vocabulary, used in this same article: tiered caching is the broader pattern of layering more than one cache between the edge and the origin. An origin shield is the specific name for the deepest of those layers — the one closest to the origin. The middle layers, when they exist, are called regional caches, mid-tier caches, or in Akamai's vocabulary the parent tier. So an origin shield is a special case of tiered caching: the case where one of the tiers sits as the last gate before the origin and is the only one that may cross it.

For a non-technical reader the analogy from the previous article in Block 6 — the chain of corner stores, the regional distribution centre, the single warehouse — still applies. The regional distribution centre is the shield. Without it, every corner store has to call the warehouse for every product the corner store does not have on its shelf; with it, the corner store calls the distribution centre, and the distribution centre calls the warehouse only when neither it nor any of its sister stores has the product in stock. Replace shelves with caches and trucks with HTTP requests and the picture is correct.

The technical definition Amazon Web Services — abbreviated AWS — publishes for its CloudFront product is consistent with this framing. Their developer guide says, in plain words, that CloudFront Origin Shield is "an additional layer in the CloudFront caching infrastructure that helps to minimize your origin's load, improve its availability, and reduce its operating costs" and that it acts as "a centralized caching layer between the regional edge caches and your origin" by "collapsing duplicate requests" before they reach the operator's server.

Five-layer streaming cache hierarchy from the viewer's player through edge POP, regional cache, origin shield, and finally the operator's origin, with the request volume shrinking at each hop.

Figure 1. The five-layer cache hierarchy. The shield is the last gate before the origin; it is the only cache that talks to the origin, and only when its own cache misses.

Why a Streaming Workload Needs This More Than a Static Site

A static website with global readers gets along fine with one or two cache layers. A streaming workload does not, and the reason is the time pattern of streaming requests rather than their volume. Every viewer of the same live event asks for the same segment at almost the same instant, every 2 to 6 seconds, for the duration of the event. The arithmetic at the start of a popular live stream is brutal: 100,000 concurrent viewers asking for segment 4172 within a 200-millisecond window is, from the origin's point of view, a single 100,000-fold spike on one file followed by a single 100,000-fold spike on the next file two seconds later, and so on.

Two failure modes follow. The first is the thundering herd, an industry term for the burst of duplicate fetches that converge on a single uncached object the instant a popular live segment is published. Without coalescing or shielding, every edge POP that has not yet cached segment 4172 sends one upstream fetch for it. Five hundred edges across the operator's CDN can mean five hundred origin requests for one file at the same millisecond — fewer than the 100,000 viewer-level requests, but more than the origin's load model assumed. The second failure mode is bandwidth: even when the origin's request-per-second budget is not exceeded, the bytes that leave the origin are the bytes you pay your cloud provider for, at the public-internet egress rate. Five hundred edges pulling the same 4 MB segment is 2 GB of egress per segment per second, every second the stream is live. At AWS's standard $0.09 per GB EC2-to-public-internet rate, that is roughly $5.18 per second, $310 per minute, almost $19,000 per hour — for one stream.

The shield solves both at once. Five hundred edge fetches become five hundred shield fetches, but the shield's cache is filled by the first one and serves the rest in microseconds; the origin sees one upstream fetch. The thundering herd is collapsed by the shield, and the bandwidth that leaves the origin shrinks by the cache hit ratio of the shield — typically 90 to 99 percent for a popular live workload, which translates into a 10× to 100× reduction in origin egress. The shield does not eliminate the cost of moving the bytes; it moves the cost from the operator's origin egress line to the CDN's per-request and per-GB line, where the prices are lower by an order of magnitude and the CDN handles the routing.

The Cloudflare engineering team published the same observation in their blog on concurrent streaming acceleration: request coalescing alone — even without shielding — reduces origin requests by more than 90 percent during stampedes. Add the shield's tiered cache on top of the coalescing and the residual 10 percent shrinks again. The two techniques are complementary, not redundant; almost every modern CDN runs both, and the engineer's job is to make sure both are turned on for the streaming workload.

How a Tiered Cache Hierarchy Works, Layer by Layer

The canonical streaming cache hierarchy has five named layers. Reading from the viewer toward the origin:

The player is the cache on the viewer's device. Modern players, hls.js and Shaka Player and Dash.js and the iOS native player, all cache between 2 and 4 segments ahead in a buffer; that buffer is the player's local cache and is the first place segments live.

The edge POP, or lower-tier cache, is the nearest CDN server to the viewer. An Akamai-scale operator runs more than 4,200 such POPs as of Q2 2026; Cloudflare publishes a 330+ city footprint; Fastly runs 100+ POPs sized for high throughput per location. The edge serves the bulk of the traffic — 95 to 99 percent of viewer requests if the configuration is healthy.

The regional cache, mid-tier cache, or upper-tier cache sits behind a group of edges, usually one per geographic region, and holds the long tail of less-popular content that the individual edges could not justify storing. Cloudflare implements this layer as their generic or smart tiered cache. Akamai positions Tiered Distribution at this level. Fastly's two-POP shielding pattern uses one chosen POP as both regional cache and shield depending on the configuration.

The origin shield is one chosen Region or POP that funnels every miss from every edge and every regional cache through one final cache before the origin is allowed to be asked. Cloudflare's documentation says explicitly that "only upper-tiers can ask your origin for content"; CloudFront's documentation positions Origin Shield as "the centralized point" through which requests are routed; Fastly's shielding pattern names it the "shield POP". Different vendors, identical idea.

The origin is the operator's own server — the packager, the storage layer, the just-in-time packaging tier — the only place where the canonical copy of every segment lives. For a healthy streaming workload the origin should see less than half a percent of the requests viewers issue. For an unhealthy one, it sees twenty times that.

Two questions follow. First: how many tiers should you run? Two is the minimum; the edge cannot reach the origin directly, so something has to sit between them. Three is the comfortable answer for most streaming workloads — edge, regional, shield — and is what every major commercial CDN ships as a sensible default. Four or more is common only for the largest deployments, where the hierarchy is built physically rather than logically: embedded ISP appliances at the eyeball-network edge, a CDN regional cache one hop deeper, then the operator's regional shield, then the origin. Second: what determines which Region you pick for the shield? The single rule is close to the origin, not close to the audience. A shield in us-east-1 for an origin in us-east-1 makes the shield-to-origin hop a few milliseconds; a shield in Sydney for an origin in Virginia makes that hop 150 milliseconds and adds a transpacific round trip to every cache miss. The shield serves a global audience because the regional caches serve the edges; the shield only has to be close to one thing, and that thing is the origin.

Request Collapsing, Coalescing, and the Two-Word Trick

Origin shielding works because the cache is shared. Request collapsing — also called coalescing or collapse forwarding — works because the miss is shared. The two are independent ideas that do similar work; almost every shielding CDN runs both.

The mechanism of collapsing is small enough to fit in one paragraph. When ten thousand viewers ask an edge POP for segment 4172 at the same millisecond and the edge has not yet cached it, the edge has a choice. The naive choice is to send ten thousand upstream fetches. The collapsed choice is to send one upstream fetch and park the other nine thousand nine hundred and ninety-nine on a queue, then release them all the instant the single upstream response arrives. Cloudflare, AWS CloudFront, Akamai, Fastly, and Varnish-based custom shields all implement collapsing on every cache layer. The Cloudflare engineering write-up reports origin-request reduction of more than 90 percent during stampedes from coalescing alone. Varnish Software's product page says the same thing in slightly different words: "Varnish identifies requests to the same uncached resource, queues them on a waiting list, and only sends a single request to the origin."

Why the two techniques are not the same idea is worth a paragraph. A shield is a cache hit: the content is already in the shield, the edge does not have to wait for the origin. A coalesced request is a cache miss: the content is not in the cache and the cache is fetching it once on behalf of many. The shield removes traffic from the origin's bill; coalescing removes load from the origin's request-per-second budget. The combination removes both: most segments are shield hits and never reach the origin; the few that miss the shield are coalesced into a single upstream fetch.

The configuration rule is "turn both on, in every tier". An origin shield without coalescing still sends bursts of duplicate misses upstream from each individual edge that has not yet warmed its cache for the segment; coalescing without a shield still sends one origin fetch from every edge that misses. Each technique catches the failure mode the other misses. Real CDNs default to both being on for streaming-marked properties; the engineer's job is to verify, not assume.

Two-row diagram showing thundering herd of viewers on the top row converging on a single edge, and on the bottom row the same herd collapsed by the edge's coalescing logic into a single upstream fetch.

Figure 2. Request collapsing at the edge. Ten thousand viewers ask for the same uncached segment; ten thousand requests are parked, one upstream fetch is made, and ten thousand responses are served from the single response.

The Math: Show Your Work

Take a worked example, drawn from the class of projects we ship. A sports streamer carries 50,000 concurrent viewers, each at an average video bitrate of 3 megabits per second — abbreviated Mbps — with one 4-second HLS segment per second of programming. Each viewer issues one segment fetch every four seconds; the system issues 12,500 segment fetches per second.

Run without a shield. Edge POPs sit in front of the origin directly. The edge cache hit ratio runs at 88 percent — this is a typical live number; segments are new every few seconds and the edge has to warm its cache from cold roughly every segment boundary. The remaining 12 percent of fetches escape the edge tier:

edge misses per second = 12,500 × (1 − 0.88) = 1,500 fetches/sec

Each missed fetch retrieves a 4-second segment averaging 1.5 megabytes at 3 Mbps (3 Mbps × 4 s ÷ 8 = 1.5 MB). The origin must therefore upload 1,500 × 1.5 = 2.25 gigabytes per second to the CDN. Over a 30-day month of continuous traffic at that rate, the bill is:

egress per month = 2.25 GB/s × 86,400 s/day × 30 days ≈ 5.83 PB/month
cost at $0.09/GB     ≈ $524,000/month

Real-world live streaming is not continuous; the example above is the peak-rate steady-state bill, useful as a ceiling. Replace 30 days of continuous peak traffic with the equivalent of 4 hours of peak traffic per day and the bill becomes 1/6 of the steady state, or roughly $87,000 per month, which is closer to a real production sports-streamer number.

Now turn the shield on. Configure one Region — say eu-west-1 — as the origin shield, and route every edge-miss through the shield first. The shield's own cache hit ratio runs at roughly 92 percent for this workload, because the shield sees the combined miss stream of every edge in the world for the same handful of segments and almost always has the segment by the time the second edge asks. The arithmetic:

shield misses per second = 1,500 × (1 − 0.92) = 120 fetches/sec
egress per second        = 120 × 1.5 MB     = 180 MB/s
egress per month         = 180 MB/s × 86,400 × 30 ≈ 467 TB/month
cost at $0.09/GB         ≈ $42,000/month  (steady-state ceiling)
or, at 4 h/day of peak    ≈ $7,000/month

The origin egress drops by a factor of 12.5×, exactly the ratio of 0.12 / 0.0096 — the residual miss rate after both the edge and shield have done their work. The CDN bill stays approximately the same — the bytes still leave the CDN to the viewer; what changes is the per-byte price of the leg the operator pays for, since the shield-to-origin hop is the only hop the operator is billed for at public-egress rates. Net saving in this worked example: more than $80,000 per month on a typical sports-streamer cadence, for one afternoon of configuration work.

The same arithmetic applies, at different scales, to every live workload. Run it with your own numbers before you talk to your CDN sales engineer. The point of running it is to know what you are looking for in the production dashboard once the shield is on; without the number in hand you cannot tell whether the shield is working as advertised.

Bar chart comparing origin egress (Gbps) and origin egress bill ($K per month) for the worked sports-streamer example, with and without origin shield.

Figure 3. The same 50,000-viewer live event with and without origin shield. Origin egress drops more than tenfold; the CDN-side spend is functionally unchanged.

The Names per Vendor — a Field Guide

The same idea ships under different names from each CDN. The table below maps the marketing words to the architectural concepts so you can find the right control panel quickly.

Vendor	Their name for the shield layer	Their name for the mid tier	Default for streaming?	Notes
AWS CloudFront	Origin Shield	Regional Edge Caches	Off; manual enable per origin	Pick the Region closest to the origin. AWS publishes per-Region pricing for shield requests. Compatible with Multi-CDN: CloudFront's shield can sit in front of an origin behind multiple CDNs.
Cloudflare	Smart Tiered Cache (upper tier)	Regional Tiered Cache (middle tier)	Off by default; one-click on for Enterprise	"Smart" picks the closest upper tier per origin automatically; "Custom" lets the support team build a topology. Cache Reserve is the persistent-tier addition launched in 2022.
Fastly	Shield POP	(Shield POP also serves as mid tier)	Configurable per service	Shielding designates one Fastly POP as the upstream for every other POP, funnelling all uncached requests through it.
Akamai	Tiered Distribution Map (TDM)	Parent tier	On by default for AMD (Adaptive Media Delivery)	SureRoute optimises non-cacheable paths; Tiered Distribution handles cacheable streaming content.
Google Media CDN	Origin shield (integrated)	Region-tiered caching	Streaming-aware default	Couples with Google's eyeball-network peering for the last hop.
Varnish (self-host)	Origin Shield	Layered Varnish topology	Manual	The reference open-source software shield; widely deployed in front of multi-CDN origins.
Bunny.net	Origin Shield	Perma-Cache	Off by default	Smaller-footprint CDN; pricing competitive on the per-GB axis.

A few notes on what to do with that table. First, "default on" matters: Akamai's Adaptive Media Delivery turns Tiered Distribution on for cacheable streaming content out of the box, so a team that lifts and shifts onto AMD inherits a tiered topology for free; an AWS CloudFront origin behind a generic distribution does not, and the engineer has to enable Origin Shield manually per origin. Second, the Region you pick for the shield is "closest to the origin" — every vendor's documentation says the same thing, and the wrong choice (a shield far from the origin) reverses the savings because every cache miss now adds a long round trip on the operator's expensive leg. Third, the price of the shield itself is real but small: AWS charges roughly $0.0035 per 10,000 shield requests in 2026 (varies by Region), and most teams find the request-side fees are recovered ten times over by the egress savings on the origin leg.

Common Pitfalls That Quietly Cost You the Savings

The five mistakes below are the ones we see on every audit. None of them are difficult to fix; all of them are easy to miss in the original configuration and easy to introduce again the next time someone touches the property.

Pitfall one: a cache key that includes a per-viewer query parameter. The edge and the shield both look up cached responses by cache key — a tuple of host, path, and selected query parameters and headers. A cache key that includes ?token=abc123 or ?sessionId=xyz789 produces a different key for every viewer, fragmenting what should be one shared cache entry into 100,000 single-viewer copies, none of which the next viewer can use. Edge cache hit ratio collapses; shield cache hit ratio collapses; the origin sees a fetch for every viewer. The fix is to verify signed-URL tokens at the edge as a separate authorization step and exclude them from the cache key. Every CDN's documentation explains how. Cloudflare, CloudFront, and Fastly all default to ignoring query strings for cache keys unless the operator explicitly opts them in — read the property's cache-key configuration and make sure no one has opted in.

Pitfall two: a per-viewer-customised manifest. Server-side ad insertion — abbreviated SSAI — rewrites the HLS or DASH manifest per viewer to splice in personalised ad segments. Done naively, the manifest URL itself becomes per-viewer (different query string or different path), and the shield cannot share manifests. The segments stay cacheable but the manifest does not, which still costs you the manifest egress and the request rate. The 2026 fix is server-guided ad insertion — abbreviated SGAI — where the manifest stays shared and the ad personalisation happens at the player level via DASH event streams or HLS interstitials; this preserves the cache shape. Article 9.6 in the Block 9 series covers SSAI vs SGAI in detail.

Pitfall three: short or absent TTLs on VOD assets. A Cache-Control: max-age=60 on a movie that has not changed in three years is wasteful: the edge re-validates the asset every minute, the shield re-validates every minute, and the origin sees a steady drip of conditional GETs from every CDN node that is currently serving it. The fix is max-age in days or weeks for VOD; the operator can still purge by versioned filename when a re-encode happens, which is the cleaner pattern anyway. The other direction is also a pitfall: a long max-age on a live segment that has been superseded is wrong, but live segments naturally fall out of the DVR window in minutes and rarely matter.

Pitfall four: a shield in the wrong Region. As above, the shield's only job is to be close to the origin, not to be close to the audience. A team that picks a shield Region "in Europe because most of our traffic is there" but whose origin is in us-east-1 has added a transatlantic hop on every cache miss to the bill they were trying to reduce. The diagnostic is simple: open the dashboard and look at the shield-to-origin RTT. If it is more than 50 milliseconds for a normal cloud region the shield is in the wrong place.

Pitfall five: assuming the default is sensible. Several major CDNs ship the shield turned off for new distributions. AWS CloudFront defaults Origin Shield to off and requires the operator to enable it per origin. Cloudflare's Smart Tiered Cache is on by default for some plan tiers and off for others. A team that pointed DNS at the CDN and walked away may be running on the no-shield default for months without noticing. The diagnostic is to compare the origin's egress dashboard against the CDN's edge-egress dashboard; if the ratio is anywhere close to 1:1, the shield is not engaged and the operator is paying for traffic the shield should be absorbing.

Where Multi-CDN Changes the Picture

When the operator runs more than one CDN — the topic of article 6.4 — the shield question gets a second answer. Each CDN has its own shield, and each shield only sees the misses from its own edges. Run two CDNs in front of one origin and the origin still sees two upstream miss streams, one per CDN, without coordination. The standard remedy is a shared origin shield: a single shield placed in front of the origin and behind both CDNs, so that both CDNs' miss streams pass through one cache before reaching the origin. Varnish, Akamai Cloud Wrapper, and bespoke origin-shield clusters built on nginx, Squid, or Apache Traffic Server are all common implementations of this pattern; AWS publishes a reference architecture in which one CloudFront Origin Shield in a single Region serves as the shared shield for multi-CDN deployments where CloudFront is one of the CDNs.

The trade-off is operational. A shared shield is a single point through which every miss from every CDN must pass, which means a single point that must be sized, monitored, and made highly available. The shared shield is itself often deployed across multiple Availability Zones — AWS documents that "all Origin Shield regions are built using a highly-available architecture that spans several Availability Zones and includes automatic failover to secondary Origin Shield regions" — but the operator still needs to confirm the failover works under load before the day of the first big event.

Cloud Wrapper, Akamai's commercial implementation of this pattern, names the trade-off explicitly: it "maintains shared cacheability across CDNs" and "centrally caches across CDNs, improving cache hits and collapsing requests from multiple CDNs before going forward to origin" — the same trick at a higher tier. Article 6.4 covers the architectural choices in detail.

Three Production-Grade Reference Architectures

A short tour of three shapes we ship for streaming projects in 2026. Each architecture is a real, deployable topology; the choice between them is driven by audience scale, geographic reach, and the operator's appetite for operational complexity.

Single-CDN with a shield, one origin. The smallest credible production architecture for a live streaming workload. One CDN (AWS CloudFront, Cloudflare, Fastly, or Akamai), Origin Shield enabled in the Region closest to the origin, one origin behind a single load balancer. This is what 80 percent of mid-sized streaming products run, and it is enough for audiences of tens of thousands of concurrent viewers. The configuration takes one engineer one day, including the dashboard wiring; the recurring cost saving over the no-shield baseline is the 10× to 30× egress reduction shown earlier in the worked example.

Multi-CDN with a shared shield in front of one origin. The standard pattern for products of hundreds of thousands to a few million concurrent viewers. Two or three CDNs (commonly Akamai + CloudFront, or Fastly + Cloudflare + a regional CDN), each pointed at a shared shield deployed on Varnish or AWS CloudFront-as-shield, the shield pointed at a single origin. The steering layer — content steering for HLS and DASH (RFC-flavoured spec described in article 6.5) or a third-party DNS-based router — decides which CDN each viewer uses. The shared shield ensures that the origin sees one combined miss stream rather than two or three. Operational complexity goes up: there is now a steering layer to monitor, a shield to size, and three or four cache tiers to debug.

Multi-CDN with embedded ISP appliances. The architecture of the very largest streaming operators — Netflix Open Connect, YouTube Edge Cache. Embedded appliances sit physically inside ISP networks, serve only the operator's content, and bypass the public internet entirely for the last hop. The embedded appliances are themselves a cache tier, ahead of the CDN edges; the CDN edges are themselves ahead of the operator's regional shield; the regional shield is in front of the origin. Four physical tiers. Netflix's published engineering position is that Open Connect Appliances "have the same capabilities as the OCAs that we use in our 60+ global data centres", and the model achieves a 99 percent+ cache hit ratio for popular titles. The architecture is the gold standard for performance and cost and is realistically available only to operators large enough that ISPs accept a custom appliance from them.

Where Fora Soft Fits In

Fora Soft has been building video streaming infrastructure since 2005, and the cache hierarchy is a layer we touch on every project — live and on-demand video for OTT and Internet TV platforms, low-latency streaming for sports and esports, video conferencing and telemedicine systems where a small population of viewers still benefits from an origin shield on the recording leg, e-learning platforms that mix VOD lectures with live workshops, and surveillance and AR/VR experiences with their tighter latency budgets. We design tiered topologies on AWS CloudFront, Cloudflare, Fastly, Akamai, Bunny, and Google Media CDN, build shared shields on Varnish for multi-CDN customers, and instrument the cache hit ratio per tier so the operator's CFO can see the savings on the dashboard the day after the shield is activated.

CTA

Talk to a streaming engineer about scoping your shield and tiered cache configuration: contact us.
See our case studies in video streaming, OTT, and live.
Download the Origin Shield configuration checklist: PDF.

Call to action

Talk to a streaming engineer — book a 30-minute scoping call to talk through your origin shield plan.
See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
Download the Origin Shield Configuration Checklist — Single-page reference: the five-layer cache hierarchy, per-vendor knob names for AWS CloudFront, Cloudflare, Fastly, Akamai, Google Media CDN and Bunny, the five common pitfalls with one-line fixes, and the math worksheet for projecting….

References

AWS, "Use Amazon CloudFront Origin Shield", AWS CloudFront Developer Guide, accessed 2026-05-24. <https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/origin-shield.html> — Origin Shield definition, positioning, Region selection, and multi-AZ high-availability statement. The controlling vendor documentation for CloudFront's shield layer.
AWS, "Amazon CloudFront for Media — Best Practices for Streaming Media Delivery", AWS whitepaper, accessed 2026-05-24. <https://d1.awsstatic.com/whitepapers/amazon-cloudfront-for-media.pdf> — Reference architecture for the cache hierarchy and shield positioning for video workloads.
AWS Networking & Content Delivery Blog, "Using CloudFront Origin Shield to protect your origin in a multi-CDN deployment", accessed 2026-05-24. <https://aws.amazon.com/blogs/networking-and-content-delivery/using-cloudfront-origin-shield-to-protect-your-origin-in-a-multi-cdn-deployment/> — Reference architecture in which one CloudFront Origin Shield sits in front of multi-CDN origins.
AWS Media Blog, "Grabyo optimizes live cloud production with Amazon CloudFront Origin Shield", accessed 2026-05-24. <https://aws.amazon.com/blogs/media/grabyo-optimizes-live-cloud-production-with-amazon-cloudfront-origin-shield/> — Production case with p99 latency under two seconds and reduced origin load after enabling Origin Shield.
Cloudflare, "Tiered Cache", Cloudflare Cache (CDN) developer documentation, accessed 2026-05-24. <https://developers.cloudflare.com/cache/how-to/tiered-cache/> — Definition of the lower-tier / upper-tier topology and the "only upper-tiers can ask your origin for content" rule.
Cloudflare Engineering, "Live video just got more live: Introducing Concurrent Streaming Acceleration", Cloudflare blog. <https://blog.cloudflare.com/introducing-concurrent-streaming-acceleration/> — Request coalescing reduces origin requests by more than 90 percent during live-stream stampedes; behaviour at the edge for live video.
Cloudflare Engineering, "Introducing: Smarter Tiered Cache Topology Generation", Cloudflare blog, accessed 2026-05-24. <https://blog.cloudflare.com/introducing-smarter-tiered-cache-topology-generation/> — Smart Tiered Cache's automatic upper-tier selection per origin.
Cloudflare Engineering, "Reduce latency and increase cache hits with Regional Tiered Cache", Cloudflare blog, accessed 2026-05-24. <https://blog.cloudflare.com/introducing-regional-tiered-cache/> — Regional Tiered Cache positioning as a middle layer between edge and upper tier.
Fastly Documentation, "Shielding", Fastly Documentation, accessed 2026-05-24. <https://www.fastly.com/documentation/guides/getting-started/hosts/shielding/> — Fastly's shielding pattern; designating a single POP as the shield and routing all uncached requests through it; the "upwards of 99% of requests handled at the Fastly edge" benchmark.
Fastly Engineering, "Let the edge work for you: How shielding improves performance", Fastly blog, accessed 2026-05-24. <https://www.fastly.com/blog/let-the-edge-work-for-you-how-shielding-improves-performance> — Shielding architecture, request collapsing behaviour, and the role of the shield POP as the single mid-tier in front of the origin.
Akamai Tech Docs, "Tiered Distribution", Akamai Property Manager documentation, accessed 2026-05-24. <https://techdocs.akamai.com/property-mgr/docs/tiered-dist> — Tiered Distribution auto-enabled for Adaptive Media Delivery; cacheable-content path through the parent tier.
Akamai Tech Docs, "SureRoute", Akamai Property Manager documentation, accessed 2026-05-24. <https://techdocs.akamai.com/property-mgr/docs/sureroute-beh> — SureRoute's role for non-cacheable content paths; complementarity with Tiered Distribution.
Akamai, "Cloud Wrapper", Akamai product page, accessed 2026-05-24. <https://www.akamai.com/products/cloud-wrapper> — Shared cacheability across multi-CDN deployments; the central caching layer that collapses requests from multiple CDNs.
Varnish Software, "Request coalescing and other reasons to use Varnish as origin shield", Varnish Software blog, accessed 2026-05-24. <https://info.varnish-software.com/blog/request-coalescing-and-other-reasons-to-use-varnish-as-an-origin-shield> — Mechanics of request coalescing in an origin shield; the multi-CDN shielding pattern.
IETF RFC 9111, "HTTP Caching", R. Fielding, M. Nottingham, J. Reschke, June 2022. <https://www.rfc-editor.org/rfc/rfc9111.html> — The standards document for HTTP shared caches, the s-maxage directive, and the cache-key semantics that govern shielding behaviour at the protocol level. Cited per §4.3.2: the spec defines what "shared cache" means; the vendor blogs above describe what each vendor's implementation of it does.
Netflix, "Open Connect — Overview", Netflix Open Connect documentation, accessed 2026-05-24. <https://openconnect.netflix.com/Open-Connect-Overview.pdf> — Embedded ISP appliance model, OCA capability parity statement, and the 99%+ cache hit ratio engineering posture for popular titles.
OTTVerse, "CDN Request Collapsing and the Thundering Herds Problem Simplified", accessed 2026-05-24. <https://ottverse.com/request-collapsing-thundering-herds-in-cdn/> — Practitioner-level walkthrough of request collapsing and the thundering-herd failure mode for live streaming.

Origin Shielding and Tiered Caching

Why This Matters

What an Origin Shield Is, Carefully

Why a Streaming Workload Needs This More Than a Static Site

How a Tiered Cache Hierarchy Works, Layer by Layer

Request Collapsing, Coalescing, and the Two-Word Trick

The Math: Show Your Work

The Names per Vendor — a Field Guide

Common Pitfalls That Quietly Cost You the Savings

Where Multi-CDN Changes the Picture

Three Production-Grade Reference Architectures

Where Fora Soft Fits In

What to Read Next

CTA

Call to action

References

Related glossary terms

Origin Shielding and Tiered Caching

Why This Matters

What an Origin Shield Is, Carefully

Why a Streaming Workload Needs This More Than a Static Site

How a Tiered Cache Hierarchy Works, Layer by Layer

Request Collapsing, Coalescing, and the Two-Word Trick

The Math: Show Your Work

The Names per Vendor — a Field Guide

Common Pitfalls That Quietly Cost You the Savings

Where Multi-CDN Changes the Picture

Three Production-Grade Reference Architectures

Where Fora Soft Fits In

What to Read Next

CTA

Call to action

References

Related glossary terms

Shaka Player

Tiered caching

Cache hit ratio (CHR)

Peering

Live streaming

Content steering

Origin shielding

Cache key