Why this matters
If you run, or are about to build, an ad-supported streaming service — advertising video on demand (AVOD), a free ad-supported streaming TV (FAST) channel, or any hybrid that shows ads — the CSAI-versus-SSAI choice quietly decides three things you care about: how much ad revenue actually reaches your books, how good the viewing experience feels at every commercial break, and how much engineering you sign up for. Get it wrong on the web and ad-blockers erase a chunk of your revenue; get it wrong on a TV and clumsy ad transitions train viewers to look away. Connected-TV (CTV) advertising in the US is forecast at roughly $37.95 billion in 2026, and in 2026 CTV upfront commitments are projected to exceed primetime linear TV for the first time (eMarketer/industry forecasts, 2026) — so the plumbing that puts an ad into a stream is now a first-order revenue system, not a detail. This article is the builder's guide to that plumbing, and the companion to the OTT monetization map, which shows where ad revenue sits among all the ways a platform earns.
The one distinction: where the ad meets the video
Every other difference in this article follows from a single question. When a viewer reaches a commercial break, who joins the ad to the show, and where?
In client-side ad insertion — CSAI, the method built into most web and mobile video players — the joining happens on the viewer's device. The app's video player has a small advertising component (an "ad SDK"). When the show reaches an ad cue, the player pauses the content, sends a request out to an ad server (the system that decides which ad to show and hands back the ad's video file), downloads that ad, plays it in a separate ad player, then switches back to the show. The ad is a distinct event the device fetches and plays on its own.
In server-side ad insertion — SSAI, the method that dominates connected-TV — the joining happens on the platform's servers, before the video reaches the device. A stitching service splices the ad's video segments into the content stream so that, by the time the player receives the stream, the ad is already part of it. The player just keeps playing; it never makes a separate ad call and often cannot tell where the show ends and the ad begins.
Here is the analogy to hold onto. CSAI is a TV that, at each break, stops the show and dials out to a sponsor to ask "what should I play now?" — quick when the line is clear, awkward when it isn't, and easy for someone to cut the phone line. SSAI is the old broadcast model: the ads are spliced into the reel at the station, so the same continuous tape plays the show and the ads with no dialing out at all. Everything below — ad-blocker resistance, playback smoothness, measurement difficulty, build cost — is a consequence of that one choice of where the splice happens.
Figure 1. The one distinction. In CSAI the device calls the ad server itself and plays the ad separately; in SSAI a server-side stitcher bakes the ad into the same stream the player already trusts.
How CSAI works, step by step
Walk a single ad break in client-side insertion, because the mechanics explain both its strengths and its failure modes.
The player reaches an ad cue — a marker that says "a break goes here." The ad SDK builds a request and sends it to the ad server using a standard called VAST, short for Video Ad Serving Template — the XML format, maintained by the IAB Tech Lab, that an ad server uses to describe an ad to a player: where the video file is, what tracking to fire, what to do on a click (IAB Tech Lab VAST; current line VAST 4.3, December 2022, with the VAST CTV Addendum 2024). The ad server replies with a VAST document. The player reads it, downloads the ad creative, plays it in its ad player, and as the ad runs it fires tracking beacons — little web requests that report "the ad started," "25% watched," "50%," "complete" — back to the ad server and any measurement vendors. When the ad finishes, the player tears the ad player down and resumes the show.
That round trip is the source of CSAI's two real problems. First, it is a call the device makes to a known ad domain, so it is blockable. An ad-blocker is mostly a list of ad-server domains plus a rule: drop any request to them. Because CSAI's ad request goes from the viewer's browser to an ad server the blocker recognizes, the blocker simply cancels it, the player gets no ad, and the impression — and its revenue — never happens. Second, the round trip takes time at the worst possible moment. Between tearing down the content and having an ad buffered, the player must resolve the ad domain, complete the request, parse the VAST, fetch the creative, and fill its buffer. On a clean connection that is a beat; on a congested phone it is one to three seconds of spinner, or a timeout that shows a blank slate. Every break is a small chance to stall.
CSAI's strengths come from the same architecture. Because the ad runs in a real player on the device, it can be interactive and clickable — a "tap to learn more" overlay, an expandable unit — through the interactive-ad standard that replaced the old VPAID, now SIMID (Secure Interactive Media Interface Definition). And because the device itself plays the ad, the client knows exactly what happened: precise viewability, exact completion, device and session signals, easy per-device frequency capping (limiting how many times one viewer sees the same ad). CSAI keeps the richest data and the most interactivity. It just pays for that with ad-blocker exposure and a stutter at every break.
How SSAI works, step by step
Server-side insertion moves the whole join to the platform. To do that it needs a new component the client side never had: a stitcher, also called a manifest manipulator.
Start with the manifest — the small text playlist that tells a video player which short video chunks to download and in what order. Modern streaming (covered in packaging: CMAF, HLS, and DASH) cuts video into segments of a few seconds and lists them in this manifest; the two manifest formats are HLS (Apple's, RFC 8216) and MPEG-DASH (ISO/IEC 23009-1). The player just fetches whatever segments the manifest names. The stitcher's trick is to rewrite that manifest so it lists ad segments alongside content segments. For the cross-section mechanics of those manifest formats, see HLS vs DASH in the Video Streaming section; here we care only about what the stitcher does to them.
The sequence is the same at every break. The live or VOD stream carries an in-stream ad marker — an SCTE-35 cue, the broadcast standard that signals "an ad break starts here, this long" (ANSI/SCTE 35, covered in depth in SCTE-35 and ad signaling). The stitcher sees the cue and, for this viewer's session, calls the ad decision server with a VAST request — the same VAST standard CSAI uses, but now the server makes the call, not the device. The ad server returns the ad. The stitcher conditions the ad — transcodes it to match the exact resolutions and bitrates of the content's quality ladder, so it slots in without a jarring quality jump — and rewrites the manifest so the ad's segments appear in line with the content's, marked with a discontinuity tag (#EXT-X-DISCONTINUITY in HLS) that tells the player "the stream properties change here, keep playing." The player downloads ad segments exactly as it downloads content segments: same server, same format, same buffer.
Two consequences fall straight out of this design. Because the ad arrives from the same servers and in the same stream as the content, there is no separate call to a recognizable ad domain — so a network ad-blocker has nothing to cancel without blocking the whole video. And because the ad segments sit in the same buffer the player already fills for content, the transition is broadcast-smooth: no dial-out, no spinner, no slate. SSAI trades CSAI's stutter and ad-blocker exposure for a gap-free, blocker-resistant break. What it gives up — measurement and interactivity — we come to after the picture.
Figure 2. The SSAI stitching pipeline. An SCTE-35 cue triggers a VAST call to the ad decision server; the stitcher conditions the ad to match the content ladder and rewrites the manifest, so the player sees one continuous stream.
Why SSAI beats ad-blockers — the part nobody explains clearly
This is the question that sends most people to this topic, so it deserves a precise answer rather than the usual hand-wave that SSAI is "more resistant."
An ad-blocker is, at heart, a bouncer with a guest list of ad-server domains. It inspects the requests a device makes and drops any addressed to a name on the list. CSAI hands that bouncer an easy target: the player's ad request travels from the viewer's device to an ad server whose domain is almost certainly on the list, so the bouncer cancels it and no ad loads. The ad request is a separate, identifiable, client-originated call — exactly what blockers are built to catch.
SSAI removes the call entirely. There is no client-to-ad-server request to inspect, because the server made that request, out of the viewer's sight, when it built the stream. What the device downloads is a sequence of video segments from the platform's own content servers, in the same format and from the same domain as the show. To the blocker, an ad segment and a content segment are indistinguishable bytes from a trusted source; it cannot drop one without dropping the video the viewer came to watch. That is the whole mechanism: SSAI wins not by hiding the ad better, but by deleting the separate request that ad-blockers depend on.
The revenue at stake is not small, especially on the web, where blocking is heaviest. By 2025 roughly 42% of internet users ran some ad-blocker, with desktop adoption around 51% and mobile near 36%, and ad-blocking was estimated to cost publishers on the order of $54–62 billion a year (industry compilations, 2025 — estimates vary by methodology). On connected-TV the threat is different — most TVs do not run browser ad-blockers — but SSAI still wins there for the other reason: app-store TV players are a fragmented zoo of devices, and a server-side stitch gives every one of them the same smooth, SDK-free ad experience instead of asking each device's flaky ad SDK to behave. On the web SSAI defends revenue; on the TV it defends experience and consistency.
Figure 3. Why SSAI resists blockers. CSAI's separate call to a known ad domain is exactly what a blocker cancels; SSAI's ad segments are indistinguishable from content served by the same CDN, so there is no separate request to block.
The honest trade-off: what SSAI gives up
SSAI is not free wins. Moving the join to the server breaks the two things the client did effortlessly — measurement and interactivity — and adds a cost the client never had.
Measurement gets harder because the device is no longer the one calling the ad server. In CSAI the player fires every tracking beacon itself, so the ad server sees exactly what the viewer saw. In SSAI the stitcher is the natural place to fire beacons, but the stitcher is a server — it knows it sent ad bytes, not whether the screen was on, the app was foregrounded, or the viewer was even in the room. The industry fixed this deliberately: VAST 4.1 (2017) added explicit support for server-side ad insertion and aligned with the Open Measurement standard (OMID), the IAB Tech Lab framework that lets an approved measurement script run in the player and report real viewability even when the ad was stitched server-side (IAB Tech Lab VAST 4.1; Open Measurement SDK). Done right, SSAI plus OMID measures viewability properly; done lazily — beacons fired blindly from the stitcher — it over-counts impressions and an auditor will catch it. Measurement is solvable, but it is now your problem to solve.
Interactivity gets harder for the same reason: a stitched ad is just video in the stream, so the rich clickable units CSAI supports need extra wiring (a metadata side-channel the player listens to) to work at all. Many SSAI deployments simply accept non-interactive spots, which is fine for TV-style brand ads and weak for performance advertising that needs a click.
Frequency capping and personalization need session plumbing. The infamous SSAI symptom is the same ad five times in one break — the result of a stitcher that calls the ad server without passing enough about who this viewer is and what they have already seen. To target ads and cap repetition, the stitcher must thread a per-viewer session identity and history through to the ad decision server on every call. That is exactly what dynamic ad insertion (DAI) means: SSAI plus per-session targeting, where every viewer gets a personalized manifest. It works, but the per-session manifest is less cacheable than a single shared stream, which raises delivery cost — a real line item the OTT cost model has to carry.
And conditioning ads costs compute. Transcoding each ad to match every rung of the content ladder, for every campaign, is real transcode load — the price of the gap-free transition. None of these is a deal-breaker; together they are why SSAI is a build, not a checkbox.
Putting it side by side
The two approaches, and the hybrid we meet next, line up cleanly once the mechanism is clear. The table includes a "works without an ad-blocker problem?" column because that is the question most teams arrive with — but read every row, because the right answer depends on your surface (web, mobile, TV) and your goal (revenue defense, experience, interactivity).
| Property | CSAI (client-side) | SSAI (server-side) | SGAI / HLS interstitials (hybrid) |
|---|---|---|---|
| Where the ad joins the video | On the device, in a separate ad player | On the server, stitched into the stream | On the server as a pointer; the player fetches the ad |
| Beats ad-blockers? | No — the ad call is blockable | Yes — no separate ad call to block | Partly — pointer is server-side, ad fetch is client-side |
| Transition smoothness (QoE) | Stutter risk at every break | Broadcast-smooth | Smooth; ad payload decoupled from content |
| Device consistency | Varies by each device's SDK | Identical everywhere | Good on players that support the markers |
| Measurement / viewability | Native and precise | Harder; needs VAST 4.1 + OMID | Better than classic SSAI; client sees the ad |
| Interactivity / clickable ads | Strong (SIMID) | Weak without extra wiring | Improving; player handles the ad |
| Frequency capping & targeting | Easy (client knows the viewer) | Needs per-session plumbing (this is DAI) | Server-guided, with client context |
| CDN / caching efficiency | N/A (ad fetched separately) | Per-session manifest is less cacheable | Better — content stays shared/cacheable |
| Build & run complexity | Lower | Higher (stitcher + transcode + measurement) | Highest (newest, fewest turnkey options) |
Table 1. CSAI, SSAI, and the converging hybrid. No row makes one approach win outright; the choice follows your surface and your goal. The "beats ad-blockers?" column is the question most teams start with, but the measurement and interactivity rows are where SSAI projects pay their dues.
A worked example: what the choice is worth in revenue
Abstract trade-offs become decisions when you put money on them. Ad revenue follows one formula everywhere, so start there:
ad revenue = ad impressions × CPM ÷ 1000
CPM is cost per mille — what an advertiser pays per thousand times an ad is shown. Take a mid-size AVOD service and use round numbers:
viewers (monthly) = 500,000
ad-supported hours each = 15 / month
ad slots per hour = 8 (e.g. 4 breaks × 2 ads)
ad opportunities (avails) = 500,000 × 15 × 8 = 60,000,000 / month
fill rate = 70% (share of avails an ad actually fills)
impressions = 60,000,000 × 0.70 = 42,000,000 / month
CPM = $20
ad revenue = 42,000,000 ÷ 1000 × $20 = $840,000 / month
Now the part the insertion method decides. In client-side insertion, a share of those 42 million impressions never renders or never counts — ad-blockers cancel the call on the web, ad SDKs time out on weak phones, slow ad loads get abandoned. Assume a conservative 15% loss for an audience with meaningful web and mobile reach:
impressions lost to CSAI failure = 42,000,000 × 0.15 = 6,300,000 / month
revenue lost = 6,300,000 ÷ 1000 × $20 = $126,000 / month
recovered by moving to SSAI ≈ $126,000 / month ≈ $1.5M / year
About $1.5 million a year, on the same audience and the same ad sales, recovered purely by deleting the blockable ad call and the lossy client-side fetch. That is the SSAI business case in one number — and the reason every AVOD and FAST operator at scale runs server-side. The inputs here are illustrative; plug your own audience mix, fill rate, and CPM in, but the shape holds: the more of your audience sits on ad-blocked web and flaky mobile, the more SSAI is worth.
There is a smaller, sharper version of the same argument in playback quality. A CSAI break spends one to three seconds dialing out and buffering the ad; an SSAI break spends roughly zero, because the ad is already in the buffer at the content's bitrate. Multiply a few seconds of risk across every break, every viewer, every day, and you are trading measurable startup and rebuffering pain — the quality metrics that predict whether viewers stay — for a smooth one. Revenue and retention point the same way.
Where it is heading: the hybrid that ends the either/or
The cleanest news in this topic is that the industry is converging on a third option that keeps SSAI's smoothness while handing back some of CSAI's measurement and control. Instead of physically splicing ad bytes into the content manifest, the server inserts a pointer — a marker that tells the player "at this point, go play the ad from this separate list" — and the player fetches and presents the ad itself, natively.
Two names matter. HLS interstitials are Apple's version: the manifest carries an EXT-X-DATERANGE marker of the interstitial class with an X-ASSET-LIST that points to the ad, and a supporting player plays it as a player-native break (Apple HLS specification). Server-guided ad insertion (SGAI) is the broader pattern, formalized for DASH in the 6th edition of MPEG-DASH (2024–2025) and adopted by stitchers such as AWS Elemental MediaTailor, which added HLS-interstitials support for VOD in 2024 and live streams in 2025 (AWS Elemental MediaTailor documentation, 2024–2025).
The payoff is structural. Because the ad payload is decoupled from the content stream, the content manifest stays shared and cacheable — cheaper to deliver than a per-session stitched manifest — while the ad fetch, being client-side, restores accurate viewability and easier interactivity. It is not a free lunch: it is the newest approach, support is still spreading across the device fleet, and there are fewer turnkey options than for classic SSAI. But the direction is clear — server decides, client renders — and a platform built today should treat SGAI/interstitials as the architecture it grows into, not a science project. This is why the comparison table has three columns, not two.
Figure 4. The convergence. CSAI keeps measurement and interactivity but loses to blockers and stutters; classic SSAI flips that; server-guided insertion (HLS interstitials, DASH SGAI) aims to keep the best of both — server decides, client renders.
A common mistake: choosing the insertion method before the surface and the goal
The recurring error is to treat this as a single platform-wide switch — "we'll do SSAI" — decided by an engineer's preference rather than by where the audience watches and what the ads must do. It backfires in predictable ways. A performance-advertising business that needs clickable, interactive units ships pure SSAI and discovers its ads cannot click. A web-heavy AVOD service ships CSAI to keep interactivity and watches ad-blockers erase a fifth of its revenue. A team rolls out SSAI without threading session identity to the ad server and ships the "same ad five times" experience that makes viewers mute the TV. Each is the same root mistake: the method was chosen before the surface (web, mobile, CTV) and the goal (revenue defense, smooth experience, interactivity, measurement) were named.
The fix is to decide per surface against a goal, not once for the whole platform. CTV, where blocking is rare but device fragmentation is brutal and the experience must feel like television, is the textbook home for SSAI or server-guided insertion. Ad-blocked web, where a separate ad call is a liability, is where SSAI defends revenue — unless interactivity is the whole business model, in which case CSAI with anti-blocker mitigations earns its keep. Most serious platforms end up mixed: server-side on TV, server-side or hybrid on web for blocker resistance, with measurement wired through VAST 4.1 and OMID regardless. Name the surface and the goal first; the method follows.
Figure 5. Choosing per surface and goal. Start from where the audience watches and what the ad must do — not from an engineering preference — and the insertion method falls out.
Where Fora Soft fits in
Ad insertion is where streaming scale meets advertising revenue, and the expensive failures are quiet ones: a fifth of web impressions blocked before they count, a TV fleet where every device's ad SDK breaks differently, a stitcher that over-counts impressions and fails an audit, or the "same ad on repeat" that drives viewers away. Fora Soft has built video-streaming and OTT/Internet-TV platforms since 2005, across 625+ shipped projects for 400+ clients, which means we have wired SCTE-35 ad signaling into live and VOD pipelines, integrated server-side stitching with ad decision servers, conditioned ad creative to match the encoding ladder so breaks stay gap-free, and kept measurement honest through VAST and Open Measurement. Our stance is scalability-first and vendor-neutral: we start from where your audience watches and the ad revenue you must protect at scale, then build the insertion architecture — CSAI, SSAI, or the server-guided hybrid — that your surfaces and your advertisers actually require.
What to read next
- SCTE-35 and ad signaling
- Ad serving, VAST/VMAP, and the ad stack
- The OTT monetization map: subscriptions, ads, transactions
For a shorter, product-level overview of streaming monetization, see our video streaming app monetization guide; to commission a build, talk to our streaming team via the link above.
Call to action
- Talk to a streaming engineer — book a 30-minute scoping call to talk through your ssai vs csai plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the SSAI vs CSAI — One-Page Decision & Architecture Reference — When CSAI vs SSAI vs server-guided insertion wins, the SCTE-35-to-stitched-manifest pipeline, and the measurement and frequency-capping gotchas, on a single sheet.
References
- ANSI/SCTE 35 2023r1 — Digital Program Insertion Cueing Message. Society of Cable Telecommunications Engineers (SCTE). Tier 1. The controlling standard for in-stream ad-break signaling; defines
splice_insert,time_signal, and thesegmentation_descriptorcues that mark ad avails, now carried in HLS and DASH delivery. Published 2023-11-30. https://account.scte.org/standards/library/catalog/scte-35-digital-program-insertion-cueing-message/ — accessed 2026-06-17. - VAST — Digital Video Ad Serving Template (version history; VAST 4.3, Dec 2022; VAST 4.1, Aug 2017; VAST CTV Addendum 2024). IAB Tech Lab. Tier 1. The XML ad-response standard used by both CSAI and SSAI; VAST 4.1 added explicit server-side-ad-insertion support and Open Measurement alignment. Page last updated 2024-07-18. https://iabtechlab.com/standards/vast/ — accessed 2026-06-17.
- Open Measurement SDK (OMID). IAB Tech Lab. Tier 1. The viewability/verification framework that restores honest ad measurement under server-side insertion, declared via VAST
<AdVerifications>. https://iabtechlab.com/standards/open-measurement-sdk/ — accessed 2026-06-17. - HTTP Live Streaming — RFC 8216. IETF. Tier 1. The HLS manifest standard the stitcher rewrites;
#EXT-X-DISCONTINUITYmarks the content-to-ad boundary, andEXT-X-DATERANGEcarries interstitial (ad) markers. https://www.rfc-editor.org/rfc/rfc8216 — accessed 2026-06-17. - Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1, ISO/IEC 23009-1. ISO/IEC. Tier 1. The DASH manifest standard; its 6th edition (2024–2025) formalizes server-guided ad insertion (SGAI) for high-concurrency live and VOD. https://www.iso.org/standard/83314.html — accessed 2026-06-17.
- HLS Interstitials (Apple HLS specification / authoring update). Apple. Tier 3 (first-party, spec author). Player-native ad-break markers (
EXT-X-DATERANGEinterstitial class withX-ASSET-LIST) that underpin the server-guided hybrid. https://developer.apple.com/streaming/ — accessed 2026-06-17. Spec evolving — re-verify. - AWS Elemental MediaTailor — server-side ad insertion, manifest stitching, and HLS interstitials (VOD 2024, live 2025). Amazon Web Services. Tier 3 (first-party vendor). Documents how a production stitcher calls the ADS, conditions ad creative to the content ladder, rewrites manifests, and supports server-guided insertion. https://docs.aws.amazon.com/mediatailor/latest/ug/ — accessed 2026-06-17. Vendor docs — re-check.
- Google Ad Manager — Dynamic Ad Insertion (DAI) and DAI Pod Serving. Google. Tier 3 (first-party vendor). Server-side stitch for live, linear, and VOD; pod serving lets a platform own its workflow and receive ready-to-stitch ad pods, removing the ad request from the client SDK. https://support.google.com/admanager/answer/6147120 — accessed 2026-06-17. Vendor docs — re-check.
- Connected-TV advertising market forecast, 2026. eMarketer / industry forecast compilations. Tier 5. US CTV ad spend ~$37.95B in 2026 (~+14.5% YoY); 2026 CTV upfront commitments forecast to exceed primetime linear for the first time. https://www.emarketer.com/ — accessed 2026-06-17. Analyst estimates vary by source.
- Ad-blocker usage and revenue-impact statistics, 2025. Industry compilations (Backlinko, and similar). Tier 5. ~42% of internet users run an ad-blocker; desktop ~51%, mobile ~36%; estimated publisher revenue loss ~$54–62B/year. https://backlinko.com/ad-blockers-users — accessed 2026-06-17. Estimates vary by methodology; cited as ranges.
- CSAI vs SSAI — technical explainers. Bitmovin, Wowza, OTTVerse, AdExchanger. Tier 6 (orientation). Consulted for cross-checking the stitching mechanics and the ad-blocker framing; not used as the source for any spec claim. https://bitmovin.com/blog/csai-vs-ssai-client-side-server-side-ad-insertion/ — accessed 2026-06-17.
- Fora Soft — Video Streaming App Monetisation (overview blog). Fora Soft. Tier 7. Product-level companion on AVOD/SVOD/TVOD and ad-supported monetization; the commercial-intent counterpart this educational article links to. https://www.forasoft.com/blog/article/video-streaming-app-monetization-strategies — accessed 2026-06-17.
Where sources disagreed, the controlling standard or first-party document was followed. SCTE-35, VAST/OMID, RFC 8216 (HLS), and ISO/IEC 23009-1 (DASH) are cited from the issuing bodies; the stitching behavior is cross-checked against first-party vendor documentation (AWS Elemental MediaTailor, Google DAI) and dated, because vendor capabilities and the HLS-interstitials/SGAI specs are actively evolving in 2026. The popular framing that SSAI "hides" ads from blockers was overridden in favor of the precise mechanism — SSAI removes the separate client-to-ad-server request — per the spec-over-listicle hierarchy. CTV-spend and ad-blocker figures are 2025–2026 analyst ranges, never quoted as universal numbers.


