Why this matters

If you run, or are about to build, an ad-supported streaming service, the ad stack is where your audience becomes revenue. The previous two articles covered the plumbing that opens an ad break — SCTE-35, the in-stream cue that says "an ad goes here," and server-side ad insertion, the stitching that sews the ad into the stream. This article covers what happens in between: choosing the ad and delivering it. Connected-TV programmatic ad spend is forecast at roughly $36 billion in the US in 2026, up from about $28 billion in 2025 (industry analyst estimates, 2026), and almost every dollar of it flows through the VAST and VMAP standards described here. This is the builder's guide to the ad supply chain — the standards, the deal types, and the math — and the companion to the OTT monetization map, which shows where ad revenue sits among all the ways a platform earns, including the subscription billing that hybrid services run alongside ads.

The one job: from "an ad goes here" to "an ad on screen"

Start with the gap this whole stack exists to fill. By the time a viewer reaches an ad break, two things are already known: where the break is and how long it runs. That came from the in-stream cue. What is not yet known is which ads will play and where the video files for them live. Ad serving is the set of systems and standards that answer those two questions, fast enough that the viewer never notices the seam.

Here is the analogy to hold onto. Think of a commercial break on traditional television. The program timeline has slots reserved for ads — that is the schedule. A separate operation decides which commercials fill each slot and pulls the right tapes from the library. Streaming splits the same job across two standards: VMAP is the schedule of slots, and VAST is the instruction that delivers each commercial's "tape." Everything else in this article is detail about how those two documents are written, requested, auctioned, and assembled.

One distinction sets up the whole stack. A streaming platform can fill its ad slots two ways. It can sell them directly — a salesperson agrees with one advertiser to run their campaign for a flat price — or it can sell them programmatically, auctioning each slot in milliseconds to whichever buyer bids highest. Most platforms run both at once, and the machinery that lets all those demand sources compete for the same slot is the heart of the modern ad stack. We will build up to it piece by piece.

Ad-serving chain: a cue opens a break, VMAP lists slots, an ad server or auction returns VAST ads, then a stitcher assembles. Figure 1. The ad supply chain. The in-stream cue opens a break; VMAP describes the slots; an ad server (direct) or an auction of demand partners (programmatic) returns the winning ads as VAST; the stitcher or player assembles them into one break.

The standards stack: who does what

Four standards do almost all the work, and the single most common source of confusion is mixing up their jobs. Each owns one layer, and they nest inside each other cleanly. The table below is the map; the sections after it walk each layer in turn.

Standard Issued by The one thing it does Where it sits
SCTE-35 SCTE Marks where and how long a break is, in the stream In the video stream (covered in 5.4)
VMAP IAB Lists the set of ad breaks and their positions in a piece of content Around the content (the timeline)
VAST IAB Delivers one ad — its video file, tracking, and metadata Inside each break (the ad itself)
OpenRTB IAB Tech Lab Runs the real-time auction that picks programmatic ads Behind the ad server (the marketplace)

Table 1. The four standards of the ad stack and the layer each owns. SCTE-35 (the in-stream cue) and OpenRTB (the auction) bracket the two that carry the ads themselves: VMAP describes the timeline of breaks, and VAST delivers each ad inside them. Confusing VMAP with VAST is the most common beginner error — one is the schedule, the other is the ad.

A quick note on scope. SCTE-35 is the subject of the previous article and is the cue that opens an avail; this article picks up once the avail is open. VMAP, VAST, and the auction are the focus here.

VMAP: the timeline of ad breaks

VMAP stands for Video Multiple Ad Playlist, an IAB specification first published in July 2012. It solves a specific problem: the company that owns the content often does not control the video player it plays in. A studio licensing a film to three different streaming apps cannot reach inside each app's player to schedule ad breaks. VMAP lets the content owner describe the break structure once, as a separate XML document, that any compliant player can read.

A VMAP document is a list of ad breaks, and each break carries two essential attributes. The first is timeOffsetwhen the break happens. The standard allows four forms: the literal value start (a pre-roll, before the content), end (a post-roll, after it), a precise timestamp in HH:MM:SS.mmm form (a mid-roll at an exact moment), or a percentage of the total duration. The second is breakType — usually linear, meaning a full-screen video ad that interrupts the content, as opposed to a non-linear overlay.

Here is a minimal VMAP describing three breaks — one before the content, one twelve minutes in, and one at the end:

<vmap:VMAP xmlns:vmap="http://www.iab.net/vmap-1.0" version="1.0">
  <vmap:AdBreak timeOffset="start" breakType="linear" breakId="preroll">
    <vmap:AdSource><vmap:AdTagURI templateType="vast4">
      https://adserver.example/vast?slot=preroll
    </vmap:AdTagURI></vmap:AdSource>
  </vmap:AdBreak>
  <vmap:AdBreak timeOffset="00:12:00.000" breakType="linear" breakId="mid1"/>
  <vmap:AdBreak timeOffset="end" breakType="linear" breakId="postroll"/>
</vmap:VMAP>

Notice what is not in that document: the ads themselves. Each AdBreak contains an AdSource, and the AdSource either holds an inline VAST document or — far more commonly — an AdTagURI pointing at the ad server that will supply the VAST when the break is reached. VMAP is the schedule and nothing more; it hands off to VAST for the actual ad. That clean separation is the whole point: the timeline is fixed when the content is published, but the ads are chosen fresh, per viewer, at playback.

VAST: the document that delivers one ad

VAST stands for Video Ad Serving Template, the IAB standard first launched in 2008 and now the universal language between an ad server and a video player. Where VMAP is the schedule, a VAST document describes a single ad: the location of its video file, the events the player should report (started, first quartile, midpoint, completed, clicked), the clickthrough destination, and metadata such as the ad's duration and identity. When a player or stitcher reaches a slot, it requests a VAST document, parses it, plays the video file it names, and fires the tracking pixels it lists.

VAST defines two kinds of response, and understanding the difference explains a large share of ad-delivery bugs. An InLine response is the real, final ad — it contains the actual MediaFile (the video the viewer will watch) plus all the tracking. A Wrapper response contains no video; it holds a VASTAdTagURI that redirects the player to another ad server, which returns either an InLine ad or yet another Wrapper. The redirect chain continues until some server finally returns an InLine response with a real media file.

This redirect chain exists for a reason: each hop is a different company in the supply chain — an agency's ad server, a supply-side platform, a verification vendor — and each adds its own tracking pixels along the way, so everyone measures the same impression. But the chain has a cost. Every redirect is another network round-trip, and convention limits Wrapper depth to about five before a player gives up and abandons the slot. A chain that is too long, or one server in it that is slow, is a frequent cause of an ad that never appears.

VAST InLine vs Wrapper: a Wrapper redirects through ad servers until a final InLine returns the real video file and tracking. Figure 2. InLine vs Wrapper. A Wrapper carries no video — only a VASTAdTagURI pointing to the next ad server. The chain redirects (collecting tracking at each hop) until a server returns an InLine ad with a real MediaFile. Convention caps the chain at about five wrappers.

The two standards nest. VMAP holds the breaks; each break points to VAST; each VAST resolves — possibly through a wrapper chain — to one or more real ads. If you remember one sentence about the structure, make it this: VMAP is the playlist of breaks, VAST is the ad inside a break, and a Wrapper is a VAST that points at another VAST.

Why the VAST version matters for OTT

VAST is not frozen, and for streaming on televisions the version genuinely matters. The full versions are 2.0, 3.0, 4.0, 4.1, 4.2, and the current 4.3, released December 2022, with a VAST CTV Addendum published July 2024 layered on top. Three changes since 4.0 are the ones a streaming team must care about.

First, VAST 4.0 (2016) separated the media file from the interactive code. Earlier ads bundled the video and any interactive behavior together through an older interface called VPAID (Video Player-Ad Interface Definition). That bundling broke server-side ad insertion, because a stitcher needs a plain video file it can transcode and splice — not executable code. By giving the real MediaFile its own element, VAST 4.0 made ads stitchable, which is exactly what OTT and connected-TV delivery require.

Second, VPAID is deprecated and replaced by SIMID (Secure Interactive Media Interface Definition), introduced alongside VAST 4.2 in 2019. If your ad stack still depends on VPAID, it will fail on most living-room devices and break server-side insertion; SIMID plus the separated media file is the modern, CTV-safe path for any interactivity.

Third, VAST 4.x added the AdVerifications node, which is how the IAB's Open Measurement SDK (OM SDK) gets the information it needs to measure whether an ad was actually viewable. The OM SDK parses that node and reports standardized signals — screen and ad geometry, obstruction, viewable impressions, and quartile completion — to whichever measurement vendor the buyer trusts. The Open Measurement SDK has extended to connected-TV platforms including Samsung and LG, covering roughly 40% of CTV households as of 2024, and added device-attestation support to fight spoofing. The CTV Addendum 2024 went further with an Ad Creative ID Framework (ACIF) for cleaner creative identification and support for required Digital Services Act disclosure icons. The practical rule: on connected TV, target VAST 4.x, use SIMID rather than VPAID, and implement the AdVerifications node — buyers increasingly will not bid on inventory they cannot measure.

Ad pods: the commercial break, and its four controls

A single mid-roll rarely holds just one ad. A two-minute break might carry four thirty-second spots in a row, exactly like a television commercial break. That sequence of ads inside one break is an ad pod, and the concept has been part of VAST since version 3.0 (2012), expressed by a sequence attribute that orders the ads within the pod.

Pods are where streaming ad serving gets genuinely hard, because filling a slot with an ad is easy but filling a pod with the right set of ads requires four controls that traditional television solved decades ago and streaming is still perfecting:

  • Deduplication — the same creative must not play twice in one pod. Without it, a viewer sees the identical spot back-to-back, which reads as a glitch.
  • Competitive separation — two ads from the same category (two car brands, two banks) should not share a pod. Advertisers pay to not sit next to a competitor.
  • Frequency capping — a viewer should not see the same ad too many times across a session or a day. This is the difference between a memorable campaign and an irritating one.
  • Latency control — every ad in the pod must be selected and ready before the break starts, or the viewer gets a blank screen or a "we'll be right back" slate while the server scrambles.

In programmatic pods, these controls are negotiated in the auction itself. The current OpenRTB 2.6 specification (IAB Tech Lab) added pod bidding, which lets a seller describe the pod — its total length, the number of slots, their sequence — in the bid request, so buyers can respond with ads tailored to specific positions. It defines structured, dynamic, and hybrid pods depending on how much the seller fixes in advance. Pod management remains one of the genuinely unsolved-feeling parts of CTV: get it right and the break feels like television, get it wrong and it feels broken.

Ad pod anatomy: one break holds a sequence of ads governed by dedup, competitive separation, frequency capping, and latency. Figure 3. A pod is a sequence of ads in one break. Four controls make it feel like television: no repeated creative (dedup), no two same-category brands (competitive separation), no over-exposure (frequency cap), and every slot ready before the break starts (latency control).

Direct-sold vs programmatic: how the slot gets filled

Now the demand side — who actually buys the slot. There are two broad routes, and the modern stack lets them compete.

The older route is direct-sold. A salesperson agrees with an advertiser to run a campaign at a fixed price for a guaranteed number of impressions. The platform's ad server (Google Ad Manager and SpringServe are common examples) stores the campaign and serves it when the targeting matches. Direct deals carry the highest prices and the most control, but they require a sales team and they cannot fill every slot of a large, fragmented audience.

The newer route is programmatic — automated, auction-based buying. When a slot opens, the platform broadcasts a bid request to many buyers (demand-side platforms representing advertisers) and the highest bid wins, all in well under a second. Programmatic comes in tiers: programmatic guaranteed (an automated version of a direct deal), private marketplace (a PMP — an invite-only auction for select buyers), and the open exchange (anyone can bid; lowest prices, highest fill).

The crucial architectural question is how these compete. The old way was the waterfall: the ad server offered each slot to demand sources one at a time, in a fixed priority order, until one accepted. The waterfall is slow and leaves money on the table, because a lower-priority buyer willing to pay more never gets asked once a higher-priority source accepts a lower price. The modern replacement is header bidding (also called a unified auction): all demand sources — direct, PMP, and open exchange — bid simultaneously on every slot, and the genuinely highest bid wins. Industry reporting puts the yield improvement from flattening the waterfall into a unified auction at roughly 15–25% for video and CTV (industry estimates, 2026). The lesson for a builder: a waterfall is simple but underprices your inventory; a unified auction is more work to wire up but is how serious ad-supported platforms maximize revenue.

Deal type How it's bought Typical price Typical fill Control over what runs
Direct-sold / guaranteed Salesperson, fixed price Highest Limited to what's sold Full
Programmatic guaranteed Automated, fixed price High Committed volume High
Private marketplace (PMP) Invite-only auction Medium–high Medium Curated buyers
Open exchange Open auction Lowest Highest Least

Table 2. The demand types that fill an ad slot, ordered from most to least control. Most platforms run all four through a single unified (header-bidding) auction so every source competes for every slot — the "typical fill" column shows why no single source is enough on its own. Prices and fill vary by audience, season, and content; treat the columns as relative, not absolute.

The money math: fill rate × CPM

Two numbers turn all of this into revenue, and working them out loud is the most useful thing in this article. The first is fill rate — the share of ad requests that actually come back with an ad. If your player asks for an ad ten million times in a month and seven million requests return one, your fill rate is 70%. The second is CPMcost per mille, the price per one thousand impressions (mille is Latin for thousand). A $20 CPM means an advertiser pays $20 for every thousand times their ad is shown.

Revenue is the product of the two, run through the impressions. Walk it through:

ad requests in a month        = 10,000,000
fill rate                     = 70%
impressions  = 10,000,000 × 0.70 = 7,000,000
average CPM                    = $20
revenue      = 7,000,000 ÷ 1,000 × $20 = $140,000 / month

Now see what each lever does. Suppose a unified auction lifts fill rate from 70% to 85%:

impressions  = 10,000,000 × 0.85 = 8,500,000
revenue      = 8,500,000 ÷ 1,000 × $20 = $170,000 / month   (+$30,000)

And suppose competition in that auction also lifts the average CPM from $20 to $25:

revenue      = 8,500,000 ÷ 1,000 × $25 = $212,500 / month   (+$72,500 vs the start)

The same ten million requests now earn 52% more, purely from better fill and stronger demand. This is why the auction architecture above is not a technical detail — it is the revenue engine. There is also a closely related metric, eCPM (effective CPM), which is your actual earned revenue per thousand impressions across all sources blended together — total revenue ÷ impressions × 1,000. eCPM is the honest scorecard, because it already bakes in fill rate, unsold slots, and remnant ads; CPM is the price of a single deal. Whether advertising is the right primary model for your catalog at all is a separate strategic question, covered in pricing, packaging, and the monetization decision.

Revenue funnel: ad requests times fill rate give impressions, times CPM over 1000 give revenue; better fill and CPM lift it. Figure 4. The revenue funnel. Ad requests narrow to impressions by the fill rate, then multiply by CPM ÷ 1,000 to give revenue. The worked example shows how lifting fill (70% → 85%) and CPM ($20 → $25) compounds to +52% revenue on the same traffic.

Supply-chain transparency: ads.txt, app-ads.txt, and sellers.json

One more layer separates a platform that buyers trust from one they avoid, and on connected TV it is not optional. Programmatic buyers have been burned by fraud — spoofed apps and unauthorized resellers passing off fake inventory — so the IAB Tech Lab publishes a set of transparency standards that let a buyer verify they are buying real inventory from an authorized seller.

ads.txt (Authorized Digital Sellers) is a public file on a website listing exactly which companies may sell its ad inventory. app-ads.txt is the same idea for apps, which is the relevant one for OTT and CTV, where your service is an app on a television. sellers.json is the mirror image, published by the supply-side platforms and exchanges, naming every seller they represent. And the SupplyChain object (often "schain") rides along with each bid request, recording the full path the inventory took from your platform to the buyer. Together they let a buyer answer one question — "is this really Fora Soft's inventory, sold through authorized partners?" — and on CTV, where app-ads.txt is the norm, failing to publish one means premium buyers will simply not bid. Implementing these files is cheap; skipping them quietly caps your CPMs.

A common mistake: chasing fill rate instead of revenue

The recurring error in ad monetization is optimizing the wrong number. A team watches the fill-rate dashboard, sees it climb toward 100% after they add a low-quality open-exchange partner, and declares victory — while revenue barely moves or even falls. The reason is that fill rate counts whether a slot was filled, not what it earned. A slot filled by a $1 remnant ad and a slot filled by a $30 premium ad both count as "filled," but they are not the same business.

The math makes it concrete. Going from 60% fill at a $30 blended eCPM to 100% fill at a $12 blended eCPM looks like a 40-point fill improvement and is actually a revenue cut: 0.60 × $30 = $18 of revenue per thousand requests, versus 1.00 × $12 = $12. The discipline is to optimize revenue per thousand requests (fill rate × eCPM together), not fill rate alone, and to keep a floor price that rejects bids so low they drag the blend down. Three related traps sit nearby: leaving deprecated VPAID creative in the stack so ads fail on TVs, letting wrapper chains grow so deep that slots time out into blank slates, and skipping frequency capping so your best advertiser's spot runs six times in one sitting and trains the audience to resent it. Correct ad serving is not "did the slot fill" — it is "did the slot earn, cleanly, on every device."

Where Fora Soft fits in

Ad serving is where streaming scale meets advertising plumbing, and the expensive failures are quiet: a pod that times out into a blank screen on the highest-traffic night, a fill-rate number that hides falling revenue, or premium buyers who skip your inventory because there is no app-ads.txt to verify it. Fora Soft has built video-streaming and OTT/Internet-TV platforms since 2005, across 625+ shipped projects for 400+ clients, which means we have wired VMAP and VAST pipelines into players and server-side stitchers, connected them to ad servers and programmatic demand through header-bidding auctions, and built the pod controls — deduplication, competitive separation, frequency capping — that make a streaming break feel like television. Our stance is scalability-first and vendor-neutral: we start from the scale and yield your audience demands — every slot filled with the highest-earning ad, ready before the break, measurable on every device — then build the ad stack your AVOD, FAST, and hybrid streams actually require.

What to read next

For a shorter, product-level overview of streaming monetization, see our video streaming app monetization guide; to commission a build, talk to our streaming team via the link above.

Call to action

References

  1. Digital Video Ad Serving Template (VAST) 4.3. IAB Tech Lab. Tier 1. The controlling standard for delivering a single video ad to a player: the XML schema, the InLine vs Wrapper response types and the VASTAdTagURI redirect, MediaFile (separated from interactivity since 4.0), the sequence attribute for ad pods, tracking events, and the AdVerifications node. Released December 2022; version history and addenda on the standards page. https://iabtechlab.com/standards/vast/ — accessed 2026-06-17.
  2. Video Multiple Ad Playlist (VMAP) 1.0. IAB. Tier 1. Defines the playlist of ad breaks around a piece of content: the AdBreak element, the timeOffset attribute (start / end / HH:MM:SS.mmm / percentage), breakType, and the AdSource / AdTagURI handoff to VAST. Published July 2012. https://www.iab.com/wp-content/uploads/2015/06/VMAPv1_0.pdf — accessed 2026-06-17.
  3. VAST CTV Addendum 2024. IAB Tech Lab. Tier 1. The connected-TV layer on VAST: the Ad Creative ID Framework (ACIF), Digital Services Act disclosure icons, and higher-resolution creative requirements for large screens. Published July 2024. https://iabtechlab.com/wp-content/uploads/2024/07/VAST-CTV-Addendum-2024-FINAL.pdf — accessed 2026-06-17.
  4. OpenRTB 2.6. IAB Tech Lab. Tier 1. The real-time-bidding protocol for programmatic auctions; version 2.6 adds CTV pod bidding (structured / dynamic / hybrid pods) and carries the SupplyChain object. https://iabtechlab.com/standards/openrtb/ — accessed 2026-06-17.
  5. Open Measurement SDK (OM SDK / OMID). IAB Tech Lab. Tier 1. The standardized viewability and verification SDK that parses the VAST 4+ AdVerifications node and reports geometry, obstruction, viewable impressions, and quartiles; extended to CTV (Samsung, LG ≈ 40% of CTV households) with device attestation. https://iabtechlab.com/standards/open-measurement-sdk/ — accessed 2026-06-17.
  6. ads.txt, app-ads.txt, and sellers.json — supply-chain transparency. IAB Tech Lab. Tier 1. The authorized-seller and seller-disclosure standards (with the SupplyChain object) that let buyers verify inventory is real and sold through authorized partners; app-ads.txt is the CTV-relevant form. https://iabtechlab.com/sellers-json/ — accessed 2026-06-17.
  7. Secure Interactive Media Interface Definition (SIMID). IAB Tech Lab. Tier 1. The interactive-ad interface that replaces the deprecated VPAID, separating interactivity from the media file so server-side insertion and CTV playback work. https://iabtechlab.com/standards/simid/ — accessed 2026-06-17.
  8. Tech Lab CodeBank: VAST (XML schemas and samples). IAB Tech Lab (GitHub). Tier 3 (issuing body's reference implementation). Cross-checked the InLine/Wrapper structure and the VMAP-to-VAST nesting against the official schema and sample tags. https://github.com/InteractiveAdvertisingBureau/vast — accessed 2026-06-17.
  9. CPM, eCPM, and fill rate — monetization metric definitions. Industry monetization references (AppsFlyer glossary; Dolby OptiView). Tier 5/6. Used for the standard formulas — CPM = cost ÷ impressions × 1,000; eCPM = revenue ÷ impressions × 1,000; fill rate = impressions ÷ requests — not for any spec claim. https://www.appsflyer.com/glossary/ecpm/ — accessed 2026-06-17.
  10. CTV ad pods: deduplication, competitive separation, frequency capping. Index Exchange; AdExchanger. Tier 5. Cross-checked the four pod controls and the "blank slate" failure mode that pod management addresses. https://www.indexexchange.com/2022/06/22/ctv-ad-pods/ — accessed 2026-06-17.
  11. Header bidding vs the waterfall; unified auctions for CTV. AdExchanger; PubMatic. Tier 5. Used for the waterfall-to-unified-auction yield framing (~15–25% video/CTV uplift) and the direct-plus-programmatic competition model; figures are industry estimates. https://www.adexchanger.com/tv-and-video/dont-go-chasing-waterfalls-what-ctv-publishers-need-to-know-about-unified-auctions/ — accessed 2026-06-17.
  12. Connected-TV programmatic ad-spend forecast, 2026. Industry analyst forecast compilations. Tier 5. US CTV programmatic spend ≈ $36B in 2026 (≈ $28B in 2025); cited for the scale of the ad-supported streaming this stack serves. https://www.emarketer.com/ — accessed 2026-06-17. Analyst estimates vary by source.
  13. Fora Soft — Video Streaming App Monetisation (overview blog). Fora Soft. Tier 7. Product-level companion on AVOD/FAST/SVOD/TVOD monetization; the commercial-intent counterpart this educational article links to. https://www.forasoft.com/blog/article/video-streaming-app-monetization-strategies — accessed 2026-06-17.

Where sources disagreed, the controlling standard was followed. The VAST response model (InLine vs Wrapper, the VASTAdTagURI redirect, the separated MediaFile, the sequence pod attribute, AdVerifications) is cited from the IAB Tech Lab VAST 4.3 standard and cross-checked against the official GitHub schema; the VMAP AdBreak / timeOffset model from the IAB VMAP 1.0 specification; pod bidding from OpenRTB 2.6. The popular shorthand that "VAST and VMAP are interchangeable" was overridden in favour of the precise framing — VMAP is the playlist of breaks, VAST is the ad inside a break. CPM/eCPM/fill-rate formulas and the CPM-vs-revenue worked examples are standard definitions; the CTV-spend and header-bidding-uplift figures are 2026 industry estimates, cited as such.