Why this matters

If you are an L&D director, an EdTech founder, or a product lead, this is the decision that quietly sets the ceiling on everything else. Treat it as "what's cheapest this quarter?" and you inherit someone else's video player, tracking model, and limits — or you over-build a custom system for parts of the product that were never going to differentiate you. The cost of getting it wrong is not a line item; it is a year of engineering pointed at the wrong layer. This article exists so you can make the call deliberately, with the 2026 numbers and the trade-offs in front of you. Read it with the platform anatomy map and the cost model open — this is the article that turns those two into a decision.

Stop asking "build or buy" as one question

The phrase "build versus buy" hides the mistake inside it. It makes you pick a single answer for a thing that is not a single thing. A learning-video platform — the software that hosts courses, plays video, tracks what learners do, and reports on it — is a stack of distinct layers, and each layer has its own answer.

Picture the stack from the bottom up. There is the learning management system (LMS), the software that holds courses, enrols learners, and records their progress. There is the content and video-on-demand (VOD) layer that stores and streams recorded lessons — VOD simply means video a learner can start any time, as opposed to a live broadcast. There is the live-classroom layer for real-time teaching. There is the interactive-video layer — quizzes, branching, and clickable overlays inside the player. There is the tracking-and-standards layer that turns player events into records an LMS can read. And increasingly there is an AI layer for captions, translation, tutoring, and summaries.

Asking "should we build or buy?" forces one verdict across all six. That is how teams end up building a course-enrolment system that a hundred vendors already sell, or buying a platform whose video player they will be fighting within a month. The better question is asked once per layer: for this specific layer, do we differentiate, or is this a solved commodity we should rent?

That reframing is the whole article. Everything below is how to answer it.

The three operating models, in plain language

Once you accept that the decision is per-layer, three recognizable patterns emerge across the whole platform. Define them before using them.

The first is buy-all: rent a finished, hosted learning platform that someone else operates, usually as software-as-a-service (SaaS — software you access over the web for a monthly fee instead of installing and running it yourself). You configure courses, learners log in, and you change nothing under the hood. Buy-all is renting a furnished apartment: you move in tomorrow, but you cannot move the walls.

The second is build-all: create a custom platform from an empty editor, choosing every technology and writing the course model, the player, the tracking, and the live stack yourself. Build-all is constructing a house from the foundation up on an empty lot — total freedom, total cost, total time.

The third is the hybrid: keep a bought or extended system for the commodity layers and build custom only the layer that makes the product special. Hybrid is buying a house with good bones and renovating the one room that is the reason you bought it. For a learning-video product that room is the video — the player, its interactivity, and how it is tracked.

Three operating models for a learning-video platform — buy-all, hybrid, and build-all — placed on a speed-versus-control spectrum Figure 1. The three models on the one axis that explains them: speed and low starting cost on the left, control and differentiation on the right. The hybrid deliberately splits the difference by layer.

Underneath all three is a single trade-off, and once you see it the decision gets simpler. On one end is speed and low starting cost; on the other is control and the ability to differentiate. You cannot maximize both. Buy-all maximizes speed — learners in a course this week, a predictable monthly fee — at the cost of control. Build-all maximizes control — every pixel and every tracked event is yours — at the cost of time and money. Hybrid is the only model that refuses to pick globally and instead picks per layer, which is why it wins so often.

The one question that decides each layer

For every layer, ask one thing: is this where we differentiate, or is it a commodity? A commodity is a capability your buyers assume you have and never choose you for — course enrolment, a gradebook, single sign-on. A differentiator is the capability a learner or buyer would switch to you to get. Build differentiators; buy commodities. This is the oldest rule in build-vs-buy, and it holds because engineering spent on a commodity is engineering that produced no advantage you can charge for.

Run the rule down the stack and a clear default falls out for most video-first products.

The LMS core — accounts, enrolment, catalogs, grading, reporting plumbing — is a commodity. Hundreds of vendors sell it; learners do not pick you for your enrolment form. Buy or extend.

The standards-and-tracking layer is mostly commodity conformance with a differentiating edge. The conformance part — speaking SCORM, xAPI, cmi5, and LTI so your courses work inside other systems — is solved and you should inherit it. The edge is what you track in video, which a generic player will not capture. Buy the conformance, build the video-specific tracking.

The VOD, live, and interactive-video layers are where a video-first product differentiates. The feel of the player, the latency of the live class, the quizzes and branching baked into the timeline — that is the product. Build, or build on top of managed infrastructure.

The AI layer is its own build-vs-buy question and usually lands on "buy the model, build the wiring" — covered in depth in building vs buying AI features.

Per-layer portfolio view — each platform layer tagged buy, build, or hybrid, showing the common winning pattern of buying commodity layers and building the video layer Figure 2. The platform as a portfolio of layer decisions. The common winning pattern: buy the commodity layers, build the video layer. The mix is the strategy.

The picture that emerges — buy most layers, build one or two — is the hybrid model. It is not a compromise you settle for; it is the deliberate output of asking the differentiation question six times instead of once.

Why the video layer is the usual build candidate

If the rule is "build where you differentiate," why does the video layer keep coming up as the thing to build? Because bought platforms treat video as a feature, and video-first products treat it as the product, and those two stances produce very different players.

A bought LMS gives you a competent, generic video player: it plays a file, it tracks that the file was watched, and it stops there. The moment you want a quiz that pauses the timeline and writes a score, a branching path where the learner's choice changes the next clip, a heatmap of where attention drops, or sub-second live latency for a real seminar, you are past what the generic player exposes. You can sometimes bolt interactivity on with a tool like H5P, but you are decorating someone else's player, not owning the experience.

There is also the tracking gap. A generic player reports "completed" and maybe "watched 90%." It does not, by default, emit the rich per-second video events — played, paused, seeked, completed, progress — defined by the xAPI Video Profile, the community profile that standardizes how video interactions are recorded as xAPI statements [5]. If your product's value is understanding how people watch, the generic player cannot see it. Owning the video layer is how you close both the experience gap and the tracking gap at once. The architecture for doing this is covered in building an interactive video player and, for real-time teaching, WebRTC for live learning.

Crucially, "build the video layer" rarely means building video infrastructure from raw parts. It means owning the player and its logic on top of managed delivery — a video API such as Mux, Cloudflare Stream, or api.video for recorded content, and a real-time platform such as LiveKit, Agora, or 100ms for live. You build the part learners feel; you rent the transcoding farms and global edge you never want to operate. That distinction is what keeps the hybrid affordable.

What it actually costs: a three-year total cost of ownership

The honest way to compare the three models is total cost of ownership (TCO) — every cost over a multi-year horizon, not the sticker price on day one. A leader who compares only upfront prices underestimates the long-term bill, because hidden integration, training, and per-seat fees can add 150–200% on top of a "buy" license over time [1]. Three years is the right window for a learning platform; five is better if your roadmap is stable.

The single most important fact about the three models is the shape of their cost, not the size. Buy-all is cheap to start and rises with every learner, forever. Build-all is expensive to start and then rises slowly with infrastructure. Hybrid starts in the middle and rises slowly. Those shapes cross, and where they cross decides the answer.

Let us make it concrete. Take a commercial education product whose video is the product, growing from about 5,000 active learners in year one to 20,000 in year two to 50,000 in year three.

Buy-all. A mid-market per-learner platform runs roughly $6 per active learner per month in 2026 [3][7]. Walk the arithmetic out loud for each year:

  • Year 1: 5,000 learners × $6 × 12 months = $360,000
  • Year 2: 20,000 × $6 × 12 = $1,440,000
  • Year 3: 50,000 × $6 × 12 = $3,600,000

Add a modest $20,000 first-year setup and the three-year buy-all total is about $5,420,000 — and at the end you still cannot build the custom interactive video the product needs, because the ceiling is the vendor's player. (Consumer course platforms sometimes charge a flat fee plus a revenue share instead of per-seat; the shape is the same — the bill grows with success, and the player is still theirs.)

Build-all. A full custom platform — course model, player, tracking, live, the lot — lands around $250,000 to build in 2026, inside the $150,000–$300,000 range for a full-featured system [2][9]. Then you run it. Using managed video and live infrastructure, and paying for delivery that scales with learners (the detailed egress math lives in scaling delivery), illustrative run costs are about $70,000, $190,000, and $390,000 across the three years. Maintenance runs 15–25% of the build per year [2]; at 20% that is $50,000 a year after year one. Three-year build-all total: roughly $1,000,000.

Hybrid. Build only the video layer — custom player, interactivity, and the tracking bridge — for about $90,000, and extend or rent a commodity LMS core for the rest, say $30,000 a year to host and operate. You own the same video infrastructure as the build-all case, so the delivery costs are similar, about $40,000, $140,000, and $320,000 across three years. Maintenance on the smaller custom surface is about $18,000 a year after year one. Three-year hybrid total: roughly $716,000.

Model Up-front build Run + license (3 yr) Maintenance (3 yr) 3-year TCO Differentiation
Buy-all ~$20,000 ~$5,400,000 $0 ~$5,420,000 Capped at vendor player
Build-all ~$250,000 ~$650,000 ~$100,000 ~$1,000,000 Total — but you rebuilt commodity
Hybrid ~$90,000 ~$590,000 ~$36,000 ~$716,000 High on video, commodity handled

Figures are illustrative for a video-first product scaling to 50,000 learners; your numbers come from the worksheet below and the cost model. The point is the shape, not the cents.

Three-year cost curves for buy-all, build-all, and hybrid as learner count grows, with the crossover points where each model stops being cheapest Figure 3. Cost shape beats sticker price. Buy-all is cheapest at small, bounded scale and the most expensive at volume; hybrid and build-all start higher and rise slowly. Where the curves cross is where your answer changes.

Now flip the scenario, because buy-all is the right answer more often than founders admit. Take internal compliance training for a bounded audience of 2,000 employees, where video is lectures and screen recordings and nothing about the player differentiates anything. Buy-all costs 2,000 × $6 × 12 × 3 = $432,000 over three years, and you get standards conformance, a working player, and compliance reporting on day one. Building anything here would burn six figures to differentiate a cost centre that no learner chose you for. When the audience is bounded and the video is not the product, buy and move on.

The lesson from both scenarios is the same: there is no universal answer, only a crossover. Bounded audience or video-as-feature → buy. Growing audience and video-as-product → hybrid, then build-all only when scale and control justify rebuilding even the commodity.

The decision matrix

Cost is one axis. A real decision weighs six. Here is the full comparison, including the standards-support column that tooling and platform tables in this section always carry.

Criterion Buy-all Hybrid Build-all
Time to first learners Days to weeks 2–4 months 4–9 months
Up-front cost Lowest Medium Highest
Cost shape as you scale Rises with every learner Rises slowly (infra) Rises slowly (infra)
Control over video & UX Low (vendor player) High (you own video) Total
Standards support (SCORM / xAPI / cmi5 / LTI) Inherited from vendor Inherited core + custom video tracking You implement all of it
Best fit Bounded audience, video-as-feature Growing audience, video-as-product Very large scale, deep control needs
Main risk Ceiling + per-seat bill at scale Integration seams between bought and built Cost, timeline, and rebuilding commodity

The winning-fit cells are not a single column — they move with your situation, which is exactly why a matrix beats a verdict. A compliance team reads down the "buy-all" column and stops. A scaling EdTech reads the "hybrid" column. A platform at MOOC scale, where even the commodity layers need control, reads "build-all" — see the MOOC reference architecture.

Standards support: who owns conformance in each model

Standards are where a build-vs-buy decision quietly becomes a conformance decision, so be precise. Four standards matter for learning video, and each is owned by a named body and a named version.

SCORM (Sharable Content Object Reference Model), maintained by ADL, packages a course so any compliant LMS can launch and track it; SCORM 1.2 and SCORM 2004 4th Edition track a fixed data model — completion, score, time, and a limited interactions set — inside an LMS launch [4]. xAPI (the Experience API), originally ADL's xAPI 1.0.3 and since October 2023 the IEEE standard 9274.1.1-2023 (also called xAPI 2.0), records learning as statements — "Maria completed Module 3" — written to a Learning Record Store (LRS), the database those statements live in, and can track learning anywhere, online or offline [6][8]. cmi5 (ADL) is the rulebook that lets xAPI content be launched and managed from an LMS, bridging SCORM-style launch with xAPI's richer tracking [4]. LTI (Learning Tools Interoperability 1.3 / LTI Advantage, from 1EdTech) lets a tool launch securely inside any LMS using an OpenID Connect login and a signed JSON Web Token — single sign-on is a consequence of the mechanism, not the mechanism itself — and its Assignment and Grade Services pass grades back to the LMS gradebook [10].

Now map ownership to each model, because this is where hybrid earns its place:

In buy-all, the vendor implements the standards and you inherit their conformance. That is a real advantage — SCORM and LTI 1.3 work on day one — but you also inherit their limits: if the vendor's player does not emit xAPI Video Profile statements, you cannot see per-second video behaviour no matter how much you want to.

In build-all, you implement every standard yourself: the SCORM run-time, an xAPI pipeline and an LRS, cmi5 launch, and LTI 1.3 as both tool and platform. Maximum control, maximum effort, and real conformance risk — getting SCORM sequencing or an LTI signed launch subtly wrong is a classic, expensive bug.

In hybrid, you keep the bought core's standards conformance and build the video tracking the core cannot do. The custom player emits xAPI Video Profile statements to the core's LRS (or your own), so courses still launch via SCORM or cmi5 and report grades via LTI, while your video layer captures the rich signal a generic player throws away. That combination — inherited conformance plus owned video tracking — is the technical reason hybrid is the default for video-first products. For the full standards comparison, see SCORM vs xAPI vs cmi5 vs LTI.

A decision tree you can run in ten minutes

You do not need a quarter of analysis to get a strong first answer. Four questions, asked in order, settle most cases.

1. Is the video experience a differentiator or a feature? If learners or buyers would choose you for the video — its interactivity, its latency, what it tracks — keep going down the build/hybrid path. If video is lectures nobody picks you for, jump to "buy-all" and stop.

2. Is your audience bounded or growing? A bounded audience (employees, a fixed cohort) keeps per-seat SaaS cheap, which favours buy-all even with decent video needs. A growing, open audience makes the per-seat bill the dominant cost and pushes you toward owning infrastructure.

3. Do you have, or can you fund, an engineering team? Building or extending requires engineers who will own the system for years, not just at launch. No team and no budget to hire one means buy-all regardless of ambition — a half-built platform nobody maintains is the worst outcome.

4. How deep are your integration and compliance requirements? If you must integrate with a specific corporate LMS, talent system, or single-sign-on, and meet standards conformance your buyers audit, that favours keeping a proven, certified core (buy or hybrid) over reimplementing it (build-all).

Run those four and you land in a model: video-as-feature or no team → buy-all; video-as-product with a team and growing scale → hybrid; very large scale with deep control needs and the budget to rebuild commodity → build-all.

Decision tree with four questions — differentiator, audience size, engineering capacity, integration and compliance depth — leading to a buy-all, hybrid, or build-all recommendation Figure 4. The ten-minute decision tree. Four questions in order route you to a model; the worksheet turns the route into a scored, defensible recommendation.

To run this against your own numbers, download the build-vs-buy-vs-hybrid decision worksheet — a one-page scored questionnaire across differentiation, scale, team, integration, standards, and budget shape that turns the four questions into a recommendation you can put in front of a board. When you are ready to turn the recommendation into a number and a timeline, scoping and estimating a learning-video project is the next step.

Common mistakes

The most expensive errors in this decision are predictable, so name them.

Treating it as one switch. Building the enrolment form and the gradebook because you decided to "build the platform" burns months on commodity nobody chose you for. Decide per layer.

Buying for a product you intend to differentiate. A SaaS platform bought for a consumer product that competes on experience hits its customization ceiling in the first week of beta and never moves. If the video is the product, buy-all rarely survives year two.

Building to avoid a subscription, then under-running it. A custom platform with no maintenance budget rots. Maintenance is 15–25% of the build cost every year [2]; if you cannot fund that, you cannot afford to build.

Ignoring the cost shape. Comparing year-one prices makes buy-all look cheapest every time. It is cheapest at small scale; model three years, because the per-seat curve is the whole point.

Forgetting standards until integration. Discovering at launch that your built platform does not speak LTI 1.3 to a buyer's LMS, or does not emit xAPI for the reports they require, turns a sale into a six-month retrofit. Decide standards support up front — it is a column in the matrix for a reason.

Where Fora Soft fits in

Fora Soft has built video software since 2005 — streaming, WebRTC conferencing, OTT, surveillance, telemedicine, and e-learning — across 239+ shipped projects, which means we have run this exact decision many times. We are not a reseller of any one model; our build-vs-buy framing is the work. In practice we most often help teams execute the hybrid: keep your proven LMS core and its standards conformance, and let us build the differentiating video layer — the interactive player, the live-classroom experience, and the xAPI Video Profile tracking — on top of managed infrastructure so it scales without an operations team you do not have. When the scenario genuinely calls for a full custom platform, we build that; when it calls for buying, we say so. The value is choosing the right model per layer, then engineering the one layer that earns it.

What to read next

Call to action

References

  1. TechTarget / Gartner TCO model — Total cost of ownership for build vs buy software, 2026. https://www.techtarget.com/searchcio/definition/TCO — Tier 5. Used for the 150–200% hidden-cost figure and the multi-year TCO lens.
  2. AnyforSoft — How Much Does Custom LMS Development Cost in 2026: From PoC to Full Build, 2026. https://anyforsoft.com/blog/custom-lms-development-cost-2026-from-poc-to-full-build/ — Tier 4. Used for build-cost ranges and the 15–25% annual maintenance figure.
  3. TalentLMS — 2026 LMS Pricing Guide: Hidden Costs & Tips for Buyers, 2026. https://www.talentlms.com/blog/lms-pricing/ — Tier 4. Used for per-learner SaaS pricing and hidden-cost framing.
  4. ADL Initiative — SCORM and cmi5 (Content Aggregation Model; Run-Time Environment; Sequencing and Navigation; cmi5 specification). https://adlnet.gov/projects/scorm/ and https://github.com/AICC/CMI-5_Spec_Current — Tier 1, primary standard. Used for SCORM's fixed data model and cmi5's launch role.
  5. ADL / xAPI community — xAPI Video Profile. https://adlnet.gov/projects/xapi-video/ — Tier 1, primary profile. Used for the per-second video events a generic player omits.
  6. IEEE Standards Association — IEEE 9274.1.1-2023: JSON Data Model Format and RESTful Web Service for Learner Experience Data Tracking and Access (xAPI 2.0), 2023. https://standards.ieee.org/ieee/9274.1.1/7321/ — Tier 1, primary standard. Used for xAPI's current standard status and the LRS definition.
  7. Docebo — Pricing and Plans, 2026. https://www.docebo.com/pricing/ — Tier 4. Used for enterprise LMS quote-based pricing context (~$25,000/yr for ~500 learners).
  8. ADL Initiative — Experience API (xAPI) overview, accessed 2026. https://www.adlnet.gov/projects/xapi/ — Tier 2, issuing-body guidance. Used for xAPI statement semantics and Tin Can history.
  9. Enacton — LMS Development Cost: The 2026 Breakdown, 2026. https://www.enacton.com/blog/lms-development-cost/ — Tier 4. Used to corroborate full-build and enterprise cost ranges.
  10. 1EdTech (IMS Global) — Learning Tools Interoperability (LTI) 1.3 and LTI Advantage (LTI 1.3 Core; Assignment and Grade Services 2.0; Names and Role Provisioning 2.0; Deep Linking 2.0). https://www.1edtech.org/standards/lti — Tier 1, primary standard. Used for the OIDC + signed-JWT launch mechanism and grade passback.

Where sources disagreed, the official standards (ADL, IEEE, 1EdTech) were followed over vendor blogs: in particular, vendor posts that describe xAPI loosely as "Tin Can" or omit its IEEE 9274.1.1-2023 status were overridden by the IEEE and ADL primary sources [6][8].