This is engineering guidance, not legal advice. Confirm specifics with qualified counsel.
Why this matters
The choice to build or buy an AI feature is made early, often by the person with the smallest budget and the least time, and it is expensive to reverse. Pick "buy" and you may be paying a per-provider fee forever and routing patient data through a vendor you cannot fully audit; pick "build" and you may discover, a year and several salaries in, that your model still is not better than the product you could have licensed on day one. This article is for the founder, product manager, or engineering lead deciding where each AI feature should come from and what it will actually cost. It draws the full-cost picture in plain language, walks the arithmetic out loud, and gives you a decision guide that keeps the compliance bar — the same bar either way — front and center.
"Build" and "buy" are a spectrum, not a switch
The first mistake is treating this as two doors. In practice an AI feature can come from any of four places along a spectrum, and most real products mix them feature by feature.
At one end is buy turnkey: you license a finished clinical-AI product — an ambient scribe, say — that does the whole job, signs a contract to handle patient data, and updates its own model. You integrate it and configure it; you do not train or host anything. At the other end is build from scratch: you train or heavily adapt your own model on your own data and run it on infrastructure you control. Almost nobody in telemedicine should start here, and we will see why.
Between those two are the options most teams actually choose. Buy the model, build the app means calling a general-purpose model through a vendor's hosted interface — a large-language-model interface, called an API (application programming interface), the standard way one program asks another for a result — and writing the clinical product logic around it yourself. Self-host an open model means taking an open-source model whose weights you can download, running it on a server you rent or own, and building the product around that. The difference between the last two is who runs the model: a vendor's data center, or yours.
One term will recur, so anchor it now. A business associate agreement, or BAA, is the signed contract in which any vendor that touches patient data on your behalf promises to guard it and accepts liability under United States health-privacy law (45 CFR §164.502(e)).¹ Think of it as the key-and-keycard agreement every contractor signs before they are let into the building. Where your feature sits on the spectrum changes who you need that agreement with — the finished-product vendor, the model API provider, or the cloud host — but it never removes the requirement.
Figure 1. The build-buy spectrum. Most telemedicine teams live in the middle two columns — buying a model and building the product around it — and choose per feature, not once for the whole platform.
The price you compare first is the tip of the iceberg
Here is the trap that sinks most build-vs-buy decisions: comparing the license fee against the engineer's salary and stopping there. Industry analyses of these decisions estimate that the obvious upfront numbers miss roughly 60–80% of the true cost of ownership.² The cost that matters has three layers, and only the top one is visible when you start.
The visible layer is what you compare on day one: the per-provider license fee if you buy, or the inference cost — the per-token API charge or the GPU rental — if you build. It is real, it is easy to quote, and it is the smallest of the three over a product's life.
The integration layer sits just below the waterline. An AI feature is worthless until it is wired into the actual visit: capturing audio from the call, putting a draft in front of the clinician, writing the result back to the medical record as a draft document, and logging every step for audit. That wiring is engineering work whether you build or buy — a bought product still has to be integrated — and it is usually weeks of it.
The validation-and-maintenance layer is the deep mass nobody quotes. Before launch you must prove the feature is accurate and unbiased on your patients (the validation tax, below). After launch, models drift, vendors change endpoints, regulations move, and clinicians find failure modes you did not test for. Across the industry, maintenance is consistently described as the part that dominates total cost of ownership — teams routinely spend the majority of their engineering time keeping what they built working rather than building anything new.² When you buy, much of this burden is the vendor's; when you build, all of it is yours, forever.
Figure 2. The three cost layers. The license fee or per-token rate you compare first is the visible tip; integration and the validation-and-maintenance burden below the waterline decide the real total — and "build" puts the whole submerged mass on your team.
A worked example: cost per consult for an AI scribe
Numbers make this concrete. Take the most common AI feature in telemedicine — an ambient scribe that listens to a visit and drafts the clinical note, covered in depth in our ambient clinical documentation (AI scribe) article — and price it per consult for a small product doing 2,000 visits a month, 15 minutes each. We will compare buying a managed service against building on a self-hosted open model. All figures are mid-2026 list prices, labeled, and meant as a planning model, not a quote.
The buy path. A managed clinical-transcription-and-summary service such as AWS HealthScribe is billed by audio length, listed at about $0.10 per minute of audio in mid-2026.³ Do the arithmetic out loud:
- 15 minutes per visit × $0.10/minute = $1.50 per consult in model cost.
- 2,000 visits × $1.50 = $3,000 per month, all-in for the AI inference, with the vendor's BAA, updates, and uptime included.
That is the whole model bill. Your remaining cost is the integration work — wiring it into the call and the record — which you pay once and then maintain.
The build path. Self-hosting an open-source speech model plus an open language model to summarize looks cheaper per token and is genuinely cheaper at the margin. A capable GPU on a specialist cloud host rented mid-2026 runs roughly $1.50–$3.50 per hour;⁴ a self-hosted dual-GPU box amortizes to roughly $900–$1,200 a month in hardware before power and hosting.⁴ At a few thousand visits a month, raw inference can land below the $3,000 the managed service costs. But that line is the visible tip. Underneath it you now own: the validation that the open model is accurate enough on clinical speech, the engineering to keep the server patched and the model current, the on-call when it falls over during a clinic's busy Monday, and the monitoring to catch drift. A fully loaded machine-learning engineer is commonly costed at $200,000–$300,000 a year;² a single one, spread across this one feature, dwarfs the inference savings until your volume is very large.
The lesson is not "buying is always cheaper." It is that build wins on per-unit inference cost and loses on everything around it until volume is high. The crossover exists — at enough consults per month, owning the model pays off — but it sits much further out than the per-token comparison suggests, and you only reach it if you can keep utilization high and an engineer assigned. For a team doing 2,000 visits a month, buy is almost always cheaper and lower-risk. For a hospital network doing 200,000, the math can flip.
Figure 3. The four approaches across the costs that actually decide it. Read the last two columns first: who owns maintenance, and whether the data path can be put under a BAA, often matter more than the per-token price.
The validation tax: you pay it either way
A tempting story says buying transfers the accuracy problem to the vendor. It does not transfer all of it. Whoever builds the model is responsible for its general performance, but you are responsible for whether it is safe and unbiased on your patient population and your workflow — and that responsibility is increasingly a legal one, not just good practice.
Two rules make this concrete, and both apply whether you built the model or bought it. First, the line between a tool that supports a clinician and one that makes the medical decision: software that effectively decides can be a regulated medical device under United States law. The FDA's Clinical Decision Support guidance sets four criteria a tool must meet all of to stay outside device regulation — chiefly, that it advises a health professional who can independently review the basis for the advice rather than relying on it.⁵ A feature that crosses that line needs FDA clearance no matter where the model came from — our AI triage article walks this device line in depth. Buying a "cleared" component helps; bolting your own logic on top can push the combined product back over the line.
Second, the anti-discrimination duty. Under the 2024 Section 1557 rule, a covered health entity must make reasonable efforts to identify and mitigate discrimination from patient-care decision-support tools that use characteristics like race, age, or disability — directly or through a proxy (45 CFR §92.210).⁶ That duty lands on you, the deployer, even if a vendor trained the model. So your validation work — measuring accuracy and checking for bias on your own population before launch and on a schedule after — is not optional and not transferable. Buying shrinks it; it never erases it. The full treatment of these gates lives in our compliance and safety layer for clinical AI; the point here is that the validation tax is a line item in both the build budget and the buy budget.
There is a maintenance-of-accuracy wrinkle unique to AI. A model you keep improving is not a fixed product; the FDA addressed this with its Predetermined Change Control Plan framework, which lets a cleared model be updated within a pre-agreed envelope without a new submission each time.⁷ If you build and intend to retrain, that lifecycle is your responsibility; if you buy, ask the vendor how their model updates are governed and validated, because their drift becomes your patients' experience.
The compliance layer is identical on both sides
It is worth saying plainly, because it removes a false reason to build: the privacy rules do not get easier if you build it yourself. Every typed symptom, every recorded sentence of a visit, every draft note is Protected Health Information — health data tied to an identifiable person, or PHI. The obligations that attach to PHI are the same whether the model runs in a vendor's cloud or on your own server.
If you buy, the model vendor is a business associate and needs a signed BAA before a single byte of patient data reaches it, plus — specifically for AI — a no-training clause so your patients' data is not absorbed into the vendor's future models.¹ If you build on a general-model API, the API provider is a business associate and needs exactly the same BAA and no-training clause; the major cloud model platforms offer their model services under a BAA, but you must use the covered enterprise service and confirm the agreement names it. If you self-host an open model, you have removed the model vendor from the data path — a real privacy advantage — but you have added yourself as the party that must now satisfy every safeguard the BAA would have required: encryption, access control, audit logging, and the rest. Building does not delete the compliance work; it moves it onto your own roof. Keeping PHI inside a controlled boundary, and de-identifying data before it leaves that boundary for analytics (45 CFR §164.514(b)), is the same job in all three cases.⁸
Lock-in and the exit question
Price is what you pay; lock-in is what you cannot stop paying. Before committing a feature to a vendor, ask the exit question: if this vendor doubled its price or went out of business next year, how hard is it to leave?
Buying turnkey gives the fastest start and the deepest lock-in. Your product depends on the vendor's model, format, and roadmap; their price increase is your price increase, and migrating off can mean rebuilding the feature. Calling a model through an API is looser — you can often swap one model provider for another with bounded work if you kept your own prompt logic and data — but you still depend on that provider's availability and terms. Self-hosting an open model gives the most independence: the weights are yours to keep running even if the original publisher disappears, which is exactly why some teams accept the higher operational burden. Building from scratch gives total control and total responsibility.
None of these is the "right" answer; the point is to price lock-in as part of the decision, not discover it at renewal. A common, sensible middle path is to buy now to ship, while keeping the integration seam clean enough to swap later — wrap the vendor behind your own interface so the rest of your product does not know or care which model is behind it. That preserves the option to move toward build if and when your volume justifies it.
A four-question decision guide
Put the framework to work with four questions, answered per feature, not once for the whole platform.
1. What is your volume, and will it stay high? Below a few thousand consults a month, buy — you will not reach the crossover where owning the model pays for the engineers it needs. At hospital-network scale with steady utilization, building becomes defensible.
2. Is this feature your core differentiator, or table stakes? Transcription and scribing are commodities many vendors do well; buying them frees your team for what makes your product distinct. Build only where the AI is the product, or where no vendor can do what you need.
3. Can you staff the maintenance, honestly? Building is not a launch; it is a standing commitment to validation, monitoring, retraining, and on-call. If you cannot dedicate an engineer to keeping the feature alive indefinitely, you cannot build it — you can only ship it and watch it rot.
4. What does the data path require? If a vendor will sign a BAA with a no-training clause and that satisfies your risk posture, buying is clean. If your risk or contractual situation means PHI must never leave infrastructure you control, that pushes you toward self-hosting an open model — and you accept the operational cost that comes with it.
Figure 4. The decision guide as a tree. Most features exit at "buy" or "buy the model, build the app"; "self-host" is the answer when the data path or scale demands it and you can staff the upkeep.
A common, expensive mistake
The signature failure here is building to save money and discovering the savings were never the cost that mattered. A team prices a self-hosted open model against a vendor license, sees a lower per-token number, and starts building. Twelve months later the inference savings are real but small, and they are buried under an engineer's salary, a validation effort nobody scoped, an on-call rotation for a feature that is not the product, and a model quietly drifting because no one owns its accuracy. Industry write-ups of these projects are blunt: most build-vs-buy analyses compare only upfront costs and miss the majority of the lifetime cost, and a large share of organizations that build clinical AI would have been better served buying.² The mistake is not choosing build; it is choosing build on a comparison that only looked at the tip of the iceberg.
The quieter cousin is buying without the exit seam — wiring a vendor so deeply into the product that leaving means a rebuild, then absorbing every price increase because the alternative is worse. Both mistakes come from pricing one layer and ignoring the others. Price all three layers, plus lock-in, and neither trap stays tempting.
Where Fora Soft fits in
The requirement comes first: an AI feature has to stay on the supportive side of the FDA's decision-support line, keep every byte of patient data inside a BAA-covered boundary, and be validated for accuracy and bias on the real population — and those duties hold whether the model is bought or built. Fora Soft has built real-time video, conferencing, and clinical-workflow software since 2005, including telemedicine platforms where AI scribing, transcription, and triage sit inside a live consult. We usually wire AI in as a bought model behind a clean integration seam — fast to ship, compliant by construction, and swappable for a self-hosted model later if a client's volume or data-residency needs justify owning it — so the build-vs-buy line stays a decision the product can revisit, not a corner it is painted into.
What to read next
- The compliance and safety layer for clinical AI
- Where AI fits in a telemedicine product — the map
- Build vs buy vs hybrid for telemedicine
Download the AI Build-vs-Buy Cost & Decision Worksheet (PDF)
Call to action
- Talk to a telemedicine engineer — book a 30-minute scoping call to talk through your build vs buy ai healthcare plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the AI Build-vs-Buy Cost & Decision Worksheet — One page: price a telemedicine AI feature across all three cost layers — inference, integration, and validation-and-maintenance — and run it through the four decision questions and the lock-in check before you commit.
References
- HIPAA Privacy Rule — Business Associates (45 CFR §160.103, §164.502(e)) — U.S. Department of Health and Human Services. Tier 1. Any model provider or cloud host that receives PHI on your behalf is a business associate and needs a signed BAA before any data reaches it; the basis for the no-training-clause requirement for AI vendors.
- Build vs. Buy AI: The Total Cost of Ownership Framework — Hyperion Consulting (2026). Tier 6. Upfront-cost comparisons miss roughly 60–80% of total cost of ownership; maintenance dominates lifetime cost; fully loaded AI/ML engineer cost commonly $200,000–$300,000/year. Illustrative industry figures — confirm against your own rates.
- AWS HealthScribe Pricing — Amazon Web Services, list prices current mid-2026. Tier 4 (vendor). Managed clinical transcription and summarization billed by audio length, ≈ $0.10 per minute of audio; HIPAA-eligible under the AWS BAA. Time-sensitive — re-verify the rate at publication.
- Self-Hosted LLM Costs 2026 — SitePoint (2026). Tier 6. Specialist-cloud GPU rental ≈ $1.50–$3.50/GPU-hour mid-2026; self-hosted dual-GPU hardware ≈ $900–$1,200/month amortized before power and hosting; on-prem wins only at high sustained utilization. Illustrative — re-verify current GPU pricing.
- Clinical Decision Support Software — Guidance for Industry and FDA Staff — U.S. Food and Drug Administration, docket FDA-2017-D-6569; FD&C Act §520(o)(1)(E). Tier 1. The four criteria a software function must meet to stay a non-device CDS tool, including that it advises a health professional who can independently review the basis. Time-sensitive — confirm current guidance version at publication.
- 45 CFR §92.210 — Nondiscrimination in the use of patient care decision support tools (Section 1557 Final Rule) — HHS (eCFR), source 89 FR 37692, May 6, 2024. Tier 1. The deployer's duty to make reasonable efforts to identify and mitigate discrimination from patient-care decision-support tools that use protected characteristics or their proxies.
- Predetermined Change Control Plans for Machine Learning-Enabled Medical Devices — U.S. Food and Drug Administration. Tier 1. The framework for governing planned updates to a cleared machine-learning model within a pre-agreed envelope — the lifecycle question a builder owns and a buyer must ask the vendor.
- Guidance Regarding Methods for De-identification of PHI (45 CFR §164.514(b)) — U.S. Department of Health and Human Services. Tier 1. The Safe Harbor and Expert Determination methods for de-identifying PHI before it leaves the compliance boundary for analytics — the same obligation whether the model is built or bought.


