This is engineering guidance, not legal advice. Confirm specifics with qualified counsel.
Why this matters
The video layer is the single most expensive and least reversible technical decision in a telemedicine build. It sets your monthly infrastructure bill, your compliance posture, your time to first patient call, and the size of the engineering team you need to keep the lights on. Pick a managed API and you can demo a compliant consult in days, but you inherit that vendor's pricing, roadmap, and — as Twilio's customers learned in 2024 — their right to threaten to discontinue the product. Self-host an open-source server and you control everything, but you have just hired yourself a real-time-infrastructure team and pulled the patient's media inside your own walls. This article exists so a founder, product manager, or hospital IT lead can make that call deliberately in week one — with the compliance requirement, the cost math, and the vendor risk all on the table at once — instead of discovering the consequences on the first cloud invoice or the first audit.
The decision, stated cleanly
Almost no team writes a video codec or an SFU from scratch. As the topology article explained, every multi-party clinical call runs through a Selective Forwarding Unit (SFU) — a server that receives each participant's audio and video once and forwards a copy to everyone else. The real build-vs-buy question is narrower and sharper: do you run that SFU yourself on open-source software, or do you rent it from a managed vendor who hides it behind an API?
Fix two terms before going further. A managed video API — often sold as CPaaS, short for Communications Platform as a Service — is a paid service such as Twilio, Vonage, Agora, or Daily that operates the media servers, global relays, and client SDKs for you; you call their API and never touch a server. An open-source media server — mediasoup, Janus, or LiveKit — is software you download for free and run on cloud machines you rent and manage yourself. "Buy" means the CPaaS route. "Build" means self-hosting the open-source server. Everything below compares those two paths.
Figure 1. The build-vs-buy decision as a tree. Low volume and a small team point to buying a managed API; high sustained volume, strict data-custody needs, and in-house real-time expertise point to self-hosting; most platforms end on a hybrid.
The compliance filter comes first
In every other industry, a build-vs-buy comparison opens with features and price. In healthcare it opens with a contract, because the wrong first move is a compliance violation no feature can undo.
Here is the rule that governs the whole decision. Under the U.S. health-privacy law HIPAA — the Health Insurance Portability and Accountability Act — anyone who handles Protected Health Information (PHI) on a healthcare provider's behalf is a business associate, and a business associate must sign a Business Associate Agreement (BAA): a contract in which the vendor promises to protect that data and accepts legal liability for it (45 CFR §164.502(e), §164.504(e)) [5]. PHI is any health information tied to an identifiable person — and a live video consultation is PHI in motion. The patient's face, voice, and everything said in the visit are PHI the moment they leave the device.
Now the part teams get wrong. A video server "only forwards encrypted packets", the argument goes, so surely it is a neutral pipe — a "conduit" — and needs no BAA. That argument fails. The Department of Health and Human Services (HHS), which enforces HIPAA, addressed this directly in its official Guidance on HIPAA & Cloud Computing: a cloud provider that creates, receives, or maintains electronic PHI is a business associate "even if the [provider] cannot view the ePHI because it is encrypted and the [provider] does not have the decryption key" [1]. The narrow "conduit exception" HHS recognizes is reserved for pure transmission with only transient, incidental storage — like the postal service or an internet backbone — not for a media server that ingests, routes, and often records a clinical session. A WebRTC SFU sits squarely on the business-associate side of that line.
Two consequences fall straight out of this, one for each path:
- If you buy, the managed video API is your business associate. It must sign a BAA with you before a single real patient joins a call. An encrypted API without a signed BAA is still a HIPAA violation — encryption is necessary, not sufficient. The "BAA available?" column is therefore the first filter on any vendor shortlist, ahead of price and features.
- If you build, there is no third-party BAA for the media server because there is no third party — but the SFU now lives inside your compliance boundary. You own every technical safeguard around it: encryption in transit and at rest, access controls, and audit logging, per the HIPAA Security Rule's technical-safeguards standard (45 CFR §164.312) [6]. You still need BAAs with the cloud host underneath (AWS, GCP, Azure all sign one) and with any other vendor in the path.
Figure 2. Where the BAA sits. Buying puts the media server inside the vendor's responsibility under a signed BAA; building puts it inside your own covered environment, with a BAA only to the cloud host beneath it. Either way, the server is inside the HIPAA boundary — encryption does not move it out.
This single rule reframes the entire comparison. The deeper compliance pattern — how the whole video stack is wrapped in a HIPAA boundary regardless of which path you take — is the subject of the compliance architecture pattern, and the BAA itself is dissected in Business Associate Agreements.
Buy: the managed video APIs
Buying means a vendor runs the SFU, the global TURN relays, the recording infrastructure, and the client SDKs, and you reach all of it through an API. You ship a compliant consult in days, not months, and you never page an engineer at 2 a.m. because a media server fell over. The trade is recurring per-minute cost, dependence on the vendor's roadmap, and the fact that the patient's media flows through infrastructure you do not control.
The healthcare-relevant question for each vendor is binary and goes first: will it sign a BAA, on what plan, and at what price? The table below is a 2026 snapshot. Treat the "BAA available?" column as a hard gate — a "no" eliminates the vendor for clinical use no matter how good the SDK is.
| Video API / CPaaS | BAA available? | Healthcare terms & cost signal (2026) | Watch-outs |
|---|---|---|---|
| Twilio Video | Yes (addendum) | HIPAA-eligible since 2020; BAA via a Business Associate Addendum on eligible products; group rooms ~$0.004/participant-min [2][9] | Announced 2024 end-of-life, then reversed it — a live reminder of roadmap risk [8] |
| Vonage Video API | Yes (Enterprise) | One BAA covering Video, Voice, and SMS; ongoing third-party HIPAA audits; healthcare pricing via sales [3] | BAA only on Enterprise; contact sales for terms |
| Daily | Yes | HIPAA + BAA via a healthcare add-on at ~$500/mo; built by WebRTC spec authors; ~$0.004/participant-min after free tier [4][10] | Add-on is a fixed monthly floor on top of usage |
| Agora | Yes (Enterprise) | HIPAA-capable configuration on enterprise; HD video ~$3.99 per 1,000 min (~$0.004/min) [10] | Confirm BAA scope and recording path with sales |
| Whereby Embedded | Yes | BAA free on Enterprise; HIPAA add-on ~$16.99/mo on the Build plan; recordings to your own S3 bucket [7] | Embedded/prebuilt UI; less low-level control |
| Zoom Video SDK | Yes (qualifying plans) | BAA on qualifying paid plans only; some AI features unavailable under a BAA [7] | Consumer Zoom ≠ the SDK; configuration is involved |
| Pexip | Yes (Enterprise) | On-premise / private-cloud hosting; composited output for hospital room hardware [7] | Enterprise pricing; heavier to adopt |
| LiveKit Cloud | Yes (Scale/Enterprise) | Managed tier of the open-source server; BAA on Scale and Enterprise; recordings written to your own bucket [11] | The same code is self-hostable — a natural build/buy bridge |
Table 1. BAA availability across managed video APIs and CPaaS, 2026. The "BAA available?" column is the first filter for any healthcare build — an encrypted API without a signed BAA is still a HIPAA violation. Per-minute prices are list signals only; model your own volume. Verify every row against the vendor's current HIPAA page before committing — terms and prices change.
Two rows deserve a closer look because they teach the two failure modes of buying.
Twilio Video is the vendor-risk lesson. In March 2024 Twilio announced it would discontinue Twilio Video on December 5, 2026. Teams that had wired their entire clinical product into Twilio's SDK faced a forced migration. In October 2024 Twilio reversed the decision and committed to keeping the product [8]. The reversal is good news, but the episode is the permanent lesson of buying: the vendor controls the roadmap, and a deprecation notice can land on a product you cannot quickly replace. Architect an abstraction layer so the video vendor is swappable, and you blunt this risk.
Zoom is the "encrypted ≠ compliant" lesson. The consumer Zoom app is not the same thing as the Zoom Video SDK, and a BAA is available only on qualifying paid plans — with some AI features switched off under that BAA [7]. "We use Zoom, so we are HIPAA-compliant" is exactly the kind of loose claim that fails an audit. Compliance is a property of the configured, BAA-covered product, not the brand name.
Build: the open-source media servers
Building means you download a free, open-source media server and run it on cloud machines you rent and operate. You gain full control of features, data custody, and per-minute economics, and you remove a third party from the patient's media path. You take on operating real-time infrastructure — autoscaling SFUs, global TURN relays, monitoring, on-call — and you sign your own servers into your HIPAA boundary, where every safeguard is now your job.
Three open-source servers dominate clinical builds, and their differences start with a detail most comparison articles skip: the software license, which is a legal constraint, not a technical one.
| Open-source SFU | License | What that means for a closed-source healthcare product | Best fit |
|---|---|---|---|
| mediasoup | ISC (permissive) [12] | Use freely in a closed-source commercial product; no source-disclosure obligation | Node.js / Rust teams wanting a low-level, high-performance SFU library |
| Janus | GPLv3 (copyleft) [13] | GPLv3 can require you to release source of derivative works; a commercial license is offered for closed products | C-based, plugin-rich gateway; teams comfortable with GPL or buying the commercial license |
| LiveKit | Apache 2.0 (permissive) [11] | Use freely in a closed-source product; same code runs as LiveKit Cloud | Teams wanting the most out-of-the-box features and an optional managed escape hatch |
Table 2. The three dominant open-source SFUs and their licenses. The license is a compliance-adjacent legal fact: GPLv3 (Janus) can force source disclosure for a derivative work unless you buy the commercial license, while ISC (mediasoup) and Apache 2.0 (LiveKit) place no such obligation on a closed-source product. Confirm the current license and your specific use with counsel.
A word on each, kept to the clinical decision rather than the protocol internals — for the engineering-grade SFU comparison, the Video Streaming section has the full mediasoup/Janus/LiveKit/Jitsi/Pion deep-dive.
mediasoup is an ISC-licensed SFU that behaves like a low-level building block rather than a finished server. You get fine-grained control and strong performance, and the permissive ISC license [12] places no obligation on a closed-source product — but you assemble the signaling, scaling, and recording yourself. It suits a team with real-time expertise that wants to own the architecture.
Janus is a mature, C-based, plugin-oriented gateway from Meetecho. Its catch is the license: Janus is GPLv3 [13], a copyleft license that can require you to publish the source of derivative works. For a proprietary telemedicine product that is a real risk, which is why Meetecho also sells a commercial license for teams that cannot accept GPL terms [13]. Engineering-capable, but read the license before you commit.
LiveKit is the youngest of the three and ships the most out of the box — SDKs, scaling, and recording included — under the permissive Apache 2.0 license [11]. It is the gentlest on-ramp for a team without deep WebRTC experience. It also blurs the build-vs-buy line usefully: the same Apache-2.0 code runs as LiveKit Cloud, a managed tier that signs a BAA on Scale and Enterprise plans and writes recordings straight to your own storage bucket [11]. You can prototype on the managed tier and self-host later, or vice versa, without changing your application code.
Whichever you choose, the compliance posture is the same: the SFU is inside your covered environment, encrypted in transit with the WebRTC standard's DTLS-SRTP [11], encrypted at rest with strong keys you control, and wrapped in the access controls and audit logging the Security Rule requires (45 CFR §164.312) [6]. The encryption details and the difference between transit, at-rest, and true end-to-end encryption are covered in encryption for telemedicine.
The cost crossover: per-minute versus fixed
The most consequential number in this decision is the point where buying stops being cheaper than building. The two paths have fundamentally different cost shapes, and seeing the crossover is the whole game.
A managed API charges roughly per participant-minute. Twilio group rooms, Daily, and Agora HD video all land near $0.004 per participant-minute in 2026 list pricing [2][4][10]. That cost is zero when no one is on a call and rises linearly with usage — you pay for exactly what you use, with no idle servers.
Self-hosting has almost the opposite shape: a fixed monthly cost that scales with peak concurrency, not total minutes, plus a large up-front and ongoing engineering cost. The servers cost roughly the same whether they run one call or a hundred at that capacity, but you also pay salaries to build and operate them.
Walk the buy-side arithmetic out loud for a mid-size platform. Take 50,000 consults a month, 20 minutes each, two participants per call (patient and clinician):
participant-minutes = consults × minutes × participants
= 50,000 × 20 × 2
= 2,000,000 participant-minutes / month
CPaaS cost = participant-minutes × $0.004
= 2,000,000 × $0.004
= $8,000 / month
At this volume buying is a bargain: $8,000 a month with zero infrastructure team. Now scale to a busy national platform at 500,000 consults a month — ten times the volume:
participant-minutes = 500,000 × 20 × 2 = 20,000,000 / month
CPaaS cost = 20,000,000 × $0.004 = $80,000 / month ≈ $960,000 / year
Nearly a million dollars a year, growing linearly with every new patient. A self-hosted SFU at that concurrency runs on the order of a few thousand dollars a month in cloud compute plus a small real-time-infrastructure team — call it $30,000–60,000 a month all-in including salaries. The lines cross somewhere in the high hundreds of thousands of participant-minutes per month: below it, buying is cheaper because you avoid the engineering cost; above it, building is cheaper because your cost stops tracking usage. The exact crossover depends on your salaries and concurrency, but the shape is universal.
Figure 3. The cost crossover. CPaaS cost (orange) starts near zero and rises with every minute used; self-hosting (green) starts higher because of fixed engineering and server cost but stays nearly flat. Below the crossover, buy; above it, build.
The trap is to compare the two on per-minute price alone. Self-hosting's "$0 per minute" ignores the salaries; buying's "$0.004" ignores how fast that compounds at scale. Model your own projected volume against both cost shapes before you choose — a back-of-envelope version of this is exactly what the downloadable comparison sheet does.
The four axes beyond cost
Cost and compliance are the two that decide most cases, but four more axes separate the paths when the decision is close.
Time-to-market. Buying wins outright. A managed API gives you a working, BAA-covered consult in days; self-hosting a production-grade, autoscaling, multi-region SFU is a multi-month engineering project. For an MVP racing to a pilot, this axis alone often settles it.
Control and features. Building wins. You own the codec choices, the recording pipeline, the exact data-residency map, and any custom clinical feature — annotation, vitals overlays, specialty layouts. A managed API gives you what its roadmap gives you, when it gives it.
Operational burden. Buying wins. The vendor carries the pager for media-server outages, global relay capacity, and browser-compatibility churn. Self-hosting means you run a real-time-infrastructure on-call rotation forever.
Vendor and roadmap risk. Building wins. Open-source code cannot be discontinued out from under you; the Twilio 2024 end-of-life scare [8] is the canonical example of the risk you accept when you buy. LiveKit's self-hostable-plus-Cloud model is a deliberate hedge against exactly this.
Figure 4. The two paths on the axes that matter. Managed APIs maximize speed and minimize operational burden; open-source SFUs maximize control and cost-at-scale; LiveKit's self-host-or-Cloud model bridges the two.
A worked decision: two real telemedicine teams
Put the framework to work on two teams a telemedicine practice actually looks like.
An early-stage startup building an MVP has one patient and one clinician per call, a few thousand consults a month, a four-person engineering team, and a pilot to hit in ninety days. The verdict is buy, decisively. At a few thousand consults the per-minute bill is a few hundred dollars a month, time-to-market is the binding constraint, and the team has no spare capacity to operate a media server. Pick a vendor that signs a BAA on a plan you can afford — Daily's flat healthcare add-on or Whereby's low-cost HIPAA tier fit an MVP well — and put a thin abstraction layer around the SDK so you can switch later. Spending three months self-hosting an SFU here would be a textbook premature optimization.
A scaled national platform runs group behavioral-health sessions and high-volume primary care, 500,000-plus consults a month, with a strict requirement that patient media never leave infrastructure it controls and a real-time-engineering team already on staff. The verdict tilts to build, or to a hybrid. At this volume the CPaaS bill approaches a million dollars a year and grows with every patient, the data-custody requirement argues for keeping media in-house, and the team can operate the servers. Self-hosting LiveKit or mediasoup caps the cost and satisfies custody — with the SFU inside the platform's own HIPAA boundary, covered by the cloud host's BAA underneath.
The honest answer for most platforms that grow is a hybrid that changes over time: buy to launch, then migrate the high-volume paths to self-hosted infrastructure once the cost crossover is in sight and the team exists to run it. LiveKit's identical-code Cloud-and-self-host model makes that migration unusually smooth, which is why it shows up so often as the bridge choice. The build-vs-buy framing for the whole platform, not just the video layer, is the subject of build vs buy vs hybrid for telemedicine.
Common mistake: choosing the video layer on per-minute price or feature checklist alone and discovering the BAA gap later. Three traps recur. First, signing up for an encrypted API and assuming "encrypted" means "compliant" — it does not without a BAA. Second, building on GPLv3-licensed Janus inside a closed-source product without reading the license or buying the commercial one. Third, picking a CPaaS on its $0.004 per-minute rate without modeling concurrency, then watching the bill cross a million dollars a year at scale. Run the compliance filter, the license check, and the cost model before you fall in love with an SDK.
Where Fora Soft fits in
Fora Soft has built real-time video since 2005 across conferencing, streaming, surveillance, e-learning, and telemedicine, and the video-layer choice is one we make on every clinical project — compliance first. We start by drawing the HIPAA boundary, then decide build or buy against it: which managed APIs will sign a BAA on a plan that fits, or whether a self-hosted mediasoup or LiveKit deployment inside the client's own covered environment is the better long-run answer on cost and data custody. We have shipped on managed CPaaS and on self-hosted open-source SFUs, and we routinely build the abstraction layer that keeps the vendor swappable so a deprecation notice is never an emergency. If you want your projected call volume mapped to a build-vs-buy recommendation with the BAA path and the cost crossover spelled out, talk to our telemedicine team.
To make the decision repeatable, we condensed it into a one-page comparison sheet: the BAA gate, the open-source license check, the per-minute-versus-fixed cost model, and the four axes, all in one place. Download the video-layer build-vs-buy comparison sheet and run it against your roadmap before you commit.
What to read next
- P2P, SFU, MCU: the right topology for a consult, a group session, and a webinar
- Business Associate Agreements (BAA): the contract that makes or breaks your stack
- The compliance architecture pattern: how to wrap a video stack in HIPAA
Talk to our telemedicine team — get your call volume mapped to a build-vs-buy recommendation, with the BAA path and cost crossover, by engineers who have shipped compliant clinical video on both: telemedicine architecture review.
See our case studies — telemedicine and real-time video products we have built on CPaaS and on self-hosted SFUs: our work in telemedicine.
Download the video-layer build-vs-buy comparison sheet — the BAA gate, license check, cost model, and four axes on one page: get the PDF.
Call to action
- Talk to a telemedicine engineer — book a 30-minute scoping call to talk through your hipaa compliant telehealth platforms plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the Video-Layer Build-vs-Buy Comparison Sheet — The BAA gate, the open-source license check, the per-minute-versus-fixed cost model, and the four decision axes — with a BAA-availability table across the managed video APIs — on one page.
References
- HHS — Guidance on HIPAA & Cloud Computing, U.S. Department of Health and Human Services, Office for Civil Rights, https://www.hhs.gov/hipaa/for-professionals/special-topics/health-information-technology/cloud-computing/index.html — a cloud/service provider that creates, receives, or maintains ePHI is a business associate even if it cannot view the ePHI because it is encrypted and lacks the decryption key; the conduit exception is limited to pure transmission with transient storage. Checked 2026-06-13. Tier 1 (agency guidance).
- Twilio — Video Pricing (Programmable Video, Group Rooms), https://www.twilio.com/en-us/video/pricing — group rooms billed per participant-minute (~$0.004); track recording and composition rates. Vendor source; checked 2026-06-13. Tier 4.
- Vonage — HIPAA Compliance and BAA (Vonage Video API), https://api.support.vonage.com/hc/en-us/sections/6646174896284-HIPAA-Compliance-and-BAA — BAA available on Enterprise; single BAA across Video, Voice, SMS; ongoing third-party HIPAA audits. Vendor source; checked 2026-06-13. Tier 4.
- Daily — HIPAA-compliant video chat, https://docs.daily.co/guides/privacy-and-security/hipaa — HIPAA + BAA via a healthcare add-on (~$500/mo); built by WebRTC spec authors; recordings to your own storage. Vendor source; checked 2026-06-13. Tier 4.
- 45 CFR §164.502(e), §164.504(e) — HIPAA Privacy Rule, business-associate contract standard and required clauses; with 45 CFR §164.308(b) (Security Rule business-associate contracts), https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164 — the requirement that any business associate handling PHI sign a BAA. Current as of 2026-06-13. Tier 1.
- 45 CFR §164.312 — HIPAA Security Rule, technical safeguards (access control, audit controls, integrity, transmission security), https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164/subpart-C/section-164.312 — the safeguards a self-hosted media server must implement inside the covered environment. Current as of 2026-06-13. Tier 1.
- Whereby — The Best HIPAA-Compliant Video Call APIs for Telehealth Platforms (2026-05-12), https://whereby.com/blog/best-hipaa-compliant-video-call-apis/ — BAA availability and pricing snapshot across Whereby, Daily, Vonage, Twilio, Zoom SDK, Pexip; verify terms per vendor. Vendor source. Tier 4.
- Twilio — Twilio Video Will Remain a Standalone Product, Changelog 2024-10-21, https://www.twilio.com/en-us/changelog/-twilio-video-will-remain-a-standalone-product — March 2024 EOL (Dec 5, 2026) announcement reversed; product retained. Vendor source documenting roadmap/vendor risk. Tier 4.
- Twilio — Twilio and HIPAA, https://www.twilio.com/en-us/hipaa — Programmable Video HIPAA-eligible (since 2020); BAA via Business Associate Addendum on HIPAA-eligible products; shared-responsibility model. Vendor source; checked 2026-06-13. Tier 4.
- Agora — Video Calling Pricing, https://www.agora.io/en/pricing/video-calling/ — HD video ~$3.99 per 1,000 minutes (~$0.004/participant-minute); free monthly tier. Vendor source; checked 2026-06-13. Tier 4.
- LiveKit — Security, https://livekit.com/security — LiveKit media server, SIP, Egress, Ingress, and SDKs are Apache 2.0; LiveKit Cloud BAAs available for Scale and Enterprise; SOC 2 Type II; DTLS-SRTP media encryption, AES-256 at rest; Egress writes recordings to your own bucket; optional E2EE. First-party source; checked 2026-06-13. Tier 3.
- mediasoup — LICENSE (ISC), versatica/mediasoup, https://github.com/versatica/mediasoup/blob/v3/LICENSE — mediasoup is ISC-licensed (permissive; no source-disclosure obligation for closed-source use). First-party source; checked 2026-06-13. Tier 3.
- Janus WebRTC Server — License (COPYING) and README, Meetecho, https://janus.conf.meetecho.com/docs/COPYING.html — Janus is GPLv3-licensed (copyleft); a commercial license is offered for teams that cannot accept GPLv3. First-party source; checked 2026-06-13. Tier 3.
Where lower-tier vendor sources (tiers 4–6) supplied pricing and BAA terms, those figures are presented as 2026 list-price signals and per-vendor snapshots, flagged for re-verification against each vendor's own HIPAA page before publication. The compliance stance — that a WebRTC media server is a business associate and needs a BAA (buy) or must sit inside the covered environment (build) — follows the HHS Cloud Computing guidance [1] and 45 CFR §164.502(e)/§164.504(e) [5] over the looser "the SFU only forwards encrypted packets, so it is a mere conduit" framing common in vendor material.


