You own the streaming stack — encoder, SFU, delivery, AI pipeline — from day one. Built on mediasoup 3.16, LiveKit, HLS / LL-HLS / CMAF, and MoQ Transport. Proven at Sprii (€365M+ in e-commerce live streaming), Nucleus (600M+ live minutes/month), and BrainCert (500M+ minutes/month, virtual classrooms). Per-minute economics that beat Daily, Twilio Live, Mux, and Agora at scale.
SaaS streaming platforms — Daily, Twilio Live, Mux, Agora, Vonage — ship in days on a per-minute pricing model. Custom development takes longer to start and pays back the moment minutes pile up, AI features get bolted on, or data residency starts mattering. Below: where they differ on what buyers actually care about, then the protocol-by-protocol matrix underneath for the engineers in the room.
Numbers reflect Fora Soft production deployments on Sprii (€365M+ revenue in live commerce), Nucleus (600M+ live min/mo), BrainCert (500M+ min/mo virtual classrooms), and Worldcast Live (10K+ concurrent viewers per event). Your numbers will move with your concurrency curve, geographic spread, and codec choices.
A streaming platform isn't one service — it's an inference and delivery graph with a latency budget at every hop. Miss the budget anywhere and you ship dropped frames, lagging chat, AI captions that arrive after the moment they describe, or a CDN bill that doesn't survive your next growth quarter.
Hosts, cameras, and contributors push in over RTMP, SRT, WHIP, or WebRTC — picked by network reliability and how much resilience the source needs. Ingest sits at the edge of your VPC so authentication, recording consent, and DRM keys never leave your perimeter.
mediasoup 3.16 or LiveKit 1.x handles SFU routing for sub-200ms interactive streams. Janus or custom MCU for transcoded composite layouts. Sharding by room ID + geography keeps any single SFU node under 1500 participants — the scale ceiling Nucleus and BrainCert have benchmarked.
Whisper Large-v3, Deepgram Nova-3, or NVIDIA Parakeet for ASR; SeamlessM4T or DeepL Voice for live translation; custom moderation models for content classification. AI runs as a parallel pipeline so the primary stream isn't blocked by inference cost.
Multi-bitrate transcoding (FFmpeg, AWS MediaLive, Wowza, or custom pipelines) produces adaptive renditions. CDN distribution (Cloudflare, Fastly, AWS CloudFront) fans out to thousands of viewers. WebRTC fanout for sub-200ms interactive viewers; LL-HLS for large-audience live; MoQ Transport for the 2026–27 hybrid path.
Recordings to S3-compatible object storage with HLS segment indexing for instant scrubbing. Event metadata to ClickHouse or PostgreSQL for forensic search and viewer analytics. Quality-of-experience telemetry (rebuffering, dropped frames, join time) feeds an SRE dashboard so regressions get caught in minutes, not days.
End-to-end budget depends on use case: sub-200ms for interactive WebRTC, sub-3s for large-audience LL-HLS, sub-second for AI-augmented broadcast. We benchmark every build against your scene density, concurrency curve, and geographic spread before sign-off.
Every layer is a deliberate choice for scale, not a default. The list below is what we deploy on real production streams — not a survey of options. When something here doesn't fit your environment (Vonage instead of LiveKit, GCP instead of AWS, a regulated codec list), we name the substitute in the architecture document, not the marketing page.
Compliance overlays — GDPR, CCPA, HIPAA, SOC 2, FERPA — are enforced inside each layer: encryption at rest and in transit, role-based access for recordings, audit logs on stream join / leave, data residency pinned per region, retention windows configurable per room class.
Streaming infrastructure isn't generic. A live commerce host stream is not the same product as a regulated telehealth call, even if both run on WebRTC. The taxonomy, the AI overlays, the recording policy, the QoE telemetry — those are where custom development earns its keep. Six shapes Fora Soft has shipped to production.
Sprii — the flagship deployment — has powered €365M+ in revenue through host-led live shopping streams. Sub-200ms host-to-viewer latency, in-stream cart + checkout, multi-host shows, replay-with-purchase. WebRTC for hosts, LL-HLS for the audience overflow.
BrainCert runs 500M+ live minutes/month for virtual classrooms with proctoring, breakout rooms, whiteboards, and SCORM 2004 / LTI 1.3 integrations. Multi-room scheduling, instructor-led recording controls, on-demand replays with chapter markers.
Worldcast Live handles 10K+ concurrent viewers per event with LL-HLS delivery and a CDN fanout designed for spike days. Operator dashboards for live moderation, multi-camera director switching, instant clip generation for social cutdown.
HIPAA-grade WebRTC sessions with role-based access, session recording with patient consent flows, integration with Epic / Cerner / MEDITECH where the consult belongs to a chart. Encrypted recording at rest. NHS UK — in production.
Translinguist — $4.2M ARR — runs simultaneous interpretation streams alongside the main session, with sub-second translation latency via SeamlessM4T or human interpreter mixing (KUDO, Interprefy patterns). Multi-language audio tracks delivered over WebRTC.
Most engagements start with a profile nobody else has built before. The work is mapping the concurrency curve, the geographic spread, the codec / device matrix, and the AI features the product depends on — then designing the stack to hit it. Discovery call is the first hour.
SaaS streaming platforms — Daily, Twilio Live, Mux, Agora, Vonage — are excellent up to a point. They ship in days, the SDKs are mature, and the SLAs are signed. The point where custom development pays back is specific: when minutes pile up, when AI features need direct stack access, or when data residency stops matching the vendor's cloud. The decision isn't "which is better" — it's "where does your three-year cost curve land."
Vendor-owned cloud, vendor-owned SDK, per-minute pricing that scales linearly with usage. Excellent for getting to production fast.
mediasoup / LiveKit / FFmpeg / your codec choices / your AI pipeline. Higher upfront effort; flat infra cost after launch.
Hybrid is a real option — keep an existing SaaS for low-volume use cases, layer custom WebRTC + AI for the high-volume / high-AI-feature flows. We architect that bridge in roughly 25% of engagements.
Engagement model is matched to where you are, not where we'd prefer you to be. The three shapes below cover roughly 90% of how Fora Soft enters a streaming project.
Discovery → architecture → MVP → production. We own the stack and ship in 10–16 weeks on a defined scope. Best fit when there's no existing system or when the SaaS economics are about to flip. Sprii and BrainCert were both built this way.
Discuss scopeSaaS-to-custom migration on a flow you've outgrown, AI overlay added to a running stack (transcription, translation, moderation), LL-HLS delivery layer added to an interactive WebRTC base, CDN re-architecture for spike days. We integrate without ripping out what works.
Discuss scopeInherited a streaming stack nobody fully understands? A previous vendor walked away mid-build? Streams dropping under load with no clear root cause? We've done the takeover dance enough times to make it boring: audit, stabilize, document, ship the next version. NDA before access; honest verdict on what's salvageable.
Discuss scopeThe number you see is the bracket the build typically lands in. Final scope depends on concurrency target, geographic spread, AI features, codec / device matrix, and compliance overlays — we name the moving parts in the discovery call before you commit.
Add-ons priced separately: per-region infrastructure, custom AI model training cycles, third-party SDK licenses (Bitmovin, Wowza), regulatory certification audits, premium CDN contracts. We itemize before contract.
An independent assessment of your streaming build, written by engineers who would actually ship it. Pick the one that fits where you are now: planning the MVP, mid-build, or stabilizing what's already in production. NDA before any code, footage, or system access changes hands.
Competitor analysis, core feature definition, monetization modeling, and a full launch blueprint — delivered within a week. Written by engineers who'll build what they plan.
An independent review of your system's technology choices, structural components, and workload fit — with a plain verdict on what's working, what's a liability, and exactly what to change to reach your goal. Delivered within a week.
A full audit of your code with every issue documented, evidenced, and located — exact file, exact line. Plus a system architecture review and a prioritized fix roadmap. Not a consultant's opinion. A case file. Delivered within a week.
No commitment. NDA before any code, footage, or system access is shared.
Not a generalist studio with a streaming practice. Not a SaaS reseller in a custom-dev jacket. Fora Soft has been building real-time video and WebRTC infrastructure since 2005 — and the live commerce, virtual classroom, broadcast, telehealth, and interpretation work below is the same team, the same stack, the same engineering bar.
625+ products shipped. Streaming is what we built the company on — long before WebRTC was a standard or LL-HLS had a draft. We've watched the streaming stack transition from RTMP to HLS, from HLS to LL-HLS, from MCU to SFU, and now from HLS to MoQ Transport. Every generation has shipped through us.
Powered €365M+ in revenue through host-led live shopping streams. Sub-200ms WebRTC for hosts, LL-HLS for the audience overflow, in-stream cart + checkout, multi-host shows, replay-with-purchase. The streaming stack that proves AI live commerce works at scale.
Nucleus runs 600M+ live minutes/month on a pure WebRTC stack — our scale benchmark for SFU-driven applications. BrainCert runs another 500M+ for virtual classrooms with proctoring, breakouts, and SCORM / LTI integration. Both deployments live on architecture Fora Soft designed and ships.
No outsourcing chain. The WebRTC engineer who tunes your SFU sits next to the iOS engineer who builds the operator app and the SRE who runs your CMAF packaging. 100% Upwork Top-Rated Plus, 100% job success on enterprise engagements. NDA before any code access; honest verdict before any contract.
Break-even sits around 1M monthly participant-minutes for most use cases. Below that, SaaS is faster to launch and the per-minute economics work. Above it, custom builds amortize sharply — Nucleus runs 600M+ min/mo at fractions of a cent per minute, where a SaaS bill at $0.005/min would be $3M/month. The decision frame is in the Build vs Buy section above.
WebRTC: sub-200ms when the SFU is regional to your viewers. LL-HLS / CMAF: 1–3 seconds depending on segment length and CDN configuration. Standard HLS / DASH: 6–10 seconds. We pick per use case — interactive flows on WebRTC, large-audience live on LL-HLS, mobile / regulated on standard HLS. MoQ Transport (IETF draft) targets sub-300ms with CDN-scale fanout, which is the 2026–27 candidate for the hybrid path.
Yes — we run roughly 25% of engagements as parallel-build migrations. New mediasoup / LiveKit SFU comes up next to the existing SaaS, traffic is split by room class or feature flag, the SaaS stays as fallback while metrics are validated, then traffic shifts over once latency and QoE match or beat the baseline. Typical migration window is 6–10 weeks.
Yes. mediasoup SFU instances run in each major region (us-east, us-west, eu-west, ap-southeast, ap-east) with a routing layer that pairs participants by latency to the nearest healthy SFU. Cross-region SFU pipes handle multi-region rooms when needed. This is the architecture Nucleus runs to hit 600M+ minutes/month.
AI runs as a parallel inference pipeline so the primary stream isn't blocked by ASR latency. Whisper Large-v3, Deepgram Nova-3, or NVIDIA Parakeet for transcription; SeamlessM4T or DeepL Voice for translation; custom moderation classifiers for content policy. Captions / translations are delivered as a separate data track over WebRTC or as a synchronized side-channel for HLS — budget is under 800ms end-to-end for the AI output.
When architected correctly, yes. Compliance is enforced inside each layer: encryption at rest and in transit, role-based access for recordings, audit logs on every join / leave, data residency pinned to region (us-east only, eu-west only, etc.), retention windows configurable per room class, BAA signed for HIPAA flows, DPIA documentation for GDPR. We sign Data Processing Agreements before any engagement.
Yes — Worldcast Live runs 10K+ concurrent per event today on LL-HLS over CDN; the architecture extends linearly via CDN origin scaling. For interactive WebRTC flows, the ceiling is per SFU shard (1500–2000 participants) with horizontal sharding for multi-room or large-audience hybrid configurations (host on WebRTC, audience on LL-HLS).
10–12 weeks for a Startup-tier scope (single room class, up to ~500 concurrent, single region). 14–18 weeks for Growth (multi-room, AI overlay, multi-region SFU, up to ~10K concurrent). 18–24 weeks+ for Enterprise (multi-region clusters, full AI pipeline, full compliance overlay, custom moderation models). Discovery call to first running room is typically 3–4 weeks regardless of tier.
You do. Models, training data, infrastructure code, operator UI, and AI pipelines are all delivered to your repositories under your name. Fora Soft retains no claim on the IP. The benefit of custom development over SaaS is exactly this: the streams, the recordings, and the unit economics live on your balance sheet rather than the vendor's.
Three shapes: handover to your in-house team with runbooks and on-call training (most common at Enterprise tier); ongoing SRE / AI-tuning retainer (typical at Growth tier when in-house streaming expertise isn't on the roadmap yet); or fixed-scope quarterly improvement cycles (new room classes, new AI features, codec migrations). All three are scoped after the initial build, not bundled.
Within 48 hours you'll get a realistic estimate, a technical recommendation, and an outline of next steps. No obligation. NDA before any access to your code, recordings, or operator dashboards.