P2PSFUMCUHybrid250+ projects since 2005
Custom WebRTC Architecture

Custom WebRTC architecture, designed to scale

We design the real-time architecture behind your product — topology (P2P, SFU, MCU, or hybrid), media-server selection, and the scaling plan that holds up under load. Architecture blueprint in 1 week from 3$K. Our designs run sub-300 ms group video at 10,000 concurrent.

10,000
Concurrent at sub-second latency, one architecture we designed
99.995%
Uptime across 10 datacenters
20+ yrs
Designing real-time systems since 2005
Sub-300 ms
Group-video latency target
Who this is for

Built for teams whose real-time architecture has to hold up

Whether you are designing a new real-time system, outgrowing peer-to-peer, choosing a media server, or fixing an architecture that buckles under load, we have made these calls before — at hundreds of millions of minutes a month.

Designing a new RT systemOutgrowing P2PChoosing an SFUMulti-region scalingMCU-to-SFU migrationCPaaS-to-self-hostedHA & failover designRecording architectureArchitecture auditRe-architecting a stalled buildCost/scale modeling
Topology

P2P, SFU, MCU, or hybrid

The topology decision shapes cost, scale, and quality more than any other choice in a real-time build. There is no universal winner — there is the right one for your call size, features, and budget. Here is how we pick.

P2PSFUMCUHybrid (SFU + MCU)
Best for1:1 callsGroup calls (the workhorse)Heavy compositing / recordingLarge calls needing both
Media serverNonemediasoup, LiveKit, Janus, PionKurento, customSFU + selective MCU
Scales to~2-4 peersThousands per regionLimited by mixing costThousands, with mixed outputs
Server costLowestModerateHighestTuned per stream
Client bandwidthHigh (each peer)Low (one up, many down)Lowest (one mixed)Low
RecordingAwkwardPer-track egressNative compositeComposite where needed
We pick it whenTruly 1:1, no serverMost group videoServer-side mixing requiredScale + composite output

Most production systems we design are SFU, or a hybrid that adds selective MCU mixing only where it earns its cost. We pick on your numbers, not a default. New to the trade-offs? See WebRTC architecture for production systems.

How we design

From requirements to a production architecture

A real-time architecture is a stack of decisions that compound — get the topology or the scaling model wrong early and you re-architect later. Here is the blueprint we design and how we get there.

Clients & SDKsWeb · iOS · Android · desktopSignalingWebSocket · token auth · presenceMedia layer — SFU / MCUmediasoup · LiveKit · Janus · Pion · KurentoNAT traversalcoturn (TURN/STUN) · ICERecording & egressto your storage · RTMP / WHIPScaling & orchestrationmulti-region · Redis routing · autoscaling · observability
Figure 1: A WebRTC production architecture is six layers of decisions. Every one is made on your numbers — the media layer (highlighted) is where most of the cost and scale live.
1

Assess current & target

Concurrency, call sizes, quality bar, budget, compliance, and (if you have one) the current architecture’s pain points.

Week 1
2

Design the topology

P2P, SFU, MCU, or hybrid, with the reasoning written down so your team understands the trade-offs, not just the answer.

3

Select the media server

mediasoup, LiveKit, Janus, Pion, Kurento, or a CPaaS — chosen on your features and scale, vendor-agnostic.

4

Design scaling & HA

Multi-region SFU, Redis routing, failover, a load model, and a cost model so you know what scale costs before you build.

5

Build it or hand off the blueprint

We build the architecture, or hand you the design docs and diagrams for your team to build. Your call.

Your choice

The result holds sub-300 ms group video and scales without re-architecting — the topology and scaling model are right from the start.

The decisions

What we decide at each layer

Every layer is a decision with downstream consequences. We make them deliberately, document the reasoning, and design so the choices still hold as you grow.

Layer
What we design
Topology
P2P / SFU / MCU / hybrid, chosen on call size, features, and cost
Media server
mediasoup, LiveKit, Janus, Pion, Kurento, or CPaaS — vendor-agnostic selection
Signaling
WebSocket / Socket.io, token auth, presence, reconnection logic
NAT traversal
coturn (TURN/STUN), ICE tuning, relay capacity planning
Recording / egress
Per-track or composite, to your storage, RTMP/WHIP out
Scaling & HA
Multi-region SFU, Redis routing, autoscaling, failover, load + cost model
Compliance
HIPAA and SOC 2 patterns, data residency, media encryption
Use cases

Architectures we have designed

Build vs Buy

Design it in-house, or with a team that has scaled it

Every team can sketch an architecture. The question is whether it survives contact with real load — and whether the people designing it have seen what breaks at scale. Here is the split.

Architecture expertiselowhighScale-readiness & ownershipIn-house first attemptGeneralist consultantSpecialist architecture team(Fora Soft)
Figure 2: Value axes, not scale. A specialist team brings architecture expertise that is already scale-tested — and documents the design so your team owns it.
Design it in-house when
Your team has shipped real-time at your target scale before
The system is small and unlikely to grow past one region
You have time to learn the failure modes the hard way
Right when: the architecture is low-stakes and your team already owns the expertise.
Bring in a specialist when
You are betting the product on real-time and cannot afford to re-architect later
You need the topology and scaling model right the first time
You want the media-server choice made on your numbers, not a vendor’s pitch
You want the architecture documented and owned by your team, not a black box
At any size — a sound architecture pays off from your first release
Right when: real-time is core and getting the architecture wrong is expensive — a blueprint in 2–4 weeks.

Not sure the current plan holds? The free architecture review below stress-tests it.

How we work

Four ways to bring us in

Pricing

Starting points, not size caps

Fixed-scope starting points. An architecture design sprint is the fastest way to a sound plan; build and scale engagements follow from it.

Architecture Design Sprint
from $3K
~1 week
  • Assessment + topology design
  • Media-server selection
  • Scaling plan
  • Documented architecture + diagrams
Start a design sprint
Most teams start here
Production Architecture
from $8K
~2–3 weeks
  • Design plus build
  • SFU/MCU, signaling, TURN
  • Recording, multi-platform
  • Deployed
Scope a build
Scale & Multi-Region
from $12K
~4+ weeks
  • Multi-region + HA/failover
  • Load-tested to your concurrency target
  • Observability
  • Cost model
Plan for scale

Infra (media servers, TURN, hosting) is billed at cost. An audit-only engagement is scoped on the call.

Free for qualified projects

Three ways to de-risk before you commit

Before you commit to an architecture, we will pressure-test the plan and catch the scaling traps.

Why Fora Soft

We have designed the architecture that holds at scale

We are not sketching from theory. We have designed and run real-time architectures at hundreds of millions of minutes a month, across streaming, EdTech, enterprise, and telehealth.

Track record

Since 2005, 250+ projects

Two decades of real-time architecture decisions, not a service line.

At scale

Proven under load

Worldcast at 10,000 concurrent sub-second; BrainCert across 10 datacenters at 99.995%; Nucleus at 600M+ minutes a month.

Vendor-agnostic

The media server, on your numbers

mediasoup, LiveKit, Janus, Pion, Kurento, plus CPaaS — we pick on your requirements and say why.

Failure modes

We design around what breaks

TURN capacity, region failover, mixing cost — the things that sink systems at scale, designed for up front.

Documented

Yours, not locked in

You get the architecture diagrams and docs; your team owns the design.

Compliance

HIPAA + SOC 2 by design

Compliance patterns designed into production systems (CirrusMed, Nucleus), not bolted on.

FAQ

WebRTC architecture, answered

The questions teams ask before they commit to an architecture. The same answers power this page’s FAQ schema.

What is custom WebRTC architecture?

Chevron down icon for interactive fields

P2P, SFU, MCU, or hybrid - how do you choose?

Chevron down icon for interactive fields

Which media server should we use?

Chevron down icon for interactive fields

How do you design WebRTC to scale to thousands?

Chevron down icon for interactive fields

Can you audit or fix our existing architecture?

Chevron down icon for interactive fields

What about TURN/STUN and connectivity at scale?

Chevron down icon for interactive fields

Do you design for HIPAA or SOC 2?

Chevron down icon for interactive fields

Do we get the architecture documented, or just code?

Chevron down icon for interactive fields

How is this different from just hiring WebRTC developers?

Chevron down icon for interactive fields

What does it cost and how long does it take?

Chevron down icon for interactive fields
Further reading

Go deeper on WebRTC architecture

Getting the architecture right the first time?

Tell us the call sizes, the scale, and the constraints. We will design the topology, choose the media server, and hand you a scaling plan — in one call. Need a team to build it? See WebRTC development. Want the background first? See WebRTC architecture for production systems.

Specialist software house for video, real-time and AI products. Founded 2005. 50 in-house engineers.

+1 (914) 775-5855
New York · USA
© Fora Soft, 2005–2026
Describe your project and we will get in touch
Enter your message
Enter your email
Enter your name

By submitting data in this form, you agree with the Personal Data Processing Policy.

Your message has been sent successfully
We will contact you soon
Message not sent. Please try again.