Media over QUIC · QUIC Development

Media over QUIC (MoQ) & QUIC Development

We build live streaming on the transport that's replacing the old stack — Media over QUIC, raw QUIC, and WebTransport — for sub-second latency that holds at scale. Built on quic-go, Cloudflare quiche, and WHIP/WHEP ingest, deployed in your cloud. First working build in 1–2 weeks, from $8K.

0.4–0.5sGlass-to-glass latency we've shipped (Worldcast Live)
10,000Concurrent viewers at that latency
1B+Streams/month on platforms we've built (Mangomolo)
20+ yrsReal-time media since 2005, 250+ products

Who we build for

Live commerceLive sports‍Trading & fintechOTT at scaleLive auctionsCloud gaming & XRBroadcast & events

The transport decision

WebRTC, LL-HLS, or Media over QUIC — what to build on in 2026

Media over QUIC (MoQ) is an IETF transport, in active draft as draft-ietf-moq-transport, that carries live media over QUIC. It aims to combine the sub-second latency of WebRTC with the CDN-scale fan-out of HLS — the two things you previously had to choose between. Here's how the four transports that matter actually compare for a production build.

WebRTCLL-HLS / LL-DASHMedia over QUIC (MoQ)SRT / RTMP
Glass-to-glass latencySub-second (~0.2–0.5s)2–5sSub-second target (~0.3–1s)2–8s (RTMP), ~1s (SRT contribution)
Fan-out scaleHard past ~10K without an SFU meshCDN-native, millionsCDN-native by design (relays over QUIC)Contribution only, not delivery
CDN-friendlyNo (bespoke infra)Yes (HTTP)Yes (QUIC / HTTP-3 path)No
Head-of-line blockingN/A (UDP)Yes (TCP segments)None (QUIC independent streams)Yes (RTMP / TCP)
Maturity (2026)Mature, ubiquitousMatureEmerging — IETF draft-17; Cloudflare relay network live across 330+ cities (2025)Mature (ingest)
Best forTwo-way calls, conferencingOne-to-many VOD / live at scale, latency-tolerantOne-to-many live at scale and sub-second — commerce, betting, sportsIngest / first-mile contribution

Most 2026 stacks are hybrid: SRT or WHIP for ingest, WebRTC where two-way matters, and MoQ over QUIC for low-latency delivery at scale. We don't sell you a protocol — we map your latency target, audience size, device mix, and CDN strategy, then build the combination that fits.

The pipeline

How a Media over QUIC stream actually moves

A MoQ system replaces the brittle parts of the old low-latency stack — the TCP segment fetches of HLS, the bespoke SFU mesh of WebRTC at scale — with one QUIC-based path from contribution to player. Here's the route a frame takes, and where the latency budget goes.

01IngestWHIP / SRT / RTMPfirst mile~50–150 ms02QUIC transportRFC 9000 · streamsno HOL blocking~10–40 ms/hop03MoQ relaymoq-rs · moxygenfan-out at the edge~10–30 ms/hop04Edge / CDNQUIC / HTTP-3 pathCDN economics~5–20 ms05PlayerWebTransportMSE / jitter buffer~50–100 ms

Figure 1: Media over QUIC delivery path — ingest to player, with per-hop latency budget.

01

Ingest

The first mile arrives over WHIP (WebRTC ingest), SRT, or RTMP. We normalize it and hand it to the QUIC layer.

~50–150 ms
02

QUIC transport

QUIC (RFC 9000) carries media as independent streams, so one lost packet never stalls the rest — no head-of-line blocking, and connection migration survives a network switch mid-stream.

~10–40 ms/hop
03

MoQ relay fan-out

A Media over QUIC relay (moq-rs, moxygen, or a Cloudflare relay) forwards objects to subscribers and downstream relays. Fan-out happens at the relay tier, the way a CDN scales HLS — but at sub-second latency.

~10–30 ms/hop
04

Edge / CDN

Relays sit at the edge over the QUIC/HTTP-3 path, so delivery rides existing CDN economics instead of a bespoke real-time mesh.

~5–20 ms
05

Player

The client subscribes over WebTransport and renders through MSE or a custom decode path, with a jitter buffer tuned to your latency-vs-smoothness target.

~50–100 ms

A tuned MoQ path delivers glass-to-glass latency under one second to a CDN-scale audience — the combination WebRTC and HLS each gave you only half of. We've shipped 0.4–0.5s at 10,000 concurrent viewers on a custom WebRTC + Kurento build (Worldcast Live); MoQ is how we now take that latency to a far larger audience without the mesh. For the protocol-level detail, see how QUIC works.

Why now

Why Media over QUIC matters in 2026

For a decade, live streaming forced a trade-off: WebRTC for sub-second latency but painful past ten thousand viewers, or HLS/DASH for CDN-scale reach but two-to-thirty-second delay. Media over QUIC collapses that choice. Three things made 2026 the year to build on it.

The standard stabilized

QUIC is RFC 9000 and HTTP/3 is RFC 9114; the IETF MoQ Transport draft (draft-ietf-moq-transport, now at revision -17, May 2026) is far enough along that production implementations track it closely.

The infrastructure shipped

Cloudflare launched a production MoQ relay network in August 2025, running on every server across 330+ cities (open-source moq-rs), with Meta, Google, and Cisco building interoperable implementations. Browser WebTransport support is broad enough to reach real audiences.

The tooling matured

quic-go, Cloudflare quiche, moq-rs, and moxygen are production-grade — you no longer write a QUIC stack from scratch to ship a MoQ product.

Being early is the advantage. The teams that build correct MoQ products in 2026 own the latency-sensitive use cases — live commerce, in-play betting, real-time auctions, interactive sports — before the field crowds in. We've spent twenty years on the hard half of this problem: the transport, the jitter buffers, the congestion control, the fan-out. The protocol is new; the engineering underneath it is what we've always done.

What we build

Low-latency systems we've shipped

Live commerce

Shoppable live video

Sub-second video so a viewer taps “buy” while the product is still on screen. We built Sprii's RTMP + WebRTC multistreaming on Cloudflare and Mux — 12.3 million products sold through live shopping in a single year.

Sports

In-play sync

In-play interactive overlays only work if every viewer sees the same moment at the same instant. MoQ keeps the whole audience inside a one-second window.

Trading & fintech

Real-time desktop streaming

Real-time desktop and chart streaming where a half-second of lag is a missed trade. We built Tradecaster — live trade streaming for 46,000+ users, auto-scaling through market-hour spikes.

OTT at scale

Low-latency live channels

Low-latency live channels inside a large VOD platform. We build for OTT scale: Mangomolo serves 1B+ streams a month to 30M+ daily viewers at up to 4K.

Live events

Two-way performance

Concerts and shows where remote performers play together and audiences talk back. Worldcast Live runs full-duplex HD at 0.4–0.5s latency for 10,000 concurrent viewers.

Cloud gaming & XR

Input-to-photon paths

Input-to-photon budgets where QUIC's stream independence and connection migration keep a session alive across a network change.

When custom wins

When a custom MoQ/QUIC build pays off

A low-latency SaaS (Millicast, Ant Media, Red5) is the right call when its feature set fits and you're happy renting the transport. Custom wins when latency-at-scale is the product itself, when you need to own the relay tier and the roadmap, or when you're early enough that being first with the right stack is the moat. It wins at any audience size — a thousand viewers or a million.

Sub-second-at-scale requirement →Control & future-proofing →Low-latency SaaSrent the transportSelf-assembled open sourceyou wire & staff itFora custom MoQ/QUIC buildown the relay tier · first-mover stack

Figure 2: Build vs Buy — sub-second-at-scale requirement × control and future-proofing. Custom wins the top-right at any audience size.

Buy a low-latency SaaS when
› A managed product covers your latency and feature needs
› You don't need to own the transport or relay tier
› Audience and use case fit a standard template
› You want it live now and will revisit later
Build custom when
› Sub-second-at-scale is the product (commerce, betting, sports, gaming)
› You need to own the relay tier, the roadmap, and the data path
› You're early on MoQ and want the first-mover stack as a moat
› A per-stream SaaS bill is outgrowing a build you'd own
Right when: low latency at audience scale is a feature your users pay for — at any size.

How we work

Three ways to start

Pricing

What a MoQ/QUIC build costs

Fixed-scope starting points. Final scope depends on ingest mix, audience scale, relay topology, and player targets — run the calculator for an instant estimate.

Starterfrom $8KLive in 1-2 weeks
  • One low-latency path: WHIP/SRT ingest, a QUIC/MoQ relay, a WebTransport player
  • Single region
  • Working low-latency stream you can demo
Get an instant estimate
Most chosenGrowthfrom $15K4-6 weeks
  • Multi-region relay fan-out
  • Hybrid transport (WebRTC two-way + MoQ delivery)
  • CDN integration, monitoring and QoE metrics
Get an instant estimate
Enterprisefrom $30K6-8 weeks
  • Owned relay tier, multi-CDN steering, DRM, SLA
  • Load-tested to your peak concurrency
  • Handover of source and infrastructure-as-code
Get an instant estimate

Free for qualified projects

Start with a free working session

Before any contract, we'll give you something useful. Pick the one that fits where you are.

Why Fora Soft

Why teams pick us for low-latency streaming

Sub-second, at scale, already shipped

Worldcast Live runs 0.4–0.5s glass-to-glass for 10,000 concurrent viewers on a custom build. The hard part of MoQ is the part we've done for years.

The transport stack, in production

quic-go, Cloudflare quiche, WHIP/WHEP, SRT, WebRTC, mediasoup, CMAF, LL-HLS — shipped in real products, not slide decks. Sprii, Mangomolo, Tradecaster, Worldcast Live.

Early on the standard

We track the IETF MoQ draft, Cloudflare's production relay network, and the Meta/Google/Cisco implementations so your build is correct against where the standard is going, not where it was.

All in-house, 250+ products

Senior engineers, no offshore handoffs, 250+ products since 2005, and a 100% job-success score on Upwork. We finish and hand over clean.

FAQ

MoQ & QUIC development, answered

What is Media over QUIC, in one line?

Chevron down icon for interactive fields

How is MoQ different from WebRTC?

Chevron down icon for interactive fields

How is MoQ different from HLS/DASH?

Chevron down icon for interactive fields

What latency can you actually hit?

Chevron down icon for interactive fields

Is MoQ production-ready in 2026?

Chevron down icon for interactive fields

Will it work in browsers and on CDNs?

Chevron down icon for interactive fields

Can you migrate our existing HLS or WebRTC stack?

Chevron down icon for interactive fields

How much does a build cost?

Chevron down icon for interactive fields

How long does it take?

Chevron down icon for interactive fields

Do we need QUIC and MoQ, or just one?

Chevron down icon for interactive fields

Keep reading

Go deeper

Have an idea?

Let's scope your low-latency build.

Within 48 hours you'll get a realistic estimate, a technical recommendation, and an outline of next steps. No obligation. NDA before any access to your code, recordings, or operator dashboards.

Specialist software house for video, real-time and AI products. Founded 2005. 50 in-house engineers.

+1 (914) 775-5855
New York · USA
© Fora Soft, 2005–2026
Describe your project and we will get in touch
Enter your message
Enter your email
Enter your name

By submitting data in this form, you agree with the Personal Data Processing Policy.

Your message has been sent successfully
We will contact you soon
Message not sent. Please try again.