WHEP: HTTP-Based Egress For WebRTC

Why This Matters

For two decades the streaming industry has had one ingest standard (RTMP) and one playback standard (HLS), and that asymmetry — push using one protocol family, pull using another — was simply accepted as how the field worked. The 2025 publication of WHIP as RFC 9725 began to close the contribution side. WHEP, in symmetry, exists to close the playback side: to give a browser, a smart TV, or a mobile app one standardised way to request a WebRTC stream from any compliant server. The promise is the same as WHIP's, on the receiving end: portability. The player you ship today should work against any WHEP server tomorrow.

But WHEP is also a story about how slowly standards actually move. As of May 2026, you can run WHEP in production today — Cloudflare Stream, Dolby Millicast, OvenMediaEngine, MediaMTX, Janus, LiveKit, Ant Media, and several others have shipped it — yet the IETF draft has expired without progressing to RFC, the working group is in "Revised I-D Needed" state, and the most recent text dates from August 2025. This article is the canonical reference on the current state of WHEP for the engineers, product people, and architects considering it for a 2026 delivery stack: what the draft actually says, where it differs from WHIP, what the layer-selection and event extensions look like in practice, which platforms support it, what its real-world latency floor is, and what the unresolved standards risk means for procurement decisions.

What WHEP Is — In One Page

WHEP, the WebRTC-HTTP Egress Protocol, is a thin HTTP layer that lets a viewer set up, manage, and tear down a WebRTC playback session against a media server. The protocol does exactly one thing: it gives the viewer a standardised way to receive a single WebRTC media session — one bundled audio plus video stream, sent from server to viewer — and to clean up when the viewing ends. Everything else — the media transport, the codecs, the encryption, the congestion control — is plain WebRTC, defined by the existing W3C and IETF specifications. The protocol mimics WHIP in shape but reverses the direction.

The two URLs are the WHEP endpoint URL (where the viewer POSTs to begin a playback session) and the WHEP session URL (returned in the Location header of the 201 response, and used for everything that follows). The verbs are POST (to begin a session), DELETE (to end one), PATCH (in two roles — to update ICE state, or to deliver the viewer's SDP answer to a server counter-offer), OPTIONS (for CORS preflight and capability discovery), and GET (which returns an empty 2XX, useful only for health checks). That is the whole protocol — the rest of the draft is precise definitions of what each request and response must contain.

Mechanically: the viewer generates an SDP offer for a single WebRTC PeerConnection (one bundled audio + video media stream, receive-only direction, DTLS-SRTP encryption negotiated), serialises the offer to text, and sends it as the body of an HTTP POST request with Content-Type: application/sdp to the WHEP endpoint URL. The server has two paths it can take in response: if it can accept the offer as-is, it returns a 201 Created with the SDP answer in the body and the session URL in the Location header; if it cannot accept the offer (for example because the viewer asked for a codec the server does not encode), it returns a 406 Not Acceptable with a SDP counter-offer in the body, and the viewer must reply with a SDP answer carried in a PATCH request against the session resource. That two-path negotiation is the single biggest mechanical difference from WHIP and the reason most production WHEP deployments do not implement the counter-offer path at all — they hard-code the codec assumption and rely on the happy 201 Created path.

Once the offer/answer exchange completes, ICE candidate exchange runs (the server's candidates are delivered fully in the initial response, and only the viewer trickles new candidates afterward), DTLS finishes its handshake, and SRTP media starts flowing from server to viewer. The viewer terminates the session with an HTTP DELETE to the session URL; if the viewer simply disappears, the server detects the loss via WebRTC's own ICE connectivity checks and tears the session down on its own.

The wire transport is whatever WebRTC uses — UDP with ICE candidates of host / server-reflexive / relayed type, DTLS for encryption, SRTP for media — which means WHEP inherits WebRTC's NAT traversal story (STUN/TURN, explored in detail in our NAT article) and WebRTC's congestion control. The signalling layer is HTTP/1.1 or HTTP/2; the draft does not mandate a version.

Figure 1. The complete WHEP playback session, end to end. The happy path is a single POST that returns a 201; the counter-offer path adds a 406 and a follow-up PATCH. Media rides on standard WebRTC machinery underneath, sent from server to viewer.

The Short Version Of How WHEP Got Here

WHEP began life in 2021 as an individual Internet-Draft (draft-murillo-whep-00), authored by the same Sergio Garcia Murillo at Millicast who co-authored WHIP. The motivation was direct: if WHIP was going to standardise WebRTC contribution, the corresponding playback path needed the same treatment, otherwise vendors would standardise the easy side and leave the harder side fragmented. The IETF WISH working group adopted the work in 2022 as draft-ietf-wish-whep-00, with the goal of submitting it to IESG for publication by December 2024.

That milestone slipped. The working group has now produced four versions — -00 through -03 — with the latest (draft-ietf-wish-whep-03) published on 18 August 2025 and expiring without revision on 19 February 2026. The current IETF datatracker state is "Expired" at the IESG level and "Revised I-D Needed — Issue raised by WG" at the working-group level, which in plain English means the document still has open issues the working group must resolve before it can be re-submitted. The intended RFC status remains "Proposed Standard". The authors are now S. Garcia Murillo (Millicast), C. Chen (ByteDance), and D. Jenkins (Everycast Labs Ltd) as editor.

Two things make this status awkward. First, the draft itself includes the IETF's standard boilerplate language: "It is inappropriate to use Internet-Drafts as reference material or to cite them other than as 'work in progress.'" The text is, by IETF convention, not a stable reference. Second, every vendor shipping WHEP today is shipping against a target that has not stabilised — and several are still tracking the earlier individual draft (draft-murillo-whep-01) because that is the version their player libraries were written against. Cloudflare's documentation, for example, explicitly states that its implementation tracks draft-murillo-whep-01, not the working-group draft.

A useful contrast: WHIP was 16 drafts and three and a half years of WG work; WHEP is at three drafts and three years, with no published reference implementation that ships from the IETF (the way DASH-IF's dash.js anchors DASH). Standards-track movement is slow on every protocol, but WHEP's specific blocker has been less about technical disagreement than about getting the harder semantics — codec negotiation, viewer-side resource management, and the events-stream extension question — written into stable text. The protocol is functional in production because every implementer has made local choices the spec leaves open.

The HTTP Verbs — What Each One Does

The protocol's surface area is small. The rest of this section walks through the wire mechanics in the order a viewer encounters them.

POST — begin a session (§4.2)

The viewer generates an SDP offer for a single WebRTC PeerConnection and sends it as the body of an HTTP POST request to the WHEP endpoint URL. The request must carry Content-Type: application/sdp. The SDP offer has constraints — the direction is recvonly (or sendrecv for a viewer that wants to send back-channel audio, an uncommon case); inactive and sendonly directions are forbidden in viewer offers; the session must bundle all media into a single transport (§4.5.1, using max-bundle policy); and exactly one MediaStream is permitted (§4.5.2). Within those constraints, anything legal in WebRTC is legal in the WHEP offer.

The server has three possible responses. The most common is 201 Created (§4.2.1) — the server accepted the offer as-is, generates an SDP answer with direction sendonly, gathers its own ICE candidates before responding, and returns the answer in the body with two response headers: Location (pointing at the session URL) and ETag (the initial entity-tag identifying the ICE session, required if ICE restarts are supported).

The second response is 406 Not Acceptable (§4.2.2) — the server could not honour the viewer's offer but is willing to negotiate, and returns its own SDP counter-offer in the body. The response carries the same Location header pointing at the session resource and a valid-until parameter on Content-Type indicating how long the counter-offer remains accepting (default 30 seconds). The viewer is then expected to respond with an SDP answer (direction recvonly) carried in an HTTP PATCH against the session URL, with Content-Type: application/sdp and a body containing the answer; the server responds with 204 No Content to confirm. This two-step counter-offer negotiation is unique to WHEP — WHIP does not have it, because the encoder always defines the offer and the server takes it.

The third response is an error — a 4XX for malformed SDP or missing auth, a 409 Conflict with a Retry-After header when no live publisher exists yet (§4.2.8 — important for viewer-tries-before-broadcaster scenario), or a 503 Service Unavailable when the server cannot allocate resources (§4.6).

DELETE — end a session (§4.3)

When the viewer is done, it sends an HTTP DELETE to the WHEP session URL. The server tears down the ICE and DTLS sessions, releases the media resources, and returns 200 OK. The DELETE is the only clean way to end a session; if the viewer disappears without sending one, the server falls back to WebRTC's own ICE-connectivity-check timers and consent freshness (RFC 7675), which typically detect the loss within a few seconds and tear down state on the server side.

PATCH — two distinct jobs (§4.2.2 and §4.4)

PATCH carries two completely different bodies in WHEP, depending on the job. The first job, already covered above, is to deliver the viewer's SDP answer when the server returned a counter-offer; the request carries Content-Type: application/sdp and the server responds with 204 No Content.

The second job — and the more common one — is to update ICE state during an active session (§4.4). The body is Content-Type: application/trickle-ice-sdpfrag (the trickle-ICE SDP fragment format from RFC 8840), and the request must carry an If-Match header. There are two sub-cases. A trickle-ICE PATCH (new candidates only, no restart) carries If-Match: "" and the server responds with 204 No Content, no new ETag. An ICE-restart PATCH carries the literal If-Match: * (the wildcard) and the server responds with 200 OK, an application/trickle-ice-sdpfrag body containing the new ufrag/pwd plus the new server candidate set, and a new ETag header identifying the new ICE session.

The response codes encode the server's interpretation: 204 No Content (trickle succeeded), 200 OK with new ETag (restart succeeded), 412 Precondition Failed (the If-Match ETag does not match — the session state has moved on), 428 Precondition Required (the request did not carry If-Match at all), 422 Unprocessable Content (the server supports one of trickle / restart but not the other, §4.4.1). The ETag mechanism is the same state machinery as in WHIP — it is what lets the client know whether its mental model of the session is still valid.

OPTIONS and GET — preflight and health (§4.1, §4.3)

OPTIONS is used for two purposes. First, CORS preflight: a browser that wants to POST to a WHEP endpoint will issue an OPTIONS request first, and the WHEP endpoint MUST handle the preflight with the appropriate Access-Control-Allow-Origin headers. Second, capability advertising: a 200 OK response to an OPTIONS request SHOULD include Accept-Post: application/sdp. GET against either the endpoint or a session URL returns a 2XX with no body — useful only for liveness checks.

Layer Selection, Events, And Extensions — The Open Edges

Here is where the draft is at its thinnest and where production deployments differ most from one another.

Two features many WHEP players genuinely need — selecting which simulcast or SVC layer to receive, and receiving server-pushed events such as "the publisher just stopped" or "current viewer count is 8,427" — are not specified normatively in the draft. The draft includes an extension framework (§4.9 and §6) that says "extensions advertise themselves via a Link header on the 201 Created response, with a rel attribute carrying an IANA-registered URN starting urn:ietf:params:whep:ext:", and gives one example of a hypothetical Server-Sent Events extension — but that is presented as illustrative, not normative, and the document explicitly states "this document does not specify such an extension". Earlier individual drafts (draft-murillo-whep-*) carried a layer-selection JSON API and an SSE events stream, but those were removed before the working group adoption, leaving the field to vendor extension.

In practice every vendor has solved both problems in their own way. Cloudflare exposes layer-selection through a separate per-session HTTP endpoint that accepts a JSON body with {mediaId, rid, spatialLayerId, temporalLayerId} fields — the shape that lived in the older drafts. Dolby Millicast uses an SSE channel discovered via a Link header for events such as viewercount, active, inactive, and layers. OvenMediaEngine implements layer selection through query parameters on the WHEP session URL. None of these are wire-compatible across vendors. A player that knows how to talk to Cloudflare's layer-selection endpoint will not know how to talk to Millicast's.

For procurement, that fragmentation matters more than the raw draft text does. The wire is interoperable enough that an OvenMediaEngine viewer can play a Millicast stream; the managed playback — switch to lower simulcast layer, listen for end-of-stream — is not. Most vendors document their extensions clearly, but they are vendor-specific surface area you must integrate against per platform.

Authentication — Bearer Tokens And Why You Should Use Them

The draft says (§4.8) that "all WHEP endpoints, sessions and clients MUST support HTTP Authentication" and (§4.8.1) that "bearer token authentication ... MUST be supported by all WHEP entities". Every compliant WHEP server understands the Authorization: Bearer header, and every compliant client knows how to send it. If the client is not configured with a token, the spec further requires that the header MUST NOT be sent in any request — closing the footgun where a default or stale value leaks.

Unlike WHIP, where the bearer token is almost always tied to a stable publisher identity (the broadcaster's account), WHEP tokens have richer lifecycles in production. Some vendors issue short-lived per-viewer JWTs scoped to a single session and signed by the platform auth service; others issue stream-level tokens that any authorised viewer can present; a few support "open" streams that require no token at all (the Authorization header is then simply absent). For paid content the token is the gate. Treat it like any other API credential: short TTLs, scope to one stream where possible, store in your secrets manager.

The cryptographic protection on the wire is twofold: HTTPS protects the signalling (the SDP offer, the SDP answer, the bearer token, the candidate exchanges), and DTLS-SRTP protects the media (WebRTC's standard DTLS handshake produces SRTP master keys that encrypt every media packet end-to-end). Both layers are mandatory; §5 (Security Considerations) of the draft says HTTPS SHALL be used. There is no plaintext mode.

What The Latency Floor Actually Is

WHEP inherits WebRTC's latency budget. The realistic number for a complete WHIP-to-WHEP path on a clean network in 2026 is between 200 milliseconds and 800 milliseconds of glass-to-glass latency, depending on geography and configuration.

The arithmetic, out loud. Take a typical 30 fps stream where one frame arrives every 33 ms. Add the encoder's pipeline (30 to 100 ms for hardware H.264), DTLS-SRTP packetisation, the ingest network round-trip (5 to 100 ms same-region), the server's media routing (10 to 50 ms inside the SFU), the egress network round-trip (5 to 100 ms same-region), and the WHEP-side jitter buffer (typically 100 to 300 ms in WebRTC, configurable). A same-region wired path adds up to around 300 to 500 ms; a transatlantic path doubles it.

Compare against the alternatives end-to-end:

Configuration	Glass-to-glass	Notes
WHIP → WHEP, same-region wired	200–500 ms	The lowest-latency contribution + delivery path in 2026.
WHIP → WHEP, transatlantic wired	400–800 ms	Each transatlantic RTT eats roughly 100 ms.
RTMPS → server → WHEP	800–1500 ms	The RTMPS contribution leg dominates; the WHEP leg is fast.
RTMPS → server → LL-HLS	3–6 s	The LL-HLS packager dominates.
WHIP → server → LL-HLS	3–6 s	The contribution side is fast; the LL-HLS packager still dominates.
RTMPS → server → HLS	8–20 s	The classical broadcast-on-internet baseline.

The other axis worth understanding: WHEP, like all WebRTC delivery, requires the playback path to be relatively clean. WebRTC's congestion control and jitter buffer can tolerate a few percent of packet loss, but a 5% loss event on a hostile cellular link or a saturated home Wi-Fi link is where WebRTC degrades visibly. HLS players degrade by switching to a lower bitrate variant; a WHEP player degrades by stalling. The 2026 industry intuition — use LL-HLS for "low latency at scale on lossy networks"; use WHEP for "sub-second latency on clean networks for a moderate viewer count" — captures the tradeoff well.

A Worked Example — Playing Cloudflare Stream Over WHEP

Let us trace one concrete configuration. We are playing a 1080p30 H.264 stream at 6 Mbps from Cloudflare Stream's WHEP endpoint, in a Chrome browser tab on a wired Ethernet uplink, against a stream that an OBS encoder is currently pushing into the same Cloudflare account over WHIP.

The client code is straightforward. The browser creates an RTCPeerConnection, adds two RTCRtpTransceiver instances in recvonly mode (one for video, one for audio), calls createOffer() to generate the SDP offer, and then issues fetch(WHEP_URL, { method: 'POST', headers: { 'Content-Type': 'application/sdp', 'Authorization': 'Bearer ' + token }, body: offer.sdp }). Cloudflare's WHEP endpoint validates the offer, gathers its full set of ICE candidates (host candidates in every Cloudflare PoP region plus the TURN relay), generates the SDP answer with sendonly media sections and the full candidate list, and returns 201 Created with Location: /webrtc/sessions/ and ETag: "view-abc123...".

The browser receives the answer, calls pc.setRemoteDescription(answer), ICE begins, and a connectivity check succeeds against a Cloudflare host candidate in the same region (round-trip time roughly 15 ms). DTLS handshake completes in two round-trips (~30 ms). SRTP keys are derived, and media starts flowing from server to viewer. The RTCPeerConnection.ontrack event fires for the video and audio tracks; the application attaches them to a element with videoElement.srcObject = stream; playback starts. End-to-end latency from glass to glass, measured by displaying a millisecond-resolution clock on the source camera and reading it in the WHEP-side browser tab: approximately 380 ms.

During the session, the browser gathers one additional server-reflexive ICE candidate (the public-IP discovery completes after the initial POST) and issues a PATCH with Content-Type: application/trickle-ice-sdpfrag, If-Match: "view-abc123...", and the candidate in the body. Cloudflare responds with 204 No Content (a successful trickle update, no new ETag). The session runs for 90 minutes uninterrupted; when the user closes the tab, the browser issues DELETE /webrtc/sessions/ with the bearer token, Cloudflare responds 200 OK, the SRTP flow stops, and the resources are released.

Compare the same content path running over HLS. The same OBS encoder, the same Cloudflare account, but the viewer pulls an HLS manifest instead of opening a WHEP session: glass-to-glass latency is approximately 12 seconds. Compare against LL-HLS on the same content: approximately 4 seconds. WHEP's roughly 380 ms is one and a half orders of magnitude lower than HLS, and a factor of ten lower than LL-HLS. That is the only number that matters for the use case decision.

A diagram comparing three end-to-end paths from a single OBS source camera. The top path labelled

Figure 2. The same source camera, the same Cloudflare Stream service, three delivery stacks. WHEP delivers roughly 380 ms glass-to-glass on a clean wired path; LL-HLS roughly 4 s; HLS roughly 12 s. The latency difference is the entire reason WHEP exists.

What Platforms Actually Support WHEP In 2026

The table below summarises WHEP playback support across the platforms we encounter in client engagements in 2026. The data is current as of May 2026; cross-check vendor documentation for the latest status before final architecture decisions.

Platform	WHEP playback	Trickle ICE	Layer selection	Events stream	Notes
Cloudflare Stream	Yes (GA)	Yes	Vendor JSON API	No	Tracks `draft-murillo-whep-01`; sub-second playback to unlimited viewers via Cloudflare's edge.
Dolby Millicast	Yes (GA)	Yes	Vendor JSON API	Yes (SSE via Link header)	The protocol's first commercial home; the events stream extension is the most mature in the industry.
AWS IVS Real-Time	Yes (GA)	Yes	No	No	WebRTC playback through the IVS Web Broadcast SDK; WHEP at the API surface.
OvenMediaEngine	Yes (GA)	Yes	Query-parameter API	No	Open-source; very common self-hosted choice for WHIP-to-WHEP pipelines.
MediaMTX	Yes (GA)	Yes	No	No	Open-source media server with broad protocol coverage.
Janus	Yes (plugin)	Yes	Through plugin	No	Self-hosted SFU; WHEP plugin from the community since 2023.
mediasoup	Yes (community)	Yes	Through application	No	Self-hosted SFU; community WHEP gateways exist.
LiveKit	Yes (GA)	Yes	Through SDK	Through SDK	Cloud and self-hosted; WHEP supported alongside LiveKit's native SDK.
Ant Media Server	Yes (GA)	Yes	Vendor API	Yes	WHEP added in v2.10; both managed and self-hosted.
THEO Technologies	Yes (GA)	Yes	No	No	Integrated with HESP for hybrid stacks.
Wowza Streaming Engine	Yes	Yes	No	No	Self-hosted; WHEP added in 2024.
Mux Live	Yes (GA)	Yes	No	No	Three-protocol delivery endpoint.
Vimeo Livestream	Yes	Yes	No	No	Three-protocol delivery endpoint.
YouTube	No	—	—	—	HLS / DASH only.
Twitch	No	—	—	—	HLS only.
Facebook Live	No	—	—	—	HLS / DASH only.

The split mirrors the WHIP support map: developer-platform and B2B video products ship WHEP alongside HLS and DASH, while consumer social platforms are HLS-first and show no public roadmap for adding WHEP. The reason is the same on both ends — consumer social runs at scales where the WebRTC cost-per-viewer model is uncompetitive, and HLS-over-CDN remains cheaper at that scale by an order of magnitude.

Common Mistakes — The Things That Break WHEP In Production

A short list of the failures we see most often when bringing up a new WHEP delivery link. Most are configuration mismatches, not bugs in the protocol.

Pitfall 1: ICE failure because TURN is not configured. WebRTC needs a TURN server when the viewer is behind a symmetric NAT that prevents direct peer-to-peer connectivity. Every production WHEP deployment must ship a TURN server (or use the platform's hosted TURN — Cloudflare's anycast TURN, Twilio's TURN, or a Coturn cluster). The single most common failure is the viewer reporting "ICE failed" because no TURN candidates were returned in the SDP answer. The fix is either to configure the WHEP server to advertise a TURN URL (the draft permits this via a Link header with rel="ice-server"), or to use the platform's TURN.

Pitfall 2: Counter-offer path not implemented in the player. The draft's 406 Not Acceptable counter-offer path is rare in production but real on some self-hosted servers when the viewer's offered codecs do not match the server's encoded ones. Most player libraries fail closed when they see a 406 because the counter-offer SDP parsing was never wired up. The fix is either to widen the viewer's offered codec set (always offer H.264 baseline + Opus, which every server supports) or to implement the PATCH-with-SDP-answer response.

Pitfall 3: HTTPS not enforced. §5 of the draft says HTTPS SHALL be used. A few self-hosted servers ship a plaintext-HTTP mode for local development convenience; never deploy that mode in production. Cleartext SDP exposes the bearer token on the wire (it travels in the Authorization header of the POST), and a leaked token grants playback rights to any attacker on the path. Use HTTPS, with a valid certificate, on every WHEP endpoint.

Pitfall 4: Bearer token in the URL query string. Some vendors document a "convenience" form where the bearer token is appended to the URL as a query parameter. The form is not in the draft, and using it leaks the token to every HTTP intermediary that logs URLs (load balancers, proxies, CDN edge logs, browser history). Use the Authorization: Bearer header instead.

Pitfall 5: Treating WHEP like HLS for scale. Every WHEP viewer is a real WebRTC peer on the server. A WHEP server that serves 10,000 viewers maintains 10,000 SRTP encryption contexts, 10,000 jitter buffers, 10,000 ICE sessions, and 10,000 streams of outbound media. The unit economics are very different from an HLS/CDN egress model. Plan capacity per-viewer, not per-stream; pick a provider that handles the SFU fan-out architecture for you (see our WebRTC scale article for the details).

Pitfall 6: Assuming the draft is stable. WHEP is currently draft-ietf-wish-whep-03, expired in February 2026, with a working-group note of "Revised I-D Needed". A future revision could change ETag semantics, the counter-offer flow, the layer-selection question, or any IANA registration. Production deployments will see no churn — vendors have implemented what they implemented — but if you are building a player library you maintain, track the working-group mailing list. The text is genuinely a moving target.

When To Choose WHEP — A Decision Framework

WHEP is the right delivery choice in a specific set of conditions. The framework below is the one we walk clients through when sketching a 2026 playback architecture.

Choose WHEP when: (a) the latency budget is under one second glass-to-glass; (b) the viewer's network is wired or a healthy Wi-Fi / 5G link with under 1% expected packet loss; (c) the audience is bounded (typically under 10,000 concurrent viewers per stream, where WebRTC fan-out economics still work); (d) the target platform has shipped WHEP delivery (any of the platforms in the table above); and (e) the player runs in a browser, mobile app, or smart TV environment with a WebRTC stack. The contribution path should ideally be WHIP for the lowest total latency.

Choose LL-HLS when: latency must be under five seconds but the audience is larger than 10,000 (or unbounded), the network mix is unknown (because LL-HLS degrades by switching bitrates rather than stalling), or the device coverage requirement includes legacy smart TVs and set-top boxes that have no WebRTC stack.

Choose LL-DASH / CMAF chunked when: the toolchain is already DASH-native (Bitmovin, Shaka Packager pipelines) and the player ecosystem is dash.js or Shaka.

Choose HLS when: latency is not the constraint (VOD, archived live, broadcast tail) — HLS-over-CDN is the cheapest per-viewer egress economy by a wide margin.

Choose HESP when: a HESP-Alliance toolchain is already in place and the 400 ms latency claim is verified for your path.

A decision tree diagram for picking a delivery protocol in 2026. The root node labelled

Figure 3. The 2026 delivery decision tree. WHEP for sub-second on clean paths with bounded audience; LL-HLS for sub-five-second at scale; HLS for cheap CDN egress; LL-DASH or HESP when an existing toolchain dictates them.

Where Fora Soft Fits In

We have shipped WHIP-to-WHEP delivery stacks across several verticals since the protocol stabilised enough to be production-credible: telemedicine deployments where a clinical device pushes via WHIP to a central SFU that fans out to specialists via WHEP for sub-second remote consultation; e-learning live classroom rigs where the instructor pushes via WHIP and active participants receive via WHEP while passive viewers fall back to LL-HLS; live shopping and auction products where the moderator pushes via WHIP and the audience watches over WHEP with the LL-HLS broadcast tail behind; AR/VR live-event streaming where the WHEP latency keeps the audience inside the comfort budget that human perception requires. The pattern is consistent: when the latency target is unambiguously sub-second and the audience is bounded, WHEP is the right tool, and the open-edge extensions (layer selection, events) are integrated against the specific vendor we picked.

CTA

Talk to a streaming engineer · See our case studies · Download the WHEP integration checklist (PDF)

Call to action

Talk to a streaming engineer — book a 30-minute scoping call to talk through your whep protocol plan.
See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
Download the WHEP Integration Checklist — One-page printable summary of the WHEP draft-03 wire flow, the HTTP verbs, the counter-offer mechanism, the ETag-based ICE restart state machine, the bearer-token auth pattern, the vendor extension surface (layer selection, events….

References

draft-ietf-wish-whep-03 — WebRTC-HTTP Egress Protocol (WHEP), S. Garcia Murillo, C. Chen, and D. Jenkins (Ed.), IETF WISH Working Group, Internet-Draft published 18 August 2025, expired 19 February 2026; intended status "Proposed Standard"; current WG state "Revised I-D Needed"; current IESG state "Expired". The canonical specification text. Every section reference in this article (§4.1 HTTP usage, §4.2 Playback session set up, §4.3 Playback session termination, §4.4 ICE support, §4.5 WebRTC constraints, §4.8 Authentication, §4.9 Protocol extensions, §5 Security Considerations, §6 IANA Considerations) points at this draft. <https://datatracker.ietf.org/doc/draft-ietf-wish-whep/> (tier 1, official IETF Internet-Draft; per IETF convention, drafts are "work in progress" and may change before RFC publication).
RFC 9725 — WebRTC-HTTP Ingestion Protocol (WHIP), S. Garcia Murillo and A. Gouaillard, IETF Standards Track, Proposed Standard, March 2025. The contribution mirror of WHEP; the published RFC that WHEP draft is modelled on. Used in this article for the WHIP-vs-WHEP comparison and the latency-budget arithmetic. <https://www.rfc-editor.org/info/rfc9725> (tier 1, official IETF Standards Track RFC).
RFC 8840 — A SIP Usage Of The Trickle ICE Mechanism, IETF, January 2021. The trickle-ICE SDP fragment format used by WHEP PATCH bodies (application/trickle-ice-sdpfrag). <https://www.rfc-editor.org/info/rfc8840> (tier 1, official RFC).
RFC 9429 — JSEP / WebRTC SDP Offer/Answer, IETF, March 2024. The WebRTC SDP offer/answer rules WHEP §4.5 normatively references for max-bundle policy. <https://www.rfc-editor.org/info/rfc9429> (tier 1, official RFC).
RFC 9143 — Negotiating Media Multiplexing Using the Session Description Protocol (SDP), IETF, February 2022. The SDP bundle specification WHEP §4.5.1 normatively references. <https://www.rfc-editor.org/info/rfc9143> (tier 1, official RFC).
RFC 9110 — HTTP Semantics, IETF, June 2022. The HTTP authentication framework WHEP §4.8 normatively references, plus the entity-tag (If-Match, ETag) semantics. <https://www.rfc-editor.org/info/rfc9110> (tier 1, official RFC).
RFC 6750 — The OAuth 2.0 Authorization Framework: Bearer Token Usage, IETF, October 2012. The bearer-token semantics WHEP §4.8.1 normatively references. <https://www.rfc-editor.org/info/rfc6750> (tier 1, official RFC).
RFC 7675 — Session Traversal Utilities for NAT (STUN) Usage for Consent Freshness, IETF, October 2015. The consent revocation procedure WHEP §4.3 references when a viewer disappears without sending DELETE. <https://www.rfc-editor.org/info/rfc7675> (tier 1, official RFC).
IETF WISH Working Group, IETF. The chartering documents, mailing-list archive, and draft history for the standardisation work, including the December 2024 milestone for IESG submission that has slipped. <https://datatracker.ietf.org/wg/wish/about/> (tier 1, IETF working group records).
draft-murillo-whep-01 — WebRTC-HTTP Egress Protocol (WHEP) [historical], S. Garcia Murillo, IETF individual draft, 2022. The earlier individual draft that Cloudflare and several other vendors still track in production deployments; carried the layer-selection JSON API and the SSE events stream that the working-group draft -03 does not normatively include. <https://datatracker.ietf.org/doc/draft-murillo-whep/> (tier 1, IETF individual draft; superseded by working-group drafts).
Cloudflare Stream — WebRTC (Beta) Documentation, Cloudflare, retrieved May 2026. The platform documentation for Cloudflare's WHIP/WHEP implementation, including trickle ICE support, the explicit statement that the implementation tracks draft-murillo-whep-01, and the migration to Cloudflare Realtime (Calls) starting 13 March 2025 with no API changes. <https://developers.cloudflare.com/stream/webrtc-beta/> (tier 4, vendor documentation).
Cloudflare Stream Changelog — Trickle ICE Support For WHIP And WHEP, Cloudflare, 2025. The release notes for the trickle-ICE addition on Cloudflare's WHIP/WHEP endpoints. <https://developers.cloudflare.com/stream/changelog/> (tier 4).
WebRTC live streaming to unlimited viewers, with sub-second latency, Cloudflare blog, 2022. The original Cloudflare announcement of WHIP and WHEP support; useful for the historical adoption timeline. <https://blog.cloudflare.com/webrtc-whip-whep-cloudflare-stream/> (tier 4, first-party vendor engineering blog).
Dolby OptiView (Millicast) Platform and Media Server Changelog, Dolby, retrieved May 2026. The vendor changelog for the WHEP and events-stream extension implementation that is the most mature in the industry. <https://optiview.dolby.com/docs/millicast/changelog/changelog-dolbyio-platform-media-server/> (tier 4, vendor documentation).
OvenMediaEngine Documentation — WebRTC / WHIP / WHEP, AirenSoft, retrieved May 2026. The open-source media server's documentation for WHEP playback, including the query-parameter layer-selection convention. <https://docs.ovenmediaengine.com/> (tier 4, reference-implementation documentation).
How Cloudflare Glares At WebRTC With WHIP And WHEP, webrtcHacks (Chad Hart), 2022. The community engineering overview of Cloudflare's WHIP/WHEP implementation. <https://webrtchacks.com/how-cloudflare-glares-at-webrtc-with-whip-and-whep/> (tier 4, engineering blog).

WHEP: HTTP-Based Egress For WebRTC

Why This Matters

What WHEP Is — In One Page

The Short Version Of How WHEP Got Here

The HTTP Verbs — What Each One Does

POST — begin a session (§4.2)

DELETE — end a session (§4.3)

PATCH — two distinct jobs (§4.2.2 and §4.4)

OPTIONS and GET — preflight and health (§4.1, §4.3)

Layer Selection, Events, And Extensions — The Open Edges

Authentication — Bearer Tokens And Why You Should Use Them

What The Latency Floor Actually Is

A Worked Example — Playing Cloudflare Stream Over WHEP

What Platforms Actually Support WHEP In 2026

Common Mistakes — The Things That Break WHEP In Production

When To Choose WHEP — A Decision Framework

Where Fora Soft Fits In

What To Read Next

CTA

Call to action

References

Related glossary terms

WHEP: HTTP-Based Egress For WebRTC

Why This Matters

What WHEP Is — In One Page

The Short Version Of How WHEP Got Here

The HTTP Verbs — What Each One Does

POST — begin a session (§4.2)

DELETE — end a session (§4.3)

PATCH — two distinct jobs (§4.2.2 and §4.4)

OPTIONS and GET — preflight and health (§4.1, §4.3)

Layer Selection, Events, And Extensions — The Open Edges

Authentication — Bearer Tokens And Why You Should Use Them

What The Latency Floor Actually Is

A Worked Example — Playing Cloudflare Stream Over WHEP

What Platforms Actually Support WHEP In 2026

Common Mistakes — The Things That Break WHEP In Production

When To Choose WHEP — A Decision Framework

Where Fora Soft Fits In

What To Read Next

CTA

Call to action

References

Related glossary terms

WebRTC delivery (egress)

mediasoup

Packet loss

Contribution

Congestion control

Shaka Packager

Live streaming

WebRTC ingest