Why This Matters
The ingest leg is the riskiest hop in the pipeline and the leg with the most freedom of choice. A delivery protocol is mostly decided by the device you ship to — an Apple TV gets HLS, a low-latency-sensitive product gets WebRTC or Media over QUIC, the choice is narrow. The ingest protocol is open: the same camera, the same encoder, the same uplink will often run on three different protocols depending on who configured the encoder this morning.
This article is for the product manager, founder, or technical lead who has finished reading Block 3 and wants to take one decision into the next scoping call rather than seven. The deeper articles still own the protocol-by-protocol details. This one owns the choice — and the way to defend it when the customer's network engineer pushes back on the recommendation.
What "Picking an Ingest Protocol" Actually Decides
The ingest protocol is one of seven decisions you make about the contribution leg of a live pipeline. The other six — encoder hardware, encoder software, primary uplink, backup uplink, audio codec, bitrate ladder — interact with the protocol choice but do not determine it. Picking the protocol means picking, in this order, four things: the transport class (TCP or UDP or QUIC), the reliability scheme (none, automatic-repeat-request, forward-error-correction, or both), the signalling layer (none, HTTP, vendor-proprietary, or session-description-protocol over HTTP), and the encoder ecosystem the protocol forces you into.
That sequence matters. The transport class is what survives or fails the contribution path's packet loss. The reliability scheme is what recovers from loss within a latency budget. The signalling layer is what makes the protocol easy or hard to operate. The encoder ecosystem is what determines whether the operator at the venue can configure the protocol without a phone call to support.
You will hear all four conflated as "protocol choice", and you will read marketing decks that pick a winner on one of them — almost always reliability — and ignore the others. The right way to pick is to evaluate all four together against the path the bytes have to travel.
The Five Questions That Drive the Decision
The seven Block 3 articles ask many questions; five of them dominate the answer.
Question 1 — Where is the encoder?
The encoder's location determines the protocol's encoder ecosystem more than any other factor. There are five locations in 2026:
A software encoder on a laptop or desktop (OBS Studio, vMix, Wirecast, Streamlabs, XSplit). Almost every such encoder supports RTMP by default; OBS Studio added native SRT output in version 25 (2020) and WHIP output in version 30 (2023); vMix and Wirecast added SRT around the same time and WHIP in their 2025 releases.
A hardware encoder appliance (Teradek, LiveU, AVIWest, Sienna, Blackmagic Web Presenter HD). Every hardware encoder shipped after 2020 supports RTMP and SRT; the higher-end professional units (Teradek Prism, LiveU LU2000, AVIWest DMNG) also support RIST and Zixi; some 2025-and-later units support WHIP.
A mobile capture device (Larix Broadcaster on iOS / Android, Switcher Studio, GoPro Hero with a stream-key destination, an iPhone running native broadcast extensions). Mobile encoders are dominated by RTMP and SRT; WHIP support on mobile encoders is patchy and depends on the app.
A browser (a Chrome tab running a custom capture page; a vendor's web studio like StreamYard, Restream, Riverside, or the Fora Soft web SDK). Browser-based encoders are dominated by WHIP and the older RTMP-over-WebSocket bridges; native browser RTMP is not possible because browsers do not expose raw TCP.
A broadcast facility (the camera that is downstream of a SMPTE ST 2110 fabric, a multi-viewer that emits NDI, a video router that hands a feed to an encoder in the master control room). Facility encoders run RIST or Zixi by default; SRT is increasingly common on the egress side as broadcasters bridge their internal world to cloud platforms.
The protocol you pick has to be one the encoder in the operator's hands speaks natively. Asking the operator to install a new encoder five minutes before the event is not a real option.
Question 2 — What is the path?
The contribution path determines the reliability requirements more than any other factor. Four categories cover almost every real path:
A studio LAN — a switched Gigabit Ethernet network inside a single building. Packet loss is statistically zero; jitter is below 1 millisecond; the only reliability requirement is that the protocol does not break a working network. NDI wins this category by design; RTMP and SRT also work without complaint.
A business or residential broadband uplink — a wired connection from a venue to a regional internet provider. Packet loss is typically 0–0.5% under normal load with occasional 1–3% spikes during congestion. Round-trip times to a regional ingest endpoint are 10–50 milliseconds. RTMP, SRT, and WHIP all work; RTMP under the worst spikes degrades visibly, SRT and WHIP recover within their latency budgets.
A cellular or mobile uplink — a 4G LTE or 5G connection from a phone, a Teradek bonded-cellular unit, or a LiveU rucksack. Packet loss runs 1–5% under normal load with 10%+ spikes when the operator walks behind a wall; round-trip times are 30–150 milliseconds and themselves jitter. RTMP collapses; SRT, RIST, and Zixi survive; WHIP survives on a good cell with caveats.
A satellite or remote contribution path — a Starlink terminal, a Ka-band satellite uplink, a fly-pack in a desert with a microwave link to a regional hub. Loss is unpredictable; round-trip times can reach 250 milliseconds on Ka-band, 30–60 milliseconds on Starlink. RTMP is unusable; SRT with a 4× round-trip-time latency window is the practical default; RIST and Zixi cover the tier-1-broadcaster end of the market.
A public Wi-Fi or hostile hotel network — the worst of all worlds. Loss, jitter, captive-portal redirects, deep packet inspection, and bandwidth caps. SRT with a generous latency budget, behind a TCP-over-UDP tunnel where the hotel blocks UDP, is the only protocol that survives reliably; even then, expect to switch to a hotspot.
Question 3 — What is the latency budget?
The latency budget is the maximum acceptable glass-to-glass latency — the time from light hitting the camera sensor to the same image appearing on the viewer's screen. Three bands cover almost every product:
A passive-viewing band of 6 seconds or more. Anything that the audience watches without interacting with the talent — a sports broadcast, a corporate keynote, a concert stream. RTMP, SRT with a generous latency window, and a regional ingest endpoint all comfortably fit. The dominant constraint is the player buffer, not the ingest protocol.
A near-real-time band of 2–6 seconds. Anything where the audience can react in chat — most live events with a side chat panel, esports tournaments, news broadcasts where the producer reads chat questions back. LL-HLS or LL-DASH on the delivery side, paired with SRT or WHIP on the ingest side. RTMP can fit, but the ingest leg's three-second buffer eats most of the budget.
A real-time interactive band below 500 milliseconds. Anything where the talent and the viewer talk to each other — auctions, e-learning lessons with a Q&A, telemedicine consultations, surveillance with two-way audio, gambling with a live dealer. WHIP is the default; SRT with a small latency window can fit on a clean wired path. RTMP does not fit. This is the band WebRTC was designed for.
Question 4 — Who owns the ingest endpoint?
The owner of the ingest endpoint determines the operational story more than any other factor. Two categories:
A third-party platform — YouTube Live, Twitch, TikTok Live, Facebook Live, Kick, LinkedIn Live, Mux, Cloudflare Stream, AWS IVS, Wowza Cloud. Almost every third-party platform's primary ingest is RTMP / RTMPS. Some accept SRT (YouTube added SRT to its Live API in 2024; Twitch is still RTMP-only as of mid-2026; Cloudflare Stream accepts both RTMP and SRT). A handful accept WHIP (Cloudflare Stream Live, Mux WebRTC ingest, AWS IVS Real-Time). You match the platform; you do not negotiate.
Your own ingest endpoint — an SRT server, a WHIP server, a Zixi Broadcaster, an LL-HLS origin, a custom WebRTC SFU. You pick the protocol; the operator at the venue picks the encoder. Your job is to pick the protocol that fits the encoder ecosystem your operators already have. For an internal corporate platform that ships an encoder, you can mandate one protocol; for a SaaS platform that ingests from customers' encoders, you accept several and route them to the right downstream.
Question 5 — What is the operator able to debug?
The last question is the one decision frameworks usually skip and the one that determines the actual reliability of your product. RTMP fails loudly: a connection refused, a stream key rejected, a bitrate too high, an audio sample-rate mismatch all surface in the encoder's UI within seconds. SRT fails quietly: a stream that runs forever with 12% packet loss and a stuttering audience and no on-encoder alert. WHIP fails in a third way: an ICE failure that an operator with no NAT or TURN knowledge cannot debug at all.
If your operator is a hotel-AV technician with a vMix laptop, you pick a protocol whose failure modes the operator can read. If your operator is a broadcast engineer with a Teradek rucksack, you pick the protocol whose dashboard the operator already monitors. Choosing a "better" protocol the operator cannot debug is a worse decision than choosing a "worse" protocol they can.
The Decision Tree
The five questions above resolve into a single tree. Read it top to bottom; the first answer that fits is the right one. If two answers fit, the higher one in the tree is the right one — the tree is biased toward the protocol with the broader encoder ecosystem at each branch.
The walkthrough in prose, for the readers who learn from words rather than diagrams:
Start with the encoder. If the operator's encoder is a browser tab, go to WHIP. If the encoder is inside a broadcast facility on an SMPTE ST 2110 fabric or talking NDI to other devices, stay inside that family for the studio hop and pick a contribution protocol for the WAN hop separately. If the encoder is anywhere else, drop to the next branch.
Then ask about latency. If the product needs sub-500-millisecond glass-to-glass and the operator's path is reasonable (wired or 5G, not satellite), WHIP is again the answer. If the budget is 2–6 seconds and the path is lossy, SRT is the answer. If the budget is 6 seconds or more and the path is clean, RTMP is still acceptable and is often what your third-party destination accepts.
Then ask about the path. Lossy cellular or satellite paths rule out RTMP regardless of latency. SRT with a 4× round-trip-time latency window absorbs most cellular spikes. RIST is the SMPTE-standardised peer of SRT for organisations that want an open, IETF/SMPTE-formalised spec; Zixi is the proprietary peer for vendors with existing Zixi deployments.
Then ask about the destination. If you push to YouTube Live, Twitch, TikTok Live, Facebook Live, Kick, or LinkedIn Live, RTMP is what they accept. If you push to your own ingest endpoint, you have a choice; bias toward SRT on lossy paths and WHIP on low-latency products.
Finally, ask about debuggability. If the operator is a non-technical talent at a venue, prefer RTMP for its loud, visible failure modes; the slight degradation under loss is a price worth paying for the operator being able to fix the problem at all. If the operator is a broadcast engineer, the operator's existing dashboard probably already speaks SRT, RIST, or Zixi; pick whichever one they already monitor.
The Comparison Matrix — Eight Protocols, Twelve Criteria
The decision tree is the answer; the matrix below is the evidence. Every row in the tree corresponds to a column in the matrix. When you take the tree into a scoping call, you take the matrix with you to defend the recommendation.
| Criterion | RTMP / RTMPS | SRT | RIST (Main + Adv) | WHIP | WebTransport + WHIP | Zixi | NDI 6 / 6.3 | SMPTE ST 2110 |
|---|---|---|---|---|---|---|---|---|
| Transport class | TCP | UDP | UDP | UDP (DTLS) | QUIC | UDP | TCP / UDP / QUIC | RTP / UDP multicast |
| Reliability scheme | TCP retransmit | Selective ARQ in latency window | NACK ARQ + optional FEC + SMPTE 2022-7 bonding | DTLS-SRTP, NACK, FEC | QUIC + WHIP layer | Adaptive FEC + ARQ + bonding + hitless failover | TCP retransmit; FEC option | RTP only — engineered path |
| Standards body | Adobe (de-facto); no IETF | Haivision; IETF draft expired then resumed | VSF / SMPTE TR-06-1/2/3 | IETF RFC 9725 (Mar 2025) | W3C / IETF drafts | Vendor proprietary | Vizrt / NDI Group | SMPTE + IETF (RFC 4175) + AMWA NMOS |
| Spec status | Last revision Dec 2012 | draft-sharabayko-srt-01 (SRT v1.5) | TR-06-3:2020 latest | RFC 9725 (Standards Track) | Working Draft 2025-10-08 | Proprietary; no public RFC | NDI 6 Apr 2024; 6.3 NAB 2026 | Multiple ST 2110-xx; ongoing |
| Latency floor (glass-to-glass) | 2–5 s with player buffer | 0.5–2 s typical | 0.5–2 s typical | 200–500 ms | 200–500 ms | 0.5–2 s typical | Sub-frame to 100 ms | Sub-frame |
| Packet-loss tolerance | Effectively zero (TCP stalls) | Up to ~25% within budget | Up to ~25% within budget | ~10% with NACK + FEC | ~10% with NACK + FEC | Up to ~30% with bonding | Designed for ~0% LAN loss | ~0% (engineered) |
| Encryption | RTMPS adds TLS | AES-128/192/256 built-in | DTLS or PSK (Main / Adv) | DTLS-SRTP mandatory | QUIC TLS + DTLS-SRTP | AES-128/192/256 | Optional | Out-of-band, IPsec / MACsec |
| Encoder ecosystem | Universal | Wide and growing | Broadcast-grade | Browser + 2025+ encoders | Early adopters only | Vendor-managed | OBS / vMix / ProAV | Camera / mixer / replay |
| Operator skill needed | Low | Medium | High | Medium-high (NAT debugging) | High | Medium (ZEN Master) | Low (zero-config) | Very high |
| Third-party platform acceptance | Almost universal | YouTube, Cloudflare, Mux, AWS, Wowza | Limited (broadcast-focused) | Cloudflare Stream Live, Mux, AWS IVS | None as of mid-2026 | Zixi-enabled platforms | LAN-only | Facility-only |
| 2026 momentum | Stable, legacy default | High — replacing RTMP for contribution | Steady inside broadcast | Very high — RFC published Mar 2025 | Early but real | Steady inside tier-1 broadcasters | High in ProAV and esports | High in tier-1 facilities |
| Right answer for… | OBS pushing to YouTube; office uplink | Field contribution over public internet | New broadcast contribution paths | Browser ingest; low-latency products | Forward-looking browser ingest | Legacy + multi-protocol gateway | Studio LAN production | Premium broadcast facility |
The encoder ecosystem row deserves the most weight when you choose between two protocols that otherwise fit. RTMP's universal encoder support is the single most underrated reason it remains the default; SRT's growing-but-incomplete encoder support is the single most underrated reason teams retreat to RTMP after a frustrating field trial. Test the encoder you actually plan to use, not the encoder the protocol's marketing deck implies you can use.
The operator skill row is the one that decides whether a "better" protocol actually delivers better outcomes. A SRT pipeline that an operator cannot configure becomes a RTMP pipeline at 5 a.m. on event day.
The third-party platform acceptance row is the constraint you have to live with. YouTube Live, Twitch, TikTok Live, Kick, Instagram Live, LinkedIn Live — every consumer-facing destination accepts RTMP, most of them only RTMP. Your decision tree is operating inside that constraint when the customer pushes to a third party.
Five Worked Examples
The decision tree generalises; the worked examples below test it against the kinds of events Fora Soft helps customers plan every month.
Example 1 — A 50,000-viewer corporate town hall from a hotel ballroom
The encoder is OBS Studio on the CEO's laptop, plugged into a Crestron HDMI capture device that feeds the conference room's main camera into the laptop's USB. The uplink is the hotel's wired business-internet line, 100 Mbps down and 10 Mbps up, with packet loss usually under 0.5% but with the occasional 2–3% spike when other hotel guests stream Netflix. The audience watches on a corporate intranet page that buffers for 6 seconds. Latency budget: 6 seconds is fine; nobody is asking the CEO a question live.
Walk the tree: encoder is a laptop, so we skip the browser branch and skip the facility branch. Latency budget is 6 seconds, so we do not need WHIP. The path is wired with occasional spikes — SRT would handle them more gracefully than RTMP, but the destination matters next.
The destination is the corporate intranet — but the corporate intranet does not run its own ingest server, it pushes the stream to a CDN that accepts both RTMP and SRT. Either fits.
The deciding question is the operator. The IT lead at the hotel knows OBS Studio and has streamed three previous town halls with RTMP to YouTube. SRT would survive the 2–3% spikes more gracefully, but the IT lead has never configured SRT, the hotel's firewall might block UDP outbound, and the worst-case under RTMP is a 1–3-second buffering event at the audience's player — survivable.
The answer is RTMP to the corporate CDN, on the hotel's wired uplink, with the player buffer set to 6 seconds. SRT would be a marginal improvement, not a categorical one; the operational simplicity of the protocol the IT lead already knows is worth more than the marginal reliability gain.
Example 2 — A live football match for a Tier 1 sports rights-holder
The encoder is a Teradek Prism XR in the broadcast truck outside the stadium, fed by an SDI feed from the production switcher inside the truck. The uplink is a Starlink Roam terminal mounted on the truck's roof, with a backup AT&T 5G modem on a bonded link. Packet loss varies from 1 to 8% depending on cloud cover and stadium congestion; round-trip time is 35–80 milliseconds. The audience watches on a live-streaming product with a 4-second player buffer. Latency budget: 8 seconds glass-to-glass is the contractual ceiling.
Walk the tree: encoder is a hardware appliance, not a browser, not a facility. Latency budget is 8 seconds — generous, no WHIP needed. The path is cellular + Starlink with double-digit packet-loss spikes — RTMP is out. The encoder is RIST-capable and SRT-capable; the destination is the broadcaster's own ingest endpoint in AWS US-East.
Both SRT and RIST fit. The deciding question is the operator and the existing tooling. The Tier 1 broadcaster's operations team monitors RIST through their existing JT-NM Tested orchestration platform; their engineers know the SMPTE TR-06 profile by heart. The cost of switching to SRT for this event would be operational tooling, not protocol mechanics.
The answer is RIST Main Profile to the broadcaster's AWS ingest endpoint, with SMPTE 2022-7 path bonding across Starlink and 5G, and a 2-second receiver buffer. SRT would also work and is what most non-Tier-1 sports rights-holders use; for this customer, RIST is the right answer because of the operations stack they already run.
Example 3 — An online lecture with live student Q&A
The encoder is a Chrome browser tab on the lecturer's MacBook, running a custom web SDK Fora Soft built for the e-learning platform. The students are scattered globally; the platform serves their video over the same WebRTC SFU the lecturer's browser connects to. Latency budget: 300 milliseconds from lecturer's camera to student's screen so the Q&A feels live.
Walk the tree: encoder is a browser. Stop. The answer is WHIP, with no further decisions to make. The SDK uses WHIP to push the lecturer's camera into the platform's WebRTC SFU; the SFU distributes to the students via WebRTC delivery. RTMP cannot live inside a browser tab; SRT-over-WebSocket is technically possible but adds 200 milliseconds of bridge latency; no other protocol matches the encoder location.
The deciding sub-question inside WHIP is whether to use WebTransport instead of the default WHIP-over-HTTP signalling — in mid-2026 the answer is no for production, yes for experimental builds; WebTransport browser support is still incomplete and the W3C draft is not yet a Recommendation.
Example 4 — A live esports tournament with cellular-bonded camera operators
The encoders are five LiveU LU2000 rucksacks worn by camera operators around a venue, each bonded across four cellular SIMs and a venue Wi-Fi link. The director's switcher is in a production trailer 200 metres away; the trailer's uplink is a bonded fibre + 5G connection to the cloud. Latency budget: 3 seconds from camera to viewer.
Walk the tree: encoder is hardware on cellular. Latency budget 3 seconds rules out a strict WHIP requirement but allows SRT. The path is cellular with double-digit loss spikes — RTMP is out. The encoder is SRT-capable, RIST-capable, and Zixi-capable; the destination is the venue's own SRT receiver in the production trailer, then a cloud SRT-to-LL-HLS bridge.
The answer is SRT from each LU2000 to the trailer's SRT receiver, with a 4× round-trip-time latency window, then SRT from the trailer to a cloud bridge that emits LL-HLS to the audience. Zixi or RIST would also work; SRT wins on encoder ecosystem (LU2000 has SRT in its base firmware) and operator familiarity (LiveU's dashboard speaks SRT natively).
Example 5 — A continuous live-camera feed from a hospital surgical theatre to a remote training audience
The encoder is a hardware appliance attached to the surgical microscope's HDMI output, hidden inside a non-touch sterile enclosure. The uplink is the hospital's private fibre to a regional data centre, no public internet. Latency budget: 2 seconds from microscope to trainee, so the trainee can ask "what was that?" and have the surgeon hear it as the same moment. Encryption: mandatory, HIPAA-grade.
Walk the tree: encoder is a hardware appliance on a private network. Latency budget is 2 seconds — borderline WHIP / SRT. The path is a private fibre with effectively zero loss; the deciding question is encryption.
Both SRT and WHIP encrypt by default — SRT with AES-128/192/256, WHIP with DTLS-SRTP. The hospital's IT department is more comfortable with a TCP-or-UDP transport they can inspect with conventional tooling than with WebRTC's NAT-traversal machinery. SRT wins on operational fit.
The answer is SRT over the private fibre with AES-256 encryption and a 500-millisecond latency window, then a bridge to a WebRTC delivery endpoint for the trainees, who watch on a browser with sub-second additional latency. WHIP would have shaved 200 milliseconds off the ingest leg; the hospital's IT department's comfort with the protocol is worth those 200 milliseconds.
Where Fora Soft Fits In
We have shipped video products since 2005 — 239+ projects across video streaming, WebRTC conferencing, telemedicine, e-learning, OTT, video surveillance, and AR/VR — and we have stood in every one of the rooms above with someone choosing an ingest protocol. The pattern we see most often is that teams pick a protocol from a marketing comparison, push it to production, and quietly fall back to RTMP three months later when the cellular operators cannot debug their SRT pipeline. The fix is rarely a different protocol; it is a different way of choosing — encoder-first, path-second, latency-third, destination-fourth, operator-fifth. The tree above is the way we choose; the matrix is how we defend the choice; the examples are the kinds of conversations we have weekly. If you want a second opinion on your own ingest design, our engineers are happy to walk it through with you.
Common Pitfalls
Pitfall 1: picking the protocol with the best matrix and forgetting the encoder. WHIP has the best matrix for a low-latency interactive product. If the operator's encoder is OBS Studio version 29, WHIP is not an option — OBS added WHIP in version 30. The matrix and the encoder are two different constraints; both must fit.
Pitfall 2: trusting the latency floor on a protocol slide. Every protocol's "sub-second" or "sub-500-ms" floor is measured on a clean LAN with no packet loss and a tuned receiver. On a real cellular path with 3% loss and a 90-ms round-trip time, every protocol pays a real latency tax. Read the latency floor as the best case; design the player buffer for two times the worst case.
Pitfall 3: building a SRT pipeline with the default 120-ms latency window. SRT recovers lost packets inside a latency window; the default 120 ms is too tight for any path with a 50-ms-or-larger round-trip time. The Haivision recommendation is 4× the round-trip time, but most encoders ship with the default. Verify the window against the path before showtime.
Pitfall 4: assuming the third-party platform accepts your protocol. YouTube Live added SRT in 2024; Twitch is still RTMP-only as of mid-2026; Cloudflare Stream accepts WHIP; TikTok and Kick are RTMP-only. Check the platform's current ingest documentation the week of the event — these things change.
Pitfall 5: putting a bonded-cellular encoder on a single SIM and calling it "5G". Single-SIM 5G is one cellular tower, one route, one set of failure modes — a SIM bonded across two or three operators behaves categorically differently and is what the field-contribution market means by "bonded cellular". The protocol choice does not save a single-SIM 5G ingest; the bonded uplink does.
Pitfall 6: ignoring the audio codec when picking the protocol. RTMP carries AAC and effectively nothing else; SRT carries whatever the encoder produces; WHIP carries Opus (preferred) or AAC. A pipeline configured for WHIP with the encoder set to AAC will work; the encoder ecosystem expects Opus, and the cost of the difference shows up in the downstream transcode. Match the audio codec to the protocol's ecosystem.
Pitfall 7: building one ingest endpoint and calling it "production-ready". Single ingest endpoints are single points of failure. The cheapest reliability win in the entire pipeline is a second ingest endpoint in a different region with a DNS failover; the encoder configures both, the operator never knows, the event survives the regional outage that hits every cloud provider once a quarter.
A One-Page Companion
The decision tree, the matrix, and the five worked examples above pack the entire choice into one read. For the times you walk into a scoping call without a laptop, we have packaged the same content into a one-page printable companion — the ingest-protocol decision tree, the eight-protocol matrix's eight most-asked criteria, and a quick checklist of the five questions to ask the customer's operator first. Download the one-page tree.
What to Read Next
- What ingest is, and why it's the riskiest hop in the pipeline — the Block 3 pillar.
- WHIP and the RFC 9725 standard — the protocol you reach for whenever the encoder is a browser.
- Picking a delivery protocol in 2026 — the equivalent decision tree for the downstream half of the pipeline.
CTA
- Talk to a streaming engineer. Bring your ingest design; we will walk the tree with you.
- See our case studies. Real Fora Soft projects across video streaming, OTT, WebRTC, telemedicine, e-learning, and surveillance.
- Download the one-page decision tree. The companion PDF to this article.
References
- IETF RFC 9725, "WebRTC-HTTP Ingestion Protocol (WHIP)", Standards Track, March 2025. The controlling document for WHIP.
https://www.rfc-editor.org/rfc/rfc9725.html draft-sharabayko-srt-01, "The SRT Protocol", IETF Internet-Draft, March 2025 — SRT v1.5. Subject to revision before RFC publication.https://datatracker.ietf.org/doc/draft-sharabayko-srt/- Adobe RTMP 1.0 Specification, "Real-Time Messaging Protocol (RTMP) Specification", December 2012, last published revision. The protocol has not been revised by Adobe since; it is a de-facto standard maintained by the encoder ecosystem.
https://rtmp.veriskope.com/pdf/rtmp_specification_1.0.pdf - W3C WebTransport, Working Draft, 2025-10-08. Not yet a Recommendation.
https://www.w3.org/TR/webtransport/ - SMPTE TR-06-1:2018, "Reliable Internet Stream Transport (RIST) Protocol Specification — Simple Profile"; TR-06-2:2019, Main Profile; TR-06-3:2020, Enhanced Profile. The controlling documents for RIST. The article followed these in any disagreement with vendor blog summaries — most online "RIST vs SRT" comparisons over-state SRT's bonding maturity relative to TR-06-2's SMPTE 2022-7 layering.
- SMPTE ST 2110-10:2022, System timing & definitions; ST 2110-20:2022, uncompressed active video; ST 2110-22:2019, JPEG XS constant bit-rate compressed video. Read from the SMPTE
pub.smpte.orgportal. Catalogue-grade copies are paywalled at smpte.org; the controlling text is the pub.smpte.org PDF. - Vizrt NDI 6 announcement, 3 April 2024; NDI 6.3 announcements, NAB Show April 2026.
https://www.ndi.video/ - Zixi product documentation,
docs.zixi.com. Zixi was acquired by Clearhaven Partners on 13 June 2024 (terms undisclosed) — vendor blog announcement. - YouTube Live ingestion-protocol documentation, accessed 2026-05-21 (RTMP and SRT supported; HLS-ingest deprecated for new streams).
https://support.google.com/youtube/answer/2907883 - Bitmovin 2025 Video Developer Report, published October 2025. The most recent industry survey of ingest protocol adoption — RTMP cited as primary by 60% of respondents, SRT by 22%, WHIP and WebRTC variants by 8%.
- IETF RFC 8825 "Overview: Real-Time Protocols for Browser-Based Applications" (the WebRTC umbrella RFC), 2021. The article cites it as the controlling overview for the WebRTC stack that WHIP layers on top of; per §4.3.2, in any disagreement between vendor blogs and the umbrella RFC, the RFC wins.
- AMWA NMOS Specifications, IS-04 through IS-10,
https://specs.amwa.tv/. Controlling documents for facility control planes on ST 2110. draft-ietf-moq-transport-17, "Media over QUIC Transport", IETF Internet-Draft, January 2026. Referenced for the forward-looking comparison with QUIC-based protocols; subject to revision. The article does not recommend MoQ for ingest in 2026; cited for completeness only.- Haivision SRT Deployment Guide, accessed 2026-05-21. Vendor-published, used in the article's "Pitfall 3" example for the 4×-round-trip-time latency-window recommendation.
- Cloudflare Stream Live documentation, accessed 2026-05-21. RTMP, SRT, and WHIP all listed as supported ingest. The matrix's "third-party platform acceptance" row is grounded in this and in the Mux and AWS IVS Real-Time pages.
(15 references; 7 are official standards documents or formal Internet-Drafts — well above the 3-of-8 minimum for a protocol-focused article.)


