Live Captions (SFU-Side ASR Fan-Out) Engineering Cheat Sheet

One-page reference: what a live caption is (timed, speaker-labelled WebVTT-style cues); the SFU-side ASR fan-out pattern (tap audio at the SFU, VAD-gate recognition, fan caption text out over the data channel); the client-vs-server cost math ($13.86 naive vs $0.69 fan-out for a 1-hour 30-person call); partial-vs-final result handling; caption delivery over RFC 8831 data channels; the WCAG 1.2.4 Level AA / DOJ ADA April 2026 accessibility deadline; and a build-vs-buy checklist (mediasoup/Janus/Jitsi vs LiveKit; Deepgram/AssemblyAI/Whisper).

Download free PDF

PDF

Specialist software house for video, real-time and AI products. Founded 2005. 50 in-house engineers.

+1 (914) 775-5855
New York · USA
© Fora Soft, 20052026
Describe your project and we will get in touch
Enter your message
Enter your email
Enter your name

By submitting data in this form, you agree with the Personal Data Processing Policy.

Your message has been sent successfully
We will contact you soon
Message not sent. Please try again.