
Key takeaways
• Fora Soft won two 2024 APAC Insider Business Awards — Best Custom Real-Time Interaction Software Development Company and Streaming Software Innovation Excellence. The awards reflect a long bet on hard real-time and streaming engineering, not generic custom dev.
• Real-time and streaming software is the highest-stakes layer in modern apps. Latency, packet loss recovery, codec choices, and concurrency all hit the user in seconds — and they hit your AWS bill in real money.
• WebRTC vs SRT/RIST vs HLS/DASH is a decision per use case, not per company. Sub-500ms two-way calls demand WebRTC; one-to-many broadcast at scale rewards HLS or DASH; contribution feeds across regions live on SRT or RIST.
• APAC streaming has its own constraints. Cross-border bandwidth is volatile, mobile networks dominate, regional CDN POPs matter, and compliance is fragmented (China, India, Indonesia each behave differently).
• The deeper bet is on AI-augmented streaming. On-device transcription, real-time translation, content moderation, and adaptive bitrate decisions driven by ML are the 2026–2028 differentiators — and where the APAC Insider judges saw our work.
In late 2024 APAC Insider named Fora Soft a winner in two categories: Best Custom Real-Time Interaction Software Development Company and the Streaming Software Innovation Excellence Award. The APAC Business Awards are now in their ninth edition and recognise companies driving innovation across the Asia-Pacific region. We are pleased — and we think the more useful version of this article is not the press release, but the playbook for what serious real-time and streaming engineering actually looks like in 2026, in APAC and beyond.
Below: how to think about real-time vs streaming architectures, the protocols that matter, the cost shape of running this stuff at scale, the APAC-specific constraints we plan around, where AI is genuinely changing the game, and the decision questions we ask every founder before quoting a streaming or real-time build.
Why Fora Soft wrote this playbook
Fora Soft has been delivering custom video, audio, and real-time communication software since 2005. We have shipped WebRTC SFUs, HLS/DASH delivery pipelines, SRT/RIST contribution stacks, and on-device ML for streaming apps across iOS, Android, web, and desktop. Recent reference points include Perspire.tv (a live fitness streaming product with low-latency video), BrainCert (a virtual classroom platform with multi-region delivery), VOLO (real-time translation deployed at Black Hat for 22,000 attendees), and Sprii (a live video-shopping platform).
We use Agent Engineering internally, which compresses delivery time on most workstreams by 30–40% versus a baseline team — documented in our AI software development case study. The streaming and real-time content below is what we tell founders on real scoping calls, including the parts vendors usually leave out of pitches.
Building a real-time or streaming product?
A 30-minute scoping call — we map your latency, scale, region, and codec needs against the architectures below and tell you what to build, buy, or skip.
Real-time vs streaming — the distinction that drives every architecture choice
The single most expensive mistake we see in this category is conflating real-time with streaming. They are different jobs, with different protocols, different infrastructure, and very different cost curves.
Real-time interaction. Two- or many-way conversations where humans expect sub-500 millisecond latency. Video calls, voice chat, multiplayer co-presence, live tutoring, real-time translation. The protocol is WebRTC. The infrastructure is SFUs (and sometimes MCUs), TURN servers for NAT traversal, and tight regional placement.
Live streaming. One-to-many broadcast where viewers tolerate 2–10 seconds of latency in exchange for higher quality and massive scale. Sports, concerts, conference live streams, live shopping. The protocols are HLS or MPEG-DASH for delivery, with SRT or RIST for contribution feeds. The infrastructure is origin servers, packagers, and a CDN.
Low-latency streaming — the hybrid. When you want broadcast scale but with sub-2-second latency: live auctions, live betting, interactive game shows. Low-Latency HLS, Low-Latency DASH, or WebRTC-over-CDN architectures. More expensive than classic HLS, more scalable than pure WebRTC.
Streaming and real-time protocols compared
| Protocol | Latency | Best fit | Trade-off |
|---|---|---|---|
| WebRTC | ~100–500 ms | Two-way calls, real-time interaction | Hard to scale beyond ~few hundred per SFU |
| SRT | ~250 ms–3 s | Contribution from camera to cloud over WAN | Open source but not browser-native |
| RIST | ~250 ms–3 s | Pro broadcast contribution, multi-vendor | Smaller community than SRT |
| Low-Latency HLS / DASH | ~1–5 s | Live shopping, auctions, betting at scale | Higher CDN cost than classic HLS |
| Classic HLS / DASH | ~6–30 s | Concerts, sports, conference broadcasts | No interactivity, latency too high for chat |
| RTMP (legacy ingest) | ~2–5 s ingest | Encoder → cloud, where SRT not available | Flash-era; mostly replaced by SRT/WebRTC ingest |
Our deeper analysis lives in WebRTC architecture guide for business, WebRTC vs HLS, and WebRTC vs Agora trade-offs.
Reach for WebRTC when: the user needs to interact in real time with another human, latency must stay under 500 ms, and concurrency per session is in the tens or low hundreds — otherwise you are paying for an interactivity layer you do not use.
WebRTC architecture — SFU, MCU, P2P, and when to pick which
P2P (mesh). Each participant sends to each other participant. Cheap and simple at 2–3 people, brutal at 6+. Use only for one-to-one calls.
SFU (Selective Forwarding Unit). A server that receives each participant’s stream once and forwards copies to others. The 2026 default for 3–200+ participants. Mature open-source options include mediasoup, Janus, Pion, and ion-sfu.
MCU (Multipoint Control Unit). A server that mixes streams server-side and sends one composite back to each participant. Higher CPU cost, lower bandwidth, useful for downstream playback or recording.
Hybrid SFU + MCU. The pattern we use most often for serious products: SFU for live participants, MCU for recording, transcoding, or broadcast egress.
APAC-specific constraints we always plan around
1. Cross-border bandwidth is volatile. Mainland China to Singapore to Sydney all behave differently and change over the year. Always test from real consumer ISPs in each target market, not just from cloud regions.
2. Mobile-first traffic. Most APAC viewers arrive on phones. Adaptive bitrate ladders need a strong sub-720p story; default-on data-saver mode is a competitive advantage.
3. Regional CDN POPs. Cloudflare, Akamai, Tencent Cloud, AWS CloudFront, and Alibaba Cloud each have different POP density per country. Pick by where your viewers actually live, not by familiar logo.
4. Compliance is fragmented. Mainland China requires ICP filing for hosted content; India has data-localisation rules under DPDP for some categories; Indonesia, Vietnam, and South Korea each have their own privacy regimes. Build a flexible region+jurisdiction architecture from day one.
5. App distribution beyond Google Play. In several APAC markets, Huawei AppGallery, Xiaomi GetApps, OPPO App Market, and others are first-class distribution channels. Plan multi-store distribution from the start — we discuss the survivors in our distribute Android apps without Google Play guide.
Reach for a multi-CDN strategy in APAC when: your viewers cross more than two of these markets — mainland China, India, Indonesia, Vietnam, South Korea, Japan, Australia — otherwise a single strong CDN with regional POPs will usually do the job at MVP.
Cost shape — where the streaming bill actually comes from
A serious streaming product’s monthly cost falls into four buckets. The mistake is to optimise the wrong one.
1. Egress bandwidth. Usually the largest line. Concurrent viewers × bitrate × minutes. CDN pricing varies by region by 2–5x. Negotiated rates beat list prices significantly above ~50 TB/month.
2. Compute (transcoding, packaging, SFU). Live transcoding to multiple bitrates is expensive; smart use of per-title encoding and pre-computed renditions cuts the bill. SFU compute scales with concurrent participants, not viewers.
3. Storage. Often overlooked: VOD storage of recordings adds up at 10s of TB scale. Lifecycle policies that move cold content to cheaper tiers save real money.
4. Third-party APIs. Agora, Twilio, Daily, Vonage, etc. Convenient at MVP, very expensive at scale. We benchmark these in WebRTC vs Agora trade-offs.
AI in streaming and real-time — the genuine 2026 differentiators
The APAC Insider judges singled out our innovation work, and the bulk of that is AI inside streaming and real-time products. Five capabilities now have credible production-grade implementations.
1. Real-time transcription and translation. On-device or cloud, sub-second latency. Translates a Korean tutor for an Indonesian student in real time. Our case study lives in VOLO.
2. Content moderation. ML classifiers running over the live video and audio stream, flagging policy violations within seconds — essential for live shopping, gaming, and UGC platforms.
3. AI-driven adaptive bitrate. ML models predict the next ABR rung from network signals, reducing rebuffering versus rule-based ABR by ~10–25% on volatile mobile networks.
4. Background and noise removal. On-device, low-latency, vendor-agnostic. Tools like Krisp, NVIDIA Maxine, or in-house models. Now table stakes for serious calling apps.
5. Auto-summarisation and chapter generation. Post-stream AI generates a transcript, summary, and chapter markers within minutes — critical for VOD discoverability.
Need a live-streaming or real-time-AI product across APAC?
A 30-minute call — we map your latency, scale, region, and budget against the architectures above and tell you what to build, buy, or skip. Free.
Mini case — VOLO real-time translation at Black Hat for 22,000 attendees
VOLO is a real-time translation platform we built around speech recognition, machine translation, and live audio mixing — all latency-sensitive, all running across multiple language pairs in parallel.
VOLO was deployed at Black Hat for an audience of 22,000 attendees, translating live talks across multiple languages in real time. The architecture combined a WebRTC ingest layer for the speaker, an STT layer for transcription, an MT layer for translation, and a TTS or audio-mix layer back to the listener — with sub-second end-to-end latency on the critical path. The lesson generalises: real-time AI in streaming is now production-viable, but only with disciplined latency budgeting and careful model placement (on-device, edge, cloud) per layer.
A decision framework — pick a streaming or real-time architecture in five questions
1. What is your latency budget? Sub-500 ms → WebRTC. 1–5 s → LL-HLS / LL-DASH. 6–30 s → classic HLS / DASH.
2. What is your concurrency profile? Tens of two-way participants → WebRTC SFU. Hundreds of viewers → CDN-based HLS. Millions of viewers → multi-CDN HLS with smart packaging.
3. Where do your viewers actually live? Pick CDN POPs by your real distribution, not by your familiar logo. Test from real consumer ISPs in each region.
4. Build vs buy on the WebRTC layer? Buy (Agora, Daily, Twilio, LiveKit Cloud) when MVP speed beats unit economics. Build when scale, customisation, or unit economics dominate. Our analysis is in WebRTC vs Agora trade-offs.
5. Where does AI add real differentiation? Transcription, translation, content moderation, ABR tuning, noise removal. These are the 2026 differentiators — the features users immediately notice the absence of.
Five pitfalls in real-time and streaming product builds
1. Picking WebRTC for one-to-many broadcast. WebRTC scales linearly with concurrency; CDN-based HLS scales sub-linearly with viewers. Wrong protocol → 5–20x cost overrun.
2. Single-region SFU placement. Latency is the product. SFUs need to live near participants — multi-region from the start.
3. Skipping TURN. A meaningful share of users sit behind symmetric NAT or restrictive firewalls. No TURN relay = silent connection failures and cryptic support tickets.
4. Ignoring per-region compliance. Streaming products that handle PII or recorded content collide with DPDP (India), PDPA (Singapore), PIPL (China), and PIPEDA-style laws across the region.
5. Locking into one vendor SDK. Hard to migrate if pricing or capability shifts. Always design with a swappable backend interface, even if you start on a managed service.
KPIs to track on streaming and real-time products
Quality KPIs. Glass-to-glass latency p50 and p95, rebuffering rate (target <1% of viewer-minutes), startup time (target <2s on broadband), connection success rate (target >99.5%).
Business KPIs. Cost per concurrent viewer per minute, CDN cost as a share of revenue, average session duration, and monetisable engagement (purchases, votes, gifts) per stream.
Reliability KPIs. Stream success rate (target >99.9%), mean-time-to-detect on a regional outage (target <5 minutes), and mean-time-to-recover (target <30 minutes).
When NOT to build your own streaming or real-time stack
If you have under ~1,000 concurrent users and are still validating product-market fit, lean on a managed WebRTC vendor (Agora, Daily, LiveKit Cloud, Twilio Programmable Video). The unit economics will be worse, but the time-to-market will be 6–12 weeks faster.
If your latency budget is genuinely 10+ seconds and you are running a one-to-many broadcast, classic HLS with a single solid CDN is almost always the right answer — do not overpay for low-latency variants you do not need.
If you are pre-product-market fit, the right output is users in the room, not perfect protocol selection. Pick the simplest stack that works, ship, and re-architect when scale forces it.
Where streaming and real-time are heading next
1. AV1 and Versatile Video Coding (VVC) adoption picks up. 30–50% bandwidth savings for the same perceived quality — meaningful CDN cost cuts at scale, once decoding is mature on phones.
2. Edge inference becomes default. Transcription, moderation, and ABR decisions move from cloud to edge POPs and on-device, cutting latency and cost.
3. WebTransport and WebCodecs gain ground. Browser-native low-latency primitives reduce the WebRTC blast radius and open up custom transports.
4. Live shopping, live betting, live tutoring keep growing. Especially across APAC, where mobile-first commerce blends with creator culture faster than in most Western markets.
5. AI-generated avatars and synthetic media in calls. Mostly novelty in 2026, structurally relevant by 2028. Plan how your product handles authenticity verification.
Where the APAC Insider Awards sit in our wider recognition pattern
Recognition is most useful when the pattern is consistent across multiple specialisms and verified directories. In recent years Fora Soft has been listed in the Clutch 1000 for 2025, named a top iOS app development company on Techreviewer (2024 and 2026), top education software development company on GoodFirms (2025), top custom audio & video software development company in 2025, Clutch Global Leader for both Spring and Fall 2024 — and now both APAC Insider categories for real-time interaction and streaming software innovation in 2024.
For a buyer, the right read is the cluster: streaming, video, AI, mobile, and education engagement signals all in the same window, across multiple independent rating bodies. That pattern carries more signal than any single award — including this one.
Security and DRM — what serious streaming products owe their content owners
Streaming security is a layered conversation, not a single tick-box. Five layers matter at MVP and only get more important at scale.
1. Transport encryption. HTTPS and SRTP for everything — ingest, delivery, signalling. No plaintext anywhere on the wire.
2. Token-based access control. Signed URLs or HLS / DASH manifest tokens with short TTLs and IP/region binding. Never expose a stream URL that anyone can hotlink.
3. DRM where content rights demand it. Widevine (Android, Chrome), FairPlay (Apple), PlayReady (Microsoft, smart TVs). Multi-DRM is mandatory for serious VOD. Studio-grade content effectively requires it.
4. Watermarking. Forensic watermarks (per-session) for premium content, visible watermarks for live events. Invisible per-user watermarks deter screen-recording leaks.
5. Geoblocking and rights enforcement. Country-level geoblocking by IP, paired with token regional binding, is the baseline. CDN-level geoblocking is faster than app-level for blocked regions.
Reach for multi-DRM at MVP when: you license content from third parties (sports leagues, studios, music labels) — otherwise tokenised HLS without DRM is enough until your monetisation model catches up.
Observability — instrumenting a streaming product so you actually know what is happening
A streaming product without observability is a blind spot waiting to ship a bad release. Five signals to instrument from day one.
1. Glass-to-glass latency. Per session, per region, p50 and p95. Trend over weeks, not days — regressions creep in slowly.
2. Rebuffering rate. Per viewer-minute. Anything over 1% is a sign the bitrate ladder or CDN is mis-tuned for that region.
3. Connection success rate. Per region, per network type (Wi-Fi, 4G, 5G). Flag anything below 99.5%.
4. Per-session bitrate distribution. If too many sessions sit at the lowest rung, your encoding ladder is wrong; if too many sit at the top, you are wasting CDN egress.
5. Player error events. Categorised, counted, and alerted on. Mux Data, Datadog RUM, or open-source player-side analytics all work.
Reach for a dedicated streaming observability stack when: you cross ~1,000 concurrent viewers in production, or your product crosses two or more APAC markets with different network behaviour — otherwise lean instrumentation in your existing stack will do.
Real-time and streaming work in the Fora Soft portfolio
A short tour of the streaming and real-time projects most relevant to the topics above.
Perspire.tv. A live fitness streaming product with low-latency video, real-time engagement features, and multi-region delivery to a creator-led audience.
Sprii. A live video-shopping platform combining streaming, interactive overlays, and commerce flows — the canonical APAC use case for low-latency broadcast plus real-time interaction.
BrainCert. A virtual classroom platform with multi-region WebRTC for real-time tutoring and HLS for recorded session playback.
VOLO. A real-time translation system deployed at Black Hat for 22,000 attendees — the kind of latency-sensitive AI workload that is hard to get right.
If you would like a written 2–3 paragraph case-study summary on any of these for your own internal review, send us a note — happy to share.
Want a deeper look at one of our streaming case studies?
A 30-minute call — we walk through the architecture, the failure modes we hit, and what we would do differently. Useful even if you never hire us.
FAQ
What did the APAC Insider Awards specifically recognise?
Two awards. The 2024 APAC Insider Business Award for Best Custom Real-Time Interaction Software Development Company, and the 2024 APAC Insider Streaming Software Innovation Excellence Award. Both reflect Fora Soft’s long focus on hard real-time and streaming engineering rather than generic custom development.
When should I pick WebRTC vs HLS for my live product?
WebRTC when latency must stay under 500 ms and the experience is two-way (calls, tutoring, live auctions, multiplayer). HLS or DASH when you are broadcasting one-to-many and viewers tolerate a few seconds of delay (sports, concerts, conference live streams). LL-HLS / LL-DASH when you want broadcast scale with sub-2-second latency.
How big does an audience need to be before building our own SFU pays off vs using Agora or LiveKit Cloud?
Rule of thumb: under ~1,000 concurrent participants per month, managed services usually win on time-to-market. Above ~10,000 concurrent participants, owning the SFU stack and CDN deals usually wins on unit economics. Between those, it depends on customisation, region coverage, and your team’s WebRTC experience.
What APAC-specific issues do streaming products usually trip on?
Cross-border bandwidth volatility, mobile-first viewing patterns, fragmented compliance regimes (mainland China, India, Indonesia, Vietnam, South Korea), regional CDN POP density differences, and app distribution beyond Google Play. Plan for multi-region, multi-CDN, multi-store from the start.
Is real-time AI translation in calls actually production-ready?
Yes for many language pairs, with sub-second latency on the critical path, when the architecture is built around it from day one. We deployed it for VOLO at Black Hat for 22,000 attendees. The trade-offs are model placement (cloud vs edge vs on-device), language coverage, and acoustics — but the technology is no longer a research demo.
How much does a live-streaming product cost to run per month?
Highly dependent on concurrent viewers and bitrate, but the dominant line is egress bandwidth. A useful sanity check: at 1 Mbps average bitrate, 1,000 concurrent viewers for 1 hour costs roughly 450 GB of egress. CDN list price for 450 GB ranges from a few tens of dollars to a few hundred depending on region. Add transcoding, origin storage, and SFU compute on top.
Can Fora Soft build a streaming or real-time product end-to-end?
Yes. We cover discovery, architecture, custom SFU work, CDN strategy, codec choices, AI feature integration, mobile clients on iOS and Android, web clients, and post-launch maintenance. Recent reference points include VOLO, Perspire.tv, BrainCert, and Sprii. The starting point is a free 30-minute scoping call.
What is APAC Insider, and how do they pick winners?
APAC Insider is a quarterly business publication covering the Asia-Pacific region. Their annual APAC Business Awards, now in their ninth edition, recognise companies for innovation, performance, and contribution to their sector. The selection is editorial and research-based, not pay-to-play.
What to read next
WebRTC
WebRTC architecture guide for business
A 2026 deep dive on SFU, MCU, P2P, TURN, multi-region placement, and managed-service trade-offs.
Build vs buy
WebRTC vs Agora trade-offs
When a managed WebRTC platform pays off and when owning the stack wins on unit economics.
Protocols
WebRTC vs HLS — which to ship
A practical comparison for product teams choosing between real-time interaction and broadcast streaming.
Case study
How AI cut 30–40% off our delivery time
A first-person case study of Agent Engineering on a 1M+ line video streaming platform — numbers and methodology.
Estimation
Streaming app development time estimation
Realistic time estimates for the major workstreams in a serious streaming app build.
Ready to ship a real-time or streaming product across APAC and beyond?
The APAC Insider awards reflect the bet we have been making for years: that real-time interaction and streaming are not generic custom-software categories — they are specialisms with their own protocols, their own infrastructure economics, their own AI opportunities, and their own regional constraints. The companies winning in 2026 are the ones that treat them that way.
If you are scoping a video, audio, or real-time product — or rethinking one that is hitting a scaling wall — that is exactly the conversation we run on a 30-minute call. We bring our case studies, our cycle-time data, and our written assumptions about your project. You walk away with a prioritised plan whether you hire us or not.
Let’s talk about your real-time or streaming project
A free 30-minute call — we challenge your scope, validate your stack, and give you a written priority list whether you hire us or not.


.avif)

Comments