Latency measures how long a feature makes the user wait — from a spoken word to its caption, or a frame to its blurred background. Real-time calls feel broken above roughly 100-200 ms, so every model in the path must fit inside a strict time budget.