A latency budget assigns each stage — capture, model, network, render — a slice of the allowed delay, forcing trade-offs so the whole pipeline feels instant. Designing to a budget is how teams keep AI in a live call from making it lag.