Inference is the step where a finished model is fed a new image, audio clip, or text and returns an answer. In a video product it runs on every frame or every utterance, so its speed and price per call decide whether a feature is affordable at scale.

