Inference

Definition

Running a trained AI model on new input to get a prediction. The everyday cost of an AI feature comes from inference, not training.

Inference is the step where a finished model is fed a new image, audio clip, or text and returns an answer. In a video product it runs on every frame or every utterance, so its speed and price per call decide whether a feature is affordable at scale.

Also known as

prediction, serving