A residual is the difference between what the encoder predicted a block would look like and what it actually looks like. After the prediction step (intra or inter), the encoder subtracts the prediction from the real pixels and gets a residual — a block of small numbers, mostly close to zero, that represents the leftover information the prediction didn't capture. The residual is what then gets transformed (dct or similar), quantized and entropy-coded for storage. Everything in the actual bitstream is either a prediction instruction (motion vector, intra mode) or a residual.
Why residuals matter: they're the mathematical reason modern video compresses so well. A perfect prediction would leave a residual of all zeros, which entropy-codes to nearly nothing. A bad prediction leaves a residual essentially as informative as the original pixels, which compresses no better than the input. The whole encoder is engineered to make residuals as small and as close to zero as possible before they hit the quantization stage.
For a product team, "residual" is mostly a piece of jargon you'll hear in encoder documentation rather than a knob you tune. But it's useful for diagnosing where compression goes wrong. Visible blockiness on flat areas? The prediction was good but quantization rounded the residual too aggressively. Visible ghosting or smearing on motion? The prediction was bad and the residual was too big to encode at the available bitrate. Sharp text losing crispness? The intra prediction couldn't capture the edge precisely, leaving a high-frequency residual that quantization mauled. Mapping each artefact back to "prediction failed" or "residual was quantized too hard" makes encoder tuning a lot more methodical.

