Quantization is the one step where lossy video compression actually loses information. Every other step — prediction, DCT transform, entropy coding — is mathematically reversible. Quantization is where the encoder decides "this coefficient is close enough to zero to call it zero", or "this brightness is close enough to 128 to call it 128", and the original precision is gone forever. All visible compression artefacts (banding, blockiness, blurry textures, posterised colours) ultimately originate here.
Mechanically, quantization divides each dct coefficient by a step size and rounds to an integer. A small step size keeps most coefficients intact (lossless or near-lossless); a big step size collapses many coefficients to zero, which then compress to almost nothing in the entropy-coding stage. The step size is controlled by the qp quantization parameter. The bigger the step, the smaller the file — and the more visible the loss. This is the fundamental rate-vs-quality dial of every lossy codec.
What makes modern codecs feel "good" despite this brutal-sounding step is that they quantize selectively — aggressively in places the eye can't see and gently where it can. Flat areas where banding would show up get fine quantization. Busy textures get coarse quantization because the visual noise of the texture masks the loss. Faces get fine quantization because viewers stare at them. This perceptual tuning, combined with smart rate control (crf, vbr), is why a 4K HDR Netflix stream at 15 Mbps looks essentially identical to an uncompressed 6 Gbps master to almost any viewer.

