Entropy coding

Entropy coding is the final, lossless squeeze that every video codec applies right before writing the bitstream to disk. After all the clever tricks — prediction, motion compensation, DCT, quantization — the encoder is left with a long stream of numbers. Entropy coding takes that stream and packs it into the smallest possible sequence of bits, with no further loss of information. It's pure data compression: the same kind of math that powers zip files, applied to the encoder's output.

The principle is the one Shannon proved in 1948: if a symbol appears often, give it a short code; if it appears rarely, give it a longer one. So a stream where the number 0 appears 80 % of the time can be encoded in much less than one bit per symbol on average. Modern codecs go further with context-adaptive entropy coding (cabac in H.264 and HEVC, cavlc in older H.264, daala_bool in AV1) — they update their probability estimates on the fly as they go through the frame, so the code lengths get even closer to the theoretical optimum.

The practical impact: entropy coding alone is responsible for around 10–20 % of a modern codec's compression efficiency. It also has zero quality impact — being lossless, the original numbers can always be recovered. What it does cost is decoder complexity: the more sophisticated the entropy coder, the more compute the decoder needs, which is why low-power devices (older phones, video doorbells) sometimes use simpler entropy coders. For product decisions, this is invisible plumbing — you never set it directly, but it's part of why every new codec generation needs more silicon to decode.

Entropy coding

Related terms

Huffman coding

CAVLC

CABAC

Arithmetic coding

Entropy (information)