Entropy coding is the final, lossless squeeze that every video codec applies right before writing the bitstream to disk. After all the clever tricks — prediction, motion compensation, DCT, quantization — the encoder is left with a long stream of numbers. Entropy coding takes that stream and packs it into the smallest possible sequence of bits, with no further loss of information. It's pure data compression: the same kind of math that powers zip files, applied to the encoder's output.
The principle is the one Shannon proved in 1948: if a symbol appears often, give it a short code; if it appears rarely, give it a longer one. So a stream where the number 0 appears 80 % of the time can be encoded in much less than one bit per symbol on average. Modern codecs go further with context-adaptive entropy coding (cabac in H.264 and HEVC, cavlc in older H.264, daala_bool in AV1) — they update their probability estimates on the fly as they go through the frame, so the code lengths get even closer to the theoretical optimum.
The practical impact: entropy coding alone is responsible for around 10–20 % of a modern codec's compression efficiency. It also has zero quality impact — being lossless, the original numbers can always be recovered. What it does cost is decoder complexity: the more sophisticated the entropy coder, the more compute the decoder needs, which is why low-power devices (older phones, video doorbells) sometimes use simpler entropy coders. For product decisions, this is invisible plumbing — you never set it directly, but it's part of why every new codec generation needs more silicon to decode.

