Entropy in information theory is the mathematical lower bound on how compactly data can be losslessly encoded. Coined by Claude Shannon in 1948, it's the theoretical answer to "what's the absolute minimum number of bits needed to represent this data?". A data source with low entropy (highly predictable, lots of repetition) can be compressed dramatically; a source with high entropy (random, unpredictable) can't be losslessly compressed at all. Entropy is, in a sense, the inherent information content of the signal.
The intuition: if a video frame is mostly one shade of blue with a few details, the encoder can describe "everything's blue except for these few spots" in very few bits — low entropy, high compressibility. If the same frame is full of random pixel noise, every pixel is independent information and no clever code can shorten the description much — high entropy, low compressibility. Real video sits somewhere in between, and every codec's job is to transform the signal into a form with as low entropy as possible before final encoding, so the entropy coder has very little work left to do.
For a product team, entropy doesn't get tuned directly but it explains a recurring pattern. Content with high natural entropy — snow, rain, sparkling water, confetti, action-heavy sports — consistently needs more bitrate to look acceptable, because the signal genuinely contains more information per second. Content with low natural entropy — talking heads, animations, slide decks, screencasts of static UIs — needs much less. This is why streaming services use per-title encoding (cheaper bitrate ladders for low-entropy content, more aggressive ones for high-entropy content) and why a 90-minute documentary fits in 500 MB while a 90-minute action movie at the same quality needs 2 GB.

