Scene-cut detection is the encoder's ability to recognise when one shot ends and another begins — the visual jump cut between different camera angles, different lighting, different scenes entirely. When the encoder spots a cut, it forces an I-frame at that exact boundary instead of trying to predict the new shot from the old (impossible, because the content is completely different) — which would produce a large, poor-quality residual and waste bits.
The detection itself isn't subtle. The encoder analyses the difference between consecutive frames (or runs proper visual analysis) and triggers a cut when the difference exceeds a threshold. Hard cuts (the camera angle changes instantly) are easy. Soft cuts (fades, dissolves) are harder because the change is gradual; modern encoders handle them by detecting the transition window and placing keyframes at appropriate points. The result is better compression at the cut boundary plus proper alignment for adaptive streaming: I-frames at scene cuts mean viewer-side seeks and segment boundaries naturally fall on visually meaningful moments rather than mid-shot.
For a product team, scene-cut detection is invisible plumbing that massively impacts perceived quality on cuts. Without it, a fast-cut action sequence develops visible blockiness right after every cut — the encoder is trying (and failing) to predict the new shot from the old. With it, every cut is clean. Modern encoders (x264, x265, SVT-AV1) enable scene-cut detection by default, but -sc_threshold in FFmpeg lets you tune sensitivity. For sports and live broadcast with cuts every few seconds, scene-cut detection saves ~5–10 % bitrate at the same quality. The closely-related concept is adaptive GOP: scene-cut detection means GOP length isn't fixed but adapts to content.

