MS-SSIM, multi-scale SSIM, is the natural extension of structural similarity from one scale to several. Plain SSIM measures luminance, contrast, and structure at a single window size, which silently assumes one viewing distance and display resolution; MS-SSIM instead runs the contrast and structure comparisons on the image, then repeatedly low-pass-filters and downsamples it — five scales in the standard design — and combines the per-scale results with fixed weights (0.0448, 0.2856, 0.3001, 0.2363, 0.1333) that were tuned against human judgments. The payoff is a number more stable across viewing distance and display resolution than single-scale SSIM, which is why most modern pipelines that want a structure-based score reach for it. It shares SSIM's family traits and limits — it is still a spatial, structure-based proxy, usually computed on luma, and implementation-dependent — and like every objective metric it is graded against subjective scores, where perception-trained VMAF generally surpasses it.