SSIM, the Structural Similarity Index, is a full-reference metric that asks whether the structure of a picture survived rather than how many pixel values changed. It slides a small window across each frame and at every position compares the original and distorted patches on three things the eye cares about — luminance (mean brightness), contrast (variance), and structure (the correlation of patterns via covariance) — then multiplies the three into one score between 0 and 1, where 1 means identical. Averaging the per-window map gives the Mean SSIM that tools report. Because human vision reads structure, SSIM tracks perceived quality better than PSNR, reacting to blur, blocking, and ringing while forgiving harmless brightness shifts. Its catches: there is no single SSIM (windows and constants differ between implementations, so compare only same tool and settings), it is usually luma-only, per-frame, and saturates near 1. MS-SSIM and VMAF extend or surpass it.

