Ground truth in video quality is the human verdict, the subjective Mean Opinion Score produced when a panel of viewers rates clips under controlled conditions. It is the bedrock the field rests on, because video quality is ultimately defined by the people watching. Objective metrics earn their authority entirely by reference to it: SSIM was validated by showing it tracks human ratings better than older pixel-error measures, and VMAF is trained directly on subjective opinion scores using machine learning. The standard way to measure agreement, set out in ITU-T P.1401, uses the Pearson correlation for accuracy, the Spearman rank correlation for ranking, and RMSE for error size. The catch is that ground truth is expensive and slow to gather, needing at least fifteen observers, controlled conditions, and a reported confidence interval, so it cannot run on every encode. The governing rule is simple: when a metric and a properly run subjective test disagree, the ground truth wins and the metric has hit a blind spot on that content.