Harmonic-mean pooling collapses per-frame scores into one number by averaging their reciprocals, which mathematically drags the result toward the low values and so weights bad frames more heavily than the arithmetic mean. For a clip of 270 frames at VMAF 95 and 30 frames at VMAF 30, the arithmetic mean reads 88.5 while the harmonic mean reads 78.1, more than ten points lower, because the bad second now counts for more. It is a gentle correction rather than a worst-case alarm: the score still reflects the whole clip, but the low frames get a louder vote, which better matches the human tendency to weight the worst moments. The sharp edge to know is zero scores. Because the method sums reciprocals, a single frame at exactly zero makes one term 1/0 and the pool blows up, so implementations quietly shift zero and near-zero scores to a small positive value; when much of a clip is broken, prefer percentile pooling or the minimum, which have no division to break.

