Pose estimation outputs a skeleton of keypoints for each person, frame by frame. From those joint coordinates a product can count reps, analyse a golf swing, or recognise gestures. MediaPipe Pose, RTMPose, and ViTPose are common engines.