A cue point is a named, timestamped marker embedded in the video timeline or stored in a parallel metadata track that instructs the player to execute a specific action when playback reaches that position — displaying an overlay, pausing for a quiz, firing an xAPI statement, marking a chapter boundary, or branching to a different scene. Cue points are the fundamental primitive from which all interactivity in an interactive video player is assembled: every hotspot activation, every in-video quiz, every chapter label, and every xAPI Video Profile event is ultimately driven by a cue point being crossed. In HTML5, the browser-native mechanism is the TextTrack API with kind "metadata": WebVTT metadata cues can carry arbitrary JSON payloads that the player reads and acts on, making them a standards-compliant way to embed interaction data without modifying the video file itself. Alternatively, platforms store cue point lists in JSON or XML served alongside the video manifest and implement their own timeupdate polling loop. Precision matters: video frames are typically delivered at 24–30 frames per second, and cue-point resolution depends on how frequently the player samples the current time; polling too infrequently causes cues to fire late, which is especially noticeable when a quiz is supposed to appear exactly on a particular spoken word. Cue points are closely linked to the video-player SDK, which provides the scheduling and dispatch infrastructure; at the data-model level, a cue point record typically contains at minimum a timestamp, a type identifier, and a payload object whose schema is defined by the type. A practical gotcha is cue drift on variable-bitrate streams: if the player's clock and the video's clock diverge during adaptive bitrate switching, cue points can fire early or late, requiring the SDK to resynchronise on segment boundaries.