RTSP (Real-Time Streaming Protocol) is the "remote control" for a video stream. It does not carry the video itself; it sets up and controls the session — DESCRIBE to learn what a stream offers, SETUP to negotiate transport, then PLAY, PAUSE, and TEARDOWN. When a VMS connects to a camera at a URL like rtsp://camera/stream1, RTSP is the conversation that establishes how the live feed will flow. It is standardised by the IETF in RFC 2326 (RTSP 1.0, 1998) and RFC 7826 (RTSP 2.0, 2016).
In a surveillance pipeline, ONVIF typically discovers the camera and hands over its RTSP stream URIs; the VMS then opens an RTSP session, and the actual media is delivered by RTP underneath, described by SDP. This division of labour — RTSP to control, RTP to transport, SDP to describe — is the backbone of how nearly every IP camera delivers live video.
The pitfall is networking, not the protocol. RTSP/RTP commonly runs over UDP for low latency, which firewalls and NAT can block or mangle, producing the classic "camera connects but no video" symptom; switching to RTSP-over-TCP (interleaving the media in the control channel) fixes many such cases at the cost of some latency. The deep transport behaviour is owned by the Video Streaming section; for surveillance, RTSP is simply how the stream gets turned on.

