A camera stream is the continuous flow of encoded video — and often audio and metadata — that a camera sends to the VMS. In an IP system the stream is usually set up with RTSP (the "remote control" that says play, pause, teardown), carried by RTP packets on the network, and compressed with a codec such as H.264 or H.265. The VMS opens the stream, decodes it for live view, and writes the encoded bytes to storage for recording.

Most cameras publish more than one stream at once. A high-resolution "main stream" (commonly 2–8 Mbps) is recorded and shown full-screen, while a lower-resolution "substream" feeds multi-camera grids and mobile clients cheaply. This multi-stream design is what lets a wall of 32 thumbnails stay responsive without decoding 32 full 4K feeds.

The classic pitfall is pulling the main stream everywhere. Displaying dozens of main streams in a grid hammers the client's decode capacity and the network, causing dropped frames and lag; the fix is to wire substreams to the grid and reserve the main stream for the single expanded view and for recording. Stream count, resolution, frame rate, and codec together set the bandwidth and storage bill, so they are the first numbers to plan.