Failover is the automatic switch to a standby component when a primary one fails, so the system keeps running through a fault instead of going dark. In surveillance it usually means a spare recording server that takes over the cameras of a failed one, and a redundant management server that keeps the system controllable if the primary management host dies. The goal is continuity of recording — the moment you most need footage is often the moment hardware is under stress.
It works by monitoring health and redirecting work on failure. In an N+1 recording design, one extra server stands ready for a group of active recorders; if any one fails, its camera streams are reassigned to the spare and recording continues with at most a brief gap. Management failover (commonly built on clustering such as Windows Server Failover Clustering) keeps the control plane — users, search, configuration — available even when a server is lost. Good designs pair this with UPS power so a mains blip does not take everything down.
The pitfalls are untested failover and the brief gap. Failover that has never been tested often does not work when it is finally needed — a misconfigured cluster or an undersized spare fails silently until the real outage exposes it, so it must be exercised, not assumed. And failover is not instantaneous: there is usually a short window during the switch where a few seconds of footage can be lost, which is why high-assurance cameras also record to the edge (an SD card) as a second line. Design failover deliberately, size the spare for the load it must absorb, and test it on a schedule.

