Edge AI means running the analytics — the object detection, classification, or event logic — directly on the camera or a device right next to it, rather than shipping the video to a server or the cloud to be analysed. The camera's own processor looks at the frames, decides "that is a person", and emits a small result. The video may never leave the device unless something interesting happens.
The advantages are speed, bandwidth, and privacy. Deciding on the camera avoids the network round-trip, so an edge result can arrive in roughly 25–100 ms versus several hundred for a cloud round-trip; it lets the camera send mostly metadata instead of a continuous video stream, cutting upload by around 99%; and it keeps raw imagery local, which eases privacy and residency concerns. On-camera silicon (an NPU) commonly offers a few TOPS of compute, enough for person/vehicle detection and simple rules.
The pitfall is expecting edge AI to do everything. A camera's compute, memory, and power budget cap how big and how accurate a model can be, so the heaviest tasks — cross-camera re-identification, large vision-language reasoning, training — still belong on a server or in the cloud. Edge AI is best understood as the fast, cheap first filter; the deployment decision (what runs where) is the real design problem, and the model internals belong to the AI for Video Engineering section.

