Object detection answers 'what is where' by returning a labelled bounding box for every object it finds — person, car, package. It is the most-used CV task in video products and the foundation that tracking, counting, and analytics build on.