Object Detection

Robotics applications need to detect and localize where objects are in a scene in order to react intelligently to their presence. Object detection finds bounding boxes in pixel coordinates of target objects in a single monocular camera image.

While classical computer vision algorithms have been somewhat successful in object detection, deep learned models trained on real examples with correct bounding boxes have changed the game. DNN architectures such as DetectNetV2 and YOLOv8 are world-class at detecting people, common objects, machine parts, or anything needed for the task at hand.

Object detection annotated on an image