Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
But only Chinese version is available currently. If you want an English version, please open an issue for it. At least let me know you are interested in my project :D The project is not stable yet.
This important work introduces an integrated open-source platform for behavioral acquisition and pose estimation that substantially improves the accessibility and speed of real-time animal tracking ...
Abstract: Object detection is a core computer vision problem that requires real-time performance as an indispensable companion of accuracy. The YOLO family (You Only Look Once) has gained popularity ...
Abstract: This review marks the tenth anniversary of You Only Look Once (YOLO), one of the most influential frameworks in real-time object detection. Over the past decade, YOLO has evolved from a ...