Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
HOI-DETR is a transformer-based framework for detecting hands, hand-held objects, and their interactions in images and video. Built on the Co-DETR architecture, it adds a lightweight interaction ...
Abstract: Small target detection in remote sensing images is a significant research focus within the remote sensing domain. Recently, various YOLO algorithms have demonstrated remarkable achievements ...
Version 5.0 Modernizes DNN Engine, Adds LLM/VLM Support, and Enhances Core, Hardware Acceleration, and 3D Stack.
Abstract: Tiny-object detection is increasingly crucial in fields such as remote sensing, traffic monitoring, and robotics. Inspired by human visual perception, the attention mechanism has become a ...