LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning Paper • 2509.24786 • Published Sep 29, 2025 • 6
HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context Paper • 2506.21277 • Published Jun 26, 2025 • 14
ViSpeak: Visual Instruction Feedback in Streaming Videos Paper • 2503.12769 • Published Mar 17, 2025 • 8