OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection Paper • 2502.20361 • Published Feb 27 • 1
$β$-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment Paper • 2512.12678 • Published 13 days ago