Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper • 2603.16932 • Published 11 days ago • 72
WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP Paper • 2509.21153 • Published Sep 25, 2025 • 1
Speak While You Think: Streaming Speech Synthesis During Text Generation Paper • 2309.11210 • Published Sep 20, 2023 • 1
Exploring the Benefits of Tokenization of Discrete Acoustic Units Paper • 2406.05547 • Published Jun 8, 2024 • 1
Advancing Speech Understanding in Speech-Aware Language Models with GRPO Paper • 2509.16990 • Published Sep 21, 2025 • 22
Continuous Speech Synthesis using per-token Latent Diffusion Paper • 2410.16048 • Published Oct 21, 2024 • 30
Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper • 2603.16932 • Published 11 days ago • 72
HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering Paper • 2603.18558 • Published 6 days ago • 10
Benchmarking Label Noise in Instance Segmentation: Spatial Noise Matters Paper • 2406.10891 • Published Jun 16, 2024 • 1
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods Paper • 2410.17222 • Published Oct 22, 2024 • 3
Hysteresis Activation Function for Efficient Inference Paper • 2411.10573 • Published Nov 15, 2024 • 1
$\mathbf{R}^3$: Reconstruction, Raw, and Rain: Deraining Directly in the Bayer Domain Paper • 2509.24022 • Published Sep 28, 2025 • 1
WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP Paper • 2509.21153 • Published Sep 25, 2025 • 1