AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation Paper • 2603.28068 • Published 6 days ago • 9
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published about 1 month ago • 118
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published about 1 month ago • 118
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22, 2025 • 91
What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness Paper • 2502.14914 • Published Feb 19, 2025