Qwen/Qwen2.5-VL-72B-Instruct Image-Text-to-Text • 73B • Updated Jun 6, 2025 • 66.3k • • 579
Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 2.29M • • 1.42k
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 28 days ago • 214k • 1.56k
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 47.2k • 1.6k
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published Mar 2, 2025 • 64
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28, 2025 • 38