Qwen/Qwen3-VL-30B-A3B-Thinking Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 63.6k • • 197
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 45k • 43
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8 Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 3.95k • 28
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 1.08M • • 383
Qwen/Qwen3-VL-235B-A22B-Thinking Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 228k • • 389
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8, 2025 • 114
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14, 2025 • 308
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11, 2024 • 36
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19, 2025 • 27