Running on CPU Upgrade Featured 2.83k The Smol Training Playbook 📚 2.83k The secrets to building world-class LLMs
MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding Paper • 2505.20298 • Published May 26, 2025 • 9
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated about 1 month ago • 172k • 1.56k
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation Paper • 2410.17250 • Published Oct 22, 2024 • 14
Runtime error Featured 142 TextDiffuser 2 📚 142 Generate images from text prompts with layout planning