prithivMLmods/Herculis-CUA-GUI-Orchestrator-4B Image-Text-to-Text โข 4B โข Updated about 2 hours ago
prithivMLmods/Herculis-CUA-GUI-Orchestrator-4B Image-Text-to-Text โข 4B โข Updated about 2 hours ago
Running on Zero MCP Featured 62 Qwen Image Edit 2509 LoRAs Fast Fusion โญ 62 Qwen Image Editing Fusion Collection LoRA Demo
HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models Paper โข 2512.09928 โข Published 3 days ago โข 10
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models Paper โข 2512.08829 โข Published 4 days ago โข 13
Running on Zero MCP Featured 402 Qwen Image Edit 2509 LoRAs Fast โก 402 Demo of the Collection of Qwen Image Edit LoRAs
Running on Zero MCP 379 Multimodal OCR ๐ 379 nanonets ocr2 / olmocr / qwen2vl ocr / aya vision / rolmocr
Running on Zero MCP 168 DocScope-R1 ๐ซ 168 cosmos reason1 / docscopeocr / visionocr / captioner relaxed
Running on Zero MCP Featured 139 Multimodal OCR2 ๐ป 139 nanonets ocr / smoldocling / monkey ocr / typhoon ocr
Running on Zero MCP 110 VisionScope-R2 ๐ 110 deepcaption / skycaptioner /spacethinker / spaceom / coreocr
Running on Zero MCP Featured 100 Photo Mate i2i ๐ฝ 100 Image manipulation with Kontext adapters.[demo]