Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis Paper • 2510.15710 • Published Oct 17 • 6
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark Paper • 2402.02242 • Published Feb 3, 2024
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published 4 days ago • 3
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published 4 days ago • 3
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 17 days ago • 77
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20 • 62
Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation Paper • 2507.13032 • Published Jul 17
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7 • 54
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding Paper • 2507.14533 • Published Jul 19 • 5
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7 • 54