Update (March 2026): We are excited to introduce LLaDA-o, the latest model in the LLaDA series. As an effective and length-adaptive omni diffusion model for unified multimodal understanding and generation, LLaDA-o extends the LLaDA line to broader multimodal settings, supporting visual understanding, text-to-image generation, and instruction-based image editing. For more details, please check out the paper and code.

LLaDA-V

We introduce LLaDA-V, a competitive diffusion-based vision-language model that outperforms other diffusion MLLMs.

Safetensors

Model size

8B params

Tensor type

F16