CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training Paper • 2505.17589 • Published May 23 • 5
jina-vlm Collection Jina-VLM: Small Multilingual Vision Language Model • 3 items • Updated about 14 hours ago • 7
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published 6 days ago • 43
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 7 days ago • 121
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch Paper • 2512.02395 • Published 14 days ago • 46
Skywork-R1V4 Collection Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch • 4 items • Updated 7 days ago • 7
PaCoRe Collection Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning • 3 items • Updated 7 days ago • 8
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 8 days ago • 72
view changelog Changelog Team & Enterprise Articles Now Featured on the Hugging Face Blog 8 days ago • 55
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 12 days ago • 166
view article Article Introducing swift-huggingface: The Complete Swift Client for Hugging Face 11 days ago • 31
XVLA Collection X-VLA is a soft-prompted Transformer for cross-embodiment robot learning • 6 items • Updated 12 days ago • 10