PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published 25 days ago • 72
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 28 days ago • 73
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published Nov 29, 2025 • 25
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models Paper • 2512.08829 • Published Dec 9, 2025 • 18
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published Dec 10, 2025 • 46
Composing Concepts from Images and Videos via Concept-prompt Binding Paper • 2512.09824 • Published Dec 10, 2025 • 27
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published Dec 10, 2025 • 71
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality Paper • 2512.07951 • Published Dec 8, 2025 • 48
Running on Zero MCP Featured 1.6k Z Image Turbo 🏃 1.6k Generate realistic images from text descriptions