--- license: mit language: - en base_model: - Wan-AI/Wan2.1-T2V-14B --- # KALEIDO: OPEN-SOURCED MULTI-SUBJECT REFERENCE VIDEO GENERATION MODEL This repository contains the official implementation of **Kaleido**, proposed in our paper: ## Update and News * 2025.10.28: 🔥 We release the checkpoints of Kaleido-14B-S2V. * 2025.10.22: 🔥 We propose **Kaleido**, a novel multi-subject reference video generation model. * ### Checkpoints Download Use the following commands to download the model weights (We have integrated both Wan VAE and T5 modules into this checkpoint for convenience). ```bash # Download the repository (skip automatic LFS file downloads) GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/Crilias/Kaleido-14B-S2V # Enter the repository folder cd Kaleido-14B-S2V # Merge the checkpoint files python merge_kaleido.py ``` Arrange the model files into the following structure: ```text . ├── Kaleido-14B-S2V │ ├── model │ │ └── .... │ ├── Wan2.1_VAE.pth │ │ │ └── umt5-xxl │ └── .... ├── configs ├── sat └── sgm ``` ## Citation If you find our work helpful, please cite our paper: ```bibtex @misc{zhang2025kaleidoopensourcedmultisubjectreference, title={Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model}, author={Zhenxing Zhang and Jiayan Teng and Zhuoyi Yang and Tiankun Cao and Cheng Wang and Xiaotao Gu and Jie Tang and Dan Guo and Meng Wang}, year={2025}, eprint={2510.18573}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2510.18573}, }