Update README.md

cecb4ac verified 8 months ago

3.63 kB

	---
	library_name: HunyuanImage-2.1
	license: other
	license_name: tencent-hunyuan-community
	license_link: https://github.com/Tencent-Hunyuan/HunyuanImage-2.1/blob/master/LICENSE
	language:
	- en
	- zh
	tags:
	- text-to-image
	- comfyui
	- diffusers
	pipeline_tag: text-to-image
	extra_gated_eu_disallowed: true
	---
	<div align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/5DZez8C7TeFwRn3FcKDix.png" alt="HunyuanImage-2.1 Banner" />
	<h1> HunyuanImage-2.1 fp8 e4m3fn </h1>
	<h2>An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation</h2>
	</div>



	</div>
	<div align="center">
	<a href="https://github.com/Tencent-Hunyuan/HunyuanImage-2.1" target="_blank"><img src="https://img.shields.io/badge/Code-black.svg?logo=github" height="22px"></a>
	<a href="https://huggingface.co/spaces/tencent/HunyuanImage-2.1" target="_blank">
	<img src="https://img.shields.io/badge/Demo%20Page-blue" height="22px"></a>
	<a href="https://huggingface.co/tencent/HunyuanImage-2.1" target="_blank"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg" height="22px"></a>
	<a href="#" target="_blank"><img src="https://img.shields.io/badge/Report-Coming%20Soon-blue" height="22px"></a>
	<a href="https://hunyuan-promptenhancer.github.io/" target="_blank"><img src="https://img.shields.io/badge/PromptEnhancer-bb8a2e.svg?logo=github" height="22px"></a>
	<a href="https://x.com/TencentHunyuan" target="_blank"><img src="https://img.shields.io/badge/Hunyuan-black.svg?logo=x" height="22px"></a>
	</div>

	---

	## Performance on RTX 5090
	> When using HunyuanImage-2.1 with the quantized encoder + quantized base model,
	> the VRAM usage on an NVIDIA RTX 5090 typically ranges between 26 GB and 30 GB with average
	> 16 second inference time depending on resolution, batch size, and prompt complexity.
	> Reports that it works on 16gb VRAM GPU's

	⚠ Important Note:
	The refiner is still not implemented and is not ready for use in ComfyUI.
	However, the distilled model now works in ComfyUI with recommended settings of 8 steps / 1.5-2.5 CFG.

	---

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/auZ_xmiKPw0QdBYUrTLn-.png" alt="Image1"/>
	</p>
	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/qod1zCPWjzOZSNcOWx49-.png" alt="Image2"/>
	</p>

	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/drMNYMjvB01RvgZKS6kX6.jpeg)
	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/uxhsoLKjzJu24eCZh_RQ8.jpeg)
	---
	## Download Quantized Model (FP8 e4m3fn)
	[Download hunyuanimage2.1_fp8_e4m3fn.safetensors](https://huggingface.co/drbaph/HunyuanImage-2.1_fp8/blob/main/hunyuanimage2.1_fp8_e4m3fn.safetensors)
	---
	### Workflow Notes
	- Model: HunyuanImage-2.1
	- Mode: Quantized Encoder + Quantized Base Model
	- VRAM Usage: ~26GB–30GB on RTX 5090
	- Resolution Tested: 2K (2048×2048)
	- Frameworks: ComfyUI & Diffusers
	- Optimisations Works with Patch Sage Attention + Lazycache / TeaCache ✅
	- Distilled Model: ✅ Now works in ComfyUI with 8 steps / 1.5-2.5 CFG
	- Refiner: ❌ Still not implemented, not available in ComfyUI
	- License: [tencent-hunyuan-community](https://github.com/Tencent-Hunyuan/HunyuanImage-2.1/blob/master/LICENSE)
	---
	<p align="center">
	🚀 Optimized for High-Resolution, Memory-Efficient Text-to-Image Generation
	</p>