Instructions to use Qwen/Qwen-Image-Edit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Qwen/Qwen-Image-Edit with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image-Edit", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
Releasing FP8 & F16 Models
First of all, thank you for the open-source models. Qwen is bringing huge growth to the open-source development of LLMs and now image generation.
I hope in the future there will also be FP8 and F18 models launched for lower-end GPUs with only 8–16 GB of VRAM.
It would be great to have multiple models, such as one focused on realism and another on animation, similar to the fine-tuned models of SDXL and SD 1.5.
Since these large models are mostly practical for enterprises but very difficult for personal or retail users, smaller optimized versions would be a big help.
Again, thank you for the superb model.
It can be converted directly through Diffusers, right?
https://huggingface.co/Qwen/Qwen-Image-Edit/discussions/6#68a39afdf4aa9e784e43afc0
It can be converted directly through Diffusers, right?
https://huggingface.co/Qwen/Qwen-Image-Edit/discussions/6#68a39afdf4aa9e784e43afc0
In the process of finding out right now.
Will let you know.
The downloads are killing me, softly.
Why would you use FP16 instead of BF16 though ?
If you GPU doesn't support BF16, I don't think you could even run this
Wait for a FP8 scaled model from Kijai (smart scaling is way better than a naive truncated FP8)
Both the bitsandbytes code and torchao code are now functional.
They can be found here:
bitsandbytes: ~17GB VRAM
https://huggingface.co/Qwen/Qwen-Image-Edit/discussions/6#68a3f2b63a24e2df78974f5d
torchao: ~23GB VRAM
https://huggingface.co/Qwen/Qwen-Image-Edit/discussions/6#68a4013ec45c7fbadef91472
NielsGx: There's a "fast fp16_accumulation" that makes FP16 faster on some (nvidia, as far as I know) cards. Shows up as "fp16_fast" I believe, in some ComfyUI nodes. So that'd be >A< reason.
Found the reference, from the Kijai Wan 2.1 T2V workflow: "fp_16_fast enables 'Full FP16 Accumulation in FP16 GEMMs" feature available in the very latest pytorch nightly, this is around 20% speed boost. '
So that's >A< reason, if you've got vram to burn.
Why would you use FP16 instead of BF16 though ?
If you GPU doesn't support BF16, I don't think you could even run thisWait for a FP8 scaled model from Kijai (smart scaling is way better than a naive truncated FP8)
Well, you have a ton of Tesla V100 with 16 and 32gb of HBM for cheap that won't support BF16, and they support proper nvlink.