Apply-Texture-Qwen-Image-Edit

Sleeping

App Files Files Community

Apply-Texture-Qwen-Image-Edit / CLAUDE.md

tchung1970

Add CLAUDE.md and localize UI to Korean

06ecdbb about 2 months ago

preview code

raw

history blame contribute delete

4.25 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a Hugging Face Gradio Space that applies textures to images using the Qwen-Image-Edit-2509 model enhanced with custom LoRA adapters. The application takes a content image and a texture image as inputs, then applies the texture to the content based on a text description.

Key Architecture

Pipeline Structure

The application uses a custom diffusion pipeline built on top of Diffusers:

Base Model: Qwen/Qwen-Image-Edit-2509 with FlowMatchEulerDiscreteScheduler
LoRA Adapters: Two LoRAs are fused at startup:
1. tarn59/apply_texture_qwen_image_edit_2509 - texture application capability
2. lightx2v/Qwen-Image-Lightning - 4-step inference acceleration
Custom Components (in qwenimage/ module):
- QwenImageEditPlusPipeline - modified pipeline for dual image input
- QwenImageTransformer2DModel - custom transformer implementation
- QwenDoubleStreamAttnProcessorFA3 - FlashAttention 3 processor for performance

Pipeline Initialization Flow

Scheduler configured with exponential time shift and dynamic shifting
Base pipeline loaded with bfloat16 dtype
Both LoRAs loaded and fused (then unloaded to save memory)
Transformer class swapped to custom implementation
FlashAttention 3 processor applied
Pipeline moved to GPU and optimized with AOT compilation

Optimization System

optimization.py implements ahead-of-time (AOT) compilation using Spaces GPU infrastructure:

Exports transformer with torch.export using dynamic shapes for variable sequence lengths
Compiles with TorchInductor using aggressive optimizations (max_autotune, cudagraphs, etc.)
Applies compiled model back to pipeline transformer
Warmup run performed during initialization with 1024x1024 dummy images

Running the Application

Local Development

# Install dependencies
pip install -r requirements.txt

# Run the Gradio app
python app.py

Testing Inference

The main function is apply_texture() in app.py:82. Key parameters:

content_image: PIL Image or file path - the base image
texture_image: PIL Image or file path - the texture to apply
prompt: Text description (e.g., "Apply wood texture to mug")
num_inference_steps: Default 4 (optimized for Lightning LoRA)
true_guidance_scale: Default 1.0

Image Dimension Handling

Output dimensions are calculated from the content image (calculate_dimensions() at app.py:59):

Largest side is scaled to 1024px
Aspect ratio preserved
Both dimensions rounded to multiples of 8 (required by model)

Important Technical Details

Model Device and Dtype

Uses torch.bfloat16 for memory efficiency and H100 compatibility
Automatically selects CUDA if available, falls back to CPU
Pipeline optimization assumes GPU availability (uses @spaces.GPU decorator)

Spaces Integration

This app is designed for Hugging Face Spaces with ZeroGPU:

@spaces.GPU decorator on inference function allocates GPU on-demand
Optimization uses spaces.aoti_capture(), spaces.aoti_compile(), and spaces.aoti_apply()
Compilation happens once at startup with 1500s duration allowance

Custom Module Dependencies

The qwenimage/ module contains modified Diffusers components:

Not installed via pip, part of the repository
Must be kept in sync if updating base Diffusers version
Implements dual-image input handling for texture application

Common Development Commands

# Test the app locally (will download ~10GB of models on first run)
python app.py

# Check dependencies
pip list | grep -E "diffusers|transformers|torch"

# View GPU memory usage during inference (if running on GPU)
nvidia-smi

Key Files

app.py - Main Gradio interface and inference logic
optimization.py - AOT compilation and quantization utilities
qwenimage/pipeline_qwenimage_edit_plus.py - Custom dual-image pipeline
qwenimage/transformer_qwenimage.py - Modified transformer model
qwenimage/qwen_fa3_processor.py - FlashAttention 3 attention processor
requirements.txt - Includes diffusers from GitHub main branch