tchung1970's picture
Add CLAUDE.md and localize UI to Korean
06ecdbb

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a Hugging Face Gradio Space that applies textures to images using the Qwen-Image-Edit-2509 model enhanced with custom LoRA adapters. The application takes a content image and a texture image as inputs, then applies the texture to the content based on a text description.

Key Architecture

Pipeline Structure

The application uses a custom diffusion pipeline built on top of Diffusers:

  • Base Model: Qwen/Qwen-Image-Edit-2509 with FlowMatchEulerDiscreteScheduler
  • LoRA Adapters: Two LoRAs are fused at startup:
    1. tarn59/apply_texture_qwen_image_edit_2509 - texture application capability
    2. lightx2v/Qwen-Image-Lightning - 4-step inference acceleration
  • Custom Components (in qwenimage/ module):
    • QwenImageEditPlusPipeline - modified pipeline for dual image input
    • QwenImageTransformer2DModel - custom transformer implementation
    • QwenDoubleStreamAttnProcessorFA3 - FlashAttention 3 processor for performance

Pipeline Initialization Flow

  1. Scheduler configured with exponential time shift and dynamic shifting
  2. Base pipeline loaded with bfloat16 dtype
  3. Both LoRAs loaded and fused (then unloaded to save memory)
  4. Transformer class swapped to custom implementation
  5. FlashAttention 3 processor applied
  6. Pipeline moved to GPU and optimized with AOT compilation

Optimization System

optimization.py implements ahead-of-time (AOT) compilation using Spaces GPU infrastructure:

  • Exports transformer with torch.export using dynamic shapes for variable sequence lengths
  • Compiles with TorchInductor using aggressive optimizations (max_autotune, cudagraphs, etc.)
  • Applies compiled model back to pipeline transformer
  • Warmup run performed during initialization with 1024x1024 dummy images

Running the Application

Local Development

# Install dependencies
pip install -r requirements.txt

# Run the Gradio app
python app.py

Testing Inference

The main function is apply_texture() in app.py:82. Key parameters:

  • content_image: PIL Image or file path - the base image
  • texture_image: PIL Image or file path - the texture to apply
  • prompt: Text description (e.g., "Apply wood texture to mug")
  • num_inference_steps: Default 4 (optimized for Lightning LoRA)
  • true_guidance_scale: Default 1.0

Image Dimension Handling

Output dimensions are calculated from the content image (calculate_dimensions() at app.py:59):

  • Largest side is scaled to 1024px
  • Aspect ratio preserved
  • Both dimensions rounded to multiples of 8 (required by model)

Important Technical Details

Model Device and Dtype

  • Uses torch.bfloat16 for memory efficiency and H100 compatibility
  • Automatically selects CUDA if available, falls back to CPU
  • Pipeline optimization assumes GPU availability (uses @spaces.GPU decorator)

Spaces Integration

This app is designed for Hugging Face Spaces with ZeroGPU:

  • @spaces.GPU decorator on inference function allocates GPU on-demand
  • Optimization uses spaces.aoti_capture(), spaces.aoti_compile(), and spaces.aoti_apply()
  • Compilation happens once at startup with 1500s duration allowance

Custom Module Dependencies

The qwenimage/ module contains modified Diffusers components:

  • Not installed via pip, part of the repository
  • Must be kept in sync if updating base Diffusers version
  • Implements dual-image input handling for texture application

Common Development Commands

# Test the app locally (will download ~10GB of models on first run)
python app.py

# Check dependencies
pip list | grep -E "diffusers|transformers|torch"

# View GPU memory usage during inference (if running on GPU)
nvidia-smi

Key Files

  • app.py - Main Gradio interface and inference logic
  • optimization.py - AOT compilation and quantization utilities
  • qwenimage/pipeline_qwenimage_edit_plus.py - Custom dual-image pipeline
  • qwenimage/transformer_qwenimage.py - Modified transformer model
  • qwenimage/qwen_fa3_processor.py - FlashAttention 3 attention processor
  • requirements.txt - Includes diffusers from GitHub main branch