Spaces:
Paused
Paused
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,5 +10,99 @@ pinned: false
|
|
| 10 |
license: mit
|
| 11 |
short_description: input text, a video from the past to the future
|
| 12 |
---
|
|
|
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
license: mit
|
| 11 |
short_description: input text, a video from the past to the future
|
| 12 |
---
|
| 13 |
+
Looking at this code, it's a Gradio-based application that generates interpolated images between two concepts using CLIP-guided diffusion with the FLUX model. Let me explain the key components and functionality:
|
| 14 |
|
| 15 |
+
## English Explanation
|
| 16 |
+
|
| 17 |
+
### Overview
|
| 18 |
+
This application creates a "Time Stream" effect by generating a series of images that smoothly transition between two different states or concepts. For example, it can show the progression from a "fresh" tomato to a "rotten" one, creating a time-lapse-like visualization.
|
| 19 |
+
|
| 20 |
+
### Key Features
|
| 21 |
+
|
| 22 |
+
1. **CLIP-Guided Image Generation**
|
| 23 |
+
- Uses FLUX.1-schnell model with CLIP guidance
|
| 24 |
+
- Finds latent directions between two concepts using CLIP embeddings
|
| 25 |
+
- Generates intermediate images along this direction
|
| 26 |
+
|
| 27 |
+
2. **Main Components**
|
| 28 |
+
- **Prompt**: The base description of what to generate
|
| 29 |
+
- **1st/2nd Direction**: Two states to interpolate between (e.g., "Fresh" โ "Rotten")
|
| 30 |
+
- **Strength**: Controls how extreme the transformation is
|
| 31 |
+
- **Output**: Creates both an image strip and a looping video
|
| 32 |
+
|
| 33 |
+
3. **Advanced Options**
|
| 34 |
+
- Number of intermediate images (3-65)
|
| 35 |
+
- CLIP direction iterations (0-400)
|
| 36 |
+
- Inference steps (1-4)
|
| 37 |
+
- Guidance scale (0.1-10.0)
|
| 38 |
+
- Seed control for reproducibility
|
| 39 |
+
|
| 40 |
+
4. **Output Formats**
|
| 41 |
+
- Individual generated images
|
| 42 |
+
- Image strip showing all transitions
|
| 43 |
+
- Looping video of the transformation
|
| 44 |
+
- Interactive slider to view specific frames
|
| 45 |
+
|
| 46 |
+
### Technical Implementation
|
| 47 |
+
- Uses `spaces.GPU` decorator for GPU acceleration
|
| 48 |
+
- Implements AutoencoderTiny for faster processing
|
| 49 |
+
- Handles Korean text detection (though warns it's used directly without translation)
|
| 50 |
+
- Saves images with unique UUID filenames
|
| 51 |
+
|
| 52 |
+
### Example Use Cases
|
| 53 |
+
- Showing decay/aging processes
|
| 54 |
+
- Seasonal changes
|
| 55 |
+
- Weather transitions
|
| 56 |
+
- Urban development/deterioration
|
| 57 |
+
- Any temporal transformation
|
| 58 |
+
|
| 59 |
+
---
|
| 60 |
+
|
| 61 |
+
## ํ๊ธ ์ค๋ช
|
| 62 |
+
|
| 63 |
+
### ๊ฐ์
|
| 64 |
+
์ด ์ ํ๋ฆฌ์ผ์ด์
์ ๋ ๊ฐ์ง ๋ค๋ฅธ ์ํ๋ ๊ฐ๋
์ฌ์ด๋ฅผ ๋ถ๋๋ฝ๊ฒ ์ ํํ๋ ์ผ๋ จ์ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ์ฌ "์๊ฐ์ ํ๋ฆ(Time Stream)" ํจ๊ณผ๋ฅผ ๋ง๋ญ๋๋ค. ์๋ฅผ ๋ค์ด, "์ ์ ํ" ํ ๋งํ ์์ "์ฉ์" ํ ๋งํ ๋ก์ ๋ณํ ๊ณผ์ ์ ๋ณด์ฌ์ฃผ๋ ์๊ฐ ๊ฒฝ๊ณผ ์๊ฐํ๋ฅผ ์์ฑํ ์ ์์ต๋๋ค.
|
| 65 |
+
|
| 66 |
+
### ์ฃผ์ ๊ธฐ๋ฅ
|
| 67 |
+
|
| 68 |
+
1. **CLIP ๊ฐ์ด๋ ์ด๋ฏธ์ง ์์ฑ**
|
| 69 |
+
- CLIP ๊ฐ์ด๋์ค์ ํจ๊ป FLUX.1-schnell ๋ชจ๋ธ ์ฌ์ฉ
|
| 70 |
+
- CLIP ์๋ฒ ๋ฉ์ ์ฌ์ฉํ์ฌ ๋ ๊ฐ๋
์ฌ์ด์ ์ ์ฌ ๋ฐฉํฅ ์ฐพ๊ธฐ
|
| 71 |
+
- ์ด ๋ฐฉํฅ์ ๋ฐ๋ผ ์ค๊ฐ ์ด๋ฏธ์ง๋ค์ ์์ฑ
|
| 72 |
+
|
| 73 |
+
2. **์ฃผ์ ๊ตฌ์ฑ ์์**
|
| 74 |
+
- **ํ๋กฌํํธ**: ์์ฑํ ๋์์ ๊ธฐ๋ณธ ์ค๋ช
|
| 75 |
+
- **1์ฐจ/2์ฐจ ๋ฐฉํฅ**: ๋ณด๊ฐํ ๋ ๊ฐ์ง ์ํ (์: "์ ์ ํ" โ "์ฉ์")
|
| 76 |
+
- **๊ฐ๋**: ๋ณํ์ ๊ทน๋จ์ฑ์ ์ ์ด
|
| 77 |
+
- **์ถ๋ ฅ**: ์ด๋ฏธ์ง ์คํธ๋ฆฝ๊ณผ ๋ฃจํ ๋น๋์ค ๋ชจ๋ ์์ฑ
|
| 78 |
+
|
| 79 |
+
3. **๊ณ ๊ธ ์ต์
**
|
| 80 |
+
- ์ค๊ฐ ์ด๋ฏธ์ง ์ (3-65๊ฐ)
|
| 81 |
+
- CLIP ๋ฐฉํฅ ๋ฐ๋ณต ํ์ (0-400ํ)
|
| 82 |
+
- ์ถ๋ก ๋จ๊ณ (1-4๋จ๊ณ)
|
| 83 |
+
- ๊ฐ์ด๋์ค ์ค์ผ์ผ (0.1-10.0)
|
| 84 |
+
- ์ฌํ์ฑ์ ์ํ ์๋ ์ ์ด
|
| 85 |
+
|
| 86 |
+
4. **์ถ๋ ฅ ํ์**
|
| 87 |
+
- ๊ฐ๋ณ ์์ฑ ์ด๋ฏธ์ง
|
| 88 |
+
- ๋ชจ๋ ์ ํ์ ๋ณด์ฌ์ฃผ๋ ์ด๋ฏธ์ง ์คํธ๋ฆฝ
|
| 89 |
+
- ๋ณํ ๊ณผ์ ์ ๋ฃจํ ๋น๋์ค
|
| 90 |
+
- ํน์ ํ๋ ์์ ๋ณผ ์ ์๋ ์ธํฐ๋ํฐ๋ธ ์ฌ๋ผ์ด๋
|
| 91 |
+
|
| 92 |
+
### ๊ธฐ์ ์ ๊ตฌํ
|
| 93 |
+
- GPU ๊ฐ์์ ์ํ `spaces.GPU` ๋ฐ์ฝ๋ ์ดํฐ ์ฌ์ฉ
|
| 94 |
+
- ๋น ๋ฅธ ์ฒ๋ฆฌ๋ฅผ ์ํ AutoencoderTiny ๊ตฌํ
|
| 95 |
+
- ํ๊ธ ํ
์คํธ ๊ฐ์ง ์ฒ๋ฆฌ (๋ฒ์ญ ์์ด ์ง์ ์ฌ์ฉ๋๋ค๋ ๊ฒฝ๊ณ ํ์)
|
| 96 |
+
- ๊ณ ์ ํ UUID ํ์ผ๋ช
์ผ๋ก ์ด๋ฏธ์ง ์ ์ฅ
|
| 97 |
+
|
| 98 |
+
### ์ฌ์ฉ ์์
|
| 99 |
+
- ๋ถํจ/๋
ธํ ๊ณผ์ ํํ
|
| 100 |
+
- ๊ณ์ ๋ณํ
|
| 101 |
+
- ๋ ์จ ์ ํ
|
| 102 |
+
- ๋์ ๊ฐ๋ฐ/์ ํด
|
| 103 |
+
- ๋ชจ๋ ์๊ฐ์ ๋ณํ
|
| 104 |
+
|
| 105 |
+
### ์ฐธ๊ณ ์ฌํญ
|
| 106 |
+
- ํ๊ธ ์
๋ ฅ์ ์ง์๋์ง๋ง ๋ชจ๋ธ์ด ์์ด์ ์ต์ ํ๋์ด ์์ด ๊ฒฐ๊ณผ๊ฐ ์ ํ์ ์ผ ์ ์์
|
| 107 |
+
- ๊ฐ๋(Strength) ๊ฐ์ด 2.5 ์ด์์ผ ๊ฒฝ์ฐ ๋ถ์์ ํ ์ ์์
|
| 108 |
+
- ์ค๊ฐ ์ด๋ฏธ์ง ์๊ฐ ๋ง์์๋ก ๋ ๋ถ๋๋ฌ์ด ์ ํ ํจ๊ณผ๋ฅผ ์ป์ ์ ์์
|