Instructions to use imgailab/flux1-dev-bf16-ampere with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use imgailab/flux1-dev-bf16-ampere with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("imgailab/flux1-dev-bf16-ampere", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - TensorRT
How to use imgailab/flux1-dev-bf16-ampere with TensorRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
Flux.1-dev TensorRT-RTX BF16 Ampere
TensorRT-RTX optimized engines for Flux.1-dev on NVIDIA Ampere architecture (RTX 30 series, A100, etc.) with BF16 precision.
Model Details
- Base Model: black-forest-labs/FLUX.1-dev
- Architecture: AMPERE (Compute Capability 8.6)
- Precision: BF16 (16-bit brain floating point)
- TensorRT-RTX Version: 1.0.0.21
- Image Resolution: 1024x1024
- Batch Size: 1 (static)
Engine Files
This repository contains 4 TensorRT engine files:
clip.plan- CLIP text encodert5.plan- T5 text encodertransformer.plan- Flux transformer modelvae.plan- VAE decoder
Total Size: 17.3GB
Hardware Requirements
- NVIDIA RTX 30 series (RTX 3080, 3090) or A100
- Compute Capability 8.6
- Minimum 24GB VRAM recommended
- TensorRT-RTX 1.0.0.21 runtime
Usage
# Example usage with TensorRT-RTX backend
from nvidia_demos.TensorRT_RTX.demo.flux1_dev.pipelines.flux_pipeline import FluxPipeline
pipeline = FluxPipeline(
cache_dir="./cache",
hf_token="your_hf_token"
)
# Load pre-built engines
pipeline.load_engines(
transformer_precision="bf16",
opt_batch_size=1,
opt_height=1024,
opt_width=1024
)
# Generate image
image = pipeline.infer(
prompt="A beautiful landscape with mountains",
height=1024,
width=1024
)
Performance
- Inference Speed: ~8-12 seconds per image (RTX 3090)
- Memory Usage: ~18-20GB VRAM
- Optimizations: Static shapes, BF16 precision, Ampere-specific kernels
License
This model follows the Flux.1-dev license terms. Please refer to the original model repository for licensing details.
Built With
- TensorRT-RTX 1.0.0.21
- NVIDIA Flux Demo
- Built on NVIDIA GeForce RTX 3090 (Ampere 8.6)
- Downloads last month
- -
Model tree for imgailab/flux1-dev-bf16-ampere
Base model
black-forest-labs/FLUX.1-dev