File size: 634 Bytes
97fdb99 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ---
license: mit
---
This repository serves as the official model zoo for **Let ViT Speak: Generative Language-Image Pre-training**.
## Currently released models
1. Mdels from fixed low resolution pretraining:
- GenLIP-L16-224
- GenLIP-So16-224
- GenLIP-g16-224
2. NaViT models:
- GenLIP-L16-NaViT
- GenLIP-So16-NaViT
- GenLIP-g16-NaViT
We use siglip image preprocessor for our fixed low resolution models (\*-224), and use a Qwen2-VL style image preprocessor for our NaViT models (*-NaViT).
Pretraining and implementation details can be found in our codebase [[GenLIP](https://github.com/YanFangCS/GenLIP)].
|