File size: 634 Bytes
97fdb99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---

license: mit
---

This repository serves as the official model zoo for **Let ViT Speak: Generative Language-Image Pre-training**.

## Currently released models

1. Mdels from fixed low resolution pretraining:
- GenLIP-L16-224
- GenLIP-So16-224
- GenLIP-g16-224

2. NaViT models:
- GenLIP-L16-NaViT
- GenLIP-So16-NaViT
- GenLIP-g16-NaViT

We use siglip image preprocessor for our fixed low resolution models (\*-224), and use a Qwen2-VL style image preprocessor for our NaViT models (*-NaViT).

Pretraining and implementation details can be found in our codebase [[GenLIP](https://github.com/YanFangCS/GenLIP)].