Video-Text-to-Text
Transformers
Safetensors
English
llava
text-generation
multimodal
Eval Results (legacy)
Instructions to use lmms-lab/LLaVA-Video-7B-Qwen2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lmms-lab/LLaVA-Video-7B-Qwen2 with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForCausalLM processor = AutoProcessor.from_pretrained("lmms-lab/LLaVA-Video-7B-Qwen2") model = AutoModelForCausalLM.from_pretrained("lmms-lab/LLaVA-Video-7B-Qwen2") - Notebooks
- Google Colab
- Kaggle
force toi de travailler dur
#14 opened about 1 year ago
by
ibugueye
max_frames_num use vram gpu?
#13 opened over 1 year ago
by
n3xt1lxs
Batch Inference with LLaVA-Video generates '!' for samples with smaller prompt
#12 opened over 1 year ago
by
rranjan1
Issue with `time_instruction` variable in README.md example
#11 opened over 1 year ago
by
seongwoncho98
size mismatch for vision_model
4
#10 opened over 1 year ago
by
XiaoHangjia
Add more packages to "pip install"
#9 opened over 1 year ago
by
nogayevnuras
Missing steps
9
#8 opened over 1 year ago
by
Martins6
Difference between 7B-DPO and 7B-Qwen2
1
#7 opened over 1 year ago
by
RachelZhou
There are some modules missing like 'einops'
4
#6 opened over 1 year ago
by deleted
Update pipeline tag
#4 opened over 1 year ago
by
nielsr
Bfloat16 problem
1
#2 opened over 1 year ago
by
Aniel99
Is this a newer/better model than OneVision?
3
#1 opened over 1 year ago
by
ehayes-haiper