LongD-CLIP
Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation
π CVPR 2025
This repository provides resources for our CVPR 2025 paper:
"Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation"
π Introduction
Our work focuses on improving CLIPβs ability to handle long-text inputs while retaining its original knowledge.
We propose a Dual-Teacher Distillation framework that:
- Retains knowledge from the original CLIP,
- Enhances long-text representations through teacher guidance,
This work extends the research line of Long-CLIP and further advances long-text representation learning in multimodal models.
π The implementation can also refer to LongD-CLIP.
π Resources
- Paper: CVPR 2025 proceedings
- Model Weights: Hugging Face β LongD-CLIP
- Related Codebase: Long-CLIP
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support