LongD-CLIP

Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation

📄 CVPR 2025

This repository provides resources for our CVPR 2025 paper:
"Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation"

🔍 Introduction

Our work focuses on improving CLIP’s ability to handle long-text inputs while retaining its original knowledge.
We propose a Dual-Teacher Distillation framework that:

Retains knowledge from the original CLIP,
Enhances long-text representations through teacher guidance,

This work extends the research line of Long-CLIP and further advances long-text representation learning in multimodal models.
👉 The implementation can also refer to LongD-CLIP.

🚀 Resources

Paper: CVPR 2025 proceedings
Model Weights: Hugging Face – LongD-CLIP
Related Codebase: Long-CLIP

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support