flash attention not working with model

by XVII - opened Apr 9, 2025

Apr 9, 2025

If you try to use sentence transformers with flash_attention_2 you get error NameError: name '_flash_supports_window_size' is not defined
If you uncomment lines 49-53 in modeling_qwen.py everything woks fine.

Code to reproduce:

from sentence_transformers import SentenceTransformer
import torch

class InfRetrieverV1Embedder:
    def __init__(self):
        self.model = SentenceTransformer(
            "infly/inf-retriever-v1", 
            trust_remote_code=True,
            device='cuda',
            model_kwargs = {
                'attn_implementation': 'flash_attention_2',
                "torch_dtype": torch.bfloat16
            }
        )
        
        self.embedding_dims = 3584
        self.max_length = 4096
        self.batch_size = 8
        self.model_name =  "inf-retriever-v1"

        self.model.max_seq_length = self.max_length
        
    def encode(self, texts, mode='document'):
        assert mode in ('query', 'document')
        if mode=='document':
            res = self.model.encode(texts, batch_size=self.batch_size)
        else:
            res = self.model.encode(
                    texts, 
                    prompt="You are given code snippet with incomplete line. Retrieve relevant code snippets that help to complete this line.",
                    batch_size=self.batch_size
                )
        return res.tolist()
    
embedder = InfRetrieverV1Embedder()

load = ['def hello_world'*10000] * 256
embedder.encode(load)

Transformers 4.49.0 with flash attention 2.7.1post1 and 3.4.1 sentence-transformers

XVII

Apr 9, 2025

Is it safe to modify this code, or you have faced some hidden consequencef of using flash attention?

SamuelYang

inftech.ai org Apr 10, 2025

Is it safe to modify this code, or you have faced some hidden consequencef of using flash attention?

We commented out lines 49-53 just for convenience, to remove the dependency on flash_attn. You can safely uncomment those lines.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment