Instructions to use deepseek-ai/deepseek-coder-6.7b-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use deepseek-ai/deepseek-coder-6.7b-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="deepseek-ai/deepseek-coder-6.7b-instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use deepseek-ai/deepseek-coder-6.7b-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "deepseek-ai/deepseek-coder-6.7b-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/deepseek-coder-6.7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/deepseek-ai/deepseek-coder-6.7b-instruct

SGLang

How to use deepseek-ai/deepseek-coder-6.7b-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "deepseek-ai/deepseek-coder-6.7b-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/deepseek-coder-6.7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "deepseek-ai/deepseek-coder-6.7b-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/deepseek-coder-6.7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use deepseek-ai/deepseek-coder-6.7b-instruct with Docker Model Runner:
```
docker model run hf.co/deepseek-ai/deepseek-coder-6.7b-instruct
```

Confirming the EOS token? 32021 or 32014? Or both?

by TheBloke - opened Nov 5, 2023

Discussion

TheBloke

Nov 5, 2023

I'm having issues with my GGUF quantisations where the model won't stop generating, and generates endless <|EOT|> tokens.

I made the GGUF with special tokens set as per tokenizer_config.json, ie EOS is set to token ID 32014

But in your README I realised you're actually setting it to 32021 for the Instruct models?

# 32021 is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=32021)

So I just wanted to double check that for Instruct, EOS should be set to 32021 and that tokenizer_config.json is wrong in this regard?

Is there a reason that tokenizer_config.json and config.json don't have EOS set to 32021, but rather to 32014? Would you consider changing that, or do other aspects of model generation depend on 32014?

Thanks

deepseek-admin

DeepSeek org Nov 5, 2023

For instruct model, the eos_id is 32021, i.e. <|EOT|> token. For base model, the eos_id is 32014, i.e. . We will reset the eos_id for different models. Thanks for your pointing it.

TheBloke

Nov 5, 2023

Great, thank you for confirming that quickly.

I will re-make all my Instruct GGUF files once you've been able to update the tokenizer config.

zqh11

DeepSeek org Nov 5, 2023

I have fixed the mistakes in the instruction models. Thanks!

TheBloke

Nov 5, 2023

Thanks very much - but could you do tokenizer_config.json also? Or I can do a PR if you like

Chester111 changed discussion status to closed Nov 9, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment