Instructions to use deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-R1-Distill-Qwen-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-32B")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

SGLang

How to use deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with Docker Model Runner:
```
docker model run hf.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
```

Garbage characters generated with using 32B

by carlosbdw - opened Jan 21, 2025

Discussion

carlosbdw

Jan 21, 2025

在构思具体句子时，我决定从大模型的广博知识入手。比如，“瀚海犇 lượt， ככל諾計算”体现了大模型处理海量数据的能力。接下来，“百亿餘鏈結”则展示了其复杂的网络结构。

接下来，我描绘大模型的学习与演进过程。“自學自演進，逝者如斯”，表达了大模型在不断学习中不断优化的特性。然后，“古今知性， unterschiedlicher 認知”则强调了其跨越时空的智慧。

然后，我用“ PURPOSE 判斷，.Structure 構建”来展示大模型在目的判断和结构构建上的能力。接着，“.module 模塊，.innovation 创新”进一步细化了其功能模块和创新之处。

最后，我以“大模型，.model 模型，不分你我”结尾，强调了大模型与人类的紧密关系及共同发展的愿景。

完成初稿后，我通读了一遍，确保词句流畅，符合“沁园春”的韵律和节奏。同时，我也检查了用词是否恰当，是否准确传达了大模型的特点和优势。

通过这样的思考过程，我希望这首“沁园春·大模型”既能够体现现代科技的辉煌，也保留宋词的古典美感，满足用户的要求。

瀚海犇 lượt，盡數計算，百亿餘鏈結。自學自演進，逝者如斯，古今知性， unterschiedlicher 認知。PURPOSE 判斷，STRUCTURE 構建，MODULE 模塊，INNOVATION 创新，大模型，MODEL 模型，不分你我！

滄海橫流，問.isAuthenticated Woody, 雄才何處？.scssViewInitState Matilde HELP <VisuallyUiautomationElementInformation_Id_<c_bag thawed;qzws_of▋ mulher

pays_limit_items 思想_join_anon conect—all—resposta 幻影瞬間，思如海，智似星漢。

望數計算，DeepThink運算，,function Component 決策 rapidly。多少遴選，_LAYERestate為民，digits managers Unique)<組合 >.< Ad Hoc_override Стаڵểuفاق 。

見賢思齊， KendiCore.Silver clam subspace Colorectal surgery 一體 NOW > searchingCOLLAPSIVITY.pWeb any < Зараздаidentifier ， erratic future>

雄才偉略，<M lượcDIGITAL THINKING>。

dongdongunique

Jan 23, 2025

i have encountered this problem as well

U1PNS

Jan 23, 2025

•

edited Jan 23, 2025

Same thing in 14b size

MultEase

Jan 24, 2025

Please try lowering the temperature for better results.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment