GigaChat-20B-A3B-instruct
ΠΠΈΠ°Π»ΠΎΠ³ΠΎΠ²Π°Ρ ΠΌΠΎΠ΄Π΅Π»Ρ ΠΈΠ· ΡΠ΅ΠΌΠ΅ΠΉΡΡΠ²Π° ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ GigaChat, ΠΎΡΠ½ΠΎΠ²Π½Π°Ρ Π½Π° GigaChat-20B-A3B-base. ΠΠΎΠ΄Π΄Π΅ΡΠΆΠΈΠ²Π°Π΅Ρ ΠΊΠΎΠ½ΡΠ΅ΠΊΡΡ Π² 131 ΡΡΡΡΡΡ ΡΠΎΠΊΠ΅Π½ΠΎΠ².
This repository contains the instructed model of GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture.
ΠΠΎΠ»ΡΡΠ΅ ΠΏΠΎΠ΄ΡΠΎΠ±Π½ΠΎΡΡΠ΅ΠΉ Π² Ρ Π°Π±Ρ ΡΡΠ°ΡΡΠ΅.
ΠΠ»Ρ Π΄Π°Π½Π½ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΡΠ°ΠΊΠΆΠ΅ Π΄ΠΎΡΡΡΠΏΠ½Ρ Π²Π΅ΡΠ° Π² bf16 ΠΈ int8
Upd. ΠΠ΅ΡΠ΅Π·Π°Π»ΠΈΠ»ΠΈ Π²Π΅ΡΠ° Π² .safetensors
ΠΠ΅Π½ΡΠΌΠ°ΡΠΊΠΈ
| T-lite-instruct-0.1 (llama 3.0 8B based) |
gemma-2-9b-it | GigaChat-20B-A3B-instruct | |
|---|---|---|---|
| MERA | 0.335 | 0.392 | 0.513 |
| ru-MMLU 5-shot | 0.555 | 0.625 | 0.598 |
| Shlepa | 0.36 | 0.388 | 0.482 |
Π‘Π΅ΠΌΠ΅ΠΉΡΡΠ²ΠΎ GigaChat
| GigaChat-20B-A3B-instruct | GigaChat-Pro v26.20 | GigaChat-Max v26.20 | |
|---|---|---|---|
| ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ Π·Π°Π΄Π°ΡΠΈ | |||
| GSM8K 5-shot | 0,763 | 0,782 | 0,929 |
| MATH 4-shot | 0,426 | 0,446 | 0,53 |
| ΠΠ°ΠΏΠΈΡΠ°Π½ΠΈΠ΅ ΠΊΠΎΠ΄Π° | |||
| HumanEval 0-shot | 0,329 | 0,439 | 0,64 |
| MBPP 0-shot | 0,385 | 0,487 | 0,667 |
| ΠΠ±ΡΠΈΠ΅ Π·Π½Π°Π½ΠΈΡ | |||
| MMLU EN 5-shot | 0,648 | 0,687 | 0,804 |
| MMLU RU 5-shot ΠΠ΅ΡΠ΅Π²Π΅Π΄Π΅Π½Π½ΡΠ΅ Π΄Π°Π½Π½ΡΠ΅ ΠΈΠ· MMLU EN 5-shot |
0,598 | 0,645 | 0,75 |
| MMLU RU 1-shot | β | 0,617 | 0,718 |
| MMLU PRO EN 5-shot | 0,348 | 0,431 | 0,589 |
| RUBQ 0-shot | 0,675 | 0,724 | 0,73 |
| WINOGRANDE 4-shot | 0,75 | 0,796 | 0,832 |
| CyberMetric 0-shot | 0,798 | 0,827 | 0,864 |
| Π‘Π»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠ΅ ΠΈΠ½ΡΡΡΡΠΊΡΠΈΡΠΌ | |||
| IFEval 0-shot | 0,411 | 0,566 | 0,721 |
ΠΡΠΎΠ±Π΅Π½Π½ΠΎΡΡΠΈ Π·Π°ΠΌΠ΅ΡΠΎΠ²
GSM8k β ΡΡΠΎ ΡΠ΅ΡΡ, ΠΊΠΎΡΠΎΡΡΠΉ ΠΏΡΠΎΠ²Π΅ΡΡΠ΅Ρ, ΠΊΠ°ΠΊ Ρ ΠΎΡΠΎΡΠΎ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΌΠΎΠ³ΡΡ ΡΠ΅ΡΠ°ΡΡ Π·Π°Π΄Π°ΡΠΈ Ρ ΡΠΈΡΠ»Π°ΠΌΠΈ. Π Π½Π°ΡΠ΅ΠΌ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΈ ΠΌΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π»ΠΈ 5 ΡΠΎΡΠΎΠ², ΡΡΠΎΠ±Ρ ΠΎΡΠ΅Π½ΠΈΡΡ ΠΌΠΎΠ΄Π΅Π»Ρ, ΠΈ ΡΠΌΠΎΡΡΠ΅Π»ΠΈ Π½Π° ΠΏΠΎΡΠ»Π΅Π΄Π½Π΅Π΅ ΡΠΈΡΠ»ΠΎ Π² ΠΎΡΠ²Π΅ΡΠ΅. Π ΠΎΡΠΈΠ³ΠΈΠ½Π°Π»ΡΠ½ΠΎΠ΅ ΡΠ΅ΡΡΠ΅ ΠΎΡΠ²Π΅Ρ ΠΈΡΠ΅ΡΡΡ ΠΏΠΎ ΡΠ°Π±Π»ΠΎΠ½Ρ: β### ΡΠΈΡΠ»ΠΎβ.Π’Π΅ΡΡ Math ΡΠΎΠΆΠ΅ ΠΈΠΌΠ΅Π΅Ρ ΡΠ°Π·Π½ΡΠ΅ Π²Π΅ΡΡΠΈΠΈ, ΠΊΠΎΡΠΎΡΡΠ΅ ΠΏΡΠΎΠ²Π΅ΡΡΡΡ ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΡΠΏΠΎΡΠΎΠ±Π½ΠΎΡΡΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ. Π Π½Π°ΡΠ΅ΠΌ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΈ ΠΌΡ Π΄Π°Π²Π°Π»ΠΈ 4 ΠΏΡΠΈΠΌΠ΅ΡΠ° ΠΈ ΡΠΌΠΎΡΡΠ΅Π»ΠΈ Π½Π° ΠΏΠΎΡΠ»Π΅Π΄Π½Π΅Π΅ Π²ΡΡΠ°ΠΆΠ΅Π½ΠΈΠ΅ Π² ΡΠΎΡΠΌΠ°ΡΠ΅ '\boxed{expression}'. ΠΠ°ΡΠ΅ΠΌ ΠΎΡΠ΅Π½ΠΈΠ²Π°Π»ΠΈ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ Π½Π° ΡΠΎΠ²ΠΏΠ°Π΄Π΅Π½ΠΈΠ΅ Ρ ΠΏΠΎΠΌΠΎΡΡΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ sympy.
Requirements
transformers>=4.47
ΠΡΠΈΠΌΠ΅Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ ΡΠ΅ΡΠ΅Π· transformers
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
model_name = "ai-sage/GigaChat-20B-A3B-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto")
model.generation_config = GenerationConfig.from_pretrained(model_name)
messages = [
{"role": "user", "content": "ΠΠΎΠΊΠ°ΠΆΠΈ ΡΠ΅ΠΎΡΠ΅ΠΌΡ ΠΎ Π½Π΅ΠΏΠΎΠ΄Π²ΠΈΠΆΠ½ΠΎΠΉ ΡΠΎΡΠΊΠ΅"}
]
input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(input_tensor.to(model.device))
result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=False)
print(result)
ΠΡΠΈΠΌΠ΅Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ ΡΠ΅ΡΠ΅Π· vLLM
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
model_name = "ai-sage/GigaChat-20B-A3B-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
llm = LLM(model=model_name, trust_remote_code=True)
sampling_params = SamplingParams(temperature=0.3, max_tokens=8192)
messages_list = [
[{"role": "user", "content": "ΠΠΎΠΊΠ°ΠΆΠΈ ΡΠ΅ΠΎΡΠ΅ΠΌΡ ΠΎ Π½Π΅ΠΏΠΎΠ΄Π²ΠΈΠΆΠ½ΠΎΠΉ ΡΠΎΡΠΊΠ΅"}],
]
prompt_token_ids = [tokenizer.apply_chat_template(messages, add_generation_prompt=True) for messages in messages_list]
outputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)
generated_text = [output.outputs[0].text for output in outputs]
print(generated_text)
Π GigaChat-20B-A3B-instruct ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΡΡΡ ΠΎΡΠΎΠ±ΡΠΉ ΡΠΏΠΎΡΠΎΠ± ΡΠΎΠΊΠ΅Π½ΠΈΠ·Π°ΡΠΈΠΈ ΡΠ΅ΠΊΡΡΠ°, ΠΏΠΎΡΡΠΎΠΌΡ Π½Π΅ ΡΠ΅ΠΊΠΎΠΌΠ΅Π½Π΄ΡΠ΅ΡΡΡ ΡΠ»Π΅Π΄ΡΡΡΠΈΠΉ ΡΡΠ΅Π½Π°ΡΠΈΠΉ
input_string = tokenizer.apply_chat_template(messages,tokenize=False, add_generation_prompt=True)
input_tensor = tokenizer(input_string, return_tensors="pt")
ΠΡΠΈΠΌΠ΅Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ vLLM server
ΠΠ°ΠΏΡΡΠΊ ΡΠ΅ΡΠ²Π΅ΡΠ°
vllm serve ai-sage/GigaChat-20B-A3B-instruct \
--disable-log-requests \
--trust_remote_code \
--dtype bfloat16 \
--max-seq-len 8192
ΠΡΠΈΠΌΠ΅Ρ Π·Π°ΠΏΡΠΎΡΠ°
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ai-sage/GigaChat-20B-A3B-instruct" ,
"messages": [
{"role": "system", "content": "Π’Ρ ΠΠ§ΠΠΠ¬ ΡΠΌΠ½ΡΠΉ ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊ"},
{"role": "user", "content": "ΠΠΎΠΊΠ°ΠΆΠΈ ΡΠ΅ΠΎΡΠ΅ΠΌΡ ΠΎ Π½Π΅ΠΏΠΎΠ΄Π²ΠΈΠΆΠ½ΠΎΠΉ ΡΠΎΡΠΊΠ΅"}
]
}'
- Downloads last month
- 134
Model tree for ai-sage/GigaChat-20B-A3B-instruct
Base model
ai-sage/GigaChat-20B-A3B-base