European LLMs
Collection
Large language models for European languages (multilingual and monolingual) • 13 items • Updated • 3
How to use malteos/hermeo-7b with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="malteos/hermeo-7b") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("malteos/hermeo-7b")
model = AutoModelForCausalLM.from_pretrained("malteos/hermeo-7b")How to use malteos/hermeo-7b with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "malteos/hermeo-7b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "malteos/hermeo-7b",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/malteos/hermeo-7b
How to use malteos/hermeo-7b with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "malteos/hermeo-7b" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "malteos/hermeo-7b",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "malteos/hermeo-7b" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "malteos/hermeo-7b",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use malteos/hermeo-7b with Docker Model Runner:
docker model run hf.co/malteos/hermeo-7b
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("malteos/hermeo-7b")
model = AutoModelForCausalLM.from_pretrained("malteos/hermeo-7b")Hermes + Leo = Hermeo
A German-English language model merged from DPOpenHermes-7B-v2 and leo-mistral-hessianai-7b-chat using mergekit. Both base models are fine-tuned versions of Mistral-7B-v0.1.
You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we set a seed for reproducibility:
>>> from transformers import pipeline, set_seed
>>> generator = pipeline('text-generation', model='malteos/hermeo-7b')
>>> set_seed(42)
>>> generator("Hallo, Ich bin ein Sprachmodell,", max_length=40, num_return_sequences=1)
[{'generated_text': 'Hallo, Ich bin ein Sprachmodell, das dir bei der Übersetzung von Texten zwischen Deutsch und Englisch helfen kann. Wenn du mir einen Text in Deutsch'}]
The evaluation methdology of the Open LLM Leaderboard is followed.
| German tasks: | MMLU-DE | Hellaswag-DE | ARC-DE | Average |
|---|---|---|---|---|
| Models / Few-shots: | (5 shots) | (10 shots) | (24 shots) | |
| 7B parameters | ||||
| llama-2-7b | 0.400 | 0.513 | 0.381 | 0.431 |
| leo-hessianai-7b | 0.400 | 0.609 | 0.429 | 0.479 |
| bloom-6b4-clp-german | 0.274 | 0.550 | 0.351 | 0.392 |
| mistral-7b | 0.524 | 0.588 | 0.473 | 0.528 |
| leo-mistral-hessianai-7b | 0.481 | 0.663 | 0.485 | 0.543 |
| leo-mistral-hessianai-7b-chat | 0.458 | 0.617 | 0.465 | 0.513 |
| DPOpenHermes-7B-v2 | 0.517 | 0.603 | 0.515 | 0.545 |
| hermeo-7b (this model) | 0.511 | 0.668 | 0.528 | 0.569 |
| 13B parameters | ||||
| llama-2-13b | 0.469 | 0.581 | 0.468 | 0.506 |
| leo-hessianai-13b | 0.486 | 0.658 | 0.509 | 0.551 |
| 70B parameters | ||||
| llama-2-70b | 0.597 | 0.674 | 0.561 | 0.611 |
| leo-hessianai-70b | 0.653 | 0.721 | 0.600 | 0.658 |
| English tasks: | MMLU | Hellaswag | ARC | Average |
|---|---|---|---|---|
| Models / Few-shots: | (5 shots) | (10 shots) | (24 shots) | |
| llama-2-7b | 0.466 | 0.786 | 0.530 | 0.594 |
| leolm-hessianai-7b | 0.423 | 0.759 | 0.522 | 0.568 |
| bloom-6b4-clp-german | 0.264 | 0.525 | 0.328 | 0.372 |
| mistral-7b | 0.635 | 0.832 | 0.607 | 0.691 |
| leolm-mistral-hessianai-7b | 0.550 | 0.777 | 0.518 | 0.615 |
| hermeo-7b (this model) | 0.601 | 0.821 | 0.620 | 0.681 |
Prompt dialogue template (ChatML format):
"""
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""
The model input can contain multiple conversation turns between user and assistant, e.g.
<|im_start|>user
{prompt 1}<|im_end|>
<|im_start|>assistant
{reply 1}<|im_end|>
<|im_start|>user
{prompt 2}<|im_end|>
<|im_start|>assistant
(...)
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="malteos/hermeo-7b")