Instructions to use StentorLabs/Stentor2-12M-Preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use StentorLabs/Stentor2-12M-Preview with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="StentorLabs/Stentor2-12M-Preview")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("StentorLabs/Stentor2-12M-Preview")
model = AutoModelForCausalLM.from_pretrained("StentorLabs/Stentor2-12M-Preview")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use StentorLabs/Stentor2-12M-Preview with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "StentorLabs/Stentor2-12M-Preview"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StentorLabs/Stentor2-12M-Preview",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/StentorLabs/Stentor2-12M-Preview

SGLang

How to use StentorLabs/Stentor2-12M-Preview with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "StentorLabs/Stentor2-12M-Preview" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StentorLabs/Stentor2-12M-Preview",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "StentorLabs/Stentor2-12M-Preview" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StentorLabs/Stentor2-12M-Preview",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use StentorLabs/Stentor2-12M-Preview with Docker Model Runner:
```
docker model run hf.co/StentorLabs/Stentor2-12M-Preview
```

Welcome and Guidelines

by StentorLabs - opened Feb 25

Discussion

StentorLabs

Owner Feb 25

Welcome to the official discussion hub for Stentor-12M!
We are excited to have you here. This space is dedicated to sharing feedback, asking questions, and collaborating on the development and application of the Stentor-12M model. Whether you are using it for research, fine-tuning it for a specific task, or deploying it on edge devices, your input is invaluable.
To ensure this community remains a helpful and productive space for everyone, please follow these guidelines:
🌟 How to Use This Discussion Board
Questions & Support: If you’re having trouble running the model or need help with implementation, please check the Model Card first. If your question hasn't been answered, feel free to start a new thread.
Showcase Your Work: Did you fine-tune Stentor-12M on a unique dataset? Are you using it in a cool project? We’d love to see it! Share your results and links to your spaces or repos.
Feature Requests & Feedback: As a 12M parameter model, we are constantly looking for ways to optimize its performance. Let us know what features or architectural improvements you'd like to see.
📜 Community Guidelines
Be Respectful: Maintain a professional and welcoming tone. We are all here to learn.
Search Before Posting: Before opening a new topic, please use the search bar to see if your question has already been answered.
Provide Context: When reporting a bug or unexpected behavior, please include:
The environment you are using (e.g., Transformers version, hardware).[1]
A minimal code snippet to reproduce the issue.
Expected vs. actual results.[2]
Follow the HF Code of Conduct: We adhere to the Hugging Face Community Code of Conduct.
Report Misuse: If you find any safety concerns or misuse of the model, please use the "Report" button or open a private issue.
Thank you for being part of the StentorLabs journey. Let’s build something great together!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment