Instructions to use vanta-research/wraith-coder-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use vanta-research/wraith-coder-7b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="vanta-research/wraith-coder-7b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("vanta-research/wraith-coder-7b")
model = AutoModelForCausalLM.from_pretrained("vanta-research/wraith-coder-7b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use vanta-research/wraith-coder-7b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "vanta-research/wraith-coder-7b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vanta-research/wraith-coder-7b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/vanta-research/wraith-coder-7b

SGLang

How to use vanta-research/wraith-coder-7b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "vanta-research/wraith-coder-7b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vanta-research/wraith-coder-7b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "vanta-research/wraith-coder-7b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vanta-research/wraith-coder-7b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use vanta-research/wraith-coder-7b with Docker Model Runner:
```
docker model run hf.co/vanta-research/wraith-coder-7b
```

wraith-coder-7b / LICENSE.md

Tyler Williams

Initial commit: Wraith Coder 7B - Concise code assistant via iterative fine-tuning

cc49567 7 months ago

preview code

raw

history blame contribute delete

1.61 kB

License

Model License

This model is licensed under the Qwen License Agreement as it is derived from Qwen2.5-Coder-7B-Instruct.

The original Qwen2.5-Coder license permits:

Commercial use
Modification and derivative works
Distribution with attribution

Full license text: https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/LICENSE

Training Data License

Training datasets include:

Apollo V2.3 (various subsets)
Centauri coding datasets
Custom persona and reasoning datasets

Dataset licenses vary by source. Users should review individual dataset licenses for compliance requirements.

Attribution

When using this model, please cite:

@misc{wraith-coder-7b-2024,
  author = {Vanta},
  title = {Wraith Coder 7B: Concise Code Assistant via Iterative Fine-Tuning},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/vanta/wraith-coder-7b}}
}

And cite the original Qwen2.5-Coder model:

@misc{qwen2.5-coder-2024,
  title={Qwen2.5-Coder Technical Report}, 
  author={Qwen Team},
  year={2024},
  publisher={Alibaba Cloud},
  howpublished={\url{https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct}}
}

Disclaimer

This model is provided "as is" without warranties of any kind. Users are responsible for:

Validating outputs for production use
Ensuring compliance with applicable laws and regulations
Reviewing generated code for security vulnerabilities
Testing in appropriate environments before deployment

The authors and contributors assume no liability for damages arising from model use.