Instructions to use iamplus/mpt-30b-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use iamplus/mpt-30b-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="iamplus/mpt-30b-v2", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("iamplus/mpt-30b-v2", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("iamplus/mpt-30b-v2", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use iamplus/mpt-30b-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "iamplus/mpt-30b-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "iamplus/mpt-30b-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/iamplus/mpt-30b-v2
- SGLang
How to use iamplus/mpt-30b-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "iamplus/mpt-30b-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "iamplus/mpt-30b-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "iamplus/mpt-30b-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "iamplus/mpt-30b-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use iamplus/mpt-30b-v2 with Docker Model Runner:
docker model run hf.co/iamplus/mpt-30b-v2
Impressive achievement
This is one epoch of 3 million gpt3.5 ?
Can you please share the command line and configuration files you use to train this? Hyperparameters?
What tool you use?
What prompt format?
Did you include the system prompts in the training or just the questions and answers?
As orca is 4 epochs of Gpt3.5 and 4 epochs of gpt4, this isn't enough but it's a good start and it's possible I could train the rest starting from this model instead of starting from scratch
I used this
SYSTEM: [system prompt]
USER: [prompt]
ASSISTANT:
Hey Eric, thanks for the dataset. Sure, will update everything in the model card.
Used MosaicML's llm-foundry for finetuning, and this is the config yaml i used : https://huggingface.co/manojpreveen/mpt-30b-orca-v2/blob/main/mpt-30b_orca.yaml
My effort is called Dolphin and we haven't collaborated, there are differences here.
Ie I use the Microsoft recipe of 4 epochs of gpt3.5, 4 epochs of gpt4, and the prompt template is like:
SYSTEM: <system prompt>
USER: <prompt>
ASSISTANT:
Could please name it uniquely? For example you could call it porpoise, or manojdolphin or something? Because people will be confused. I do intend to and am currently working on releasing an official full dolphin-mpt-30b with all epochs and with the same prompt format as all the other dolphin releases.
Or if you prefer not to rename it at least it would be appreciated if you mention "this model is not part of ehartford's dolphin series of models, but it's derived from ehartford's dolphin dataset"
Cool, no worries.
Much appreciated