Instructions to use Yhyu13/LMCocktail-10.7B-v1-function-calling with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Yhyu13/LMCocktail-10.7B-v1-function-calling with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Yhyu13/LMCocktail-10.7B-v1-function-calling")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Yhyu13/LMCocktail-10.7B-v1-function-calling")
model = AutoModelForCausalLM.from_pretrained("Yhyu13/LMCocktail-10.7B-v1-function-calling", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Yhyu13/LMCocktail-10.7B-v1-function-calling with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Yhyu13/LMCocktail-10.7B-v1-function-calling"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yhyu13/LMCocktail-10.7B-v1-function-calling",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Yhyu13/LMCocktail-10.7B-v1-function-calling

SGLang

How to use Yhyu13/LMCocktail-10.7B-v1-function-calling with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Yhyu13/LMCocktail-10.7B-v1-function-calling" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yhyu13/LMCocktail-10.7B-v1-function-calling",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Yhyu13/LMCocktail-10.7B-v1-function-calling" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yhyu13/LMCocktail-10.7B-v1-function-calling",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Yhyu13/LMCocktail-10.7B-v1-function-calling with Docker Model Runner:
```
docker model run hf.co/Yhyu13/LMCocktail-10.7B-v1-function-calling
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

license: apache-2.0

LMCocktail-10.7B function calling

This is a merged model of the https://huggingface.co/Yhyu13/LMCocktail-10.7B-v1 and sft function calling lora here

This model is acompanied with pr on textgen-webui to enable its function calling ability like GPTs: Add function calling ability to openai extension

Caution

This model is recommanded over my ph-2 variant since this model has more grounding ability on following function calling prompt

This model is by far my 2nd best function calling model, it can achieve 9/10 success on openai function calling cook book

Also, checkout my best function calling model here: https://huggingface.co/Yhyu13/dolphin-2.6-mistral-7b-dpo-laser-function-calling

Detail

The function calling is wrapped in simple xml tag for eaiser identification.

<functioncall> {\"name\": \"calculate_loan_payment\", \"arguments\": '{\"principal\": 50000, \"interest_rate\": 5, \"loan_term\": 10}'} </functioncall>

that can be extracted like this

import re
import json

input_str = "<functioncall> {\"name\": \"calculate_loan_payment\", \"arguments\": '{\"principal\": 50000, \"interest_rate\": 5, \"loan_term\": 10}'} </functioncall>"

# Define the pattern to match the JSON string within the functioncall tags
pattern = r'<functioncall>(.*?)</functioncall>'

# Use re.search to find the matched pattern
match = re.search(pattern, input_str, re.DOTALL)

if match:
    json_str = match.group(1)
    # Remove the single quotes surrounding the inner JSON string
    json_str = json_str.replace("'", "")
    
    # Load the JSON string into a Python dictionary
    json_data = json.loads(json_str)
    print(json_data)
else:
    print("No match found.")

Or, if you want to faithfully keep the single quotes that wrapps the arguments value (where openai does it like this, which makes json.loads fail shortly on the original json_str), use ast.literal_eval for the rescue.

if match:
    import ast
    json_str = match.group(1)
    json_str = json_str.strip()
    """
    https://www.datasciencebyexample.com/2023/03/16/what-to-do-when-single-quotes-in-json-string/
    """
    json_dict = ast.literal_eval(json_str)
    print(json_dict['name'], json_dict['arguments'])
else:
    print("No match found.")

Hopefully, this model can be a drop-in replacement for apps (e.g. memgpt) that require function calling ability from LLMs.

Another note on interpreting function call result:

Function response has been put between <functionresponse> in order to be identified as a function call result (which could be evaluted behind the scene, and its result in principle should be interpreted as part of the user input), which then will be processed by the assistant for form a conversational response.

<functionresponse> jons_str </functionresponse>

Downloads last month: 12

Safetensors

Model size

11B params

Tensor type

F16

Model tree for Yhyu13/LMCocktail-10.7B-v1-function-calling

Quantizations

2 models

Yhyu13
/

LMCocktail-10.7B-v1-function-calling

license: apache-2.0

LMCocktail-10.7B function calling

Caution

Detail

Model tree for Yhyu13/LMCocktail-10.7B-v1-function-calling

Spaces using Yhyu13/LMCocktail-10.7B-v1-function-calling 9