Instructions to use rob-x-ai/phi-2-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rob-x-ai/phi-2-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="rob-x-ai/phi-2-GGUF",
	filename="ggml-model-f16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use rob-x-ai/phi-2-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rob-x-ai/phi-2-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf rob-x-ai/phi-2-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rob-x-ai/phi-2-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf rob-x-ai/phi-2-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rob-x-ai/phi-2-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf rob-x-ai/phi-2-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rob-x-ai/phi-2-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rob-x-ai/phi-2-GGUF:Q4_K_M

Use Docker

docker model run hf.co/rob-x-ai/phi-2-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use rob-x-ai/phi-2-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rob-x-ai/phi-2-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rob-x-ai/phi-2-GGUF",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/rob-x-ai/phi-2-GGUF:Q4_K_M

Ollama
How to use rob-x-ai/phi-2-GGUF with Ollama:
```
ollama run hf.co/rob-x-ai/phi-2-GGUF:Q4_K_M
```

Unsloth Studio new

How to use rob-x-ai/phi-2-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rob-x-ai/phi-2-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rob-x-ai/phi-2-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rob-x-ai/phi-2-GGUF to start chatting

Docker Model Runner
How to use rob-x-ai/phi-2-GGUF with Docker Model Runner:
```
docker model run hf.co/rob-x-ai/phi-2-GGUF:Q4_K_M
```

Lemonade

How to use rob-x-ai/phi-2-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rob-x-ai/phi-2-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.phi-2-GGUF-Q4_K_M

List all available models

lemonade list

How to create in Ollama??

by iammahadev - opened Dec 16, 2023

Discussion

iammahadev

Dec 16, 2023

2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51186]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51187]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51188]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51189]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51190]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51191]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51192]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51193]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51194]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51195]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51196]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51197]
2023/12/16 21:21:42 parser.go:62: WARNING: Unknown command: [PAD51198]

Error: no FROM line for the model was specified

This is the error i am getting when i run ollama create phi-2-q4 -f ./phi-2_Q8_0.gguf

Any idea to create it

rob-x-ai

Owner Dec 16, 2023

•

edited Dec 16, 2023

Some modification will need to be done as this isn't yet merged into llamacpp : https://github.com/mrgraycode/llama.cpp/commit/12cc80cb8975aea3bc9f39d3c9b84f7001ab94c5#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348efR6239 but you can fork it.

iammahadev

Dec 16, 2023

Yeah but it is not working in ollama

rob-x-ai

Owner Dec 18, 2023

•

edited Dec 18, 2023

Support has been added to llama.cpp master so the ball is on Ollama now.

https://github.com/ggerganov/llama.cpp/commit/b9e74f9bca5fdf7d0a22ed25e7a9626335fdfa48

namankhator

Dec 19, 2023

LM Studio Beta is updated.
So I am trying there now!

Thanks!!

namankhator

Dec 19, 2023

Not a good model.

rob-x-ai

Owner Dec 23, 2023

I find it interesting for a only 3b parameters model you will soon be able to run anywhere. It won't do math or you prolly would have to implement a Chain of Thought in the prompts or external tools after processing.

sebubeck

Dec 29, 2023

@namankhator : thanks for the feedback! Please recall that this is a base completion model, so the format of your question really matters. When you give instruction I recommend using the format:

Instruct: YOUR INSTRUCTION
Output:

Moreover, for any kind of reasoning it's useful to add "Let's think step by step", even for easy questions. If you do both of those things, it works for your example.

namankhator

Dec 29, 2023

Hey @sebubeck

Thanks for the recommendations.
I believe Instruct and Output are already set. (attached image from LM Studio)

I tried to use the prompt you asked but it still did not work.

namankhator

Dec 29, 2023

I will try for tasks other than reasoning, and if need be will update.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment