Instructions to use concedo/Phi-SoSerious-Mini-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use concedo/Phi-SoSerious-Mini-V1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="concedo/Phi-SoSerious-Mini-V1", trust_remote_code=True)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("concedo/Phi-SoSerious-Mini-V1", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("concedo/Phi-SoSerious-Mini-V1", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use concedo/Phi-SoSerious-Mini-V1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "concedo/Phi-SoSerious-Mini-V1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "concedo/Phi-SoSerious-Mini-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/concedo/Phi-SoSerious-Mini-V1

SGLang

How to use concedo/Phi-SoSerious-Mini-V1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "concedo/Phi-SoSerious-Mini-V1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "concedo/Phi-SoSerious-Mini-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "concedo/Phi-SoSerious-Mini-V1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "concedo/Phi-SoSerious-Mini-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use concedo/Phi-SoSerious-Mini-V1 with Docker Model Runner:
```
docker model run hf.co/concedo/Phi-SoSerious-Mini-V1
```

Phi-SoSerious-Mini-V1

Let's put a smile on that face!

This is a finetune of https://huggingface.co/microsoft/Phi-3-mini-4k-instruct trained on a variant of the Kobble Dataset. Training was done in under 4 hours on a single Nvidia RTX 3090 GPU with qLora (LR 1.2e-4, rank 16, alpha 16, batch size 3, gradient acc. 3, 2048 ctx).

You can obtain the GGUF quantization of this model here: https://huggingface.co/concedo/Phi-SoSerious-Mini-V1-GGUF

Dataset and Objectives

The Kobble Dataset is a semi-private aggregated dataset made from multiple online sources and web scrapes, augmented with some synthetic data. It contains content chosen and formatted specifically to work with KoboldAI software and Kobold Lite. The objective of this model was to produce a usable version of Phi-3-mini usable for storywriting, conversations and instructions, and without excess tendency for refusal.

Dataset Categories:

Instruct: Single turn instruct examples presented in the Alpaca format, with an emphasis on uncensored and unrestricted responses.
Chat: Two participant roleplay conversation logs in a multi-turn raw chat format that KoboldAI uses.
Story: Unstructured fiction excerpts, including literature containing various erotic and provocative content.

Prompt template: Alpaca

### Instruction:
{prompt}

### Response:

Note: No assurances will be provided about the origins, safety, or copyright status of this model, or of any content within the Kobble dataset.
If you belong to a country or organization that has strict AI laws or restrictions against unlabelled or unrestricted content, you are advised not to use this model.

Downloads last month: 9

Safetensors

Model size

4B params

Tensor type

F16