---
base_model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gpt_oss
license: apache-2.0
language:
- en
new_version: EpistemeAI/VibeCoder-20B-alpha-0.001
---
# Model card

# Test our endpoint
[FriendliAI](https://friendli.ai/suite/WTHFpZnt6oAT/VGDaGrYOXeIm/dedicated-endpoints/depoqch056a4j4a/playground)

# Summary
This is an first-generation vibe-code alpha(preview) LLM. It’s optimized to produce both natural-language and code completions directly from loosely structured, “vibe coding” prompts. Compared to earlier-generation LLMs, it has a lower prompt-engineering overhead and smoother latent-space interpolation, making it easier to guide toward usable code. The following capabilities can be leveraged:
- **Agentic capabilities**: Use the OpenAI's gpt oss 20b models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
- This model were trained on our [harmony response](https://github.com/openai/harmony) format and should only be used with the harmony format as it will not work correctly otherwise.

# Vibe-Code LLM

This is a **first-generation vibe-code LLM**.  
It’s optimized to produce both natural-language and code completions directly from loosely structured, *“vibe coding”* prompts.  

Unlike earlier LLMs that demanded rigid prompt engineering, vibe-code interaction lowers the overhead: you can sketch intent, describe functionality in free-form language, or mix pseudo-code with natural text. The model interpolates smoothly in latent space, making it easier to guide toward usable and executable code.  

---

## Key Features

- **Low Prompt-Engineering Overhead**  
  Accepts incomplete or intuitive instructions, reducing the need for explicit formatting or rigid templates.  

- **Latent-Space Interpolation**  
  Transitions fluidly between natural-language reasoning and syntax-aware code generation. Produces semantically coherent code blocks even when the prompt is under-specified.  

- **Multi-Domain Support**  
  Handles a broad range of programming paradigms: Python, JavaScript, C++, shell scripting, and pseudo-code scaffolding.  

- **Context-Sensitive Completion**  
  Leverages attention mechanisms to maintain coherence across multi-turn coding sessions.  

- **Syntax-Aware Decoding**  
  Biases output distribution toward syntactically valid tokens, improving out-of-the-box executability of code.  

- **Probabilistic Beam & Sampling Controls**  
  Supports temperature scaling, top-k, and nucleus (top-p) sampling to modulate creativity vs. determinism.  

- **Hybrid Text + Code Responses**  
  Generates inline explanations, design rationales, or docstrings alongside code for improved readability and maintainability.  

---

## Example Usage

```plaintext
Prompt:  
"make me a fast vibe function that sorts numbers but with a cool twist"

Response:  
- Natural explanation of sorting method  
- Code snippet (e.g., Python quicksort variant)  
- Optional playful commentary to match the vibe  
```

---

## Ideal Applications

- Rapid prototyping & exploratory coding  
- Creative coding workflows with minimal boilerplate  
- Educational contexts where explanation + code matter equally  
- Interactive REPLs, notebooks, or editor assistants that thrive on loose natural-language input  

---

## Limitations

- Not tuned for production-grade formal verification.  
- May require post-processing or linting to ensure strict compliance with project coding standards.  
- Designed for *“fast prototyping vibes”*, not for long-horizon enterprise-scale codebases.  


# Inference examples

## Transformers

You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.

To get started, install the necessary dependencies to setup your environment:

```
pip install -U transformers kernels torch 
```

For Google Colab (free/Pro)
```
!pip install -q --upgrade torch

!pip install -q transformers triton==3.4 kernels

!pip uninstall -q torchvision torchaudio -y
```

Once, setup you can proceed to run the model by running the snippet below:

```py
from transformers import pipeline
import torch
model_id = "EpistemeAI/VibeCoder-20B-alpha"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype="auto",
    device_map="auto",
)
messages = [
    {"role": "user", "content": "Let’s start with the header and navigation for the landing page. Start by creating the top header section for the dashboard. We’ll add the content blocks below afterward."},
]
outputs = pipe(
    messages,
    max_new_tokens=3000,
)
print(outputs[0]["generated_text"][-1])
```

### Amazon SageMaker
```py
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
	role = sagemaker.get_execution_role()
except ValueError:
	iam = boto3.client('iam')
	role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'EpistemeAI/VibeCoder-20B-alpha',
	'SM_NUM_GPUS': json.dumps(1)
}


# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	image_uri=get_huggingface_llm_image_uri("huggingface",version="3.2.3"),
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1,
	instance_type="ml.g5.2xlarge",
	container_startup_health_check_timeout=300,
  )
  
# send request
predictor.predict({
	"inputs": "Hi, what can you help me with?",
})
```

# Uploaded finetuned  model

- **Developed by:** EpistemeAI
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gpt-oss-20b-unsloth-bnb-4bit

This gpt_oss model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)