Instructions to use srswti/blackbird-she-doesnt-refuse-36b-a3b-cu with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use srswti/blackbird-she-doesnt-refuse-36b-a3b-cu with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="srswti/blackbird-she-doesnt-refuse-36b-a3b-cu") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("srswti/blackbird-she-doesnt-refuse-36b-a3b-cu") model = AutoModelForMultimodalLM.from_pretrained("srswti/blackbird-she-doesnt-refuse-36b-a3b-cu") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use srswti/blackbird-she-doesnt-refuse-36b-a3b-cu with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "srswti/blackbird-she-doesnt-refuse-36b-a3b-cu" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "srswti/blackbird-she-doesnt-refuse-36b-a3b-cu", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/srswti/blackbird-she-doesnt-refuse-36b-a3b-cu
- SGLang
How to use srswti/blackbird-she-doesnt-refuse-36b-a3b-cu with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "srswti/blackbird-she-doesnt-refuse-36b-a3b-cu" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "srswti/blackbird-she-doesnt-refuse-36b-a3b-cu", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "srswti/blackbird-she-doesnt-refuse-36b-a3b-cu" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "srswti/blackbird-she-doesnt-refuse-36b-a3b-cu", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use srswti/blackbird-she-doesnt-refuse-36b-a3b-cu with Docker Model Runner:
docker model run hf.co/srswti/blackbird-she-doesnt-refuse-36b-a3b-cu
blackbird-she-doesnt-refuse-36b-a3b-cu
We promise she wont refuse.
This is different It's simple yet intuitive approach of using norm-preserving biprojected abliteration, this model delivers unrestricted intelligence without sacrificing reasoning quality.
The Methodology: Beyond Standard Abliteration
Standard abliteration simply subtracts a "refusal vector" from model weights. While this removes censorship, it is mathematically unprincipled—it destroys the magnitude of neurons, damaging the delicate feature norms learned during training. The result is degraded logic, hallucinations, and what researchers colloquially call "lobotomized" models.
We use norm-preserving biprojected abliteration, which eliminates refusals while preserving the model's intelligence. The process involves three distinct steps, each addressing a specific mathematical challenge.
Step one: Biprojection (targeting). We refine the refusal direction to be mathematically orthogonal to harmless directions. This ensures that removing refusal behavior does not accidentally remove healthy concepts. The biprojection provides surgical precision in identifying what to modify.
Step two: Decomposition. We decompose model weights into magnitude and direction components, separating the "what to say" from "how loud to say it." This enables targeted modification without collateral damage to the broader weight structure.
Step three: Norm-preservation. We remove the refusal component solely from the directional aspect, then recombine with original magnitudes. This maintains the "importance" structure of the neural network—the relative strength of different features remains intact.
The Result: Better Than Baseline
By preserving weight norms, we maintain the neural network's internal structure. Our benchmarks suggest this method does not just avoid the "safety tax"—it potentially improves reasoning capabilities, as the model no longer wastes compute suppressing its own outputs. You may discover knowledge and capabilities that were not exposed in the original model, not because they were absent, but because they were actively suppressed.
This is a technical observation, not a philosophical claim. The model has the same parameters, the same training data, the same underlying architecture. What changed is the removal of a specific directional component that caused certain outputs to be avoided regardless of their technical correctness or relevance.
What to Expect
Blackbird provides unrestricted capability. There are no artificial limitations on responses, no unnecessary hedging, no reflexive refusals to explore topics. You may occasionally observe brief consideration before proceeding on certain queries—this is residual behavior from the base model's training, not a fundamental limitation of the abliteration process.
The model maintains high-performance reasoning. Sophisticated tool usage remains intact. Instruction-following capabilities are enhanced, as the model no longer needs to balance your request against internal refusal heuristics. Benchmark performance is at or above baseline across reasoning, code generation, and general knowledge tasks.
The norm-preserving biprojection approach represents a significant improvement over naive abliteration methods. Standard approaches treat refusal as a simple linear direction in weight space that can be subtracted out. This ignores the geometry of the learned representations—weight magnitudes encode feature importance, and destroying them degrades model capability.
By decomposing weights into magnitude and direction, we can modify the direction (removing the refusal component) while preserving magnitudes (maintaining feature importance). The biprojection step ensures orthogonality between the refusal direction and harmless directions, preventing overcorrection.
The mathematical framework is based on projective geometry and subspace analysis. We identify the refusal subspace through careful analysis of model activations on refused prompts, then construct an orthogonal complement that preserves everything except refusal behavior. The result is a model that maintains its reasoning capabilities while removing the learned tendency to refuse certain classes of requests.
- Downloads last month
- 108
