✨ YOSHII-7B-BR ✨
Yoshii é uma LLM de última geração em Português Brasileiro, especializada em atendimento ao cliente e automação empresarial.
Construído com QLoRA sobre Mistral 7B, oferecendo performance enterprise com requisitos de hardware acessíveis.
🎯 Overview
💼 Use Cases
🎧Customer ServiceAutomated support for e-commerce, SaaS, and service businesses |
💰Sales AssistantsProduct recommendations and lead qualification |
📱WhatsApp BotsConversational AI for messaging platforms |
⚙️AutomationScheduling, FAQs, and workflow automation |
🚀 Quick Start
Installation
pip install transformers accelerate bitsandbytes torch
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("yoshii-ai/Yoshii-7B-BR", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("yoshii-ai/Yoshii-7B-BR")
messages = [{"role": "user", "content": "Ola, preciso de ajuda com meu pedido"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Optimized Usage (4-bit)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
"yoshii-ai/Yoshii-7B-BR",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("yoshii-ai/Yoshii-7B-BR")
# Your conversation here
messages = [{"role": "user", "content": "Qual o horario de funcionamento?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
💬 Example Conversations
|
🎧 Customer Support
|
📦 Product Inquiry
|
|
🚚 Shipping Info
|
🕐 Business Hours
|
📈 Training Details
🔧 Full Training Configuration
# QLoRA Configuration
LoraConfig(
r=32, # LoRA rank
lora_alpha=64, # LoRA alpha
lora_dropout=0.05, # Dropout
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "v_proj"]
)
# Training Arguments
SFTConfig(
num_train_epochs=3,
per_device_train_batch_size=1,
gradient_accumulation_steps=16,
learning_rate=2e-4,
lr_scheduler_type="cosine",
warmup_ratio=0.03,
max_length=512,
fp16=True,
gradient_checkpointing=True,
optim="paged_adamw_8bit",
)
# Quantization
BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
📊 Performance
| Metric | Value |
|---|---|
| Training Loss (final) | ~0.8 |
| Inference Speed | ~30 tokens/sec (RTX 3070) |
| Memory Usage | ~5.8 GB VRAM |
⚠️ Limitations
- Language: Optimized for Brazilian Portuguese only
- Domain: Best suited for customer service contexts
- Responses: May require fine-tuning for specialized domains
- Hallucinations: Like all LLMs, may generate inaccurate information
📝 Citation
@misc{yoshii7bbr2025,
author = {Richard Sakaguchi},
title = {Yoshii-7B-BR: Brazilian Portuguese Customer Service LLM},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/yoshii-ai/Yoshii-7B-BR}},
}
- Downloads last month
- 5
Model tree for yoshii-ai/Yoshii-7B-BR
Base model
mistralai/Mistral-7B-v0.3 Finetuned
mistralai/Mistral-7B-Instruct-v0.3