Yoshii AI - Brazil Japan High-Tech Banner

✨ YOSHII-7B-BR ✨

Typing SVG


Mistral Portuguese License 4-bit


Model Dataset Website



Yoshii é uma LLM de última geração em Português Brasileiro, especializada em atendimento ao cliente e automação empresarial.

Construído com QLoRA sobre Mistral 7B, oferecendo performance enterprise com requisitos de hardware acessíveis.


🎯 Overview

📊 Model Specs

Attribute Value
Architecture Mistral 7B
Parameters 7.2B
Context Window 32,768 tokens
Quantization 4-bit NF4
Model Size 4.7 GB
VRAM Required ~6 GB

🔧 Training Info

Attribute Value
Method QLoRA
Dataset 755 conversations
Epochs 3
LoRA Rank 32
Training Time 37 min
Hardware RTX 3070 8GB

💼 Use Cases


🎧

Customer Service

Automated support for e-commerce, SaaS, and service businesses

💰

Sales Assistants

Product recommendations and lead qualification

📱

WhatsApp Bots

Conversational AI for messaging platforms

⚙️

Automation

Scheduling, FAQs, and workflow automation

🚀 Quick Start

Installation

pip install transformers accelerate bitsandbytes torch

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("yoshii-ai/Yoshii-7B-BR", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("yoshii-ai/Yoshii-7B-BR")

messages = [{"role": "user", "content": "Ola, preciso de ajuda com meu pedido"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Optimized Usage (4-bit)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    "yoshii-ai/Yoshii-7B-BR",
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("yoshii-ai/Yoshii-7B-BR")

# Your conversation here
messages = [{"role": "user", "content": "Qual o horario de funcionamento?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

💬 Example Conversations

🎧 Customer Support

User: Ola, gostaria de cancelar meu pedido
Yoshii: Ola! Claro, posso ajudar com o cancelamento.
        Poderia informar o numero do pedido?

📦 Product Inquiry

User: Esse produto tem garantia?
Yoshii: Sim! Todos os nossos produtos possuem
        garantia de 12 meses contra defeitos.

🚚 Shipping Info

User: Quanto custa o frete para SP?
Yoshii: O frete para Sao Paulo varia conforme
        o CEP. Pode informar seu CEP?

🕐 Business Hours

User: Qual o horario de atendimento?
Yoshii: Nosso atendimento funciona de segunda
        a sexta, das 8h as 18h.

📈 Training Details

🔧 Full Training Configuration
# QLoRA Configuration
LoraConfig(
    r=32,                              # LoRA rank
    lora_alpha=64,                     # LoRA alpha
    lora_dropout=0.05,                 # Dropout
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "v_proj"]
)

# Training Arguments
SFTConfig(
    num_train_epochs=3,
    per_device_train_batch_size=1,
    gradient_accumulation_steps=16,
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    max_length=512,
    fp16=True,
    gradient_checkpointing=True,
    optim="paged_adamw_8bit",
)

# Quantization
BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

📊 Performance

Metric Value
Training Loss (final) ~0.8
Inference Speed ~30 tokens/sec (RTX 3070)
Memory Usage ~5.8 GB VRAM

⚠️ Limitations

  • Language: Optimized for Brazilian Portuguese only
  • Domain: Best suited for customer service contexts
  • Responses: May require fine-tuning for specialized domains
  • Hallucinations: Like all LLMs, may generate inaccurate information

📝 Citation

@misc{yoshii7bbr2025,
  author       = {Richard Sakaguchi},
  title        = {Yoshii-7B-BR: Brazilian Portuguese Customer Service LLM},
  year         = {2025},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/yoshii-ai/Yoshii-7B-BR}},
}

🛠️ Built with

Transformers TRL PEFT BitsAndBytes


🤗 Yoshii AI🌐 Sakaguchi IA📊 Dataset


Made in Brazil




In MemoriamYoshii

Downloads last month
5
Safetensors
Model size
7B params
Tensor type
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yoshii-ai/Yoshii-7B-BR

Quantized
(242)
this model

Dataset used to train yoshii-ai/Yoshii-7B-BR