Instructions to use hsienchen/Llama3.2-homedepot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use hsienchen/Llama3.2-homedepot with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="hsienchen/Llama3.2-homedepot")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("hsienchen/Llama3.2-homedepot")
model = AutoModelForCausalLM.from_pretrained("hsienchen/Llama3.2-homedepot")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use hsienchen/Llama3.2-homedepot with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "hsienchen/Llama3.2-homedepot"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hsienchen/Llama3.2-homedepot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/hsienchen/Llama3.2-homedepot

SGLang

How to use hsienchen/Llama3.2-homedepot with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "hsienchen/Llama3.2-homedepot" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hsienchen/Llama3.2-homedepot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "hsienchen/Llama3.2-homedepot" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hsienchen/Llama3.2-homedepot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use hsienchen/Llama3.2-homedepot with Docker Model Runner:
```
docker model run hf.co/hsienchen/Llama3.2-homedepot
```

Model Card for HomeDepot-LoRA-GuideBot

This model is a fine-tuned version of Cagatayd/llama3.2-1B-Instruct-Egitim adapted with Low-Rank Adaptation (LoRA) for reasoning-guided product recommendation. It has been trained on the Ukhushn/home-depot dataset to simulate helpful responses to customer product search queries using Chain-of-Thought-style formatting with <think> annotations.

Model Details

Model Description

Base model: Cagatayd/llama3.2-1B-Instruct-Egitim
Adapter method: PEFT (LoRA)
Quantization: 8-bit (BitsAndBytes)
Tokenizer: AutoTokenizer from the same base model with padding_side="left"
Language(s): English
License: MIT (inherits from base + dataset)
Finetuned by: Udemy DeepSeek Fine-Tuning Notebook (Udemy_DeepSeek.ipynb)

Model Sources

Dataset: Ukhushn/home-depot
Training Notebook: Provided in the repository (Udemy_DeepSeek.ipynb)

Uses

Direct Use

Product search assistance for Home Depot-like catalogs.
Reasoning-style answers for DIY and specification-based shopping.
Embedded assistant in LLM playgrounds or chatbots with context-rich inputs.

Out-of-Scope Use

Complex multi-product comparisons without product metadata.
Open-domain generation or general reasoning beyond retail context.

Training Details

Training Data

Dataset: Ukhushn/home-depot
Split: 80/20 (train_test_split(seed=42))
Preprocessed Format: JSON-like ChatML with system/user/assistant roles using <think> tags for reasoning supervision.

Training Procedure

LoRA target modules: ["q_proj", "v_proj"]
LoRA config: r=8, lora_alpha=16, lora_dropout=0.05, bias="none"
TrainingArgs:
- max_steps=60, learning_rate=2e-4, gradient_accumulation_steps=1, per_device_train_batch_size=2
- fp16=True, save_strategy="steps", eval_steps=10, save_steps=20

Compute Environment

GPU: NVIDIA RTX 3060 12GB
Platform: WSL Ubuntu 22.04
Precision: fp16, 8-bit quantized base

Evaluation

Metrics

Training Loss: ~1.91
Validation Loss: ~1.88 at step 60

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

model_name = "Cagatayd/llama3.2-1B-Instruct-Egitim"
bnb_config = BitsAndBytesConfig(load_in_8bit=True)
base_model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto")

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "./results/checkpoint-60")

prompt = "I am tiling a shower in a 5x7 ft basement bathroom. What should I consider?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

F32

F16