llama4-debugmodel-10k / README.md

lakhera2023

Upload folder using huggingface_hub

f73f47e verified 6 days ago

preview code

raw

history blame contribute delete

758 Bytes

metadata

license: llama3
base_model: meta-llama/Llama-4-Scout-17B-16E
tags:
  - llama4
  - moe
  - torchtitan
  - custom

Llama 4 Debug Model - Trained with TorchTitan

Custom-trained Llama 4 model using TorchTitan framework.

Model Details

Training Framework: TorchTitan
Training Steps: 10,000
Model Size: ~220 MB
Precision: bfloat16

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("lakhera2023/llama4-debugmodel-10k") tokenizer = AutoTokenizer.from_pretrained("lakhera2023/llama4-debugmodel-10k")

prompt = "Once upon a time" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0]))