HuggingFaceH4/ultrachat_200k
Viewer • Updated • 515k • 71.9k • 718
Fine-tuned LLaMA-3.1-8B using SFT instruction tuning without prompt masking (loss computed on all tokens).
| Benchmark | Baseline | This Model |
|---|---|---|
| GSM8K | 16.4% | 29.0% |
| MMLU | 58.1% | 58.4% |
| SST Safety | 62.0% | 78.0% |
| AlpacaEval | 1.57% | 5.3% |
eval_baseline/: Baseline evaluation results (pre-finetuning Llama-3.1-8B)Part of CS336 Assignment 5 (SFT Instruction Tuning). See building-from-scratch/sft for details.
Base model
meta-llama/Llama-3.1-8B