SmolLM2-135M-Instruct : The 135M model was trained on 2 trillion tokens using a diverse dataset combination: FineWeb-Edu, DCLM, The Stack, along with new filtered datasets we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets. We then applied Direct Preference Optimization (DPO) using UltraFeedback.
Model Files
File Name
Size
Format Description
SmolLM2-135M-Instruct.F32.gguf
540 MB
Full precision (32-bit floating point)
SmolLM2-135M-Instruct.BF16.gguf
271 MB
Brain floating point 16-bit
SmolLM2-135M-Instruct.F16.gguf
271 MB
Half precision (16-bit floating point)
SmolLM2-135M-Instruct.Q8_0.gguf
145 MB
8-bit quantization
SmolLM2-135M-Instruct.Q6_K.gguf
138 MB
6-bit quantization (K-quant)
SmolLM2-135M-Instruct.Q5_K_M.gguf
112 MB
5-bit quantization (K-quant, medium)
SmolLM2-135M-Instruct.Q5_K_S.gguf
110 MB
5-bit quantization (K-quant, small)
SmolLM2-135M-Instruct.Q4_K_M.gguf
105 MB
4-bit quantization (K-quant, medium)
SmolLM2-135M-Instruct.Q4_K_S.gguf
102 MB
4-bit quantization (K-quant, small)
SmolLM2-135M-Instruct.Q3_K_L.gguf
97.5 MB
3-bit quantization (K-quant, large)
SmolLM2-135M-Instruct.Q3_K_M.gguf
93.5 MB
3-bit quantization (K-quant, medium)
SmolLM2-135M-Instruct.Q3_K_S.gguf
88.2 MB
3-bit quantization (K-quant, small)
SmolLM2-135M-Instruct.Q2_K.gguf
88.2 MB
2-bit quantization (K-quant)
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):