kureha295/deepseek-ai-DeepSeek-R1-Distill-Llama-8B-ortho-baseline-layer-11 8B • Updated 7 days ago • 11
kureha295/deepseek-ai-DeepSeek-R1-Distill-Qwen-7B-ortho-baseline-layer-17 8B • Updated 7 days ago • 10
Bochkov/growing-transformers-model-frozen-16-bit-baseline-monolyth-181m Updated about 20 hours ago • 20