JameSand/qwen3-1.7b-base-muon-1e-4-bs128-kl0.0-run2-global_step_100 2B • Updated about 21 hours ago • 10
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_20 4B • Updated Jan 25 • 8
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_40 4B • Updated Jan 25 • 9
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_60 4B • Updated Jan 25 • 11
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_80 4B • Updated Jan 25 • 9
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_100 4B • Updated Jan 25 • 7
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_120 4B • Updated Jan 25 • 8
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_140 4B • Updated Jan 25 • 8
JameSand/qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_160 4B • Updated Jan 25