lindsaybordier/Qwen3-0.6B-SFT-DPO_not-robust_argilla_acc4_beta0.13 Text Generation • 0.6B • Updated Jun 9 • 13
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.2-beta-0.4-2-epochs Text Generation • 3B • Updated Jun 9 • 13
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.6-beta-0.4-2-epochs Text Generation • 3B • Updated Jun 9 • 12
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.8-beta-0.4-2-epochs Text Generation • 3B • Updated Jun 9 • 11
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-1-beta-0.4-2-epochs Text Generation • 3B • Updated Jun 9 • 12
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0-beta-0.6-2-epochs Text Generation • 3B • Updated Jun 9 • 12
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.2-beta-0.6-2-epochs Text Generation • 3B • Updated Jun 9 • 11
sayantan0013/ultrafeedback-binarized-preferences-cleaned_math-stack_Qwen3-0_ramp Text Generation • 0.6B • Updated Jun 9 • 11
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.4-beta-0.6-2-epochs Text Generation • 3B • Updated Jun 9 • 10
lindsaybordier/Qwen3-0.6B-DPO_not-robust_final-dataset_acc4_beta0.07 Text Generation • 0.6B • Updated Jun 10 • 11
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.6-beta-0.6-2-epochs Text Generation • 3B • Updated Jun 10 • 11
lindsaybordier/Qwen3-0.6B-DPO_not-robust_final-dataset_acc4_beta0.10 Text Generation • 0.6B • Updated Jun 10 • 13
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-0.8-beta-0.6-2-epochs Text Generation • 3B • Updated Jun 10 • 12
lindsaybordier/Qwen3-0.6B-DPO_not-robust_final-dataset_acc4_beta0.13 Text Generation • 0.6B • Updated Jun 10 • 14
kowndinya23/ultrafeedback_binarized-alpaca-llama-3-3b-2-epochs-alpha-1-beta-0.6-2-epochs Text Generation • 3B • Updated Jun 10 • 13