-
-
-
-
-
-
Inference Providers
Active filters:
rloo
blakenp/Qwen2.5-1.5B-Policy
Text Generation
•
0.5B
•
Updated
•
2
blakenp/Qwen2.5-1.5B-Policy2
Text Generation
•
2B
•
Updated
•
4
mradermacher/Qwen2.5-1.5B-Policy2-GGUF
2B
•
Updated
•
44
mradermacher/Qwen2.5-1.5B-Policy-GGUF
0.5B
•
Updated
•
71
Text Generation
•
0.6B
•
Updated
•
16
thomasjhuang/qwen2-rloo-countdown-step150
Text Generation
•
0.5B
•
Updated
•
5
thomasjhuang/qwen2-rloo-countdown-step250
Text Generation
•
0.5B
•
Updated
•
5
thomasjhuang/qwen2-rloo-countdown-step350
Text Generation
•
0.5B
•
Updated
•
4
leobianco/npov_PERL_google_S200898_eps10000_lr2e-5_kl1e-4_2507031331
Updated
JW17/Q25-3B-It-A0.25-C0.75-v1.0
Text Generation
•
Updated
•
4
JW17/Q25-3B-It-A0.5-C0.5-v1.0
Text Generation
•
Updated
•
5
JW17/Q25-3B-It-A1.0-C0.0-v1.0
Text Generation
•
Updated
•
3
JW17/Q25-3B-It-A0.75-C0.25-v1.0
Text Generation
•
Updated
•
4
JW17/Q25-3B-It-PreNorm-v2.0
Text Generation
•
Updated
•
6
Prathyusha101/qwen2-0.5b-rl00
Text Generation
•
0.5B
•
Updated
•
4
Prathyusha101/qwen2-0.5b-REINFORCE-no-baseline-kl-disabled
Text Generation
•
0.5B
•
Updated
•
4
leobianco/bosch_PERL_google_S130104_eps10000_lr2e-5_kl1e-4_2510291047
Updated
leobianco/bosch_PERL_google_S051179_eps10000_lr2e-5_kl1e-4_2510310721
Updated
leobianco/bosch_PERL_google_S130104_eps10000_lr2e-5_kl1e-4_2510310722
Updated
leobianco/bosch_PERL_google_S200898_eps10000_lr2e-5_kl1e-4_2511051107
Updated
leobianco/bosch_PERL_google_S200898_eps10000_lr2e-5_kl1e-4_2511060943
Updated
leobianco/bosch_PERL_google_S051179_eps20000_lr2e-5_kl1e-4_2511060944
Updated
leobianco/bosch_PERL_google_S200898_eps10000_lr2e-5_kl1e-4_2511061519
Updated
leobianco/bosch_PERL_google_S200898_eps10000_lr1e-4_kl1e-4_2511070928
Updated
leobianco/bosch_PERL_google_S130104_eps10000_lr5e-5_kl1e-4_2511071026
Updated
Thrillcrazyer/Qwen-1.5B_NOTHIP_RLOO
Text Generation
•
2B
•
Updated
•
13
Thrillcrazyer/Qwen-1.5B_THIP_RLOO
Text Generation
•
2B
•
Updated
•
20
Thrillcrazyer/Qwen-7B_TAC_RLOO
Text Generation
•
333k
•
Updated
•
13