Pythia models supervised finetuned and DPO finetuned with all of Anthropic-hh-rlhf dataset for 1 epoch.
Laura O'Mahony
lomahony
AI & ML interests
PhD student
Recent Activity
updated
a Space
8 days ago
lomahony/First_agent_template
new activity
8 months ago
lomahony/pythia-1.4b-helpful-sft:Adding `safetensors` variant of this model
new activity
12 months ago
lomahony/pythia-410m-helpful-sft:Adding `safetensors` variant of this model
Organizations
None yet