Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Paper
• 2410.18451 • Published
• 20
Skywork reward model series
Note A new version of our 27B reward model trained on Skywork-Reward-Preference-80K-v0.2, the decontaminated version of Skywork-Reward-Preference-80K-v0.1
Note A new version of our 8B reward model trained on Skywork-Reward-Preference-80K-v0.2, the decontaminated version of Skywork-Reward-Preference-80K-v0.1