AlphaResearch: Accelerating New Algorithm Discovery with Language Models Paper • 2511.08522 • Published Nov 11 • 17
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward Paper • 2510.03222 • Published Oct 3 • 75