xx18 's Collections

TFPI

ICLR2026: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners https://arxiv.org/abs/2509.26226