Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models Paper • 2505.14599 • Published May 20, 2025 • 2
RAG-Gym: Optimizing Reasoning and Search Agents with Process Supervision Paper • 2502.13957 • Published Feb 19, 2025 • 1