Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic
Paper
• 2603.01162 • Published
None defined yet.
Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees