baidu/ERNIE-4.5-VL-28B-A3B-Thinking Image-Text-to-Text • 30B • Updated about 1 month ago • 14.9k • 528
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning Paper • 2509.20712 • Published Sep 25, 2025 • 19