RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training Paper • 2510.06710 • Published Oct 8, 2025 • 39
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published Oct 3, 2025 • 97
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27, 2025 • 71
FlashDecoding++: Faster Large Language Model Inference on GPUs Paper • 2311.01282 • Published Nov 2, 2023 • 37