Xirui Li's picture

Xirui Li PRO

AIcell

·

https://xirui-li.github.io/

AI & ML interests

Multi-Modality

Recent Activity

upvoted a paper about 18 hours ago

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

upvoted a paper 3 days ago

RAGEN-2: Reasoning Collapse in Agentic RL

upvoted a paper 4 days ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

View all activity

Organizations

upvoted a paper about 18 hours ago

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Paper • 2602.22766 • Published Feb 26 • 44

upvoted a paper 3 days ago

RAGEN-2: Reasoning Collapse in Agentic RL

Paper • 2604.06268 • Published 6 days ago • 55

upvoted 6 papers 4 days ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Paper • 2604.04901 • Published 7 days ago • 39

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published 7 days ago • 227

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

Paper • 2604.05172 • Published 7 days ago • 22

How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

Paper • 2604.04323 • Published 7 days ago • 37

ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation

Paper • 2604.03922 • Published 8 days ago • 51

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

Paper • 2604.06132 • Published 6 days ago • 110

upvoted a paper 5 days ago

ClawArena: Benchmarking AI Agents in Evolving Information Environments

Paper • 2604.04202 • Published 8 days ago • 33

upvoted a paper 8 days ago

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published 18 days ago • 182

upvoted 2 papers 18 days ago

WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG

Paper • 2603.23497 • Published 19 days ago • 91

Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models

Paper • 2603.22212 • Published 20 days ago • 126

upvoted 3 papers 20 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 26 days ago • 137

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published 26 days ago • 109

upvoted a paper 21 days ago

Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching

Paper • 2602.12280 • Published Feb 12 • 34

upvoted 4 papers 22 days ago

SkillNet: Create, Evaluate, and Connect AI Skills

Paper • 2603.04448 • Published Feb 26 • 91

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Paper • 2603.13366 • Published Mar 9 • 94

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 103

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?

Paper • 2406.17806 • Published Jun 22, 2024 • 2