14 73 71

Qiushi

QiushiSun

https://qiushisun.github.io/

AI & ML interests

Code Intelligence; Large Langauge Models; AI Agents

Recent Activity

authored a paper 1 day ago

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

upvoted a paper 2 days ago

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

upvoted a paper 2 days ago

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

View all activity

Organizations

authored a paper 1 day ago

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

Paper • 2602.05843 • Published 5 days ago • 53

upvoted 2 papers 2 days ago

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published 9 days ago • 30

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

Paper • 2602.05843 • Published 5 days ago • 53

updated a collection 3 days ago

OS-Sentinel

Collection

2 items • Updated 3 days ago • 1

upvoted a paper 6 days ago

SPARKLING: Balancing Signal Preservation and Symmetry Breaking for Width-Progressive Learning

Paper • 2602.02472 • Published 8 days ago • 44

updated a Space 7 days ago

README

🚀

upvoted a paper 19 days ago

EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

Paper • 2601.15876 • Published 20 days ago • 89

upvoted a paper 20 days ago

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Paper • 2601.12346 • Published 24 days ago • 49

authored a paper 29 days ago

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

Paper • 2601.07779 • Published 29 days ago • 28

upvoted a paper 29 days ago

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

Paper • 2601.07779 • Published 29 days ago • 28

upvoted 2 papers about 1 month ago

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Paper • 2601.01576 • Published Jan 4 • 18

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published Dec 27, 2025 • 48

liked a dataset 2 months ago

heroding77/gui_agent_evaluation

Updated Dec 22, 2025 • 210 • 1

upvoted a paper 2 months ago

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Paper • 2512.04784 • Published Dec 2, 2025 • 25

upvoted 3 papers 3 months ago

ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models

Paper • 2510.06014 • Published Oct 7, 2025 • 10

InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation

Paper • 2510.09724 • Published Oct 10, 2025 • 11

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 72

commented a paper 3 months ago

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 72 •

updated a dataset 3 months ago

OS-Copilot/MobileRisk

Viewer • Updated Oct 31, 2025 • 3.91k • 58 • 1

Qiushi

AI & ML interests

Recent Activity

Organizations

QiushiSun's activity

README