AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents Paper • 2604.02947 • Published 5 days ago • 15
PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 12 days ago • 116
Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs Paper • 2511.12710 • Published Nov 16, 2025 • 39
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 98