Eden Yavin
EdenYav
AI & ML interests
Reinforcement learning, online learning, cybersecurity, large language models
Organizations
None yet
LLM Evaluation
-
Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?
Paper • 2508.03644 • Published • 25 -
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 141 -
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
Paper • 2508.20453 • Published • 63
VisionLM
LLM Evaluation
-
Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?
Paper • 2508.03644 • Published • 25 -
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 141 -
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
Paper • 2508.20453 • Published • 63
models 13
EdenYav/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning • Updated
EdenYav/ppo-Unit1-LunarLander-v2
Reinforcement Learning • Updated
• 2
EdenYav/ppo-LunarLander-v2
Reinforcement Learning • Updated
• 2
EdenYav/poca-SoccerTwos
Reinforcement Learning • Updated
• 9
EdenYav/a2c-PandaReachDense-v2
Reinforcement Learning • Updated
• 2
EdenYav/a2c-AntBulletEnv-v0
Reinforcement Learning • Updated
• 1
EdenYav/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning • Updated
• 5
EdenYav/ppo-Pyramids
Reinforcement Learning • Updated
• 25
EdenYav/ppo-SnowballTarget
Reinforcement Learning • Updated
• 16
EdenYav/Reinforce-PixelCopter
Reinforcement Learning • Updated
datasets 0
None public yet