Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
hitchhiker3010 's Collections
AI Agents
Video Generation
Interactive Experience
Reasoning MLLM
AI Ads
Agent First world
Agent Personalization
to_read

Reasoning MLLM

updated Jul 9, 2025
Upvote
-

  • Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

    Paper • 2503.12605 • Published Mar 16, 2025 • 35

  • R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

    Paper • 2503.12937 • Published Mar 17, 2025 • 30

  • Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection

    Paper • 2503.12271 • Published Mar 15, 2025 • 9

  • Video-T1: Test-Time Scaling for Video Generation

    Paper • 2503.18942 • Published Mar 24, 2025 • 90

  • Kwai Keye-VL Technical Report

    Paper • 2507.01949 • Published Jul 2, 2025 • 130
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs