Papers
arxiv:2510.24870

Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation

Published on Jun 1
Authors:
,
,
,
,
,
,
,

Abstract

MiRAGE is a claim-centric evaluation framework for multimodal retrieval-augmented generation that assesses factuality, information coverage, and citation support through InfoF1 and CiteF1 metrics, with automatic implementations and multimodal adaptations of existing text-based RAG metrics.

We introduce MiRAGE, an evaluation framework for retrieval-augmented generation (RAG) from multimodal sources. As audiovisual media becomes a prevalent source of information online, it is essential for RAG systems to integrate information from these sources into generation. However, existing evaluations for RAG are text-centric, limiting their applicability to multimodal settings. MiRAGE is a claim-centric approach to multimodal RAG evaluation, consisting of InfoF1, which assesses factuality and information coverage, and CiteF1, which assesses citation support and completeness. We show that, when applied by humans, MiRAGE strongly aligns with extrinsic judgments of output quality. We additionally introduce an automatic implementation of MiRAGE as well as multimodal variants of three prominent text-based RAG metrics -- ALCE, ARGUE, and RAGAS -- demonstrating the limitations of text-centric work and laying the groundwork for automatic evaluation. We release open-source implementations and outline evaluation methods for multimodal RAG.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.24870 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.24870 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.