Dataset and Format Description

Available Data

We have organized and preprocessed the most commonly used public evaluation datasets in current RAG research and released them on Huggingface datasets. Users can directly download and use them without any further processing. The table below lists the supported task types and dataset statistics:

Task Type	Dataset Name	Original Data Quantity	Leaderboard Sample Quantity
qa	nq	3,610	1,000
qa	TriviaQA	11,313	1,000
qa	popqa	14,267	1,000
qa	AmbigQA	2,002	1,000
qa	MarcoQA	101,093 ; 55,636 (filtered no-answer version)	1,000 (based on filtered)
qa	WebQuestions	2,032	1,000
Multi-hop qa	hotpotqa	7,405	1,000
Multi-hop qa	2WikiMultiHopQA	12,576	1,000
Multi-hop qa	Musique	2,417	1,000
Multi-hop qa	bamboogle	125	125 (unprocessed)
Multi-hop qa	strategy-qa	2,290	1,000
Multiple-choice	ARC	3,548 ; (options are uppercase letters A-E, with option E having 1 item)	1,000
Multiple-choice	mmlu	14,042 ; (options are uppercase letters A-D)	1,000
Long-form QA	ASQA	948	948 (unprocessed)
fact-verification	FEVER	13,332 ; (only support and refuse labels retained)	1,000
dialogue	WoW	3,054	1,000
slot-filling	T-REx	5,000	1,000

Corpus Statistics:

Corpus Name	Number of Documents
wiki2018	21,015,324
wiki2024	Coming soon

Data Format Description

We recommend users process all test data into .jsonl format, following the structure specifications below to ensure compatibility with UltraRAG modules: Non-multiple-choice data format:

{
  "id": 0,  // integer identifier
  "question": "xxxx",  // question text
  "golden_answers": ["xxx", "xxx"],  // list of standard answers, can contain multiple
  "metadata": { ... }  // other information fields, optional
}

Multiple-choice data format:

{
  "id": 0,
  "question": "xxxx",
  "golden_answers": ["A"],  // standard answer as option letter (e.g., A–D)
  "choices": ["xxx", "xxx", "xxx", "xxx"],  // list of option texts
  "metadata": { ... }
}

Corpus data format:

{
  "id": "0",
  "contents": "xxxxx"  // text segment after corpus chunking
}

Getting Started

Developer Guide

Dataset and Format Description

Available Data

Data Format Description

Getting Started

Developer Guide

​Available Data

​Data Format Description

Available Data

Data Format Description