Loop-based RAG Workflow

This section will guide you to implement a typical iterative reasoning RAG workflow: IterRetGen. This process supports repeatedly updating queries based on model outputs, gradually converging to better answers.

Paper: https://arxiv.org/pdf/2305.15294

Step 1: Clarify the Workflow Structure

Let’s first review the original workflow diagram of IterRetGen proposed in the paper:

In UltraRAG, you can quickly implement this workflow based on existing modules. Its overall architecture can be abstracted as the following diagram:

As shown in the figure:

The answer generated by the model in each round is concatenated with the original question as the query for the next round of retrieval.
This process is executed N times according to the set maximum number of loops.
Except for the prompt and custom modules that need to be implemented by yourself, other functions can be directly reused from UltraRAG built-in tools.

Step 2: Implement the Necessary Tool

IterRetGen concatenates the query and the answer generated in the current round as the query for this round. To do this, add the following code in servers/custom/src/custom.py:

servers/custom/src/custom.py

@app.tool(output="q_ls,ret_psg->nextq_ls")
def iterretgen_nextquery(
    q_ls: List[str],
    ans_ls: List[str | Any],
) -> Dict[str, List[str]]:
    ret = []
    for q, ans in zip(q_ls, ans_ls):
        next_query = f"{q} {ans}"
        ret.append(next_query)
    return {"nextq_ls": ret}

Step 3: Write the Pipeline Configuration File

Create a new YAML file under the examples/ directory, such as IterRetGen.yaml:

examples/IterRetGen.yaml

# IterRetGen demo

# MCP Server
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever
  prompt: servers/prompt
  generation: servers/generation
  evaluation: servers/evaluation
  custom: servers/custom

# MCP Client Pipeline
pipeline:
- benchmark.get_data
- retriever.retriever_deploy_search
- prompt.qa_rag_boxed
- generation.generate
- custom.output_extract_from_boxed
- loop:
    times: 3
    steps:
    - custom.iterretgen_nextquery:
        input:
          ans_ls: pred_ls
    - retriever.retriever_deploy_search:
        input:
          query_list: nextq_ls
    - prompt.qa_rag_boxed
    - generation.generate
    - custom.output_extract_from_boxed
- evaluation.evaluate

Step 4: Configure Pipeline Parameters

Run the following command:

ultrarag build examples/IterRetGen.yaml

Open the generated examples/parameter/IterRetGen_parameter.yaml and modify the configuration as follows:

examples/parameter/IterRetGen_parameter.yaml

benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: 2
    name: asqa
    path: data/sample_asqa_5.jsonl
custom: {}
evaluation:
  metrics:
  - acc
  - f1
  - em
  - coverem
  - stringem
  - rouge-1
  - rouge-2
  - rouge-l
  save_path: output/asqa.json
generation:
  base_url: http://localhost:8000/v1
  model_name: openbmb/MiniCPM4-8B
  sampling_params:
    extra_body:
      chat_template_kwargs:
        enable_thinking: false
      include_stop_str_in_output: true
      top_k: 20
    max_tokens: 2048
    temperature: 0.7
    top_p: 0.8
prompt:
  template: prompt/qa_boxed.jinja
retriever:
  query_instruction: 'Query: '
  retriever_url: http://localhost:8080
  top_k: 5

Step 5: Run Your Inference Pipeline!

Once everything is ready, execute the following command to start the inference pipeline:

ultrarag run examples/IterRetGen.yaml

Getting Started

Developer Guide

Loop-based RAG Workflow

Step 1: Clarify the Workflow Structure

Step 2: Implement the Necessary Tool

Step 3: Write the Pipeline Configuration File

Step 4: Configure Pipeline Parameters

Step 5: Run Your Inference Pipeline!

Getting Started

Developer Guide

​Step 1: Clarify the Workflow Structure

​Step 2: Implement the Necessary Tool

​Step 3: Write the Pipeline Configuration File

​Step 4: Configure Pipeline Parameters

​Step 5: Run Your Inference Pipeline!

Step 1: Clarify the Workflow Structure

Step 2: Implement the Necessary Tool

Step 3: Write the Pipeline Configuration File

Step 4: Configure Pipeline Parameters

Step 5: Run Your Inference Pipeline!