串型 RAG Workflow

本节将手把手带你实现最基本的RAG 工作流：Vanilla RAG。你将学会如何使用 UltraRAG 构建一个从数据加载、检索、生成到评估完整流程的推理系统。

Step 1：明确工作流结构

Vanilla RAG 的最基本流程如下：

数据加载 → 文档检索 → 模型生成 → 答案评估

为确保检索环节正常运行，请你优先完成语料库的编码与索引构建。相关内容可参考教程：[使用 UltraRAG 对大规模语料库进行编码与索引]

Step 2：实现必要Tool

在该流程中，我们希望模型的最终答案以 \boxed{} 包裹，以便后续自动提取。因此，我们需要：

自定义一个 Prompt Tool：将问题与检索内容构造成标准化的生成输入；
自定义一个 Tool：提取 \boxed{} 中的答案文本；
其余模块（检索、生成、评估）均可复用 UltraRAG 的现有组件。

Step 2.1: 构建 Prompt

首先准备 Prompt 模板文件 prompt/qa_boxed.jinja：

prompt/qa_boxed.jinja

Please answer the following question.
Think step by step.
Provide your final answer in the format \boxed{YOUR_ANSWER}.

Question: {{question}}

然后，在 Prompt Server 中实现如下 Tool：

servers/prompt/src/prompt.py

# prompt for QA RAG boxed
@app.prompt(output="q_ls,ret_psg,template->prompt_ls")
def qa_rag_boxed(
    q_ls: List[str], ret_psg: List[str | Any], template: str | Path
) -> list[PromptMessage]:
    template: Template = load_prompt_template(template)
    ret = []
    for q, psg in zip(q_ls, ret_psg):
        passage_text = "\n".join(psg)
        p = template.render(question=q, documents=passage_text)
        ret.append(p)
    return ret

Step 2.2: 提取答案

为了从模型输出中提取 \boxed{} 包裹的答案，可以在 Custom Server 中实现如下 Tool：

servers/custom/src/custom.py

@app.tool(output="ans_ls->pred_ls")
def output_extract_from_boxed(ans_ls: List[str]) -> Dict[str, List[str]]:
    def extract(ans: str) -> str:
        start = ans.rfind(r"\boxed{")
        if start == -1:
            content = ans.strip()
        else:
            i = start + len(r"\boxed{")
            brace_level = 1
            end = i
            while end < len(ans) and brace_level > 0:
                if ans[end] == "{":
                    brace_level += 1
                elif ans[end] == "}":
                    brace_level -= 1
                end += 1
            content = ans[i : end - 1].strip()
            content = re.sub(r"^\$+|\$+$", "", content).strip()
            content = re.sub(r"^\\\(|\\\)$", "", content).strip()
            if content.startswith(r"\text{") and content.endswith("}"):
                content = content[len(r"\text{") : -1].strip()
            content = content.strip("()").strip()
        # 还原 \\
        content = content.replace("\\", " ")
        content = content.replace("  ", " ")
        return content

    return {"pred_ls": [extract(ans) for ans in ans_ls]}

Step 3：编写 Pipeline 配置文件

完成上述代码开发后，在 examples/ 目录下创建一个新的配置文件：vanilla_rag.yaml。

examples/vanilla_rag.yaml

# Vanilla RAG

# MCP Server
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever
  prompt: servers/prompt
  generation: servers/generation
  evaluation: servers/evaluation
  custom: servers/custom

# MCP Client Pipeline
pipeline:
- benchmark.get_data
# 如果没有部署好的 retriever，可替换为以下两步：
# - retriever.retriever_init      
# - retriever.retriever_search
- retriever.retriever_deploy_search
- prompt.qa_rag_boxed
- generation.generate
- custom.output_extract_from_boxed
- evaluation.evaluate

Step 4：配置 Pipeline 参数

执行下列命令：

ultrarag build examples/vanilla_rag.yaml

打开生成的 examples/parameter/vanilla_rag_parameter.yaml，修改如下配置：

examples/parameter/vanilla_rag_parameter.yaml

benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: 2
    name: asqa
    path: data/sample_asqa_5.jsonl
custom: {}
evaluation:
  metrics:
  - acc
  - f1
  - em
  - coverem
  - stringem
  - rouge-1
  - rouge-2
  - rouge-l
  save_path: output/asqa.json
generation:
  base_url: http://localhost:8000/v1
  model_name: openbmb/MiniCPM4-8B
  sampling_params:
    extra_body:
      chat_template_kwargs:
        enable_thinking: false
      include_stop_str_in_output: true
      top_k: 20
    max_tokens: 2048
    temperature: 0.7
    top_p: 0.8
prompt:
  template: prompt/qa_boxed.jinja
retriever:
  query_instruction: 'Query: '
  retriever_url: http://localhost:8080
  top_k: 5

Step 5：运行你的推理流程！

一切准备就绪后，执行以下命令启动推理流程：

ultrarag run examples/vanilla_rag.yaml

评估结果

模型预测结果将自动评估，并保存至 Evaluation Server 中配置的 save_path 路径。例如：

output/asqa.json

运行日志

推理过程中的详细日志会保存在 logs/ 目录下，文件名基于运行时间生成，便于检索和复现。例如：

logs/20250804_193900.log

中间结果文件

所有流程执行过程中的中间结果（包括每一步输入输出）将记录为 memory 文件，默认保存在 output/ 目录，例如：

output/memory_asqa_vanilla_rag_20250804_193900.json

开始使用

开发指南

Step 1：明确工作流结构

Step 2：实现必要Tool

Step 2.1: 构建 Prompt

Step 2.2: 提取答案

Step 3：编写 Pipeline 配置文件

Step 4：配置 Pipeline 参数

Step 5：运行你的推理流程！

评估结果

运行日志

中间结果文件

开始使用

开发指南

​Step 1：明确工作流结构

​Step 2：实现必要Tool

​Step 2.1: 构建 Prompt

​Step 2.2: 提取答案

​Step 3：编写 Pipeline 配置文件

​Step 4：配置 Pipeline 参数

​Step 5：运行你的推理流程！

​评估结果

​运行日志

​中间结果文件

Step 1：明确工作流结构

Step 2：实现必要Tool

Step 2.1: 构建 Prompt

Step 2.2: 提取答案

Step 3：编写 Pipeline 配置文件

Step 4：配置 Pipeline 参数

Step 5：运行你的推理流程！

评估结果

运行日志

中间结果文件