DeepResearch-Like RAG Workflow

在实际应用中，用户提出的问题往往包含模糊、泛化或需要多轮知识补充的信息点，传统的一次性检索 + 生成（RAG）流程难以准确作答。为了解决这一问题，DeepResearch 提出了一种更具“研究式思考”特征的 RAG 工作流：模型在回答前会先进行规划，主动识别缺失信息，分阶段生成子问题，并通过多轮“搜索 - 推理 - 页面更新”构建完整的信息框架，最终生成高质量的答案。本节我们将基于 UltraRAG 框架实现一个 DeepResearch-like 系统。该系统引导模型根据原始问题生成整体结构规划（Plan）和初始页面（Page），随后通过多轮子查询生成与检索，对页面逐步填充、精化，构建出与问题密切相关的知识页面。

Step 1：明确工作流结构

该算法包含以下关键阶段：

Step 1.1：初始化规划与页面结构

大模型基于原始问题生成一个完整的 Plan（结构化的页面规划），并构建带有 [待补充] 占位符的初始页面。示例输入：

Question: 《夜之摇滚人》的演唱者在《欲望的旋律》中饰演了谁？

示例 Plan：

{
  "mainTitle": "识别《夜之摇滚人》的演唱者及其在《欲望的旋律》中的角色",
  "sections": [
    {
      "title": "了解歌曲及其演唱者",
      "focus": "本节将识别《夜之摇滚人》这首歌的演唱者，并提供关于该艺术家的背景信息。",
      "subtopics": [
        "《夜之摇滚人》歌曲概述",
        "与该歌曲相关的演唱者身份确认"
      ]
    }... 

初始页面：

# 识别《夜之摇滚人》的演唱者及其在《欲望的旋律》中的角色
## 了解歌曲及其演唱者
[待补充]
## 《欲望的旋律》概览
[待补充]
## 演唱者在《欲望的旋律》中的角色
[待补充]

Step 1.2：进入迭代流程

每一轮中，模型：

拆解当前待补充内容，生成子问题（Sub-question）
对子问题进行检索，返回文档
基于当前 Plan、页面与文档内容更新页面中对应段落

该过程会持续进行，直到：

页面中已无 [待补充] 内容
或达到最大迭代轮数（默认 10 轮）

Step 2：实现必要 Tool

Step 2.1：实现 prompt server 用到的函数

提示模版位于 prompt/ 目录，函数实现于 servers/prompt/src/prompt.py，每个函数绑定一个模板文件，需在 parameter.yaml 中单独注册字段（避免冲突）：

servers/prompt/src/prompt.py

@app.prompt(output="q_ls,plan_ls,webnote_init_page_template->prompt_ls")
def webnote_init_page(
    q_ls: List[str],
    plan_ls: List[str],
    template: str | Path,
) -> List[PromptMessage]:
    template: Template = load_prompt_template(template)
    all_prompts = []
    for q, plan in zip(q_ls, plan_ls):
        p = template.render(question=q, plan=plan)
        all_prompts.append(p)
    return all_prompts

@app.prompt(output="q_ls,webnote_gen_plan_template->prompt_ls")
def webnote_gen_plan(
    q_ls: List[str],
    template: str | Path,
) -> List[PromptMessage]:
    template: Template = load_prompt_template(template)
    all_prompts = []
    for q in q_ls:
        p = template.render(question=q)
        all_prompts.append(p)
    return all_prompts

@app.prompt(output="q_ls,plan_ls,page_ls,webnote_gen_subq_template->prompt_ls")
def webnote_gen_subq(
    q_ls: List[str],
    plan_ls: List[str],
    page_ls: List[str],
    template: str | Path,
) -> List[PromptMessage]:
    template: Template = load_prompt_template(template)
    all_prompts = []
    for q, plan, page in zip(q_ls, plan_ls, page_ls):
        p = template.render(question=q, plan=plan, page=page)
        all_prompts.append(p)
    return all_prompts

@app.prompt(output="q_ls,plan_ls,page_ls,subq_ls,psg_ls,webnote_fill_page_template->prompt_ls")
def webnote_fill_page(
    q_ls: List[str],
    plan_ls: List[str],
    page_ls: List[str],
    subq_ls: List[str],
    psg_ls: List[Any],
    template: str | Path,
) -> List[PromptMessage]:
    template: Template = load_prompt_template(template)
    all_prompts = []
    for q, plan, page, subq, psg in zip(q_ls, plan_ls, page_ls, subq_ls, psg_ls):
        p = template.render(question=q, plan=plan, page=page, subq=subq, psg=psg)
        all_prompts.append(p)
    return all_prompts

@app.prompt(output="q_ls,plan_ls,page_ls,webnote_gen_answer_template->prompt_ls")
def webnote_gen_answer(
    q_ls: List[str],
    plan_ls: List[str],
    page_ls: List[str],
    template: str | Path,
) -> List[PromptMessage]:
    template: Template = load_prompt_template(template)
    all_prompts = []
    for q, plan, page in zip(q_ls, plan_ls, page_ls):
        p = template.render(question=q, plan=plan, page=page)
        all_prompts.append(p)
    return all_prompts

完整函数列表如下：

webnote_gen_plan：根据问题生成结构化 Plan
webnote_init_page：根据 Plan 构建初始页面
webnote_gen_subq：生成子查询
webnote_fill_page：结合子查询结果，填充页面内容
webnote_gen_answer：整合页面信息生成最终答案

每个函数都对应一个 .jinja 模板。除此之外，请确保在 servers/prompt/parameter.yaml 中新增以下参数配置，以显式指定各模板路径：

servers/prompt/parameter.yaml

template: prompt/qa_boxed.jinja
webnote_gen_plan_template: prompt/webnote_gen_plan.jinja
webnote_init_page_template: prompt/webnote_init_page.jinja
webnote_gen_subq_template: prompt/webnote_gen_subq.jinja
webnote_fill_page_template: prompt/webnote_fill_page.jinja
webnote_gen_answer_template: prompt/webnote_gen_answer.jinja

Step 2.2：实现 Router Server

用于判断当前页面是否已完成填充。若仍存在 [to be filled] 等占位符，标记为 incomplete，继续循环，否则终止流程。

servers/router/src/router.py

@app.tool(output="page_ls->page_ls")
def webnote_check_page(page_ls: List[str]) -> Dict[str, List[Dict[str, str]]]:
    """Check if the page is complete or incomplete.
    Args:
        page_ls (list): List of pages to check.
    Returns:
        dict: Dictionary containing the list of pages with their states.
    """
    page_ls = [
        {
            "data": page,
            "state": "incomplete" if "to be filled" in page.lower() else "complete",
        }
        for page in page_ls
    ]
    return {"page_ls": page_ls}

Step 3：编写 Pipeline 配置文件

在 examples/webnote.yaml 中定义如下模块结构和执行流程：

examples/webnote.yaml

# WebNote demo

# MCP Server
servers:
  benchmark: servers/benchmark
  generation: servers/generation
  retriever: servers/retriever
  prompt: servers/prompt
  evaluation: servers/evaluation
  custom: servers/custom
  router: servers/router

# MCP Client Pipeline
pipeline:
- benchmark.get_data
# 初始化检索服务
- retriever.retriever_deploy_search
# 加载数据集

# 生成plan
- prompt.webnote_gen_plan
- generation.generate:
    output:
      ans_ls: plan_ls
# 初始化page
- prompt.webnote_init_page
- generation.generate:
    output:
      ans_ls: page_ls
# 循环，生成子问题，检索，逐步填充page
- loop:
    times: 10
    steps:
    # 触发器检查，判断page是否完成
    - branch:
        router:
        - router.webnote_check_page
        branches:
          # 如果page没有完成，继续
          incomplete:
          # 生成子问题
          - prompt.webnote_gen_subq
          - generation.generate:
              output:
                ans_ls: subq_ls
          # 检索答案
          - retriever.retriever_deploy_search:
              input:
                query_list: subq_ls
              output:
                ret_psg: psg_ls
          # 填充page
          - prompt.webnote_fill_page
          - generation.generate:
              output:
                ans_ls: page_ls
          # 如果page完成，结束
          complete: []
# 生成答案
- prompt.webnote_gen_answer
- generation.generate
# 评估结果
- custom.output_extract_from_boxed
- evaluation.evaluate

Step 4：配置 Pipeline 参数

执行以下命令构建参数模板：

ultrarag build examples/webnote.yaml

webnote_parameter.yaml 的格式如下所示：

examples/webnote_parameter.yaml

benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: 2
    name: asqa
    path: data/sample_asqa_5.jsonl
custom: {}
evaluation:
  metrics:
  - acc
  - f1
  - em
  - coverem
  - stringem
  - rouge-1
  - rouge-2
  - rouge-l
  save_path: output/nq.json
generation:
  base_url: http://localhost:8000/v1
  model_name: openbmb/MiniCPM4-8B
  sampling_params:
    extra_body:
      chat_template_kwargs:
        enable_thinking: false
      include_stop_str_in_output: true
      top_k: 20
    max_tokens: 2048
    temperature: 0.7
    top_p: 0.8
prompt:
  webnote_fill_page_template: prompt/webnote_fill_page.jinja
  webnote_gen_answer_template: prompt/webnote_gen_answer.jinja
  webnote_gen_plan_template: prompt/webnote_gen_plan.jinja
  webnote_gen_subq_template: prompt/webnote_gen_subq.jinja
  webnote_init_page_template: prompt/webnote_init_page.jinja
retriever:
  query_instruction: 'Query: '
  retriever_url: http://localhost:8080
  top_k: 5

Step 5：运行你的推理流程！

一切准备就绪后，执行以下命令启动推理流程：

ultrarag run examples/webnote.yaml

开始使用

开发指南

DeepResearch-Like RAG Workflow

Step 1：明确工作流结构

Step 1.1：初始化规划与页面结构

Step 1.2：进入迭代流程

Step 2：实现必要 Tool

Step 2.1：实现 prompt server 用到的函数

Step 2.2：实现 Router Server

Step 3：编写 Pipeline 配置文件

Step 4：配置 Pipeline 参数

Step 5：运行你的推理流程！

开始使用

开发指南

​Step 1：明确工作流结构

​Step 1.1：初始化规划与页面结构

​Step 1.2：进入迭代流程

​Step 2：实现必要 Tool

​Step 2.1：实现 prompt server 用到的函数

​Step 2.2：实现 Router Server

​Step 3：编写 Pipeline 配置文件

​Step 4：配置 Pipeline 参数

​Step 5：运行你的推理流程！

Step 1：明确工作流结构

Step 1.1：初始化规划与页面结构

Step 1.2：进入迭代流程

Step 2：实现必要 Tool

Step 2.1：实现 prompt server 用到的函数

Step 2.2：实现 Router Server

Step 3：编写 Pipeline 配置文件

Step 4：配置 Pipeline 参数

Step 5：运行你的推理流程！