> ## Documentation Index
> Fetch the complete documentation index at: https://ultrarag.openbmb.cn/llms.txt
> Use this file to discover all available pages before exploring further.

# Retriever

## 作用

Retriever Server 是 UltraRAG 中的核心检索模块，集成了 模型加载、文本编码、索引构建与检索查询 等一体化功能。
它原生支持 [Sentence-Transformers](https://github.com/UKPLab/sentence-transformers)、[Infinity](https://github.com/michaelfeil/infinity) 以及 [OpenAI](https://platform.openai.com/docs/guides/embeddings) 等多种后端接口，
可灵活适配不同规模与类型的语料库，满足 大规模向量化 与 高效文档召回 的需求。

## 使用示例

### 语料库编码与索引

以下示例展示如何使用 Retriever Server 对语料库进行编码与索引构建。

```yaml examples/corpus_index.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
# MCP Server
servers:
  retriever: servers/retriever

# MCP Client Pipeline
pipeline:
- retriever.retriever_init
- retriever.retriever_embed
- retriever.retriever_index
```

运行以下命令编译 Pipeline：

```shell theme={null}
ultrarag build examples/corpus_index.yaml
```

根据实际情况修改参数文件。下面分别展示两种典型场景：文本语料编码 与 图像语料编码。

1. 文本语料编码

示例：使用 `Qwen3-Embedding-0.6B` 对文本语料进行向量化。

```yaml examples/parameters/corpus_index_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" highlight="2" theme={null}
retriever:
  backend: sentence_transformers # 我们这里以st为例
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  embedding_path: embedding/embedding.npy
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light # [!code --]
  model_name_or_path: Qwen/Qwen3-Embedding-0.6B # [!code ++]
  overwrite: false
```

2. 图像语料编码

示例：使用 `jinaai/jina-embeddings-v4` 对图像语料进行向量化。

```yaml examples/parameters/corpus_index_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
retriever:
  backend: sentence_transformers
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document # [!code --]
        psg_task: null # [!code --]
        q_prompt_name: query # [!code --]
        q_task: null # [!code --]
        psg_prompt_name: null # [!code ++]
        psg_task: retrieval # [!code ++]
        q_prompt_name: query # [!code ++]
        q_task: retrieval # [!code ++]
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl # [!code --]
  corpus_path: corpora/image.jsonl # [!code ++]
  embedding_path: embedding/embedding.npy
  gpu_ids: 0,1 # [!code --]
  gpu_ids: 1 # [!code ++]
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false # [!code --]
  is_multimodal: true # [!code ++]
  model_name_or_path: openbmb/MiniCPM-Embedding-Light # [!code --]
  model_name_or_path: jinaai/jina-embeddings-v4 # [!code ++]
  overwrite: false
```

运行以下命令执行该 Pipeline：

```shell theme={null}
ultrarag run examples/corpus_index.yaml
```

编码与索引阶段通常涉及大规模语料处理，耗时较长。建议使用 `screen` 或 `nohup` 将任务挂载至后台运行，例如：

```shell theme={null}
nohup ultrarag run examples/corpus_index.yaml > log.txt 2>&1 &
```

### 向量检索

以下示例展示如何使用 Retriever Server 在已构建的索引上执行向量检索任务。

```yaml examples/corpus_search.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
# MCP Server
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever

# MCP Client Pipeline
pipeline:
- benchmark.get_data
- retriever.retriever_init
- retriever.retriever_search

```

运行以下命令编译 Pipeline：

```shell theme={null}
ultrarag build examples/corpus_search.yaml
```

修改参数：

```yaml examples/parameters/corpus_search_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
retriever:
  backend: sentence_transformers
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light # [!code --]
  model_name_or_path: Qwen/Qwen3-Embedding-0.6B # [!code ++]
  query_instruction: ''
  top_k: 5
```

运行 Pipeline：

```shell theme={null}
ultrarag run examples/corpus_search.yaml
```

### BM25检索

除了向量检索外，UltraRAG 还内置了经典的 BM25 文本检索算法。BM25 是一种基于 词频–逆文档频率（TF-IDF） 改进的 稀疏检索方法，
常用于快速、轻量的文本语义匹配任务。在实际应用中，BM25 可与向量检索（Dense Retrieval）互补，共同提升检索的覆盖率与召回多样性。

Step 1：构建 BM25 索引

使用 BM25 进行检索前，需要先对文档进行分词并构建稀疏索引。

```yaml examples/bm25_index.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
# MCP Server
servers:
  retriever: servers/retriever

# MCP Client Pipeline
pipeline:
- retriever.retriever_init
- retriever.bm25_index
```

运行以下命令编译 Pipeline：

```shell theme={null}
ultrarag build examples/bm25_index.yaml
```

修改参数：

```yaml examples/parameters/bm25_index_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
retriever:
  backend: sentence_transformers # [!code --]
  backend: bm25 # [!code ++]
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light
  overwrite: false
```

运行：

```shell theme={null}
ultrarag run examples/bm25_index.yaml
```

Step 2：执行 BM25 检索

索引构建完成后，即可进行基于 BM25 的文档检索。

```yaml examples/bm25_search.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
# MCP Server
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever

# MCP Client Pipeline
pipeline:
- benchmark.get_data
- retriever.retriever_init
- retriever.bm25_search
```

编译 Pipeline：

```shell theme={null}
ultrarag build examples/bm25_search.yaml
```

修改参数：

```yaml examples/parameters/bm25_search_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
retriever:
  backend: sentence_transformers # [!code --]
  backend: bm25 # [!code ++]
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light
  top_k: 5
```

运行检索流程：

```shell theme={null}
ultrarag run examples/bm25_search.yaml
```

### 混合检索

在实际应用中，单一的检索方式往往难以兼顾 召回率 与 精准度。
例如，BM25 在关键词匹配上表现出色，而向量检索则在语义理解上更具优势。
因此，UltraRAG 支持将稀疏检索（BM25）与稠密检索（Dense Retrieval）进行融合，
通过混合策略（Hybrid Retrieval）综合两者的优点，进一步提升检索的多样性与鲁棒性。

以下示例展示了如何在同一个 Pipeline 中同时运行 BM25 与向量检索，并通过自定义模块进行结果合并。

<Note>你可以参考本示例，将检索方式灵活扩展为任意组合，例如结合本地知识库与在线 Web 检索，或融合文本与图像等多模态检索结果，以构建更强大的混合检索 Pipeline。</Note>

```yaml examples/hybrid_search.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
# MCP Server
servers:
  benchmark: servers/benchmark
  dense: servers/retriever
  bm25: servers/retriever
  custom: servers/custom

# MCP Client Pipeline
pipeline:
- benchmark.get_data
- dense.retriever_init
- bm25.retriever_init
- dense.retriever_search:
    output:
      ret_psg: dense_psg
- bm25.bm25_search:
    output:
      ret_psg: sparse_psg
- custom.merge_passages:
    input:
      ret_psg: dense_psg
      temp_psg: sparse_psg
```

<Note>该 Pipeline 涉及 [参数重命名](/pages/cn/rag_client/data_and_params) 与 [模块复用](/pages/cn/rag_client/multi_agents) 机制，可点击链接查看详细说明。</Note>

运行以下命令编译 Pipeline：

```shell theme={null}
ultrarag build examples/hybrid_search.yaml
```

修改参数：

```yaml examples/parameters/hybrid_search_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
bm25:
  backend: sentence_transformers # [!code --]
  backend: bm25 # [!code ++]
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light
  top_k: 5
custom: {}
dense:
  backend: sentence_transformers
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: ''
      base_url: https://api.openai.com/v1
      model_name: text-embedding-3-small
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light # [!code --]
  model_name_or_path: Qwen/Qwen3-Embedding-0.6B # [!code ++]
  query_instruction: ''
  top_k: 5
```

运行混合检索 Pipeline：

```shell theme={null}
ultrarag run examples/hybrid_search.yaml
```

### 部署检索模型

UltraRAG 完全兼容 OpenAI API 接口规范，因此任何符合该接口标准的Embedding 模型都可以直接接入，无需额外适配或修改代码。
以下示例展示如何使用 [vLLM](https://docs.vllm.ai/en/latest/cli/serve.html#parallelconfig) 部署本地检索模型。

**step1: 后台部署模型**

推荐使用 Screen 方式后台运行，以便实时查看日志和状态。

进入一个新的 Screen 会话：

```shell theme={null}
screen -S retriever
```

执行以下命令部署模型（以 Qwen3-Embedding-0.6B 为例）：

```shell script/vllm_serve_emb.sh theme={null}
CUDA_VISIBLE_DEVICES=2 python -m vllm.entrypoints.openai.api_server \
    --served-model-name qwen-embedding \
    --model Qwen/Qwen3-Embedding-0.6B \
    --trust-remote-code \
    --host 0.0.0.0 \
    --port 65504 \
    --task embed \
    --gpu-memory-utilization 0.2
```

出现类似以下输出，表示模型服务启动成功：

```
(APIServer pid=2270761) INFO:     Started server process [2270761]
(APIServer pid=2270761) INFO:     Waiting for application startup.
(APIServer pid=2270761) INFO:     Application startup complete.
```

按下 Ctrl + A + D 可退出并保持服务在后台运行。
如需重新进入该会话，可执行：

```shell theme={null}
screen -r retriever
```

**Step 2：修改 Pipeline 参数**

以 corpus\_search Pipeline 为例，只需将检索后端切换为 openai，
并将 base\_url 指向本地 vLLM 服务即可：

```yaml examples/parameters/corpus_search_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
retriever:
  backend: sentence_transformers # [!code --]
  backend: openai # [!code ++]
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: '' # [!code --]
      api_key: 'abc' # [!code ++]
      base_url: https://api.openai.com/v1 # [!code --]
      base_url: http://127.0.0.1:65504/v1 # [!code ++]
      model_name: text-embedding-3-small # [!code --]
      model_name: qwen-embedding # [!code ++]
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light
  query_instruction: ''
  top_k: 5

```

完成配置后，即可像使用普通向量检索一样运行.

### Web Serach API

UltraRAG 原生集成了三种主流 Web 检索 API：[Tavily](https://www.tavily.com/)、[Exa](https://exa.ai/) 以及 [GLM](https://docs.z.ai/guides/tools/web-search)。
这些 API 可直接作为 Retriever Server 的检索后端使用，实现在线信息检索与实时知识增强。

**Step 1：配置 API Key**

使用前需设置对应服务的 API Key。你可以在运行 Pipeline 前手动导出环境变量：

```shell theme={null}
export TAVILY_API_KEY="your retriever key"
```

更推荐使用 .env 配置文件 进行统一管理：
在 UltraRAG 根目录下，将模板文件 `.env.dev` 重命名为 `.env`，
并填写你的密钥信息，例如：

```
LLM_API_KEY=
RETRIEVER_API_KEY=
TAVILY_API_KEY=tvly-dev-yourapikeyhere
EXA_API_KEY=
ZHIPUAI_API_KEY=
```

UltraRAG 会在启动时自动读取该文件并加载相关配置。

**Step 2：Web Search**

以下示例演示如何使用 Tavily API 进行 Web 检索：

```yaml examples/web_search.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
# MCP Server
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever

# MCP Client Pipeline
pipeline:
- benchmark.get_data
- retriever.retriever_tavily_search
```

编译 Pipeline：

```shell theme={null}
ultrarag build examples/web_search.yaml
```

在自动生成的参数文件中，填写数据路径与检索参数：

```yaml examples/parameters/web_search_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
retriever:
  retrieve_thread_num: 1
  top_k: 5
```

执行以下命令即可启动 Web 检索流程：

```shell theme={null}
ultrarag run examples/web_search.yaml
```

<Note>你可以将 retriever\_tavily\_search 替换为 retriever\_exa\_search 或 retriever\_zhipuai\_search 作为 Web 检索源。</Note>

### 部署 Retriever Server

在同一语料库下测试多种 benchmark 或模型性能时，如果每次都重新初始化 retriever server，都会重复加载大型语料库和索引，耗时且低效。

为此，UltraRAG 提供了 常驻式 Retriever Server 部署脚本，可让 retriever 长时间运行在 CPU 或 GPU 上，避免重复加载，加速实验流程。

**step1: 参数设置**

与普通 retriever server 相同，需先准备配置文件：

```json script/deploy_retriever_config.json icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/json.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=81a8c440100333f3454ca984a5b0fe5a" theme={null}
{
  "model_name_or_path": "openbmb/MiniCPM-Embedding-Light",
  "corpus_path": "data/corpus_example.jsonl",
  "collection_name": "ultrarag_embeddings",

  "backend": "sentence_transformers",
  "backend_configs": {
    "infinity": {
      "bettertransformer": false,
      "pooling_method": "auto",
      "model_warmup": false,
      "trust_remote_code": true
    },
    "sentence_transformers": {
      "trust_remote_code": true,
      "sentence_transformers_encode": {
        "normalize_embeddings": false,
        "encode_chunk_size": 10000,
        "q_prompt_name": "query",
        "psg_prompt_name": "document",
        "psg_task": null,
        "q_task": null
      }
    },
    "openai": {
      "model_name": "text-embedding-3-small",
      "base_url": "https://api.openai.com/v1",
      "api_key": ""
    },
    "bm25": {
      "lang": "en",
      "save_path": "index/bm25"
    }
  },

  "index_backend": "faiss",
  "index_backend_configs": {
    "faiss": {
      "index_use_gpu": true,
      "index_chunk_size": 50000,
      "index_path": "index/index.index"
    },
    "milvus": {
      "uri": "index/milvus_demo.db",
      "token": null,
      "id_field_name": "id",
      "vector_field_name": "vector",
      "text_field_name": "contents",
      "id_max_length": 64,
      "text_max_length": 60000,
      "metric_type": "IP",
      "index_params": {
        "index_type": "AUTOINDEX",
        "metric_type": "IP"
      },
      "search_params": {
        "metric_type": "IP",
        "params": {}
      },
      "index_chunk_size": 50000
    }
  },

  "batch_size": 16,
  "gpu_ids": "0,1",
  "is_multimodal": false,
  "is_demo": false
}
```

**step2: 后台部署**

推荐使用 Screen 以便让 retriever 后台长期运行、随时查看日志。

创建 Screen 会话：

```shell theme={null}
screen -S retriever
```

启动 retriever server：

```python script/deploy_retriever_server.py theme={null}
python ./script/deploy_retriever_server.py \
    --config_path script/deploy_retriever_config.json \
    --host 0.0.0.0 \
    --port 64501
```

Server 启动后，将常驻内存，无需重复加载语料与索引。

**step3: 在线检索**

在线检索时，无需重新初始化 retriever，只需在 pipeline 中指定已部署的地址：

```yaml examples/deploy_corpus_search.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b"  theme={null}
#  Deploy Corpus Search Demo

# MCP Server
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever

# MCP Client Pipeline
pipeline:
- benchmark.get_data
- retriever.retriever_deploy_search
```

运行以下命令编译 Pipeline：

```shell theme={null}
ultrarag build examples/deploy_corpus_search.yaml
```

修改参数：

```yaml examples/parameters/deploy_corpus_search_parameter.yaml icon="https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b" theme={null}
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
retriever:
  query_instruction: ''
  retriever_url: http://127.0.0.1:64501
  top_k: 5
```

运行 Pipeline：

```shell theme={null}
ultrarag run examples/deploy_corpus_search.yaml
```
