Skip to main content

retriever_init

Signature
async def retriever_init(
    model_name_or_path: str,
    backend_configs: Dict[str, Any],
    batch_size: int,
    corpus_path: str,
    gpu_ids: Optional[object] = None,
    is_multimodal: bool = False,
    backend: str = "sentence_transformers",
    index_backend: str = "faiss",
    index_backend_configs: Optional[Dict[str, Any]] = None,
    is_demo: bool = False,
    collection_name: str = "",
) -> None
Function
  • Initializes retrieval service.
  • Embedding Backend (backend): Responsible for converting text/images into vectors (Infinity, SentenceTransformers, OpenAI, BM25).
  • Index Backend (index_backend): Responsible for vector storage and retrieval (FAISS, Milvus).
  • Demo Mode: If is_demo=True, forces OpenAI + Milvus configuration, ignoring some parameters.

retriever_embed

Signature
async def retriever_embed(
    embedding_path: Optional[str] = None,
    overwrite: bool = False,
    is_multimodal: bool = False,
) -> None
Function
  • (Non-Demo Mode) Batches calculation of vector representations of corpus and saves as .npy file.
  • Only applies to Dense Retriever backend (BM25 not supported).

retriever_index

Signature
async def retriever_index(
    embedding_path: str,
    overwrite: bool = False,
    collection_name: str = "",
    corpus_path: str = ""
) -> None
Function
  • Builds retrieval index.
  • FAISS: Reads embedding_path (.npy) to build local index file.
  • Milvus / Demo: Reads corpus_path (.jsonl), generates vectors and inserts into specified collection_name.

Signature
async def retriever_search(
    query_list: List[str],
    top_k: int = 5,
    query_instruction: str = "",
    collection_name: str = "",
) -> Dict[str, List[List[str]]]
Function
  • Retrieves single or multiple queries.
  • Automatically handles query vectorization (adds query_instruction) and finds Top-K in specified collection_name (for Milvus) or default index.
Output Format (JSON)
{"ret_psg": [["passage 1", "passage 2"], ["..." ]]} 

Signature
async def retriever_batch_search(
    batch_query_list: List[List[str]],
    top_k: int = 5,
    query_instruction: str = "",
    collection_name: str = "",
) -> Dict[str, List[List[List[str]]]]
Function
  • Batch version of retriever_search, accepts nested list input.
Output Format (JSON)
{"ret_psg_ls": [[["psg 1-1"], ["psg 1-2"]], [["psg 2-1"]]]}

bm25_index

Signature
async def bm25_index(
    overwrite: bool = False,
) -> None
Function
  • When backend="bm25", builds BM25 sparse index and saves it.

Signature
async def bm25_search(
    query_list: List[str],
    top_k: int = 5,
) -> Dict[str, List[List[str]]]
Function
  • Keyword retrieval based on BM25 algorithm.
Output Format (JSON)
{"ret_psg": [["passage 1", "passage 2"], ["..." ]]} 

Signature
async def retriever_deploy_search(
    retriever_url: str,
    query_list: List[str],
    top_k: int = 5,
    query_instruction: str = "",
) -> Dict[str, List[List[str]]]
Function
  • As a client, calls remote retrieval service deployed at retriever_url for query.
Output Format (JSON)
{"ret_psg": [["passage 1", "passage 2"], ["..." ]]} 

Signature
async def retriever_exa_search(
    query_list: List[str],
    top_k: Optional[int] | None = 5,
    retrieve_thread_num: Optional[int] | None = 1,
) -> Dict[str, List[List[str]]]
Function
  • Calls Exa Web retrieval (requires EXA_API_KEY).
Output Format (JSON)
{"ret_psg": [["snippet 1", "snippet 2"], ["..." ]]} 

Signature
async def retriever_tavily_search(
    query_list: List[str],
    top_k: Optional[int] | None = 5,
    retrieve_thread_num: Optional[int] | None = 1,
) -> Dict[str, List[List[str]]]
Function
  • Calls Tavily Web retrieval (requires TAVILY_API_KEY).
Output Format (JSON)
{"ret_psg": [["snippet 1", "snippet 2"], ["..." ]]} 

Signature
async def retriever_zhipuai_search(
    query_list: List[str],
    top_k: Optional[int] | None = 5,
    retrieve_thread_num: Optional[int] | None = 1,
) -> Dict[str, List[List[str]]]
Function
  • Calls ZhipuAI web_search (requires ZHIPUAI_API_KEY).
Output Format (JSON)
{"ret_psg": [["snippet 1", "snippet 2"], ["..." ]]} 

Configuration

https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bservers/retriever/parameter.yaml
model_name_or_path: openbmb/MiniCPM-Embedding-Light
corpus_path: data/corpus_example.jsonl
embedding_path: embedding/embedding.npy
collection_name: wiki

# Embedding Backend Configuration
backend: sentence_transformers # options: infinity, sentence_transformers, openai, bm25
backend_configs:
  infinity:
    bettertransformer: false
    pooling_method: auto
    model_warmup: false
    trust_remote_code: true
  sentence_transformers:
    trust_remote_code: true
    sentence_transformers_encode:
      normalize_embeddings: false
      encode_chunk_size: 256
      q_prompt_name: query
      psg_prompt_name: document
      psg_task: null
      q_task: null
  openai:
    model_name: text-embedding-3-small
    base_url: "https://api.openai.com/v1"
    api_key: "abc"
  bm25:
    lang: en
    save_path: index/bm25

# Index Backend Configuration
index_backend: faiss # options: faiss, milvus
index_backend_configs:
  faiss:
    index_use_gpu: True
    index_chunk_size: 10000
    index_path: index/index.index
  milvus:
    uri: index/milvus_demo.db # Local file for Lite, or http://host:port
    token: null
    id_field_name: id
    vector_field_name: vector
    text_field_name: contents
    index_params:
      index_type: AUTOINDEX
      metric_type: IP

batch_size: 16
top_k: 5
gpu_ids: "1"
query_instruction: ""
is_multimodal: false
overwrite: false
retrieve_thread_num: 1
retriever_url: "http://127.0.0.1:64501"
is_demo: false
Parameter Description:
ParameterTypeDescription
model_name_or_pathstrRetrieval model path or name (e.g., HuggingFace model ID)
corpus_pathstrInput corpus JSONL file path
embedding_pathstrVector file save path (.npy)
collection_namestrMilvus collection name
backendstrSelect retrieval backend: infinity, sentence_transformers, openai, bm25
index_backendstrIndex backend: faiss, milvus
backend_configsdictParameter configuration for each backend (see table below)
index_backend_configsdictParameter configuration for each index backend (see table below)
batch_sizeintBatch size for vector generation or retrieval
top_kintNumber of returned candidate passages
gpu_idsstrSpecify visible GPU devices, e.g., "0,1"
query_instructionstrQuery prefix (used by instruction-tuning models)
is_multimodalboolWhether to enable multimodal embedding (e.g., image)
overwriteboolWhether to overwrite if embedding or index file already exists
retrieve_thread_numintConcurrent thread number for external Web retrieval (Exa/Tavily/Zhipu)
retriever_urlstrURL of deployed retriever server
is_demoboolDemo mode switch (forces OpenAI+Milvus, simplified configuration)
backend_configs Sub-items:
BackendParameterTypeDescription
infinitybettertransformerboolWhether to enable efficient inference optimization
pooling_methodstrPooling method (e.g., auto, mean)
model_warmupboolWhether to preload model into VRAM
trust_remote_codeboolWhether to trust remote code (Applicable to custom models)
sentence_transformerstrust_remote_codeboolWhether to trust remote model code
sentence_transformers_encodedictEncoding detailed parameters, see table below
openaimodel_namestrOpenAI model name (e.g., text-embedding-3-small)
base_urlstrAPI base address
api_keystrOpenAI API Key
bm25langstrLanguage (determines stop words and tokenizer)
save_pathstrSave directory for BM25 sparse index
sentence_transformers_encode Parameters:
ParameterTypeDescription
normalize_embeddingsboolWhether to normalize vectors
encode_chunk_sizeintEncoding chunk size (avoid VRAM overflow)
q_prompt_namestrQuery template name
psg_prompt_namestrPassage template name
q_taskstrTask description (for cases where specific models need to specify Task)
psg_taskstrTask description (for cases where specific models need to specify Task)
index_backend_configs Parameters:
BackendParameterTypeDescription
faissindex_use_gpuboolWhether to use GPU for building and retrieving index
index_chunk_sizeintBatch size when building index
index_pathstrSave path for FAISS index file (.index)
milvusuristrMilvus connection address (Local file path enables Milvus Lite)
tokenstrAuth Token (if needed)
id_field_namestrPrimary key field name (default id)
vector_field_namestrVector field name (default vector)
text_field_namestrText content field name (default contents)
id_max_lengthintMaximum length of string primary key
text_max_lengthintMaximum length of text field (truncated if exceeded)
metric_typestrDistance metric type (e.g., IP inner product, L2 Euclidean distance)
index_paramsDictIndex construction parameters (e.g., index_type: AUTOINDEX)
search_paramsDictRetrieval parameters (e.g., nprobe etc.)
index_chunk_sizeintBatch size when inserting data