Skip to main content

retriever_init

Signature
def retriever_init(
    model_name_or_path: str,
    backend_configs: Dict[str, Any],
    batch_size: int,
    corpus_path: str,
    index_path: Optional[str] = None,
    faiss_use_gpu: bool = False,
    gpu_ids: Optional[object] = None,
    is_multimodal: bool = False,
    backend: str = "infinity",
) -> None
Function
  • Initializes the retrieval backend and corpus index.

retriever_embed

Signature
async def retriever_embed(
    embedding_path: Optional[str] = None,
    overwrite: bool = False,
    is_multimodal: bool = False,
) -> None
Function
  • Encodes corpus embeddings and saves them to *.npy.

retriever_index

Signature
def retriever_index(
    embedding_path: str,
    index_path: Optional[str] = None,
    overwrite: bool = False,
    index_chunk_size: int = 50000,
) -> None
Function
  • Builds an inverted index for embeddings using FAISS.

bm25_index

Signature
def bm25_index(
    index_path: Optional[str] = None,
    overwrite: bool = False,
) -> None
Function
  • Builds a BM25 index and saves stopwords and the vocabulary.

Signature
async def retriever_search(
    query_list: List[str],
    top_k: int = 5,
    query_instruction: str = "",
) -> Dict[str, List[List[str]]]
Function
  • Standard FAISS vector index-based semantic retrieval supporting: infinity / sentence_transformers / openai.
Output Format (JSON)
{"ret_psg": [["passage 1", "passage 2"], ["..."]]} 

retriever_search_colbert_maxsim

Signature
async def retriever_search_colbert_maxsim(
    query_list: List[str],
    embedding_path: str,
    top_k: int = 5,
    query_instruction: str = "",
) -> Dict[str, List[List[str]]]
Function
  • Supports ColBERT/ColPali multi-vector retrieval (only for the infinity backend).
  • Reads embedding_path (shape (N, Kd, D) or dtype=object for variable-length vectors), aggregates scores via MaxSim, and returns top-k results.
Output Format (JSON)
{"ret_psg": [["passage 1", "passage 2"], ["..."]]} 

Signature
async def bm25_search(
    query_list: List[str],
    top_k: int = 5,
) -> Dict[str, List[List[str]]]
Function
  • Performs BM25 inverted index retrieval and returns the top-k text results for each query.
Output Format (JSON)
{"ret_psg": [["passage 1", "passage 2"], ["..."]]} 

Signature
async def retriever_exa_search(
    query_list: List[str],
    top_k: Optional[int] | None = 5,
    retrieve_thread_num: Optional[int] | None = 1,
) -> Dict[str, List[List[str]]]
Function
  • Performs Exa Web Search (requires EXA_API_KEY).
Output Format (JSON)
{"ret_psg": [["snippet 1", "snippet 2"], ["..."]]} 

Signature
async def retriever_tavily_search(
    query_list: List[str],
    top_k: Optional[int] | None = 5,
    retrieve_thread_num: Optional[int] | None = 1,
) -> Dict[str, List[List[str]]]
Function
  • Performs Tavily Web Search (requires TAVILY_API_KEY).
Output Format (JSON)
{"ret_psg": [["snippet 1", "snippet 2"], ["..."]]} 

Signature
async def retriever_zhipuai_search(
    query_list: List[str],
    top_k: Optional[int] | None = 5,
    retrieve_thread_num: Optional[int] | None = 1,
) -> Dict[str, List[List[str]]]
Function
  • Performs ZhipuAI Web Search (requires ZHIPUAI_API_KEY).
Output Format (JSON)
{"ret_psg": [["snippet 1", "snippet 2"], ["..."]]} 

Parameter Configuration

https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bservers/retriever/parameter.yaml
model_name_or_path: openbmb/MiniCPM-Embedding-Light
corpus_path: data/corpus_example.jsonl
embedding_path: embedding/embedding.npy
index_path: index/index.index

backend: sentence_transformers # options: infinity, sentence_transformers, openai, bm25
backend_configs:
  infinity:
    bettertransformer: false
    pooling_method: auto
    device: cuda
    model_warmup: false
    trust_remote_code: true
  sentence_transformers:
    device: cuda
    trust_remote_code: true
    sentence_transformers_encode:
      normalize_embeddings: false
      encode_chunk_size: 10000
      q_prompt_name: query
      psg_prompt_name: document
      psg_task: null
      q_task: null
  openai:
    model_name: text-embedding-3-small
    base_url: "https://api.openai.com/v1"
    api_key: ""
  bm25:
    lang: en

batch_size: 16
top_k: 5
gpu_ids: "0,1"
query_instruction: ""
is_multimodal: false
faiss_use_gpu: True
overwrite: false
index_chunk_size: 50000
retrieve_thread_num: 1
Parameter Description:
ParameterTypeDescription
model_name_or_pathstrPath or name of the retrieval model (e.g., HuggingFace model ID)
corpus_pathstrPath to the input corpus JSONL file
embedding_pathstrPath to save the vector embeddings (.npy)
index_pathstrPath to save the FAISS/BM25 index (.index)
backendstrSelects the retrieval backend: infinity, sentence_transformers, openai, bm25
backend_configsdictBackend-specific configuration (see below)
batch_sizeintBatch size for embedding generation or retrieval
top_kintNumber of passages to return
gpu_idsstrVisible GPU devices, e.g., "0,1"
query_instructionstrQuery prefix (used in instruction-tuned models)
is_multimodalboolEnables multimodal embedding (e.g., images)
faiss_use_gpuboolWhether to use GPU acceleration for FAISS
overwriteboolWhether to overwrite existing embeddings or index files
index_chunk_sizeintNumber of vectors per batch during index building
retrieve_thread_numintNumber of concurrent threads for external web retrieval (Exa/Tavily/Zhipu)

backend_configs Subfields:

BackendParameterDescription
infinitybettertransformerEnables optimized inference acceleration
pooling_methodPooling method (e.g., auto, mean)
deviceExecution device (cuda or cpu)
model_warmupWhether to preload the model into GPU memory
trust_remote_codeWhether to trust remote custom code (for custom models)
sentence_transformersdeviceExecution device (cuda or cpu)
trust_remote_codeAllows loading models with custom code
sentence_transformers_encodeAdvanced encoding parameters (see below)
openaimodel_nameOpenAI model name (e.g., text-embedding-3-small)
base_urlAPI base URL
api_keyOpenAI API key
bm25langLanguage setting (determines stopwords and tokenizer)

sentence_transformers_encode Parameters

ParameterTypeDescription
normalize_embeddingsboolWhether to normalize vectors
encode_chunk_sizeintChunk size during encoding (to prevent OOM)
q_prompt_namestrQuery template name
psg_prompt_namestrPassage template name
q_task / psg_taskstr/nullTask tags (if prompt adaptation is needed)