retriever_init
Signature
- Initializes the retrieval backend and corpus index.
retriever_embed
Signature
- Encodes corpus embeddings and saves them to
*.npy.
retriever_index
Signature
- Builds an inverted index for embeddings using FAISS.
bm25_index
Signature
- Builds a BM25 index and saves stopwords and the vocabulary.
retriever_search
Signature
- Standard FAISS vector index-based semantic retrieval supporting:
infinity/sentence_transformers/openai.
retriever_search_colbert_maxsim
Signature
- Supports ColBERT/ColPali multi-vector retrieval (only for the
infinitybackend). - Reads
embedding_path(shape(N, Kd, D)ordtype=objectfor variable-length vectors), aggregates scores via MaxSim, and returns top-k results.
bm25_search
Signature
- Performs BM25 inverted index retrieval and returns the top-k text results for each query.
retriever_exa_search
Signature
- Performs Exa Web Search (requires
EXA_API_KEY).
retriever_tavily_search
Signature
- Performs Tavily Web Search (requires
TAVILY_API_KEY).
retriever_zhipuai_search
Signature
- Performs ZhipuAI Web Search (requires
ZHIPUAI_API_KEY).
Parameter Configuration
| Parameter | Type | Description |
|---|---|---|
model_name_or_path | str | Path or name of the retrieval model (e.g., HuggingFace model ID) |
corpus_path | str | Path to the input corpus JSONL file |
embedding_path | str | Path to save the vector embeddings (.npy) |
index_path | str | Path to save the FAISS/BM25 index (.index) |
backend | str | Selects the retrieval backend: infinity, sentence_transformers, openai, bm25 |
backend_configs | dict | Backend-specific configuration (see below) |
batch_size | int | Batch size for embedding generation or retrieval |
top_k | int | Number of passages to return |
gpu_ids | str | Visible GPU devices, e.g., "0,1" |
query_instruction | str | Query prefix (used in instruction-tuned models) |
is_multimodal | bool | Enables multimodal embedding (e.g., images) |
faiss_use_gpu | bool | Whether to use GPU acceleration for FAISS |
overwrite | bool | Whether to overwrite existing embeddings or index files |
index_chunk_size | int | Number of vectors per batch during index building |
retrieve_thread_num | int | Number of concurrent threads for external web retrieval (Exa/Tavily/Zhipu) |
backend_configs Subfields:
| Backend | Parameter | Description |
|---|---|---|
| infinity | bettertransformer | Enables optimized inference acceleration |
pooling_method | Pooling method (e.g., auto, mean) | |
device | Execution device (cuda or cpu) | |
model_warmup | Whether to preload the model into GPU memory | |
trust_remote_code | Whether to trust remote custom code (for custom models) | |
| sentence_transformers | device | Execution device (cuda or cpu) |
trust_remote_code | Allows loading models with custom code | |
sentence_transformers_encode | Advanced encoding parameters (see below) | |
| openai | model_name | OpenAI model name (e.g., text-embedding-3-small) |
base_url | API base URL | |
api_key | OpenAI API key | |
| bm25 | lang | Language setting (determines stopwords and tokenizer) |
sentence_transformers_encode Parameters
| Parameter | Type | Description |
|---|---|---|
normalize_embeddings | bool | Whether to normalize vectors |
encode_chunk_size | int | Chunk size during encoding (to prevent OOM) |
q_prompt_name | str | Query template name |
psg_prompt_name | str | Passage template name |
q_task / psg_task | str/null | Task tags (if prompt adaptation is needed) |