str
): Path or name of the embedding modelstr
): Path to the corpus file (JSONL format)str
): Path to the .npy
file storing generated embedding vectorsstr
): Path to the .index
file storing the Faiss indexdict
): Model configuration passed to infinity_emb.EngineArgs
bool
): Whether to enable BetterTransformer acceleration (default false
)str
): Vector pooling method (auto
, cls
, mean
)str
): Device to load the model on (e.g., cuda
, cpu
)int
): Batch size when embedding in batchesstr
): Visible GPU settings, e.g., "0,1"
means using GPU 0 and GPU 1str
): Prefix prompt for queriesbool
): Whether to enable GPU Faiss (fallback to CPU if failed)int
): Number of retrieved documents to returnbool
): Whether to overwrite if the target existsstr
): Service address (can be host:port
or full http://host:port
, default port 8080
if not specified)int
): Batch size for add_with_ids
in chunksretriever_init
index_path
is provided and exists, load the Faiss index (GPU/CPU).
str
): Infinity model path or name (passed to EngineArgs
)str
): Corpus file (JSONL), each line must contain the key "contents"
Optional[str]
): Existing .index
file path of the built index, load if existsbool
): Whether to enable GPU Faiss (fallback to CPU if failed)Optional[Dict[str, Any]]
): Other parameters passed to EngineArgs
(e.g., dtype
, batch_size
, etc.)Optional[str]
): Visible GPUs, e.g., "0,1"
; will be set to CUDA_VISIBLE_DEVICES
retriever_embed
.npy
.
Optional[str]
): Output .npy
file path; if not provided, defaults to <project_root>/output/embedding/embedding.npy
bool
): Whether to overwrite if the target existsretriever_index
IndexIDMap2(FlatIP)
) based on .npy
embeddings, supports chunked insertion and GPU construction.
str
): File storing the encoded embeddings, .npy
fileOptional[str]
): Output .index
file path; defaults to <project_root>/output/index/index.index
if not providedbool
): Whether to overwrite if existsint
): Batch size for add_with_ids
in chunksretriever_search
List[str]
): List of queries or a single string (automatically wrapped into a list)int
): Number of results to returnstr
): Prefix prompt added to each query (instruction-based query)bool
): Whether to use OpenAI to generate query vectors; otherwise use the Infinity model by defaultDict[str, List[List[str]]]
): Retrieved passagesretriever_deploy_service
str
): Service address (can be host:port
or full http://host:port
, default port 8080
if not specified)retriever_deploy_search
/search
endpoint of the retriever_deploy_service
as a client and return remote retrieval results.
str
): Remote base addressList[str]
): List of queries or a single string (automatically wrapped into a list)int
): Number of results to returnstr
): Prefix prompt added to each query (instruction-based query)Dict[str, List[List[str]]]
): Retrieved passages