get_data
Signature
- Multi-format Loading: Supports loading evaluation datasets in
.jsonl,.json, or.parquetformats from local storage. - Dynamic Field Mapping: Uses
key_mapto map different column names in raw data (such asquestion,answer) to standardized output keys (usuallyq_lsandgt_ls). - Data Preprocessing: Built-in support for random shuffling (
shuffle) and quantity truncation (limit). - Used in Demos to receive user input, treating it as a piece of data (
q_ls).
Configuration
| Parameter | Type | Description | |
|---|---|---|---|
name | str | Evaluation set name, used only for logging and identification (Example: nq) | |
path | str | Data file path, supports .jsonl, .json, .parquet | |
key_map | dict | Field mapping table, mapping raw fields to tool output keys | |
q_ls | str | Raw field name mapped to Question List (e.g., question column in file) | |
gt_ls | str | Raw field name mapped to Golden Answer List (e.g., golden_answers column in file) | |
shuffle | bool | Whether to shuffle sample order (default false) | |
seed | int | Random seed (effective when shuffle=true) | |
limit | int | Upper limit of sampled data items. Default is -1 (load all), positive integer means truncation of first N items |