get_data
Signature
- Loads evaluation samples from a local file, supporting .jsonl / .json / .parquet formats.
- Maps original fields to standardized output keys (e.g.,
q_ls,gt_ls) according tokey_map. - Supports sample shuffling (
shuffle) and limiting the number of samples (limit).
Parameter Configuration
| Parameter | Type | Description |
|---|---|---|
name | str | Name of the benchmark dataset, used for logging and identification (e.g., nq) |
path | str | Path to the data file, supports .jsonl, .json, and .parquet |
key_map | dict | Field mapping table that maps original dataset fields to tool output keys |
key_map.q_ls | str | Name of the question field (e.g., question) |
key_map.gt_ls | str | Name of the ground truth field (e.g., golden_answers, supports lists) |
shuffle | bool | Whether to shuffle the samples (default: false) |
seed | int | Random seed (effective when shuffle=true) |
limit | int | Sampling limit: -1 means all samples, a positive integer specifies the number of samples to take |