.jsonl
format.servers/benchmark/parameter.yaml
:
name
: Dataset name, used for logging, debugging, or identifying the currently loaded dataset in the system.path
: Data file path, serves as the entry point for the get_data
utility.key_map
: Field mapping table, specifies which fields to extract from each sample and sets their aliases.
q_ls: question
means mapping the original field question
to q_ls
.p_ls: retrieved_passage
.shuffle
: Whether to enable random sampling.seed
: Set random seed.limit
: Number of samples to load; -1
means loading all data.get_data
: This utility function is used to load and parse data during the preprocessing phase, extracting key fields (such as questions, answers, retrieved passages, etc.) for downstream modules.