Skip to main content

get_data

Signature
@app.tool(output="benchmark->q_ls,gt_ls")
def get_data(benchmark: Dict[str, Any]) -> Dict[str, List[Any]]
Function
  • Loads evaluation samples from a local file, supporting .jsonl / .json / .parquet formats.
  • Maps original fields to standardized output keys (e.g., q_ls, gt_ls) according to key_map.
  • Supports sample shuffling (shuffle) and limiting the number of samples (limit).
Output Format (JSON)
{
  "q_ls": ["Question 1", "Question 2"],
  "gt_ls": [["Answer A1", "Answer A2"], ["Answer B"]]
}

Parameter Configuration

https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bservers/benchmark/parameter.yaml
benchmark:
  name: nq
  path: data/sample_nq_10.jsonl
  key_map:
    q_ls: question
    gt_ls: golden_answers
  shuffle: false
  seed: 42
  limit: -1
Parameter Description:
ParameterTypeDescription
namestrName of the benchmark dataset, used for logging and identification (e.g., nq)
pathstrPath to the data file, supports .jsonl, .json, and .parquet
key_mapdictField mapping table that maps original dataset fields to tool output keys
key_map.q_lsstrName of the question field (e.g., question)
key_map.gt_lsstrName of the ground truth field (e.g., golden_answers, supports lists)
shuffleboolWhether to shuffle the samples (default: false)
seedintRandom seed (effective when shuffle=true)
limitintSampling limit: -1 means all samples, a positive integer specifies the number of samples to take