List[str] | None
): List of evaluation metric namesstr
): Save path (filename will automatically include a timestamp)evaluate
List[str]
): List of predicted answersList[List[str]]
): List of reference answers, may contain one or multipleList[str] | None
): List of evaluation metric namesstr
): Save path (filename will automatically include a timestamp)Dict[str, Any]
): Evaluation results