For each model we release its per-sample prediction dump on the validation split — the exact outputs extract_predicts produced — so you can reproduce the reported metrics by running only the evaluator ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果一些您可能无法访问的结果已被隐去。
显示无法访问的结果