This repository contains two AI generated Python scripts for analyzing benchmark results from JSONL files.
Compares two benchmark result files and shows differences in metrics and values.
Usage:
python compare.py <file1> <file2>Example:
# compares file2 over file1. results are signed.
python compare.py ./results-summary-1.jsonl ./results-summary-2.jsonlOutput:
- Missing metrics between files
- Value differences above 5% threshold
- Benchmarks that exist only in one file
Sample Output:
cpu-memory-bw-latency - Value differences:
- mem_bandwidth_matrix_numa_0_1_bw: +5.2%
file1: 7056.89
file2: 7424.32
gpu-copy-bw:perf - Missing metrics:
Only in file2:
- new_metric_name
Checks a single benchmark file for any non-zero return codes (indicating failures).
Usage:
python check_return_code.py <file>Example:
python check_return_code.py ./results-summary.jsonlOutput:
- Lists all benchmarks with non-zero return codes
- Shows success message if all benchmarks passed
Sample Output:
Benchmarks with non-zero return codes:
• cudnn-function/return_code:124
• gemm-flops/return_code:124
Checks that every benchmark listed under superbench.enable in a SuperBench YAML config appears in a JSONL results file.
Usage:
python check_benchmarks.py <config.yaml> <results-summary.jsonl>Example:
python check_benchmarks.py gb200.yaml results-summary.jsonlOutput:
- Prints each enabled benchmark with
OKif any key in the JSONL starts with<benchmark>/, orMISSINGotherwise - Exits with status
0if all are present, or1if any benchmarks are missing
Sample Output:
kernel-launch : OK
gemm-flops : OK
…
computation-communication-overlap : MISSING
…
All enabled benchmarks are present. # or lists