Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilizing benchmark results #23

Open
bjacob opened this issue Oct 11, 2024 · 1 comment
Open

Stabilizing benchmark results #23

bjacob opened this issue Oct 11, 2024 · 1 comment

Comments

@bjacob
Copy link
Contributor

bjacob commented Oct 11, 2024

At the moment, gemm_bench.py passes --benchmark_repetitions=3 to stabilize benchmark results. Then the bench_summary_process function defined in bench_utils.py is returning the mean time (according to local variable names; I'm trusting that it's correct).

In my experience, when benchmarking on AMD GPU, the single most effective step to take to stabilize results is something like --benchmark_min_warmup_time=0.1 to run warm-up iterations that are discarded. Just 0.1 second seems to be more than enough.

If, after doing that, we still want to run multiple iterations to further reduce noise, then I would suggest taking either the min or the median of the iteration latencies, not the mean. The problem with the mean as an estimator is that if there is noise in the input N values, then there is still 1/N of that noise in the mean. The noise doesn't decrease fast was N increases, it only decreases in 1/N. By contrast, if we run 3 repetition, then the median will be unaffected by one bad repetition, and the min will be unaffected by two bad repetitions.

Ideally, these mechanisms should be shared around bench_utils.py, not local to each benchmark such as gemm_bench.py.

WDYT @saienduri, @kuhar ?

@kuhar
Copy link
Member

kuhar commented Oct 11, 2024

+1. I think we can start small by adding this to all benchmarks and then start working towards sharing the implementation across these three harnesses. Would be also nice to start adding tests to the repo as it's starting to outgrow the initial simplicity that comes with short scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants