Single-GPU benchmark for preconditioners#171
Single-GPU benchmark for preconditioners#171namgyu-youn wants to merge 3 commits intofacebookresearch:mainfrom
Conversation
The benchmark compares the performance of various preconditioners (SGD, AdaGrad, Root Inverse Shampoo, Eigendecomposed Shampoo, and Eigenvalue-Corrected Shampoo) using rich console and PyTorch profiler. In rich console, you can check the following: - Total time taken for each preconditioner - Average time taken per epoch - Memory usage in MB - GPU utilization percentage (if applicable) In PyTorch profiler, you can check the following: - Most time-consuming operations (5-th) - Bottleneck analysis for each preconditioner Requested by @tsunghsienlee in facebookresearch#157 for developers experience. Co-authored-by: Tsung-Hsien Lee <zong@meta.com>
|
Sorry for the multiple-PRs; I have to learn more about VCS... Also, I will gratefully wait the review until July based on #163 - Comment. But I truly believe this PR would be useful. Example is attached in #171 - README.md |
1. Hardcode the device to "cuda" and basic configurations for benchmarks. 2. Enhance sorting logics for profiling results. 3. Fix typo in rich Console output.
- top_ops is not a valid name for a variable, it should be profiling_table
|
Hi @namgyu-youn , sorry for my late reply, and I was too busy for the work so I might not be able to review this. Sorry that I bring this idea to you before. |
Never mind. Since learning But lastly, I want to ask if this PR could be triaged because this update must be helpful for your teams. I will wait @runame, but it seems the review might be delayed (or neglected). The result log message is here, and I hope this update could be helpful for this project; Please consider the review. |

Introduces single-GPU benchmarks for comparing various preconditioners: SGD, AdaGrad, Root Inverse Shampoo, Eigendecomposed Shampoo, and Eigenvalue-Corrected Shampoo
In rich Console, developers can check the following:
In PyTorch profiler, developers can check the following:
Co-authored-by: Tsung-Hsien zong@meta.com