Why
To find the low-hanging fruits and get insights on what parts of the generated code are worth optimizing.
How
We need to capture the number of cycles metric for a running MASM code and have the ability to visualize it by procedures/blocks and down to the op. I'm thinking flamegraphs.