Skip to content

Regarding the calculation of the duration in the profiling phase #53

@doudouxx

Description

@doudouxx

Hello Vidur,

Thank you for sharing your work. While reading the code, I encountered a question.

I am analyzing the profiling part of the code. The profiling is divided into two parts: MLP and attention.
When calculating the duration for MLP, it first finds all the children events, then finds the correlation for each child, and sums the duration of all correlated events to get the total duration for the event. (The function get_operation_time_stats() in Vidur)
However, for attention, it uses the sarathi module, where the implementation sums up all the CUDA runtime durations using sum([e.cuda_time_total for e in trace.key_averages()]). This approach sums all CUDA runtime durations. (The function handle_trace() in Sarathi)
However, for MLP, it does not sum all CUDA runtime durations.
Is my understanding correct?
These two methods seem inconsistent. How were these approaches considered?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions