Delegate GPU task completion to a co-manager#509
Delegate GPU task completion to a co-manager#509josephjohnjj wants to merge 2 commits intoICLDisco:masterfrom
Conversation
complete_mutex - tracks the number of tasks to be completed by the co-manager to_complete - list of tasks to be completed by the co-manager co_manager_mutex - ensures that there is only one co-manager per device
The second thread that submits the task to the GPU device is transitioned to a co-manager. The task is completed by the manager if the co manager has not yet been set. The task is freed by the manager if it completes the tasks or the task is freed by the co-manager.
|
What is the status of this? Any reason for not taking this in? @josephjohnjj Could you please rebase your branch? |
|
@devreal There was no performance improvement when using a co-manager to complete the task. @bosilca suggested that this might be due to a single task completion not generating enough child tasks to make a noticeable impact. Unlike #566, in #509 co-manager just completed the tasks and was not involved in task execution. I'm doubtful that rebasing the code would be helpful this at this stage. In my codebase, all task offloading to GPU occurs in In the current codebase, this has been moved to I can implement the same in the current codebase if having a co-manger will be helpful. Also, the co-manager was controlled by an MCA parameter, so if in the extreme case there is just 2 cores we could make the choice not to use the co-manager. |
Delegate GPU task completion to a co-manager using the MCA parameter device_cuda_delegate_task_completion.