Regarding threshold for the task mapping phase #108

nicelhc13 · 2023-05-24T06:11:38Z

We decided to use a threshold of the number of task limit for each task mapping like the old Parla runtime.
This is reasonable since the phase was not finished before all tasks are mapped. It means that it does not overlap the scheduling phase and task execution. So I pushed the threshold mechanism to the current main.

However, during independent experiments, I noticed that it actually degraded performance of small granularity tasks.
The case in which difference becomes noticeable was 10MB data move + 0.5ms + 1000 tasks. It was previously 1.79s, 0.96s, 0.71s, and 0.56s for 4, 3, 2, and 1 GPU, respectively. But now these became 2.9s, 1.8s, 1.4s, and 1.1s. I am still waiting for other configurations, but 0.5ms + 500 tasks was not changed noticeably like this. My hypothesis now is that task granularity is too small and so each task is finished immediately after it is launched. At the same time, the scheduler takes more time than the task execution time and might degrade the total execution time.

I am not sure which option would be better but I am trying to collect all results to check if there is any consistent change.

nicelhc13 assigned wlruys May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding threshold for the task mapping phase #108

Regarding threshold for the task mapping phase #108

nicelhc13 commented May 24, 2023 •

edited

Loading

Regarding threshold for the task mapping phase #108

Regarding threshold for the task mapping phase #108

Comments

nicelhc13 commented May 24, 2023 • edited Loading

nicelhc13 commented May 24, 2023 •

edited

Loading