-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scheduler exit-early feature to avoid long tail with sparse activity #6959
Labels
Comments
derekbruening
added a commit
that referenced
this issue
Aug 30, 2024
Reverts a fix for a bug in the scheduler where it let a thread going unscheduled continue running if there are no other non-running-now scheduleable inputs. This triggered too-frequent all-unscheduled cases and the current timeout for those is too high, causing tail delays. We'll re-instate the fix once we add an early exit feature for that scenario. Issue: #6959
derekbruening
added a commit
that referenced
this issue
Aug 30, 2024
Reverts a fix for a bug in the scheduler where it let a thread going unscheduled continue running if there are no other non-running-now scheduleable inputs. This triggered too-frequent all-unscheduled cases and the current timeout for those is too high, causing tail delays. We'll re-instate the fix once we add an early exit feature for that scenario. Issue: #6959
derekbruening
changed the title
Add exit early feature if all remaining threads are unscheduled
Add scheduler exit-early feature to avoid long tail with sparse activity
Sep 26, 2024
derekbruening
added a commit
that referenced
this issue
Sep 28, 2024
derekbruening
added a commit
that referenced
this issue
Oct 1, 2024
When using the drmemtrace scheduler in an analyzer or other tool that does not track simulated time, the scheduler used to use wall-clock time. Here we change that to use the instruction count plus a scaled idle count. An idle counter is added and a new scale option scheduler_options_t.time_units_per_idle (and CLI -sched_time_units_per_idle) defaulting to 5. The time_units_per_us and sched_time_units_per_us defaults are set to 1000, reflecting a gHz machine with IPC=0.5 Using counters provides a more reproducible result across different runs and machines. Adds a test of the new option. The default values of the options were tested on a large trace and found to produce a representative level of idle time during the main execution (and the whole run when combined with the forthcoming exit-early feature for #6959). This means that the clock going backward problem (#6966) is no longer seen in default runs. The analyzer still supports wall-clock with the -sched_time option so a check to avoid underflow is added. Fixes #6971 Fixes #6966
derekbruening
added a commit
that referenced
this issue
Oct 2, 2024
Adds a new scheduler feature and CLI option exit_if_fraction_left. This applies to -core_sharded and -core_serial modes. When an input reaches EOF, if the number of non-EOF inputs left as a fraction of the original inputs is equal to or less than this value then the scheduler exits (sets all outputs to EOF) rather than finishing off the final inputs. This helps avoid long sequences of idles during staggered endings with fewer inputs left than cores and only a small fraction of the total instructions left in those inputs. The default value in scheduler_options_t is 0 as simulators are typically already choosing to stop at some even point. For analyzers, however, via the command-line option, the default is 0.05 (i.e., 5%), which when tested on an large internal trace helps eliminate much of the final idle time from the cores (just about any value over 0.05 works well: it is not overly sensitive). Compare the numbers below for today's default with a long idle time and so distinct differences between the "cpu busy by time" and "cpu busy by time, ignoring idle past last instr" stats on a 39-core schedule-stats run of a moderately large trace, with key stats and the 1st 2 cores (for brevity) shown here: 1567052521 instructions 878027975 idles 64.09% cpu busy by record count 82.38% cpu busy by time 96.81% cpu busy by time, ignoring idle past last instr Core #0 schedule: CccccccOXHhUuuuuAaSEOGOWEWQqqqFffIiTETENWwwOWEeeeeeeACMmTQFfOWLWVvvvvFQqqqqYOWOooOWOYOYQOWO_O_W_O_W_O_W_O_WO_WO_O_O_O_O_O_OR_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_RY_YyyyySUuuOSISO_S_S_SOPpSOKO_KO_KCcDKWDB_B_____________________________________________ Core #1 schedule: KkLWSFUQPDddddddddXxSUSVRJWKkRNJBWUWwwTttGgRNKkkRWNTtFRWKkRNWUuuGULRFSRSYKkkkRYAYFffGSRYHRYHNWMDddddddddRYGgggggYHNWK_YAHYNnGYSNHWwwwwSWSNKSYyyWKNNWKNNGAKWGggNnNW_NNWE_E_EF__________________________________________________ And now with -exit_if_fraction_left 0.05, where we lose (1567052521 - 1564522227)/1567052521. = 0.16% of the instructions but drastically reduce the tail from 14% of the time to less than 1% of the time: 1564522227 instructions 120512812 idles 92.85% cpu busy by record count 96.39% cpu busy by time 97.46% cpu busy by time, ignoring idle past last instr 766.85user 6.33system 1:15.88elapsed 1018%CPU (0avgtext+0avgdata 4947364maxresident)k Core #0 schedule: CccccccOXHKYEGGETRARrrPRTVvvvRrrNWwwOOKWVRRrPBbbXUVvvvvvOWKVLWVvvJjSOWKVUuTIiiiFPpppKAaaMFfffAHOKWAaGNBOWKAPPOABCWKPWOKWPCXxxxZOWKCccJSOSWKJUYRCOWKCcSOSUKkkkOROK_O_O_O_O_O Core #1 schedule: KkLWSMmmFLSFffffffJjWBbGBUuuuuuuuuuuBDBJJRJWKkRNJWMBKkkRNWKkRNWKkkkRNWXxxxxxZOooAaUIiTHhhhSDNnnnHZzQNnnRNWXxxxxxRNWUuuRNWKXUuXRNKRWKNXxxRWKONNHRKWONURKWXRKXRKNW_KR_KkRK_KRKR_R_R_R_R_R_R_R_R_R_R_R__R__R__R___R___R___R___R___R Fixes #6959
derekbruening
added a commit
that referenced
this issue
Oct 4, 2024
Adds a new scheduler feature and CLI option exit_if_fraction_inputs_left. This applies to -core_sharded and -core_serial modes. When an input reaches EOF, if the number of non-EOF inputs left as a fraction of the original inputs is equal to or less than this value then the scheduler exits (sets all outputs to EOF) rather than finishing off the final inputs. This helps avoid long sequences of idles during staggered endings with fewer inputs left than cores and only a small fraction of the total instructions left in those inputs. The default value in scheduler_options_t and the CLI option is 0.05 (i.e., 5%), which when tested on an large internal trace helps eliminate much of the final idle time from the cores without losing many instructions. Compare the numbers below for today's default with a long idle time and so distinct differences between the "cpu busy by time" and "cpu busy by time, ignoring idle past last instr" stats on a 39-core schedule-stats run of a moderately large trace, with key stats and the 1st 2 cores (for brevity) shown here: ``` 1567052521 instructions 878027975 idles 64.09% cpu busy by record count 82.38% cpu busy by time 96.81% cpu busy by time, ignoring idle past last instr Core #0 schedule: CccccccOXHhUuuuuAaSEOGOWEWQqqqFffIiTETENWwwOWEeeeeeeACMmTQFfOWLWVvvvvFQqqqqYOWOooOWOYOYQOWO_O_W_O_W_O_W_O_WO_WO_O_O_O_O_O_OR_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_R_RY_YyyyySUuuOSISO_S_S_SOPpSOKO_KO_KCcDKWDB_B_____________________________________________ Core #1 schedule: KkLWSFUQPDddddddddXxSUSVRJWKkRNJBWUWwwTttGgRNKkkRWNTtFRWKkRNWUuuGULRFSRSYKkkkRYAYFffGSRYHRYHNWMDddddddddRYGgggggYHNWK_YAHYNnGYSNHWwwwwSWSNKSYyyWKNNWKNNGAKWGggNnNW_NNWE_E_EF__________________________________________________ ``` And now with -exit_if_fraction_inputs_left 0.05, where we lose (1567052521 - 1564522227)/1567052521. = 0.16% of the instructions but drastically reduce the tail from 14% of the time to less than 1% of the time: ``` 1564522227 instructions 120512812 idles 92.85% cpu busy by record count 96.39% cpu busy by time 97.46% cpu busy by time, ignoring idle past last instr Core #0 schedule: CccccccOXHKYEGGETRARrrPRTVvvvRrrNWwwOOKWVRRrPBbbXUVvvvvvOWKVLWVvvJjSOWKVUuTIiiiFPpppKAaaMFfffAHOKWAaGNBOWKAPPOABCWKPWOKWPCXxxxZOWKCccJSOSWKJUYRCOWKCcSOSUKkkkOROK_O_O_O_O_O Core #1 schedule: KkLWSMmmFLSFffffffJjWBbGBUuuuuuuuuuuBDBJJRJWKkRNJWMBKkkRNWKkRNWKkkkRNWXxxxxxZOooAaUIiTHhhhSDNnnnHZzQNnnRNWXxxxxxRNWUuuRNWKXUuXRNKRWKNXxxRWKONNHRKWONURKWXRKXRKNW_KR_KkRK_KRKR_R_R_R_R_R_R_R_R_R_R_R__R__R__R___R___R___R___R___R ``` Fixes #6959
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In PR #6955:
This fix is causing everyone-unscheduled issues at the end of some runs. That "bug" was not all that unreasonable if there no live threads on other cores, in which case it's similar to today's all-unscheduled mechanism. This issue covers addressing the bug (going to revert the fix for now) along with adding an early exit feature.
For the early exit, if we had the remaining record count in the unscheduled threads and knew there wasn't much left, it would be an easier decision. Probably for now we'll put it under a flag.
The text was updated successfully, but these errors were encountered: