Make sinter distribute work better when workers outnumber tasks #392

Strilanc · 2022-10-26T23:42:57Z

Start sinter on a 96 core machine with a single task. It barely uses any of the cores at all, because of the gradual spinup of the task.

The issue is basically that the task's manager is saying how many shots to send out, with no conception of how many workers will be used to do it.

@inmzhang

…#804) - Add `sinter.Sampler` and `sinter.CompiledSampler` classes - They can go anywhere a Decoder would go, but they are responsible for all parts of the sampling instead of only prediction - Add a new default sampler `perfectionist`, which discards anything with detection events and predicts the observables are not flipped - Improved layout of the progress printouts when collect is running - Sinter decoders can now flag that they want to discard shots by adding an extra byte to the returned observable data, with 0 meaning keep and not-0 meaning discard - Change how `sinter collect` distributes work - Workers are now distributed as widely as possible, instead of all on one task - Workers are now never switched between tasks until their current task is done - Add `sinter plot --point_label_func` argument for drawing text next to data points - Augment `sinter plot --group_func` to support dictionaries with special keys controlling precise grouping behaviors - If group_func returns a dict with a `"color"` key, all items with the same `"color"` value are drawn with the same color - If group_func returns a dict with a `"linestyle"` key, all items with the same `"linestyle"` value are drawn with the same linestyle - If group_func returns a dict with a `"marker"` key, all items with the same `"marker"` value are drawn with the same marker - If group_func returns a dict with a `"label"` key, this forces the label shown in the legend - If group_func returns a dict with an `"order"` key, this takes priority for ordering the legend - `sinter collect --processes` is no longer required (defaults to `"auto"`) - `sinter plot --show` is no longer required (defaults to showing, unless `--out` is specified, unless `--show` is specified) - Group some of sinter's code into private subpackages - Show traditional error bars instead of a filled region for high/low fit when only one data point is present - Add `sinter plot --preprocess_stats_func` - Add `sinter.TaskStats.with_edits` - Add safety error when adding stats that have equal strong ids but differing identifying information (json_metadata or decoder) Some of the sampler design is adapted from @inmzhang's design in #735 Fixes #774 Fixes #682 Fixes #392 --------- Co-authored-by: Matt McEwen <[email protected]>

@inmzhang

…#804) - Add `sinter.Sampler` and `sinter.CompiledSampler` classes - They can go anywhere a Decoder would go, but they are responsible for all parts of the sampling instead of only prediction - Add a new default sampler `perfectionist`, which discards anything with detection events and predicts the observables are not flipped - Improved layout of the progress printouts when collect is running - Sinter decoders can now flag that they want to discard shots by adding an extra byte to the returned observable data, with 0 meaning keep and not-0 meaning discard - Change how `sinter collect` distributes work - Workers are now distributed as widely as possible, instead of all on one task - Workers are now never switched between tasks until their current task is done - Add `sinter plot --point_label_func` argument for drawing text next to data points - Augment `sinter plot --group_func` to support dictionaries with special keys controlling precise grouping behaviors - If group_func returns a dict with a `"color"` key, all items with the same `"color"` value are drawn with the same color - If group_func returns a dict with a `"linestyle"` key, all items with the same `"linestyle"` value are drawn with the same linestyle - If group_func returns a dict with a `"marker"` key, all items with the same `"marker"` value are drawn with the same marker - If group_func returns a dict with a `"label"` key, this forces the label shown in the legend - If group_func returns a dict with an `"order"` key, this takes priority for ordering the legend - `sinter collect --processes` is no longer required (defaults to `"auto"`) - `sinter plot --show` is no longer required (defaults to showing, unless `--out` is specified, unless `--show` is specified) - Group some of sinter's code into private subpackages - Show traditional error bars instead of a filled region for high/low fit when only one data point is present - Add `sinter plot --preprocess_stats_func` - Add `sinter.TaskStats.with_edits` - Add safety error when adding stats that have equal strong ids but differing identifying information (json_metadata or decoder) Some of the sampler design is adapted from @inmzhang's design in #735 Fixes #774 Fixes #682 Fixes #392 --------- Co-authored-by: Matt McEwen <[email protected]>

@inmzhang

…#804) - Add `sinter.Sampler` and `sinter.CompiledSampler` classes - They can go anywhere a Decoder would go, but they are responsible for all parts of the sampling instead of only prediction - Add a new default sampler `perfectionist`, which discards anything with detection events and predicts the observables are not flipped - Improved layout of the progress printouts when collect is running - Sinter decoders can now flag that they want to discard shots by adding an extra byte to the returned observable data, with 0 meaning keep and not-0 meaning discard - Change how `sinter collect` distributes work - Workers are now distributed as widely as possible, instead of all on one task - Workers are now never switched between tasks until their current task is done - Add `sinter plot --point_label_func` argument for drawing text next to data points - Augment `sinter plot --group_func` to support dictionaries with special keys controlling precise grouping behaviors - If group_func returns a dict with a `"color"` key, all items with the same `"color"` value are drawn with the same color - If group_func returns a dict with a `"linestyle"` key, all items with the same `"linestyle"` value are drawn with the same linestyle - If group_func returns a dict with a `"marker"` key, all items with the same `"marker"` value are drawn with the same marker - If group_func returns a dict with a `"label"` key, this forces the label shown in the legend - If group_func returns a dict with an `"order"` key, this takes priority for ordering the legend - `sinter collect --processes` is no longer required (defaults to `"auto"`) - `sinter plot --show` is no longer required (defaults to showing, unless `--out` is specified, unless `--show` is specified) - Group some of sinter's code into private subpackages - Show traditional error bars instead of a filled region for high/low fit when only one data point is present - Add `sinter plot --preprocess_stats_func` - Add `sinter.TaskStats.with_edits` - Add safety error when adding stats that have equal strong ids but differing identifying information (json_metadata or decoder) Some of the sampler design is adapted from @inmzhang's design in #735 Fixes #774 Fixes #682 Fixes #392 --------- Co-authored-by: Matt McEwen <[email protected]>

@inmzhang

…#804) - Add `sinter.Sampler` and `sinter.CompiledSampler` classes - They can go anywhere a Decoder would go, but they are responsible for all parts of the sampling instead of only prediction - Add a new default sampler `perfectionist`, which discards anything with detection events and predicts the observables are not flipped - Improved layout of the progress printouts when collect is running - Sinter decoders can now flag that they want to discard shots by adding an extra byte to the returned observable data, with 0 meaning keep and not-0 meaning discard - Change how `sinter collect` distributes work - Workers are now distributed as widely as possible, instead of all on one task - Workers are now never switched between tasks until their current task is done - Add `sinter plot --point_label_func` argument for drawing text next to data points - Augment `sinter plot --group_func` to support dictionaries with special keys controlling precise grouping behaviors - If group_func returns a dict with a `"color"` key, all items with the same `"color"` value are drawn with the same color - If group_func returns a dict with a `"linestyle"` key, all items with the same `"linestyle"` value are drawn with the same linestyle - If group_func returns a dict with a `"marker"` key, all items with the same `"marker"` value are drawn with the same marker - If group_func returns a dict with a `"label"` key, this forces the label shown in the legend - If group_func returns a dict with an `"order"` key, this takes priority for ordering the legend - `sinter collect --processes` is no longer required (defaults to `"auto"`) - `sinter plot --show` is no longer required (defaults to showing, unless `--out` is specified, unless `--show` is specified) - Group some of sinter's code into private subpackages - Show traditional error bars instead of a filled region for high/low fit when only one data point is present - Add `sinter plot --preprocess_stats_func` - Add `sinter.TaskStats.with_edits` - Add safety error when adding stats that have equal strong ids but differing identifying information (json_metadata or decoder) Some of the sampler design is adapted from @inmzhang's design in #735 Fixes #774 Fixes #682 Fixes #392 --------- Co-authored-by: Matt McEwen <[email protected]>

Strilanc mentioned this issue Jul 27, 2024

Add custom samplers, better collection, and better plotting to sinter #804

Merged

Strilanc closed this as completed in #804 Sep 10, 2024

Strilanc closed this as completed in 5e4f8b2 Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sinter distribute work better when workers outnumber tasks #392

Make sinter distribute work better when workers outnumber tasks #392

Strilanc commented Oct 26, 2022

Make sinter distribute work better when workers outnumber tasks #392

Make sinter distribute work better when workers outnumber tasks #392

Comments

Strilanc commented Oct 26, 2022