Use fragment_stats to split chunks by cost #681
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #536. Attempts to divide the cell into chunks that take a similar amount of work. The work estimate is produced by
fragment_stats::cost
, which was obtained from running linear regression on training data from random simulations.The default is still to use
split_by_effort
, which splits the chunks evenly without taking cost into account, but setting theSimulation
constructor parametersplit_chunks_evenly=False
will usesplit_by_cost
. Here are some benchmark results from the following simulation (a tall skinny cell with an expensive flux plane at the top):The
meep::structure
is now created in C++ (create_structure_and_set_materials
) instead of Python. Using the geometry instructure::choose_chunkdivision
would require making copies, or passing it around, both of which would require more significant changes.create_structure_and_set_materials
allows the Python geometry to be converted to C++ once, then deleted when it returns.The number of processors is divided into prime factors and the cell is split into chunks starting from the largest prime factor towards the smallest in order to achieve a more even grid split for powers of 3. Here are some examples of the improvement
Left:
split_by_effort
Right:split_by_cost
Cell of air with 9 processors
Cell of air with 9 processors and PML
Tall skinny cell with expensive flux plane at the top, 3 processors
(
split_by_effort
puts all the work on 1 processor).Tall skinny cell with expensive flux plane at the top, 9 processors.
This can be improved by splitting in the
X
direction, but the gain is not above the 30% threshold sosplit_by_cost
splits in theY
direction. Using machine learning to get the actual communication cost should improve this.Remaining issues
split_by_cost
doesn't work with cylindrical coordinates yet so it falls back tosplit_by_effort
split_by_cost
breaks thestructure::dump
andstructure::load
features. When you run the initial simulation and dump the structure file, the chunks are split optimally based on the geometry, but when you attempt to load the file, there is no geometry passed in so the chunks are split evenly, resulting in chunk size mismatches. Need to pass a list of volumes to these functions.