Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bisection search for faster splitting #966

Merged
merged 3 commits into from
Jul 19, 2019
Merged

bisection search for faster splitting #966

merged 3 commits into from
Jul 19, 2019

Conversation

stevengj
Copy link
Collaborator

@stevengj stevengj commented Jul 16, 2019

This updates the split_by_cost function to do a bisection search along each axis to find a balanced splitting, rather than searching every possible splitting, which should speed things up for large simulations.

(In most cases the left and right costs should be monotonic functions of the splitting point, in which cases a bisection search should be equivalent to checking every possible point.)

split_into_three should have similar improvements, but that isn't touched by this PR.

@oskooi
Copy link
Collaborator

oskooi commented Jul 16, 2019

This is failing python/tests/ring-cyl.py because it contains split_chunks_evenly=False. Since the fragment statistics (#681) does not support cylindrical coordinates, the fix is simple, e.g.:

structure.cpp:187-189

change

  if (meep_geom::fragment_stats::resolution == 0 ||
      meep_geom::fragment_stats::has_non_medium_material() ||
      meep_geom::fragment_stats::split_chunks_evenly) {

to

  if (meep_geom::fragment_stats::resolution == 0 ||
      meep_geom::fragment_stats::has_non_medium_material() ||
      meep_geom::fragment_stats::split_chunks_evenly ||
      gv.dim == Dcyl) {

@oskooi
Copy link
Collaborator

oskooi commented Jul 17, 2019

python/tests/chunks.py (the only other test involving split_chunks_evenly=False) seems to hang indefinitely when run with 4, 6, or 8 processors whereas with 2 processors finishes in about a second:

$ mpirun -n 2 python/tests/chunks.py
Using MPI version 3.0, 2 processes
-----------
Initializing structure...
Splitting into 2 chunks by cost
time for choose_chunkdivision = 0.000123024 s

@stevengj
Copy link
Collaborator Author

stevengj commented Jul 17, 2019

The ring-cyl.py test was working before, so I don't see why it should have suddenly stopped working? It looks like @ChristopherHogan added cylindrical support in #697.

I noticed a problem where my bisection search might not converge and hopefully fixed it.

@oskooi
Copy link
Collaborator

oskooi commented Jul 17, 2019

Output for python/tests/chunks.py from sim.visualize_chunk() using 8 processors/chunks for this PR:

chunk_layout_8

@oskooi
Copy link
Collaborator

oskooi commented Jul 17, 2019

Output from master for 8 chunks/processors:

chunk_layout_master_8

@stevengj
Copy link
Collaborator Author

Good, so it is coming up with the same (or almost the same…could differ by ±1 pixel) chunking.

@stevengj
Copy link
Collaborator Author

Independent of this PR, it would be nice to understand why some of the corner chunks are a different process (color) than the surrounding chunks — to minimize communication costs, it seems like we would want to assign each process to a contiguous set of chunks where possible.

@oskooi
Copy link
Collaborator

oskooi commented Jul 18, 2019

The following are the chunk layouts for a tall, skinny, 2d cell with an expensive flux region at one end for the split_by_cost algorithm from master and this PR for a range of processors/chunks: 4, 6, 8, 10, and 12. The results are similar but not identical. In particular, note that for the case of 4 and 6 processors/chunks, the bisection search of this PR produces 4/6 equally-sized chunks (which is correct) but that the linear search of master produces unequal-sized chunks.

import meep as mp
import matplotlib.pyplot as plt

sx = 30
sy = 5
fcen = 1.0
df = 0.1
nfreq = 500

src = mp.Source(mp.GaussianSource(fcen,fwidth=df),
                component=mp.Ez,
                center=mp.Vector3())

sim = mp.Simulation(cell_size=mp.Vector3(sx,sy),
                    sources=[src],
                    resolution=20,
                    verbose=True,
                    k_point=mp.Vector3(0.53,0.14,0),
                    split_chunks_evenly=False)

flux = sim.add_flux(fcen,
                    df,
                    nfreq,
                    mp.FluxRegion(center=mp.Vector3(14),size=mp.Vector3(y=sy)))

sim.init_sim()
sim.visualize_chunks()
if mp.am_master():
    plt.savefig('imbalanced_2d_chunk_layout.png')

master

4 processors/chunks

imbalanced_2d_chunk_layout_master_4

6 processors/chunks

imbalanced_2d_chunk_layout_master_6

8 processors/chunks

imbalanced_2d_chunk_layout_master_8

10 processors/chunks

imbalanced_2d_chunk_layout_master_10

12 processors/chunks

imbalanced_2d_chunk_layout_master_12

this PR

4 processors/chunks

imbalanced_2d_chunk_layout_4

6 processors/chunks

imbalanced_2d_chunk_layout_6

8 processors/chunks

imbalanced_2d_chunk_layout_8

10 processors/chunks

imbalanced_2d_chunk_layout_10

12 processors/chunks

imbalanced_2d_chunk_layout_12

@stevengj stevengj merged commit 72a3eed into master Jul 19, 2019
@stevengj stevengj deleted the fastersplit branch July 19, 2019 01:13
bencbartlett pushed a commit to bencbartlett/meep that referenced this pull request Sep 9, 2021
* bisection search for faster splitting

* make sure bisection search terminates

* refactor split_by_cost to not repeat bisection search n times for split_into_n
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants