Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change checksum backend to h5py to improve CI performance #725

Merged
merged 5 commits into from
Mar 31, 2022

Conversation

AlexanderSinn
Copy link
Member

@AlexanderSinn AlexanderSinn commented Mar 26, 2022

Based on #724

There seems to be something wrong with the openpmd-api backend in openPMD-viewer so change to h5py backend for now.

Default openpmd-api backend:

Test project /home/runner/work/hipace/hipace/build
      Start  1: blowout_wake.2Rank
 1/24 Test  #1: blowout_wake.2Rank ......................   Passed   12.82 sec
      Start  2: collisions.SI.1Rank
 2/24 Test  #2: collisions.SI.1Rank .....................   Passed    4.09 sec
      Start  3: ionization.2Rank
 3/24 Test  #3: ionization.2Rank ........................   Passed    3.34 sec
      Start  4: from_file.normalized.1Rank
 4/24 Test  #4: from_file.normalized.1Rank ..............   Passed    3.33 sec
      Start  5: from_file.SI.1Rank
 5/24 Test  #5: from_file.SI.1Rank ......................   Passed    3.41 sec
      Start  6: restart.normalized.1Rank
 6/24 Test  #6: restart.normalized.1Rank ................   Passed    1.45 sec
      Start  7: blowout_wake_explicit.2Rank
 7/24 Test  #7: blowout_wake_explicit.2Rank .............   Passed    5.13 sec
      Start  8: laser_blowout_wake_explicit.1Rank
 8/24 Test  #8: laser_blowout_wake_explicit.1Rank .......   Passed    6.61 sec
      Start  9: laser_blowout_wake_explicit.SI.1Rank
 9/24 Test  #9: laser_blowout_wake_explicit.SI.1Rank ....   Passed    6.86 sec
      Start 10: beam_evolution.1Rank
10/24 Test #10: beam_evolution.1Rank ....................   Passed   19.20 sec
      Start 11: adaptive_time_step.1Rank
11/24 Test #11: adaptive_time_step.1Rank ................   Passed   39.91 sec
      Start 12: grid_current.1Rank
12/24 Test #12: grid_current.1Rank ......................   Passed    1.88 sec
      Start 13: linear_wake.normalized.1Rank
13/24 Test #13: linear_wake.normalized.1Rank ............   Passed    5.01 sec
      Start 14: linear_wake.SI.1Rank
14/24 Test #14: linear_wake.SI.1Rank ....................   Passed    4.96 sec
      Start 15: gaussian_linear_wake.normalized.1Rank
15/24 Test #15: gaussian_linear_wake.normalized.1Rank ...   Passed   56.19 sec
      Start 16: gaussian_linear_wake.SI.1Rank
16/24 Test #16: gaussian_linear_wake.SI.1Rank ...........   Passed   54.87 sec
      Start 17: reset.2Rank
17/24 Test #17: reset.2Rank .............................   Passed    5.26 sec
      Start 18: beam_in_vacuum.SI.1Rank
18/24 Test #18: beam_in_vacuum.SI.1Rank .................   Passed    7.88 sec
      Start 19: beam_in_vacuum.normalized.1Rank
19/24 Test #19: beam_in_vacuum.normalized.1Rank .........   Passed    6.80 sec
      Start 20: next_deposition_beam.2Rank
20/24 Test #20: next_deposition_beam.2Rank ..............   Passed   13.76 sec
      Start 21: slice_IO.1Rank
21/24 Test #21: slice_IO.1Rank ..........................   Passed   11.44 sec
      Start 22: output_coarsening.2Rank
22/24 Test #22: output_coarsening.2Rank .................   Passed   14.89 sec
      Start 23: gaussian_weight.1Rank
23/24 Test #23: gaussian_weight.1Rank ...................   Passed   67.96 sec
      Start 24: beam_in_vacuum.normalized.2Rank
24/24 Test #24: beam_in_vacuum.normalized.2Rank .........   Passed   14.03 sec

h5py backend:

Test project /home/runner/work/hipace/hipace/build
      Start  1: blowout_wake.2Rank
 1/24 Test  #1: blowout_wake.2Rank ......................   Passed   10.01 sec
      Start  2: collisions.SI.1Rank
 2/24 Test  #2: collisions.SI.1Rank .....................   Passed    1.36 sec
      Start  3: ionization.2Rank
 3/24 Test  #3: ionization.2Rank ........................   Passed    2.58 sec
      Start  4: from_file.normalized.1Rank
 4/24 Test  #4: from_file.normalized.1Rank ..............   Passed    2.89 sec
      Start  5: from_file.SI.1Rank
 5/24 Test  #5: from_file.SI.1Rank ......................   Passed    2.97 sec
      Start  6: restart.normalized.1Rank
 6/24 Test  #6: restart.normalized.1Rank ................   Passed    1.21 sec
      Start  7: blowout_wake_explicit.2Rank
 7/24 Test  #7: blowout_wake_explicit.2Rank .............   Passed    2.08 sec
      Start  8: laser_blowout_wake_explicit.1Rank
 8/24 Test  #8: laser_blowout_wake_explicit.1Rank .......   Passed    4.26 sec
      Start  9: laser_blowout_wake_explicit.SI.1Rank
 9/24 Test  #9: laser_blowout_wake_explicit.SI.1Rank ....   Passed    4.20 sec
      Start 10: beam_evolution.1Rank
10/24 Test #10: beam_evolution.1Rank ....................   Passed    6.15 sec
      Start 11: adaptive_time_step.1Rank
11/24 Test #11: adaptive_time_step.1Rank ................   Passed    6.19 sec
      Start 12: grid_current.1Rank
12/24 Test #12: grid_current.1Rank ......................   Passed    1.35 sec
      Start 13: linear_wake.normalized.1Rank
13/24 Test #13: linear_wake.normalized.1Rank ............   Passed    3.25 sec
      Start 14: linear_wake.SI.1Rank
14/24 Test #14: linear_wake.SI.1Rank ....................   Passed    3.21 sec
      Start 15: gaussian_linear_wake.normalized.1Rank
15/24 Test #15: gaussian_linear_wake.normalized.1Rank ...   Passed    2.84 sec
      Start 16: gaussian_linear_wake.SI.1Rank
16/24 Test #16: gaussian_linear_wake.SI.1Rank ...........   Passed    2.88 sec
      Start 17: reset.2Rank
17/24 Test #17: reset.2Rank .............................   Passed    2.24 sec
      Start 18: beam_in_vacuum.SI.1Rank
18/24 Test #18: beam_in_vacuum.SI.1Rank .................   Passed    6.23 sec
      Start 19: beam_in_vacuum.normalized.1Rank
19/24 Test #19: beam_in_vacuum.normalized.1Rank .........   Passed    5.51 sec
      Start 20: next_deposition_beam.2Rank
20/24 Test #20: next_deposition_beam.2Rank ..............   Passed   10.89 sec
      Start 21: slice_IO.1Rank
21/24 Test #21: slice_IO.1Rank ..........................   Passed   10.06 sec
      Start 22: output_coarsening.2Rank
22/24 Test #22: output_coarsening.2Rank .................   Passed   13.50 sec
      Start 23: gaussian_weight.1Rank
23/24 Test #23: gaussian_weight.1Rank ...................   Passed   28.38 sec
      Start 24: beam_in_vacuum.normalized.2Rank
24/24 Test #24: beam_in_vacuum.normalized.2Rank .........   Passed   12.47 sec

@AlexanderSinn AlexanderSinn changed the title Test CI performance Change checksum backend to h5py to improve CI performance Mar 26, 2022
@AlexanderSinn AlexanderSinn added bug Something isn't working CI Continuous integration, checksum and analysis tests, GitHub Actions, etc. labels Mar 26, 2022
@AlexanderSinn AlexanderSinn requested a review from ax3l March 26, 2022 19:41
This was referenced Mar 26, 2022
@AlexanderSinn
Copy link
Member Author

Does this fix #717?

@MaxThevenet
Copy link
Member

@AlexanderSinn thanks for this PR! Yes it seems to address #717, although I originally expected that CI was slowed down because HiPACE++ was slower, not because the Python analysis scripts were. Could you merge dev in there?
@RemiLehe @ax3l do you see a reason why the post-processing should be slower with the openPMD-api backend than with the hdf5 backend of openPMD-viewer? In case that helps, here are the versions used for CI

Collecting openpmd-viewer
[297](https://github.com/Hi-PACE/hipace/runs/5718094870?check_suite_focus=true#step:3:297)
  Downloading openPMD_viewer-1.3.0-py3-none-any.whl (85 kB)
[298](https://github.com/Hi-PACE/hipace/runs/5718094870?check_suite_focus=true#step:3:298)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.2/85.2 KB 24.0 MB/s eta 0:00:00
[299](https://github.com/Hi-PACE/hipace/runs/5718094870?check_suite_focus=true#step:3:299)
Collecting openpmd-api
[300](https://github.com/Hi-PACE/hipace/runs/5718094870?check_suite_focus=true#step:3:300)
  Downloading openPMD_api-0.14.4-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.9 MB)

Successfully installed cycler-0.11.0 h5py-3.6.0 kiwisolver-1.4.1 matplotlib-3.2.2 numpy-1.22.3 openpmd-api-0.14.4 openpmd-viewer-1.3.0 python-dateutil-2.8.2 scipy-1.8.0 tqdm-4.63.1

Copy link
Member

@MaxThevenet MaxThevenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. There is still something fishy to look at in the dependencies, but this accelerated CI time will be appreciated for now.

@MaxThevenet MaxThevenet merged commit 0339176 into Hi-PACE:development Mar 31, 2022
@SeverinDiederichs
Copy link
Member

SeverinDiederichs commented Mar 31, 2022

The issue was identified to be caused by the upgrade of openPMD viewer from 1.2 to 1.3:

The last fast CI was on PR #687 and the first slow CI was on PR #689.

Fast CI was with openPMD viewer 1.2 (from the log of #687):

Collecting openpmd-viewer
  Downloading openPMD_viewer-1.2.0-py3-none-any.whl (84 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 84.5/84.5 KB 26.7 MB/s eta 0:00:00
Collecting openpmd-api
  Downloading openPMD_api-0.14.4-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.9 MB)

slow CI was with openPMD viewer 1.3 (from the log of #689)

Collecting openpmd-viewer
  Downloading openPMD_viewer-1.3.0-py3-none-any.whl (85 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.2/85.2 KB 26.3 MB/s eta 0:00:00
Collecting openpmd-api
  Downloading openPMD_api-0.14.4-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.9 MB)

@RemiLehe
Copy link
Contributor

Thanks for reporting this.
I wonder if this is related of this PR:
https://github.com/openPMD/openPMD-viewer/pull/332/files
It changes the way in which we read the fields. @SeverinDiederichs Are you able to tell whether the slow-down is related to how we read fields as opposed to particles?

@SeverinDiederichs
Copy link
Member

This will require some more testing, because all the tests that take significantly longer read out both fields and particles.

@AlexanderSinn
Copy link
Member Author

AlexanderSinn commented Mar 31, 2022

https://github.com/openPMD/openPMD-viewer/pull/332/files seems to indeed introduce the Issue, however my testing suggests that the slow-down actually occurs when reading in particles with the get_data() function.
@ax3l @RemiLehe

Total time: 86.1999 s
File: /home/asinn/openPMD-Viewer/openPMD-viewer/openpmd_viewer/openpmd_timeseries/data_reader/io_reader/utilities.py
Function: get_data at line 29

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    29                                           @profile
    30                                           def get_data(series, record_component, i_slice=None, pos_slice=None,
    31                                                        output_type=np.float64):
    32                                               """
    33                                               Extract the data from a (possibly constant) dataset
    34                                               Slice the data according to the parameters i_slice and pos_slice
    35
    36                                               Parameters:
    37                                               -----------
    38                                               series: openpmd_api.Series
    39                                                   An open, readable openPMD-api series object
    40
    41                                               record_component: an openPMD.Record_Component
    42
    43                                               pos_slice: int or list of int, optional
    44                                                   Slice direction(s).
    45                                                   When None, no slicing is performed
    46
    47                                               i_slice: int or list of int, optional
    48                                                  Indices of slices to be taken.
    49
    50                                               output_type: a numpy type
    51                                                  The type to which the returned array should be converted
    52
    53                                               Returns:
    54                                               --------
    55                                               An np.ndarray (non-constant dataset) or a single double (constant dataset)
    56                                               """
    57                                               # For back-compatibility: Convert pos_slice and i_slice to
    58                                               # single-element lists if they are not lists (e.g. float
    59                                               # and int respectively).
    60        30         50.0      1.7      0.0      if pos_slice is not None and not isinstance(pos_slice, list):
    61                                                   pos_slice = [pos_slice]
    62        30         32.0      1.1      0.0      if i_slice is not None and not isinstance(i_slice, list):
    63                                                   i_slice = [i_slice]
    64
    65        30      57130.0   1904.3      0.1      chunks = record_component.available_chunks()
    66
    67        30         86.0      2.9      0.0      if pos_slice is None:
    68                                                   # mask invalid regions with zero
    69        30   20196545.0 673218.2     23.4          data = np.zeros_like(record_component)
    70        60        271.0      4.5      0.0          for chunk in chunks:
    71        30       1720.0     57.3      0.0              chunk_slice = chunk_to_slice(chunk)
    72                                                       # read only valid region
    73        30       2422.0     80.7      0.0              x = record_component[chunk_slice]
    74        30   65923020.0 2197434.0     76.5              series.flush()
    75        30      17208.0    573.6      0.0              data[chunk_slice] = x
    76                                               else:
    77                                                   full_shape = record_component.shape

@ax3l
Copy link
Member

ax3l commented Apr 5, 2022

Sorry for being so slow on GitHub pings, feel free to also ping on Slack as well when I overlook something important :(

Thank you for triaging & finding the root cause of this is from openPMD/openPMD-viewer#332 / openPMD/openPMD-viewer#334.

You used HDF5 files, right? (Not json, which is for testing & very slow, ofc.)

@AlexanderSinn
Copy link
Member Author

Yes, HDF5 files

@ax3l
Copy link
Member

ax3l commented Apr 6, 2022

I am amazed that np.zeros_like already takes ~25% of the runtime to read this.
(In dev, we already changed this to np.full_like: openPMD/openPMD-viewer#334)

How large is the data set in question and how many MPI ranks wrote it?

@ax3l
Copy link
Member

ax3l commented Apr 12, 2022

I drafted a potential fix in openPMD/openPMD-viewer#340

@ax3l
Copy link
Member

ax3l commented Apr 15, 2022

Fixed via openPMD-viewer 1.4.0 🎉
https://github.com/openPMD/openPMD-viewer/releases/tag/1.4.0

Thank you @AlexanderSinn for the help! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CI Continuous integration, checksum and analysis tests, GitHub Actions, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants