Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make it possible to manually set chunks when loading dask arrays #1477

Merged
merged 3 commits into from
Aug 17, 2023

Conversation

pordyna
Copy link
Contributor

@pordyna pordyna commented Jul 14, 2023

Make it possible to override the default chunking behavior when loading to dask arrays. For PIConGPU the default creates a chunk for each GPU with its local domain. When using static load balancing with gridDist in PIConGPU, this results in very uneven chunks. It looks like, for many operations, it makes sense to chunk along the outermost direction only (reading contiguous domains is faster). With this PR this can be achieved with .to_daks_array(chunks={0 : 'auto', 1: -1, 2: -1})

@pordyna pordyna force-pushed the topic-daskChunks branch from fb5da8b to 1fe460d Compare July 14, 2023 13:06
@pordyna pordyna force-pushed the topic-daskChunks branch from 0c924d9 to b6f0bcf Compare July 14, 2023 13:08
@franzpoeschel franzpoeschel requested a review from ax3l July 14, 2023 14:21
@ax3l ax3l self-assigned this Aug 9, 2023
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, great idea!!

I added a few suggestions to finalize the PR :)

src/binding/python/openpmd_api/DaskArray.py Outdated Show resolved Hide resolved
docs/source/analysis/dask.rst Outdated Show resolved Hide resolved
Copyright and typos in setting chunks for dask arrays.

Co-authored-by: Axel Huebl <[email protected]>
@pordyna
Copy link
Contributor Author

pordyna commented Aug 14, 2023

Thanks @ax3l, I applied your suggestions.

Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! :)

@ax3l ax3l added this to the 0.16.0 milestone Aug 17, 2023
@ax3l ax3l enabled auto-merge (squash) August 17, 2023 03:31
@ax3l ax3l merged commit 99daca7 into openPMD:dev Aug 17, 2023
eschnett added a commit to eschnett/openPMD-api that referenced this pull request Sep 5, 2023
* dev:
  Fix CMake: HDF5 Libs are PUBLIC (openPMD#1520)
  Fix `chmod` in `download_samples.sh` (openPMD#1518)
  CI: Old CTest (openPMD#1519)
  Python: Fix ODR Violation (openPMD#1521)
  replace extent in weighting and displacement (openPMD#1510)
  CMake: Warn and Continue on Empty HDF5_VERSION (openPMD#1512)
  Replace openPMD_Datatypes global with function (openPMD#1509)
  Streaming examples: Set WAN as default transport (openPMD#1511)
  TOML Backend (openPMD#1436)
  make it possible to manually set chunks when loading dask arrays (openPMD#1477)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1504)
  Optional debugging output for AbstractIOHandlerImpl::flush() (openPMD#1495)
  Python: 3.8+ (openPMD#1502)

# Conflicts:
#	.github/workflows/linux.yml
#	src/binding/python/Series.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: new additions to the API frontend: Python3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants