Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetCDF slow writes when using the stride parameter. #1877

Open
agnishd opened this issue Oct 30, 2020 · 7 comments
Open

NetCDF slow writes when using the stride parameter. #1877

agnishd opened this issue Oct 30, 2020 · 7 comments

Comments

@agnishd
Copy link

agnishd commented Oct 30, 2020

NetCDF versions in which we have observed this issue:

  • Observed in 4.7.1, 4.6.3, and 4.3.3.1. There is a considerable improvement in 4.7.1 but, strided writes are still slower than just using loops and iteratively writing the same values at the same locations.

Platform information:

  • Tested on Linux lsdbgl28141glnxa64 4.19.0-9-amd64 Netcdf cmake #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux

Description of the issue:

  • We have observed slower write speeds when using the "nc_put_vars_double()" function to write to a NetCDF4-classic file using the "stride" feature. I created a standalone file to reproduce the issue and tested it on versions 4.7.3, 4.6.1, and 4.3.3.1 of the NetCDF library. The following are some sample runtimes that I got both by using the stride feature and without it:

ver 4.3.3.1:
Strided: 184514115 microseconds
Looped: 336884 microseconds

ver 4.6.1:
Strided: 175383093 microseconds
Looped: 340540 microseconds

ver 4.7.3:
Strided: 5841009 microseconds
Looped: 370361 microseconds

The code attached writes a 5184 x 228 x 40 matrix to a NetCDF file with dimensions: 5184 x 228 X (unlimited), using the nc_put_vars_double() function of NetCDF by:
a. Using the stride parameter i.e. specifying the stride vector as (1, 1, 5).
b. By looping along the z-axis and iterating the starting points of the writes as (0, 0, i * 5) while keeping the stride vector as (1, 1, 1).
NetCDF_slow_stided_writes-main.zip

@DennisHeimbigner
Copy link
Collaborator

This has been an ongoing issue.
See, for example: issue #1380
IF we ever can get a fix to that issue, it should significantly improve
strided performance for netcdf-4 file.

@edwardhartnett
Copy link
Contributor

image

@agnishd
Copy link
Author

agnishd commented Nov 12, 2020

Thanks for letting us know that this is an ongoing issue!

I just wanted to bring something else we found to your notice as well since I thought that it might be related to this issue.

Before testing the performance of strided writes on NetCDF4-classic files, I had tested the same out on NetCDF3 files by chance and got the following results:

  1. 4.3.3.1:
    Strided: 7538865 microseconds
    Looped: 2355040 microseconds

  2. 4.6.1:
    Strided: 7761987 microseconds
    Looped: 2531814 microseconds

  3. 4.7.3:
    Strided: 6834903 microseconds
    Looped: 1505734 microseconds

Notice that the difference across versions isn't as stark as in the case of NetCDF4-classic files but strided writes are still slower compared to their looped versions.

I just had a few questions about this:

  1. Is this known?
  2. Is it a bug?
  3. Or is this behaviour as per current expectations?

@DennisHeimbigner
Copy link
Collaborator

The problem is that when stride > 1, both netcdf-3 and netcdf--4 currently
devolve into walking the dataset one element at a time. This is where the time goes.
For netcdf-4, we have the option of using the HDF5 stride implementation which
is optimized to improve this case. We have, to date, not made such an optimization
available in netcdf-3, however. Hence it will likely remain slower than non-strided
until we can find the resources to fix it.

@agnishd
Copy link
Author

agnishd commented Jun 1, 2021

Hello @DennisHeimbigner,

I just wanted to double-check if there have been any updates w.r.t to this issue yet. Do you have anything planned in one of your upcoming releases to address this if not?

@agnishd
Copy link
Author

agnishd commented Oct 1, 2021

Hello @DennisHeimbigner,

We upgraded NetCDF to v4.8.1 and ran the repro code I sent you earlier on it and found that the issue is still there. I was wondering if there has been any update on this issue yet or if you have anything planned for a future release.

@DennisHeimbigner
Copy link
Collaborator

I had hoped that Ed Hartnett would fix this. But he is swamped with other work.
I will try to get to it as soon as feasible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants