-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfp: Cannot allocate memory in buffer_write #180
Comments
Hi,
Thanks for the report, indeed there is a bug with how the transformations
handle buffering. We could not figure it out yet how to fix it, I just want
to let you know we are looking at this issue.
On other note, however, your example runs out of memory at some point
anyway. Even without transformations. ADIOS buffers all the writes in to
one buffer which is flushed in adios_close().
The only way to avoid running out of memory is to adios_close at some
regularity and re-open with "u" (update mode), so all writes will be in one
timestep.
Or another way is to use the POSIX method and set a buffer max size. The
POSIX transport method can handle writing more data than the buffer allows
for, but not the other transports.
Thanks
…On Tue, May 29, 2018 at 6:15 AM, jkelling ***@***.***> wrote:
I encountered a problem when attempting to use the zfp transform in Adios
1.13.1 . When trying to write larger amounts of data an error is printed
and the program crashes with a SIGSEGV immediately afterwards.
Below you find a minimal example, which works if the "identity" transform
is used instead of zfp.
Expected behavior
The program runs to completion and writes to test.bp.
Encountered behavior
Depending on the total amount of data written and the size of the written
variables the following error message is printed:
Cannot allocate memory in buffer_write. Requested: 36783836, Maximum: 36777876
ERROR: Cannot allocate shared buffer of 21875024 bytes for ZFP transform for variable data/95
At which point it fails depends on the size of the variables but there is
no monotonous relation:
$ zfpExample 5000000
[...]
ERROR: Cannot allocate shared buffer of 21875024 bytes for ZFP transform for variable data/95
$ zfpExample 4000000
[...]
ERROR: Cannot allocate shared buffer of 17500024 bytes for ZFP transform for variable data/0
However, for smaller variables it is more likely, that more variables can
be written before crashing, or the program might even complete, for example zfpExample
400000.
After this message the program segfaults, with the following beacktrace:
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:37
#1 0x0000000000483230 in adios_transform_zfp_apply ()
#2 0x0000000000481a06 in adios_transform_apply ()
#3 0x0000000000480853 in adios_transform_variable_data ()
#4 0x0000000000420c53 in common_adios_write_transform_helper ()
#5 0x000000000042121c in common_adios_write ()
#6 0x00000000004219a3 in common_adios_write_byid ()
#7 0x000000000041e482 in adios_write_byid ()
#8 0x0000000000417fe8 in main (argc=2, argv=0x7fffffffc878) at zfpExample.cpp:57
Example Code:
Assuming Adios build with MPI, but not using parallel write, run without
MPI or with mpirun -n 1.
#include <iostream>
#include <string>
#include <cstring>
#include <sstream>
#include <vector>
#include <random>
#include <mpi.h>
#include <adios.h>
inline void exitOnError(const char* msg, int err) {
if(err)
{
std::cerr << "[EE]" << msg << "\tAdios error code: " << err << '\n';
exit(1);
}
}
const char* TRANSFORM = "zfp:accuracy=0.0001";// const char* TRANSFORM = "identity";
int main(int argc, char* argv[])
{
std::vector<float> data;
if(argc != 2)
data.resize(10000000, 0.f);
else
data.resize(atoi(argv[1]), 0.f);
MPI_Init(0,0);
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
adios_init_noxml(MPI_COMM_WORLD);
//adios_set_max_buffer_size(100); // no effect
int64_t grpid;
exitOnError("Failed to declare group."
, adios_declare_group(&grpid, "data", "idx", adios_stat_no));
exitOnError("Failed to select method MPI."
, adios_select_method(grpid, "MPI", "", ""));
int64_t adiosHandle;
exitOnError("Failed to open file."
, adios_open(&adiosHandle, "data", "test.bp", "w", MPI_COMM_WORLD));
std::ostringstream size;
size << data.size();
for(int a = 0; a < 100; ++a)
{
std::ostringstream oname;
oname << "data/" << a;
std::cerr << oname.str() << ' ' << size.str() << '\n';
auto var = adios_define_var(grpid, oname.str().c_str(), "", adios_real, size.str().c_str(), size.str().c_str(),"");
exitOnError("Failed to set transform", adios_set_transform(var, TRANSFORM));
exitOnError("Failed to write", adios_write_byid(adiosHandle, var, data.data()));
}
exitOnError("Failed to close", adios_close(adiosHandle));
adios_finalize(rank);
MPI_Finalize();
}
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#180>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADGMLZTSH-eLBHdTbdBoAlbVm7J4PdPXks5t3R_IgaJpZM4URNga>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I encountered a problem when attempting to use the zfp transform in Adios 1.13.1 . When trying to write larger amounts of data an error is printed and the program crashes with a SIGSEGV immediately afterwards.
Below you find a minimal example, which works if the "identity" transform is used instead of zfp.
Expected behavior
The program runs to completion and writes to test.bp.
Encountered behavior
Depending on the total amount of data written and the size of the written variables the following error message is printed:
At which point it fails depends on the size of the variables but there is no monotonous relation:
However, for smaller variables it is more likely, that more variables can be written before crashing, or the program might even complete, for example
zfpExample 400000
.After this message the program segfaults, with the following beacktrace:
Example Code:
Assuming Adios build with MPI, but not using parallel write, run without MPI or with
mpirun -n 1
.The text was updated successfully, but these errors were encountered: