running error #191

wangwu1991 · 2022-05-27T09:07:31Z

[ERR] MemoryManager::Malloc1D error.
[mars01:187617:0:187617] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x37)
BFD: Dwarf Error: found dwarf version '5', this reader only handles version 2, 3 and 4 information.

ben-e-whitney · 2022-05-27T18:21:14Z

@JieyangChen7 This looks like an MGARD-X error, I think?

@wangwu1991 Thanks very much for the bug report. Can you tell us how you compiled MGARD and what command you were running when you got this segfault?

wangwu1991 · 2022-05-30T02:45:43Z

the segfault was appearing when running './mgard-x' .
The MGARD was compiled on HPC of Centos 7, with gcc/11.2.0, cmake/3.23.1, the zstd and the protoc were installed using MGARD/build_scripts/build_mgard_serial.sh (BTW, as a normal user without sudo, the same failure to run MGARD with personal installed zstd and protoc).

JieyangChen7 · 2022-06-01T18:42:35Z

@wangwu1991 Thank you for creating this issue. Would you be able to provide us with the input data and parameters you used for compression?

wangwu1991 · 2022-06-02T08:12:10Z

The input data is "testfloat_8_8_128.dat" in SZ2. The full command I used is "./mgard-x -z -i ./testfloat_8_8_128.dat -c testfloat_8_8_128.dat_z -r 1 -t s -n 3 8 8 128 -m rel -e 1e-3 -s 0 -l 3 -d serial -v". Thanks.

JieyangChen7 · 2022-08-22T17:44:19Z

@wangwu1991 Sorry about the late reply. We have fixed the problem in #203 and here is an example output of compressing "testfloat_8_8_128.dat". Also, please note that the dimensions should be represented in slowest-to-fastest order.

$mgard-x -z -i testfloat_8_8_128.dat -c testfloat_8_8_128.dat.mgard -r 1 -t s -n 3 128 8 8 -m rel -e 1e-3 -s 0 -l 0 -d serial -v
[info] mode: compression
[info] original data: /home/jieyang/dev/data/testfloat_8_8_128.dat
[info] compressed data: /home/jieyang/dev/data/testfloat_8_8_128.dat.mgard
[info] data type: Single precision
[info] error bound mode: Relative
[info] error bound: 1.000000e-03
[info] s: 0
[info] lossless: Huffman
[info] device type: SERIAL
[info] Verbose: enabled
[info] Loading file: /home/jieyang/dev/data/testfloat_8_8_128.dat
[info] Select device: CPU
[time] Calculating norm time: 1.0503e-05s
[info] L_2 norm: 1.58708
[time] Decomposition time: 0.00586623s
[time] Quantization time: 0.000754992s
[info] Outlier ratio: 39/8192 (0.476074%)
[time] Level Linearizer type: 1 time: 0.000391853s
[time] Huffman Compress time: 4.13656s
[info] Huffman block size: 20480
[info] Huffman dictionary size: 8192
[info] Huffman compress ratio: 32768/40356 (0.811973)
[time] Overall Compress time: 4.14367s
[time] Compression Throughput: 7.90797e-06 GB/s
[info] Compression ratio: 0.810046
[info] Select device: CPU
[time] Level Linearizer type: 1 time: 0.000374763s
[time] Huffman Decompress time: 0.000380089s
[time] Dequantization time: 0.000674831s
[time] Recomposition time: 0.00604179s
[time] Overall Decompression time: 0.00719353s
[time] Decompression Throughput: 0.00455521 GB/s
[info] Relative L_2 error: 5.346481e-04 (Satisified)
[info] MSE: 7.200034e-07
[info] PSNR: 75.1234

wangwu1991 · 2022-08-26T08:20:56Z

After compiling with the updated code and using the “mgard-x -z -i testfloat_8_8_128.dat -c testfloat_8_8_128.dat.mgard -r 1 -t s -n 3 128 8 8 -m rel -e 1e-3 -s 0 -l 0 -d serial -v”, the output are that
./mgard-x -z -i testfloat_8_8_128.dat -c testfloat_8_8_128.dat.mgard -r 1 -t s -n 3 128 8 8 -m rel -e 1e-3 -s 0 -l 0 -d serial -v
[info] mode: compression
[info] original data: testfloat_8_8_128.dat
[info] compressed data: testfloat_8_8_128.dat.mgard
[info] data type: Single precision
[info] error bound mode: Relative
[info] error bound: 1.000000e-03
[info] s: 0
[info] lossless: Huffman
[info] device type: Serial
[info] Verbose: enabled
[info] Loading file: testfloat_8_8_128.dat
[time] Calculating norm time: 4.661e-05s
[info] L_2 norm: 1.58708
[time] Decomposition time: 0.00731331s
[time] Quantization time: 0.00113645s
[info] Outlier ratio: 39/8192 (0.476074%)
[ERR] MemoryManager::Malloc1D error.
Segmentation fault

wangwu1991 · 2022-08-26T08:27:49Z

When I compiled the same code on my another computer with Ubuntu 18.04 and gcc (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04), the program can be executed smoothly and the results are right (exactly the same with yours). Except for the system and the compiler, everything else is basically the same. I'm very eager to know why.

wangwu1991 · 2022-08-26T08:33:36Z

Since I would like to use this algorithm to incorporate into my own programs, a simplified version (such as without protobuf) will be expected to be provided, so the promotion of the algorithm will be more easily.

ben-e-whitney assigned JieyangChen7 May 27, 2022

ben-e-whitney added the bug Something isn't working label May 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

running error #191

running error #191

wangwu1991 commented May 27, 2022

ben-e-whitney commented May 27, 2022

wangwu1991 commented May 30, 2022

JieyangChen7 commented Jun 1, 2022

wangwu1991 commented Jun 2, 2022

JieyangChen7 commented Aug 22, 2022

wangwu1991 commented Aug 26, 2022

wangwu1991 commented Aug 26, 2022

wangwu1991 commented Aug 26, 2022

running error #191

running error #191

Comments

wangwu1991 commented May 27, 2022

ben-e-whitney commented May 27, 2022

wangwu1991 commented May 30, 2022

JieyangChen7 commented Jun 1, 2022

wangwu1991 commented Jun 2, 2022

JieyangChen7 commented Aug 22, 2022

wangwu1991 commented Aug 26, 2022

wangwu1991 commented Aug 26, 2022

wangwu1991 commented Aug 26, 2022