Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPC64le SegFaults #980

Closed
mmalohlava opened this issue Apr 26, 2018 · 2 comments · Fixed by #1010
Closed

PPC64le SegFaults #980

mmalohlava opened this issue Apr 26, 2018 · 2 comments · Fixed by #1010
Assignees
Labels
segfault Severe bugs that lead to crashes / seg.faults / process termination
Milestone

Comments

@mmalohlava
Copy link
Member

mmalohlava commented Apr 26, 2018

To reproduce

  1. login to mr-0xp1
  2. cd /home/jenkins/slave_dir_from_mr-0xc1/workspace/h2o-py-data-table_rel-0.3.2-HSRX7AFTBJEESK7VGQBWNRPFHJGGB6NZY3F2GPECNPOQ2U7CP6VA
  3. Run docker run --ulimit core=-1 -it --rm -u $(id -u):$(id -g) -v $(pwd):$(pwd) -w $(pwd) docker.h2o.ai/opsh2oai/datatable-build-ppc64le_centos7
  4. Build dtbl
    cd build
    . activate datatable-py36
    make debug
    
  5. Run tests
    cd ../test-ppc64le_linux-datatable-py36
    . activate datatable-py36
    make test MODULE=datatable
    
  6. Enjoy seg fault (it is not deterministic)
    tests/munging/test_dt_rows.py .................................................................make: *** [test] Segmentation fault (core dumped)
    

1. SegFault

From gdb:

(gdb) info frame
Stack level 0, frame at 0x3fffa004e070:
 pc = 0x3ffed917002c; saved pc 0x3fffa4ba4838
 called by frame at 0x3fffa004e1b0
 Arglist at 0x3fffa004e070, args:
 Locals at 0x3fffa004e070, Previous frame's sp is 0x3fffa004e070
(gdb)

The mapping shows datatable.so at saved PC:

 0x3fffa4b00000     0x3fffa4cd0000   0x1d0000        0x0 /opt/h2oai/dai/python/envs/datatable-py36/lib/python3.6/site-packages/datatable/lib/_datatable.cpython-36m-powerpc64le-linux-gnu.so

2. SegFault

If tests fail on:

tests/test_dt.py .............................                                                                                                                                                                                                                           [  2%]
tests/test_dt_options.py .........                                                                                                                                                                                                                                       [  3%]
tests/test_dt.py ........................s....                                                                                                                                                                                                                           [  6%]
tests/test_dt_create.py ............................................                                                                                                                                                                                                     [ 11%]
tests/test_dt_expr.py ........................make: *** [test] Segmentation fault (core dumped)

GDB reports problems in static cast (section which is parallelized by OMP):

#1 0x00003fffa9b9d648 in .omp_outlined..38(void) (.global_tid.=0x3fffd3e30378, .bound_tid.=0x3fffd3e30380, this=0x3fffa9bb9000 <RowIndex::inverse(long) const+560>, xi=@0x3fffd3e30388: 0x3fffa51d5308, xo=@0x119237f30: 0x63, una=@0x118ffb070: 4054029733469093921,
    umin=@0x3fffa76951a0: 176767840160579700) at c/sort.cc:399
399	        xo[j] = t == una? 0 : static_cast<TO>(t - umin + 1);
@mmalohlava
Copy link
Member Author

mmalohlava commented Apr 26, 2018

Running tests in gdb:

gdb -ex r --args python -m pytest -x tests

1. SegFault :

(gdb) info frame
Stack level 0, frame at 0x3fffb149e3b0:
 pc = 0x3ffe5a95002c in make_rowindex1; saved pc 0x3fffb6324838
 called by frame at 0x3fffb149e620
 Arglist at 0x3fffb149e270, args:
 Locals at 0x3fffb149e270, Previous frame's sp is 0x3fffb149e3b0
 Saved registers:
  r31 at 0x3fffb149e3a0, pc at 0x3fffb149e3c0, lr at 0x3fffb149e3c0
(gdb) bt
#0  0x00003ffe5a95002c in make_rowindex1 ()
#1  0x00003fffb6324838 in .omp_outlined.(void) (.global_tid.=0x3fffb149e68c, .bound_tid.=0x3fffb149e688, zrows_per_chunk=@0x3fffffff2e58: 65536, num_chunks=@0x3fffffff2e60: 1, this=0x100db2fe0, rows_per_chunk=@0x3fffffff2e68: 65536, n=@0x3fffffff2fb0: 11,
    ff=@0x3fffffff2fb8: 0x3ffe5a950000 <make_rowindex1>, out_length=@0x3fffffff2e70: 0) at c/rowindex_array.cc:181
#2  0x00003fffb5f4f488 in __kmp_invoke_microtask () from /opt/h2oai/dai/python/envs/datatable-py36/lib/python3.6/site-packages/datatable/lib/./libomp.so
#3  0x00003fffb5f1f5b0 in __kmp_invoke_task_func () from /opt/h2oai/dai/python/envs/datatable-py36/lib/python3.6/site-packages/datatable/lib/./libomp.so
Backtrace stopped: frame did not save the PC

2. SegFault

(gdb) bt
#0  0x00003ffd72de0018 in map1 ()
#1  0x00003fffb62fd648 in columns_from_mixed (spec=0x1013236f0, ncols=1, nrows=5, dt=0x1013c1ef0, mapfn=0x3ffd72de0000 <map1>) at c/columnset.cc:124
#2  0x00003fffb631d598 in pycolumnset::columns_from_mixed (args=0x3fffb198b9a8) at c/py_columnset.cc:120
#3  0x00003fffb631d318 in pycolumnset::columns_from_mixed_safe (self=0x3fffb659a278, args=0x3fffb198b9a8) at c/py_columnset.h:63
#4  0x00000001000d19c0 in _PyCFunction_FastCallDict ()
#5  0x000000010011e994 in _PyCFunction_FastCallKeywords ()
#6  0x00000001001bb14c in call_function ()
#7  0x00000001001f8194 in _PyEval_EvalFrameDefault ()
#8  0x00000001000ae3b4 in PyEval_EvalFrameEx ()
#9  0x00000001001afa38 in _PyEval_EvalCodeWithName ()
#10 0x00000001001b1144 in fast_function ()

@mmalohlava mmalohlava changed the title PPC64le Seg-faults PPC64le SegFaults Apr 26, 2018
@st-pasha st-pasha added the segfault Severe bugs that lead to crashes / seg.faults / process termination label Apr 27, 2018
@mmalohlava mmalohlava self-assigned this May 17, 2018
mmalohlava added a commit that referenced this issue Jul 18, 2018
* [BUILD] Upgrade to LLVM6

Includes:
  - Upgrade of LLVM to version 6.0.1 provided by dai-thirdparty-deps
  project
  - LVMlite upgrade
  - LLVM config path

The upgrade enables PPC64le tests and avoids segfaults reported by:
  - numba/numba#2451 
  - numba/numba#2848

Close #980
abal5 pushed a commit that referenced this issue Sep 13, 2018
* [BUILD] Upgrade to LLVM6

Includes:
  - Upgrade of LLVM to version 6.0.1 provided by dai-thirdparty-deps
  project
  - LVMlite upgrade
  - LLVM config path

The upgrade enables PPC64le tests and avoids segfaults reported by:
  - numba/numba#2451 
  - numba/numba#2848

Close #980
@st-pasha st-pasha added this to the Release 0.4.0 milestone Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
segfault Severe bugs that lead to crashes / seg.faults / process termination
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants