Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm] Further improve LLVM 10 compatibility by using RTLD_DEEPBIND #1355

Merged

Conversation

archibate
Copy link
Collaborator

Related issue = #1326

[Click here for the format server]


Failed to start OpenGL on a very old NVIDIA card which doesn't support compute shader, RTLD_DEEPBIND seems fix that issue.

@yuanming-hu yuanming-hu changed the title [llvm] Further improve libLLVM-10.so compatibility by using RTLD_DEEPBIND [llvm] Further improve LLVM 10 compatibility by using RTLD_DEEPBIND Jun 30, 2020
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this - I'm not sure if this will break Taichi on other Linux/OS X environment, since how RTLD_DEEPBIND works is unclear to me.

At least we should only do this for Linux? Here RTLD_DEEPBIND is used for both Linux and Mac.

Also note that LLVM 10 is statically linked into libtaichi.so - there's no libLLVM-10.so.

@archibate
Copy link
Collaborator Author

Well, but the stack trace tells me:

root@archlinux ~/taichi (git)-[master] # TI_ENABLE_OPENGL=1 ti example mpm128
[Taichi] mode=development
[Taichi] preparing sandbox at /tmp/taichi-x0wp6umu
[Taichi] <dev mode>, llvm 10.0.0, commit 67c6ea62, python 3.8.3

*******************************************
**      Taichi Programming Language      **
*******************************************
Running example mpm128 ...
python3: /mnt/llvm-10.0.0.src/include/llvm/Support/CommandLine.h:853: void llvm::cl::parser<DataType>::addLiteralOption(llvm::StringRef, const DT&, llvm::StringRef) [with DT = llvm::FunctionPass* (*)(); DataType = llvm::FunctionPass* (*)()]: Assertion `findOption(Name) == Values.size() && "Option already exists!"' failed.
[E 06/30/20 10:47:09.846] Received signal 6 (Aborted)


***********************************
* Taichi Compiler Stack Traceback *
***********************************
/tmp/taichi-x0wp6umu/taichi_core.so: taichi::Logger::error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
/tmp/taichi-x0wp6umu/taichi_core.so: taichi::signal_handler(int)
/usr/lib/libc.so.6(+0x3c3e0) [0x7f8d05a623e0]
/usr/lib/libc.so.6: gsignal
/usr/lib/libc.so.6: abort
/usr/lib/libc.so.6(+0x25727) [0x7f8d05a4b727]
/usr/lib/libc.so.6(+0x34936) [0x7f8d05a5a936]
/tmp/taichi-x0wp6umu/taichi_core.so(+0x2030400) [0x7f8cf6de8400]
/usr/lib/libLLVM-10.so(+0x816679) [0x7f8cee1c1679]
/lib64/ld-linux-x86-64.so.2(+0x110f2) [0x7f8d05fa80f2]
/lib64/ld-linux-x86-64.so.2(+0x11201) [0x7f8d05fa8201]
/usr/lib/libc.so.6: _dl_catch_exception
/lib64/ld-linux-x86-64.so.2(+0x1549c) [0x7f8d05fac49c]
/usr/lib/libc.so.6: _dl_catch_exception
/lib64/ld-linux-x86-64.so.2(+0x149be) [0x7f8d05fab9be]
/usr/lib/libdl.so.2(+0x134c) [0x7f8d059ff34c]
/usr/lib/libc.so.6: _dl_catch_exception
/usr/lib/libc.so.6: _dl_catch_error
/usr/lib/libdl.so.2(+0x1b89) [0x7f8d059ffb89]
/usr/lib/libdl.so.2: dlopen
/usr/lib/libGLX_mesa.so.0(+0x56990) [0x7f8cf45ad990]
/usr/lib/libGLX_mesa.so.0(+0x4bf24) [0x7f8cf45a2f24]
/usr/lib/libGLX_mesa.so.0(+0x35724) [0x7f8cf458c724]
/usr/lib/libGLX_mesa.so.0(+0x364ee) [0x7f8cf458d4ee]
/tmp/taichi-x0wp6umu/taichi_core.so: _glfwInitGLX
/tmp/taichi-x0wp6umu/taichi_core.so: _glfwPlatformCreateWindow
/tmp/taichi-x0wp6umu/taichi_core.so: glfwCreateWindow
/tmp/taichi-x0wp6umu/taichi_core.so: taichi::lang::opengl::initialize_opengl(bool)
/tmp/taichi-x0wp6umu/taichi_core.so: taichi::lang::opengl::is_opengl_api_available()
/tmp/taichi-x0wp6umu/taichi_core.so(+0x8ebfc7) [0x7f8cf56a3fc7]
/tmp/taichi-x0wp6umu/taichi_core.so(+0x7b9eb8) [0x7f8cf5571eb8]
/usr/lib/libpython3.8.so.1.0: PyCFunction_Call
/usr/lib/libpython3.8.so.1.0: _PyObject_MakeTpCall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: PyEval_EvalCode
/usr/lib/libpython3.8.so.1.0(+0x1d1e0d) [0x7f8d05dbee0d]
/usr/lib/libpython3.8.so.1.0(+0x12f098) [0x7f8d05d1c098]
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0(+0x13e442) [0x7f8d05d2b442]
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: PyObject_Call
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: _PyObject_FastCallDict
/usr/lib/libpython3.8.so.1.0: _PyObject_Call_Prepend
/usr/lib/libpython3.8.so.1.0(+0x1f5e09) [0x7f8d05de2e09]
/usr/lib/libpython3.8.so.1.0: _PyObject_MakeTpCall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyFunction_Vectorcall
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalFrameDefault
/usr/lib/libpython3.8.so.1.0: _PyEval_EvalCodeWithName
/usr/lib/libpython3.8.so.1.0: PyEval_EvalCode
/usr/lib/libpython3.8.so.1.0(+0x1d8248) [0x7f8d05dc5248]
/usr/lib/libpython3.8.so.1.0(+0x1d2483) [0x7f8d05dbf483]
/usr/lib/libpython3.8.so.1.0: PyRun_FileExFlags
/usr/lib/libpython3.8.so.1.0: PyRun_SimpleFileExFlags
/usr/lib/libpython3.8.so.1.0: Py_RunMain
/usr/lib/libpython3.8.so.1.0: Py_BytesMain
/usr/lib/libc.so.6: __libc_start_main
/usr/bin/python3: _start

Internal Error occurred, check this page for possible solutions:
https://taichi.readthedocs.io/en/stable/install.html#troubleshooting
255 root@archlinux ~/taichi (git)-[master] # 
/usr/lib/libc.so.6(+0x34936) [0x7f8d05a5a936]
/tmp/taichi-x0wp6umu/taichi_core.so(+0x2030400) [0x7f8cf6de8400]
/usr/lib/libLLVM-10.so(+0x816679) [0x7f8cee1c1679]
/lib64/ld-linux-x86-64.so.2(+0x110f2) [0x7f8d05fa80f2]
/lib64/ld-linux-x86-64.so.2(+0x11201) [0x7f8d05fa8201]
/usr/lib/libc.so.6: _dl_catch_exception
/lib64/ld-linux-x86-64.so.2(+0x1549c) [0x7f8d05fac49c]
/usr/lib/libc.so.6: _dl_catch_exception
/lib64/ld-linux-x86-64.so.2(+0x149be) [0x7f8d05fab9be]
/usr/lib/libdl.so.2(+0x134c) [0x7f8d059ff34c]
/usr/lib/libc.so.6: _dl_catch_exception
/usr/lib/libc.so.6: _dl_catch_error
/usr/lib/libdl.so.2(+0x1b89) [0x7f8d059ffb89]
/usr/lib/libdl.so.2: dlopen
/usr/lib/libGLX_mesa.so.0(+0x56990) [0x7f8cf45ad990]
/usr/lib/libGLX_mesa.so.0(+0x4bf24) [0x7f8cf45a2f24]

See? The stupid /usr/lib/libGLX_mesa.so.0 is trying to load /lib/libLLVM-10.so on its own.

@archibate archibate requested review from k-ye and yuanming-hu June 30, 2020 02:50
@codecov
Copy link

codecov bot commented Jun 30, 2020

Codecov Report

Merging #1355 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1355   +/-   ##
=======================================
  Coverage   85.58%   85.58%           
=======================================
  Files          19       19           
  Lines        3371     3371           
  Branches      624      624           
=======================================
  Hits         2885     2885           
  Misses        356      356           
  Partials      130      130           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c633f7...b6ba8ef. Read the comment docs.

@k-ye
Copy link
Member

k-ye commented Jul 1, 2020

Also note that LLVM 10 is statically linked into libtaichi.so - there's no libLLVM-10.so.

Actually this might be the reason of this failure. I found rust-lang/rust#18671 (comment) and https://xamarin.github.io/bugzilla-archives/57/57742/bug.html. Both reported that statically linked LLVM has caused this problem.

Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let's see how this works. Thank you!

taichi/ir/expr.cpp Outdated Show resolved Hide resolved
@archibate archibate merged commit 8faf2b2 into taichi-dev:master Jul 2, 2020
TI_ASSERT(var.snode()->num_active_indices == 0);
TI_ASSERT_INFO(
var.snode()->num_active_indices == 0,
"Please always use 'x[None]' (instead of simply 'x') to access any 0-D tensor."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Please always use 'x[None]' (instead of simply 'x') to access any 0-D tensor."
"Please always use 'x[None]' (instead of simply 'x') to access any 0-D tensor.");

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for merging without proofread! Did a OFT fix in #1356.

archibate added a commit to archibate/taichi that referenced this pull request Jul 2, 2020
archibate added a commit that referenced this pull request Jul 3, 2020
* Revert "[skip ci] revert ti.chain_compare"

This reverts commit d1a356e.

* add func eval twice test according to @yuanming-hu

* fix #1355 (comment)

* [skip ci] enforce code format

Co-authored-by: Taichi Gardener <[email protected]>
@FantasyVR FantasyVR mentioned this pull request Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants