-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove tensorflow::setLogging()
as thread-unsafe
#46065
Conversation
The setLogging() calls setenv(), which is not required to be thread safe, and specifically in glibc leads to a race condition with any concurrent getenv() calls.
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46065/41854 |
A new Pull Request was created by @makortel for master. It involves the following packages:
@Martin-Grunewald, @aloeliger, @antoniovagnerini, @cmsbuild, @epalencia, @jfernan2, @mandrenguyen, @mmusich, @nothingface0, @rvenditti, @srimanob, @subirsarkar, @syuvivida, @tjavaid, @valsdav, @y19y19 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild, please test |
+1 Size: This PR adds an extra 84KB to repository Comparison SummarySummary:
|
has this already happened? |
@kandrosov FYI |
+hlt |
+ml Thanks for the fix. |
Now in cms-sw/cmsdist#9418 |
@cmsbuild, please test with cms-sw/cmsdist#9418 |
I can make the backports after the next round of tests succeed. |
@cmsbuild, please test with cms-sw/cmsdist#9418 @makortel , I have updated cms-sw/cmsdist#9418 to make TF_CPP_MIN_LOG_LEVEL a runtime variable |
This PR + cms-sw/cmsdist#9418 seems to remove these printouts
|
+1
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2) |
The backports are in |
+1 |
PR description:
The
setLogging()
callssetenv()
, which is not required to be thread safe, and specifically in glibc leads to a race condition with any concurrentgetenv()
calls. For more information see #46002 (comment). There is circumstantial evidence these specificsetenv()
calls could be causing the rare crash reported in #44659.This PR should probably be accompanied with a PR to cmsdist setting
TF_CPP_MIN_LOG_LEVEL=3
in the Tensorflow toolfile.Resolves cms-sw/framework-team#1030
PR validation:
Code compiles, and tracing (with gdb)
setenv()
calls in workflow 12861.0 step2 no longer showssetenv()
calls called in the framework's parallel section.If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
Good question if this should be backported. The race condition exists in earlier releases, but we haven't seen crash reports from production. Maybe 14_1_X and 14_0_X could still be useful?