-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restart worker during tests depending on resident memory size #13577
Conversation
We ought to be able to go back upto 4 or 8 workers with this PR. |
Oops! On Travis Linux the RSS size reported seems to be wrong. Was correct on my local OSX machine. Will work on that. |
@printf(" in %6.2f seconds\n", tt) | ||
nothing | ||
end | ||
@printf(" in %6.2f seconds, rss %7.2f MB\n", tt, Sys.get_rusage().ru_maxrss / 2^20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rusage rss is in kb so this is off by a factor of 1024 http://linux.die.net/man/2/getrusage
edit: but apparently it's right on osx, and useless on windows. wonder if recent libuv is any better.
There isn't too much benefit for doing that since AFAIK there are only 2 dedicated CPUs for each worker. |
Given that there is some IO testing involved, it may be worth to test if there is any improvement with 4 workers. |
@@ -742,6 +742,30 @@ DLLEXPORT jl_sym_t* jl_get_ARCH() | |||
return ARCH; | |||
} | |||
|
|||
DLLEXPORT size_t jl_maxrss() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worth using the libuv function directly from julia instead of this? (https://github.com/nodejs/node-v0.x-archive/blob/master/deps/uv/include/uv.h#L1020-L1039)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that in the first commit, but Windows is not supported in the libuv interface - it returns 0 for maxrss size. Hence decided to just have a simple solution that works on all 3 test platforms.
I feel like this shouldn't be on by default, it should be a ci only env var. Especially in |
I agree. Will make it configurable. |
|
||
#elif defined(_OS_LINUX_) || defined(_OS_DARWIN_) | ||
struct rusage rusage; | ||
getrusage( RUSAGE_SELF, &rusage ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea about bsd?
9862102
to
bcdd3ee
Compare
Have updated the following:
|
bcdd3ee
to
17e2e1c
Compare
Have squashed and rebased. Does #13569 render this PR moot? |
No, not yet anyway. Docker experiments are showing something that would work for master, but would be a little more complicated to get working for PR's. This doesn't look like it's yet defaulting to a large out-of-the-way number for local builds. |
That will have to be a different PR which ups the default value of JULIA_TEST_MAXRSS_MB in It is low (500MB) in this PR so that the worker restart stuff can be tested for all current CI platforms. |
Okay, I guess that works. If you want to set an env var only for 64 bit linux Travis, add an export roughly here: Lines 39 to 40 in 8a86270
|
CI is green. Will merge in a couple of hours if there are no objections. |
restart worker during tests depending on resident memory size
This is an alternative take on #13567 (for fixing #11553)
It restarts workers once the resident memory size of a worker is above 200MB (configurable via an ENV variable JULIA_TEST_MAXRSS_MB)
One other difference from the existing setup is that all tests are always run (even if some fail)