-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added test timeout #211
Added test timeout #211
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #211 +/- ##
==========================================
- Coverage 34.16% 32.28% -1.89%
==========================================
Files 17 17
Lines 2409 2565 +156
==========================================
+ Hits 823 828 +5
- Misses 1586 1737 +151
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
RLTest/__main__.py
Outdated
for i in range(num_elements): | ||
if bar: | ||
bar.update(i) | ||
yield i | ||
if bar: | ||
bar.update(num_elements) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be indented? and maybe give up the if bar:
then if ProgressBar
cannot return None
RLTest/__main__.py
Outdated
@@ -558,15 +627,12 @@ def addFailure(self, name, failures=None): | |||
failures = [failures] | |||
if not failures: | |||
failures = [] | |||
self.testsFailed.append([name, failures]) | |||
self.testsFailed.setdefault(name, []).extend(failures) | |||
|
|||
def getTotalFailureCount(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename functions
RLTest/__main__.py
Outdated
currPort += 30 # safe distance for cluster and replicas | ||
processes.append(p) | ||
p.start() | ||
for _ in self.progressbar(n_jobs): | ||
# for _ in range(n_jobs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
RLTest/__main__.py
Outdated
except Exception as e: | ||
if not has_live_processor: | ||
raise Exception('Failed to get job result and no more processors is alive') | ||
_ = res['test_name'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
RLTest/__main__.py
Outdated
for test_name, failures in res['failures'].items(): | ||
self.testsFailed[test_name] = failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for test_name, failures in res['failures'].items(): | |
self.testsFailed[test_name] = failures | |
self.testsFailed.update(res['failures']) |
The PR adds a new option
--test-timeout
that allows set a test timeout (in seconds) after which the test will be considered as failed. The timeout works as follow:--verbose-information-on-failure
was used).os._exit(1)
. Notice that it is important to exit usingos._exit(1)
, if we exit in any other way, python might wait for active connections or thread to be close. We are killing the processes in the middle of its execution and we have no idea in which state it hang, so we prefer to wait for nothing.For backward compatibility, the default timeout is 0 which implies no timeout.
Notice, when choosing the best way to trigger a timeout, 2 approaches was tested:
Eventually, the first approach was chosen, mainly because if we use
signal.setitimer
, the test itself might also use it and override our timer and callback. I believe the thread approach is safer and more reliable.Extra additions/fixes:
--no-progress
. Progress bar will automatically turned off if--no-output-catch
was used or if the stdout in not a terminal (output was redirected).Technical Low Level Details on Progressbar
Till today, when running with parallelism on more than one. Each processes reported its own progress. The PR changes this approach in way that only the main processes reports the progress and each sub-processes reports to the main processes. This gives us 2 main adventages:
To achieve this, each sub-processes introspects its own stdout and send the tests output to the main processes on a new channel called
results
. When the main processes gets a message on theresults
channel, it prints its connect to the stdout and increase the progress bar.When running without parallelism, the output is printed to the stdout right away.
To avoid code duplication with the parallel and the none parallel flow, we extracted the code that runs a single tests to its own function,
run_single_test
, and we call it from the 2 different flows:run_jobs_main_thread
,run_jobs
.