-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work around https://github.com/dotnet/arcade/issues/7371 #7457
Conversation
@ViktorHofer FYI |
src/Microsoft.DotNet.Helix/Sdk/tools/azure-pipelines/reporter/run.py
Outdated
Show resolved
Hide resolved
- Don't have a threading contention for stdout by only using stdout in the exception case; "starting..." adds minimal value, one can assume if the script is called it starts. - sleep() the main thread 5s before accessing stdout in case the exception case wants to happen - Remove most non-error-path logging since it adds little value and multiple threads competing for stdout can cause un-catch-able failure.
9bc7c3d
to
702da8b
Compare
@@ -146,26 +145,20 @@ def main(): | |||
worker.daemon = True | |||
worker.start() | |||
|
|||
log.info("Beginning to read test results...") | |||
# https://github.com/dotnet/arcade/issues/7371 - trying to avoid contention for stdout | |||
time.sleep(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sleep feels... very odd. The workers haven't actually done anything yet (we haven't even put anything into the queue). This is just adding 5 seconds to every job for no real reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a similar comment here #7457 (comment). The sleep should permit the threads to start up and wait for work. I think the bug this is working around was due to the threads not all starting, all work completing, and the process shut down while remaining threads started and wrote to the console. If the threads are no longer writing to the console, except when handling work items and we're confident in our work item synchronization, then maybe it doesn't help.
Another thing that comes to mind is we can probably avoid starting up worker threads if we know we're not going to use them. I wonder if a calculation can be done up front to determine maximum number of worker threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this code is being deleted in about a week, so I wouldn't bother with another PR unless this is critical. I'm pretty sure this sleep is uncessary, but I'd like to avoid toughing this dead-man-walking code any more than is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that conversation didn't make it to my inbox for some reason. Oh well, situation... somethinged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like just deleting the "starting" would have been enough, but a bit of waiting isn't the end of the world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this code is being deleted in about a week, so I wouldn't bother with another PR unless this is critical. I'm pretty sure this sleep is uncessary, but I'd like to avoid toughing this dead-man-walking code any more than is necessary.
I had lots of conversations before you joined, and due to the nature of dependency flow in Arcade It takes a very long time to try one thing, so I'm trying multiple things. Between the amount of deleted logging and the sleep I think we have a very good chance of not dealing with this any more until it's all replaced on 6/10 ("About a week").
#7371
To double check: