-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iOS - Test suite crash despite all tests passing #11210
Comments
this is indicative of the test process OOMing, and while it may be an inconsistent repro, is not an infrastructure issue. This issue should probably go in the dotnet/runtime repo not dotnet/arcade. |
We are seeing these TCP issues since July. We've taken several steps (rebooting the devices, updating Xcode..) to rule out some of the infrastructural reasons but since this started happening across all device queues in roughly same time, and since we've not had these issues the previous year, I'm also very inclined to think that this is not of an infra nature. Can be what Matt says, can be networking regression.. cc @steveisok |
One other weird thing I noted is that the same errors are in the same run in the work items which passed too, e.g. in System.IO.MemoryMappedFiles test we see:
but somehow this coerces its exit code back to 0, reports tests, and considers itself succeeded. There may be "acceptable" and "unacceptable" crash scenarios here or something, but whatever this is merits investigation by someone with iOS / Xamarin expertise, for sure. |
That's a nice catch. The way it works is that we start the TCP connection, send some initial thing and then nothing really gets sent until the end when the whole XML with results is sent. It seems like the TCP connection flops in between and it's a matter of luck whether it's fine when we need it? We don't really have much code around keeping it alive / watching the status / retrying. Only thing we do is log whatever it outputs (with the The TCP tunnel is managed by |
We pulled a machine off of the iOS queue and are going to investigate further. I don't think this is an infrastructure issue either. |
@mdh1418 if this error is only occurring on your tests, we should move this issue to being a Test Known Issue in the dotnet/runtime repo instead of an infrastructure issue that the engineering services team would need to track. @ulisesh, is it possible to migrate this to dotnet/runtime to turn into a Test Known Issue, or would it be easier for Mitchell to open a new issue in the repo and for us to close this one? |
@missymessa I think I was being vague when I said I didn't think this was an infrastructure issue. I believe it may be some issue with the way we use usb-mux, which could be a problem in the mlaunch component we use to install / execute the mobile apps. Technically, that is an "infrastructure" component, but is not managed within arcade. The xamarin-ios team is in control of it. /cc @mandel-macaque |
@steveisok is the expectation that this issue should be tracked by dnceng until ownership of the error is determined? |
Also, given that this error message is in test logs, using an Infrastructure Known Issue isn't going to be able to find an error message in the test logs. It would work better if a Test Known Issue was created so that those errors can be found properly in the tests. For example: Test Known Issue: dotnet/runtime#74488 And what it looks like in Build Analysis when a Known Issue for a test is found: https://github.com/dotnet/runtime/pull/73263/checks |
For now, I think this is probably the best place. |
I'm going to move this to Tracking. There's nothing actionable for us (dnceng) to do on this issue, however, I would encourage y'all to turn this into a Test Known Issue in the dotnet/runtime repo so that it can be tracked properly with the Known Issue feature. |
Talked to @steveisok and his team is continuing to determine the error and next steps |
No progress from the team on this issue. |
Closing this issue as it is a duplicate of issue #10820 |
Build
https://dev.azure.com/dnceng-public/public/_build/results?buildId=44476&view=logs&j=a5078f86-b345-5a4a-85ee-f64916152c6f&t=b1d0531b-2b6c-5e64-cc59-e2e1ffcc72bf
Build leg reported
iOS arm64 Release AllSubsets_Mono
Pull Request
dotnet/runtime#76725
Action required for the engineering services team
To triage this issue (First Responder / @dotnet/dnceng):
If this is an issue that is causing build breaks across multiple builds and would get benefit from being listed on the build analysis check, follow the next steps:
Additional information about the issue reported
The net.dot* log also seems to contain the xml that should have been generated
https://gist.github.com/mdh1418/a47d8c99b49a200789e29ba3edcc11ce
Comparing with another build where the job passed, the same number of tests are ran and reported, however the xml is not injected in the net.dot*
https://gist.github.com/mdh1418/3b5fea30fdef61b6c5c5be86d9404107
The text was updated successfully, but these errors were encountered: