Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Microsoft.Extensions.FileProviders.Physical.Tests crashing on CI in FileSystemInfo API #57221

Closed
ViktorHofer opened this issue Aug 11, 2021 · 10 comments · Fixed by #57551
Closed
Assignees
Milestone

Comments

@ViktorHofer
Copy link
Member

Build: https://dev.azure.com/dnceng/public/_build/results?buildId=1288068&view=ms.vss-test-web.build-test-results-tab&runId=38025328&resultId=194160&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

Configuration: net6.0-OSX-Debug-x64-CoreCLR_release-OSX.1015.Amd64.Open

Dump is available: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-56974-merge-9155b3b7ae8e44c498/Microsoft.Extensions.FileProviders.Physical.Tests/1/how-to-debug-dump.md?sv=2019-07-07&se=2021-08-31T15%3A39%3A48Z&sr=c&sp=rl&sig=Z1XxeIsGVGHUGWG6t0bfozd05lMZIHaZ%2BpmuqDJbn08%3D, https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-56974-merge-9155b3b7ae8e44c498/Microsoft.Extensions.FileProviders.Physical.Tests/1/core.22560?sv=2019-07-07&se=2021-08-31T15%3A39%3A48Z&sr=c&sp=rl&sig=Z1XxeIsGVGHUGWG6t0bfozd05lMZIHaZ%2BpmuqDJbn08%3D

/tmp/helix/working/A95708FE/p/dotnet exec --runtimeconfig Microsoft.Extensions.FileProviders.Physical.Tests.runtimeconfig.json --depsfile Microsoft.Extensions.FileProviders.Physical.Tests.deps.json xunit.console.dll Microsoft.Extensions.FileProviders.Physical.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/private/tmp/helix/working/A95708FE/w/97EC089D/e /private/tmp/helix/working/A95708FE/w/97EC089D/e
  Discovering: Microsoft.Extensions.FileProviders.Physical.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  Microsoft.Extensions.FileProviders.Physical.Tests (found 76 of 92 test cases)
  Starting:    Microsoft.Extensions.FileProviders.Physical.Tests (parallel test collections = on, max threads = 6)
Unhandled exception. System.IO.IOException: Invalid argument
   at System.IO.FileStatus.ThrowOnCacheInitializationError(ReadOnlySpan`1 path) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 421
   at System.IO.FileStatus.EnsureCachesInitialized(ReadOnlySpan`1 path, Boolean continueOnError) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 400
   at System.IO.FileStatus.GetLastWriteTime(ReadOnlySpan`1 path, Boolean continueOnError) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 284
   at System.IO.FileSystemInfo.get_LastWriteTimeCore() in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileSystemInfo.Unix.cs:line 60
   at System.IO.FileSystemInfo.get_LastWriteTimeUtc() in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileSystemInfo.cs:line 115
   at Microsoft.Extensions.FileProviders.Physical.PollingFileChangeToken.GetLastWriteTimeUtc() in /_/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PollingFileChangeToken.cs:line 57
   at Microsoft.Extensions.FileProviders.Physical.PollingFileChangeToken.get_HasChanged() in /_/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PollingFileChangeToken.cs:line 101
   at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.RaiseChangeEvents(Object state) in /_/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PhysicalFilesWatcher.cs:line 455
   at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 673
   at System.Threading.TimerQueue.FireNextTimers() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 326
./RunTests.sh: line 162: 22560 Abort trap: 6           (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig Microsoft.Extensions.FileProviders.Physical.Tests.runtimeconfig.json --depsfile Microsoft.Extensions.FileProviders.Physical.Tests.deps.json xunit.console.dll Microsoft.Extensions.FileProviders.Physical.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/private/tmp/helix/working/A95708FE/w/97EC089D/e
----- end Wed Aug 11 08:42:00 PDT 2021 ----- exit code 134 ----------------------------------------------------------
exit code 134 means SIGABRT Abort. Managed or native assert, or runtime check such as heap corruption, caused call to abort(). Core dumped.

cc @adamsitnik

@dotnet-issue-labeler dotnet-issue-labeler bot added area-Extensions-FileSystem untriaged New issue has not been triaged by the area owner labels Aug 11, 2021
@ghost
Copy link

ghost commented Aug 11, 2021

Tagging subscribers to this area: @maryamariyan, @dotnet/area-extensions-filesystem
See info in area-owners.md if you want to be subscribed.

Issue Details

Build: https://dev.azure.com/dnceng/public/_build/results?buildId=1288068&view=ms.vss-test-web.build-test-results-tab&runId=38025328&resultId=194160&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

Configuration: net6.0-OSX-Debug-x64-CoreCLR_release-OSX.1015.Amd64.Open

Dump is available: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-56974-merge-9155b3b7ae8e44c498/Microsoft.Extensions.FileProviders.Physical.Tests/1/how-to-debug-dump.md?sv=2019-07-07&se=2021-08-31T15%3A39%3A48Z&sr=c&sp=rl&sig=Z1XxeIsGVGHUGWG6t0bfozd05lMZIHaZ%2BpmuqDJbn08%3D, https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-56974-merge-9155b3b7ae8e44c498/Microsoft.Extensions.FileProviders.Physical.Tests/1/core.22560?sv=2019-07-07&se=2021-08-31T15%3A39%3A48Z&sr=c&sp=rl&sig=Z1XxeIsGVGHUGWG6t0bfozd05lMZIHaZ%2BpmuqDJbn08%3D

/tmp/helix/working/A95708FE/p/dotnet exec --runtimeconfig Microsoft.Extensions.FileProviders.Physical.Tests.runtimeconfig.json --depsfile Microsoft.Extensions.FileProviders.Physical.Tests.deps.json xunit.console.dll Microsoft.Extensions.FileProviders.Physical.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/private/tmp/helix/working/A95708FE/w/97EC089D/e /private/tmp/helix/working/A95708FE/w/97EC089D/e
  Discovering: Microsoft.Extensions.FileProviders.Physical.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  Microsoft.Extensions.FileProviders.Physical.Tests (found 76 of 92 test cases)
  Starting:    Microsoft.Extensions.FileProviders.Physical.Tests (parallel test collections = on, max threads = 6)
Unhandled exception. System.IO.IOException: Invalid argument
   at System.IO.FileStatus.ThrowOnCacheInitializationError(ReadOnlySpan`1 path) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 421
   at System.IO.FileStatus.EnsureCachesInitialized(ReadOnlySpan`1 path, Boolean continueOnError) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 400
   at System.IO.FileStatus.GetLastWriteTime(ReadOnlySpan`1 path, Boolean continueOnError) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 284
   at System.IO.FileSystemInfo.get_LastWriteTimeCore() in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileSystemInfo.Unix.cs:line 60
   at System.IO.FileSystemInfo.get_LastWriteTimeUtc() in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileSystemInfo.cs:line 115
   at Microsoft.Extensions.FileProviders.Physical.PollingFileChangeToken.GetLastWriteTimeUtc() in /_/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PollingFileChangeToken.cs:line 57
   at Microsoft.Extensions.FileProviders.Physical.PollingFileChangeToken.get_HasChanged() in /_/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PollingFileChangeToken.cs:line 101
   at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.RaiseChangeEvents(Object state) in /_/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PhysicalFilesWatcher.cs:line 455
   at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 673
   at System.Threading.TimerQueue.FireNextTimers() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 326
./RunTests.sh: line 162: 22560 Abort trap: 6           (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig Microsoft.Extensions.FileProviders.Physical.Tests.runtimeconfig.json --depsfile Microsoft.Extensions.FileProviders.Physical.Tests.deps.json xunit.console.dll Microsoft.Extensions.FileProviders.Physical.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/private/tmp/helix/working/A95708FE/w/97EC089D/e
----- end Wed Aug 11 08:42:00 PDT 2021 ----- exit code 134 ----------------------------------------------------------
exit code 134 means SIGABRT Abort. Managed or native assert, or runtime check such as heap corruption, caused call to abort(). Core dumped.

cc @adamsitnik

Author: ViktorHofer
Assignees: -
Labels:

untriaged, area-Extensions-FileSystem

Milestone: -

@ViktorHofer
Copy link
Member Author

Happed again in https://dev.azure.com/dnceng/public/_build/results?buildId=1291923&view=ms.vss-test-web.build-test-results-tab&runId=38124038&resultId=194269&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab.

Again there's a dump available.

``´
/private/tmp/helix/working/B5060956/w/AB6D08FA/e /private/tmp/helix/working/B5060956/w/AB6D08FA/e
Discovering: Microsoft.Extensions.FileProviders.Physical.Tests (method display = ClassAndMethod, method display options = None)
Discovered: Microsoft.Extensions.FileProviders.Physical.Tests (found 76 of 92 test cases)
Starting: Microsoft.Extensions.FileProviders.Physical.Tests (parallel test collections = on, max threads = 6)
Unhandled exception. System.IO.IOException: Invalid argument
at System.IO.FileStatus.ThrowOnCacheInitializationError(ReadOnlySpan1 path) in /_/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 421 at System.IO.FileStatus.EnsureCachesInitialized(ReadOnlySpan1 path, Boolean continueOnError) in //src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 400
at System.IO.FileStatus.GetLastWriteTime(ReadOnlySpan`1 path, Boolean continueOnError) in /
/src/libraries/System.Private.CoreLib/src/System/IO/FileStatus.Unix.cs:line 284
at System.IO.FileSystemInfo.get_LastWriteTimeCore() in //src/libraries/System.Private.CoreLib/src/System/IO/FileSystemInfo.Unix.cs:line 60
at System.IO.FileSystemInfo.get_LastWriteTimeUtc() in /
/src/libraries/System.Private.CoreLib/src/System/IO/FileSystemInfo.cs:line 115
at Microsoft.Extensions.FileProviders.Physical.PollingFileChangeToken.GetLastWriteTimeUtc() in //src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PollingFileChangeToken.cs:line 57
at Microsoft.Extensions.FileProviders.Physical.PollingFileChangeToken.get_HasChanged() in /
/src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PollingFileChangeToken.cs:line 101
at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.RaiseChangeEvents(Object state) in //src/libraries/Microsoft.Extensions.FileProviders.Physical/src/PhysicalFilesWatcher.cs:line 455
at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool) in /
/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 673
at System.Threading.TimerQueue.FireNextTimers() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 326
./RunTests.sh: line 162: 45231 Abort trap: 6 (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig Microsoft.Extensions.FileProviders.Physical.Tests.runtimeconfig.json --depsfile Microsoft.Extensions.FileProviders.Physical.Tests.deps.json xunit.console.dll Microsoft.Extensions.FileProviders.Physical.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/private/tmp/helix/working/B5060956/w/AB6D08FA/e
----- end Fri Aug 13 02:45:24 PDT 2021 ----- exit code 134 ----------------------------------------------------------
exit code 134 means SIGABRT Abort. Managed or native assert, or runtime check such as heap corruption, caused call to abort(). Core dumped.

@danmoseley
Copy link
Member

#57322 also OSX

@danmoseley
Copy link
Member

This failed 71 times in 3 days, we need to disable or work around.

WorkItems 
| where FriendlyName == "Microsoft.Extensions.FileProviders.Physical.Tests"
| where ExitCode == "134"
| join kind= inner (
   Jobs  | project  QueueName , JobId, Build, Type, Source,
    Branch,
  Pipeline = tostring(parse_json(Properties).DefinitionName),
  Pipeline_Configuration = tostring(parse_json(Properties).configuration),
  OS = QueueName,
  Arch = tostring(parse_json(Properties).architecture)
) on JobId
| where Queued > ago (7d)
| summarize count() by ExitCode, Pipeline, Pipeline_Configuration, Arch, bin(Finished, 1d)
| order by Finished asc
ExitCode Pipeline Pipeline_Configuration Arch Finished count_
134 runtime Debug x64 2021-08-10 00:00:00.0000000 15
134 runtime Release x64 2021-08-10 00:00:00.0000000 1
134 runtime Debug x64 2021-08-11 00:00:00.0000000 27
134 runtime Release x64 2021-08-11 00:00:00.0000000 1
134 runtime Release x64 2021-08-12 00:00:00.0000000 4
134 runtime Debug x64 2021-08-12 00:00:00.0000000 12
134 runtime Debug x64 2021-08-13 00:00:00.0000000 10
134 runtime Release x64 2021-08-13 00:00:00.0000000 1

@danmoseley
Copy link
Member

@jozkee thoughts? Do you know what test/s this is so we can disable? Or should we add some catch in the product code meantime instead?

@jozkee
Copy link
Member

jozkee commented Aug 13, 2021

@danmoseley This one is weird since the call-stack shows the failure coming from FileSystemInfo.LastWriteTimeUtc (not the new symlinks code) so I am afraid that it could be that one of the new tests may have discovered an actual issue in FileSystemInfo.

And the scenario is quite rare as well: Exists returns true but LastWriteTime throws IOException: Invalid argument which means that one of the syscalls from Exists (lstat or stat) got this error when it was called - and that is also rare since I haven't found anywhere that those methods may be able to return EINVAL.

I've been trying to investigate the dump but I'm not quite familiar with that process.

@danmoseley
Copy link
Member

@danmoseley
Copy link
Member

It's both 10.14 and 10.15

ExitCode Pipeline OS Pipeline_Configuration Arch Finished count_
134 runtime osx.1014.amd64.open Debug x64 2021-08-10 00:00:00.0000000 11
134 runtime osx.1015.amd64.open Debug x64 2021-08-10 00:00:00.0000000 4
134 runtime osx.1014.amd64.open Release x64 2021-08-10 00:00:00.0000000 1
134 runtime osx.1014.amd64.open Debug x64 2021-08-11 00:00:00.0000000 19
134 runtime osx.1015.amd64.open Debug x64 2021-08-11 00:00:00.0000000 8
134 runtime osx.1014.amd64.open Release x64 2021-08-11 00:00:00.0000000 1
134 runtime osx.1015.amd64.open Release x64 2021-08-12 00:00:00.0000000 3
134 runtime osx.1015.amd64.open Debug x64 2021-08-12 00:00:00.0000000 6
134 runtime osx.1014.amd64.open Debug x64 2021-08-12 00:00:00.0000000 6
134 runtime osx.1014.amd64.open Release x64 2021-08-12 00:00:00.0000000 1
134 runtime osx.1014.amd64.open Debug x64 2021-08-13 00:00:00.0000000 8
134 runtime osx.1015.amd64.open Debug x64 2021-08-13 00:00:00.0000000 7
134 runtime osx.1015.amd64.open Release x64 2021-08-13 00:00:00.0000000 1

@jozkee
Copy link
Member

jozkee commented Aug 13, 2021

Thanks, I wonder why it only fails in macOS, same code runs in Linux distros without issues.

@jozkee
Copy link
Member

jozkee commented Aug 17, 2021

I was able to confirm that this is an issue on FileSystemInfo and not on PhysicalFileProvider, it was introduced in #52235.
I was also able to made a local repro, but I was able to repro it only in macOS, perhaps that's the only OS that somehow throws EINVAL in stat with a very specific configuration that involves a race condition.
Anyway, here's the repro:

using System;
using System.IO;
using System.Threading.Tasks;

namespace test_writetime
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");
            var info = new FileInfoWrapper();
            for (int i = 0; i < 1_000_000; i++){
                Parallel.Invoke(
                    () => info.PrintLastWriteTime(),
                    () => info.Delete()
                );
            }
        }

        class FileInfoWrapper
        {
            const string _linkPath = "/home/jozky/test-writetime/bar";
            const string _targetPath = "/home/jozky/test-writetime/foo";
            private FileInfo _fileInfo;
            public FileInfoWrapper()
            {
                CreateSymlink();
            }

            public void CreateSymlink()
            {
                _fileInfo = (FileInfo)File.CreateSymbolicLink(_linkPath, _targetPath);
            }

            public void Delete()
            {
                File.Delete(_linkPath);
            }

            public void PrintLastWriteTime()
            {
                // return _fileInfo.Exists ? _fileInfo.LastWriteTimeUtc : DateTime.MinValue;
                _fileInfo.Refresh();
                if (!_fileInfo.Exists)
                {
                    Console.WriteLine("Had to re-create");
                    CreateSymlink();
                }
                else
                {
                    Console.WriteLine("Already created");
                }

                Console.WriteLine(_fileInfo.LastWriteTimeUtc);
            }
        }
    }
}

Will send a PR that fixes the error.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 17, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 17, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants