Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up linker by 10% #1130

Merged
merged 3 commits into from
Apr 23, 2020
Merged

Conversation

MichalStrehovsky
Copy link
Member

...using one weird trick.

Cecil uses file I/O to read assemblies, but for our use cases this means we're spending a ton of time context switching between kernel mode and user mode.

Memory mapped I/O is more efficient for these access patterns.

The change is bigger than I would want it to be because:

  • Loading assemblies from disk is not centralized in linker (and in fact there are more extensibility points that encourage everyone to do their own loading)
  • Cecil violates OOP principles by giving certain kinds of streams preferential treatment (ModuleDefinition.FileName is useless if the assembly wasn't opened with the blessed stream kind and Cecil doesn't have a way to provide it through other means).

...using one weird trick.

Cecil uses file I/O to read assemblies, but for our use cases this means we're spending a ton of time context switching between kernel mode and user mode.

Memory mapped I/O is more efficient for these access patterns.

The change is bigger than I would want it to be because:
* Loading assemblies from disk is not centralized (and in fact there are more extensibility pointer that encourage everyone to do their own loading)
* Cecil violates OOP principles by giving certain kinds of streams preferential treatment (`ModuleDefinition.FileName` is useless if the assembly wasn't opened with the blessed stream kind and Cecil doesn't have a way to provide it through other means).
@MichalStrehovsky
Copy link
Member Author

Opening as draft mostly because I would like to get some feedback on the extensibility APIs that allow people to provide their own AssemblyDefinitions that they opened through whatever other means outside the primary AssemblyResolver. Do we need to keep supporting this or can we cut it off?

Also, I think I broke things outside FEATURE_ILLINK, but illink.sln has all tests passing.

@MichalStrehovsky MichalStrehovsky marked this pull request as draft April 21, 2020 13:26
Copy link
Contributor

@marek-safar marek-safar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the 10% is on Windows only, right?

@@ -15,6 +16,10 @@ public abstract class DirectoryAssemblyResolver : IAssemblyResolver {

readonly Collection<string> directories;

public readonly Dictionary<AssemblyDefinition, string> AssemblyToPath = new Dictionary<AssemblyDefinition, string> ();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this field not be exposed directly?

@@ -40,7 +45,34 @@ protected AssemblyDefinition GetAssembly (string file, ReaderParameters paramete
if (parameters.AssemblyResolver == null)
parameters.AssemblyResolver = this;

return ModuleDefinition.ReadModule (file, parameters).Assembly;
FileStream fileStream = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be more readable with just using syntax

viewStreams.Add (viewStream);

// We transferred the ownership of the viewStream to the collection.
viewStream = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed because we want to Dispose the viewStream if ModuleDefinition.ReadModule above failed for any reason - such assembly is unusable and we might as well throw away the mapping now. If accessing the assembly from the module succeeded, we do not want to dispose it because it's used by Cecil and the viewStreams collection now manages the lifetime.

It's also a reason why I'm not using a using - the viewStream handling is odd. If you see a better way to do this, I can change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might consider using using for fileStream and mappedFile and keep the only viewStream in custom finally to make the logic/flow more obvious

@mrvoorhe
Copy link
Contributor

Nothing about the API jumps out as being problematic for us to extend.

@MichalStrehovsky
Copy link
Member Author

I'm guessing the 10% is on Windows only, right?

I didn't measure Linux because I only have a WSL 1.0 instance that is famous for bad file I/O perf; any measurements would be meaningless. If someone who has a Linux box can help out on getting numbers for this, that would save me a bunch of time.

@@ -89,6 +121,13 @@ public void Dispose ()

protected virtual void Dispose (bool disposing)
{
if (disposing) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be also finalizer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class is not holding unmanaged resources so I don't think we need one.

@MichalStrehovsky MichalStrehovsky marked this pull request as ready for review April 22, 2020 10:29
@@ -7,6 +7,7 @@
using System.IO;
using Mono.Collections.Generic;
using Mono.Cecil;
using System.IO.MemoryMappedFiles;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Sort usings

@MichalStrehovsky MichalStrehovsky merged commit f5a9875 into dotnet:master Apr 23, 2020
@MichalStrehovsky MichalStrehovsky deleted the memmap branch April 23, 2020 11:00
tkapin pushed a commit to tkapin/runtime that referenced this pull request Jan 31, 2023
...using one weird trick.

Cecil uses file I/O to read assemblies, but for our use cases this means we're spending a ton of time context switching between kernel mode and user mode.

Memory mapped I/O is more efficient for these access patterns.

The change is bigger than I would want it to be because:
* Loading assemblies from disk is not centralized (and in fact there are more extensibility pointer that encourage everyone to do their own loading)
* Cecil violates OOP principles by giving certain kinds of streams preferential treatment (`ModuleDefinition.FileName` is useless if the assembly wasn't opened with the blessed stream kind and Cecil doesn't have a way to provide it through other means).

Commit migrated from dotnet/linker@f5a9875
jonathanpeppers added a commit to dotnet/java-interop that referenced this pull request Apr 25, 2023
…esolver

Context: dotnet/linker#1130
Context: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1783420

In dotnet/linker#1130, their version of `DirectoryAssemblyResolver`
was reworked to use `MemoryMappedFile` in the .NET 6 timeframe. This
commit ports these changes here, as we have MSBuild tasks in
xamarin/xamarin-android that use `DirectoryAssemblyResolver`.

Primarily, the `<GenerateJavaStubs />` MSBuild task uses
`DirectoryAssemblyResolver` to iterate over types in assembly to emit
Java source code and generate `AndroidManifest.xml`.

The results of these changes in a `dotnet new maui` project, an
initial clean build:

    GenerateJavaStubs = 1.318 s, 1 calls.
    GenerateJavaStubs = 1.254 s, 1 calls.

Saving ~64ms or about ~5% in this example.

The current version of the linker's resolver in .NET 8+ has moved to
the dotnet/runtime repo at:

https://github.com/dotnet/runtime/blob/cd7d006030a7feace9076fa275fb5bffc1bf4a90/src/tools/illink/src/linker/Linker/AssemblyResolver.cs
jonpryor pushed a commit to dotnet/java-interop that referenced this pull request May 2, 2023
…#1103)

Context: dotnet/linker#1130
Context: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1783420

In dotnet/linker#1130, their version of `DirectoryAssemblyResolver`
was reworked to use [`MemoryMappedFile`][0] in the .NET 6 time frame.
Port these changes to Java.Interop, as we have MSBuild tasks in
xamarin/xamarin-android that use `DirectoryAssemblyResolver`.

Primarily, the [`<GenerateJavaStubs/>` MSBuild task][1] uses
`DirectoryAssemblyResolver` to iterate over types in assembly to emit
Java source code and generate `AndroidManifest.xml`.

Note: `MemoryMappedFile` can *only* be used when we are *not* opening
the assembly as `.ReadWrite`.  Trying to use `MemoryMappedFile` for
read+write scenarios would result in an `InvalidOperationException`:

	error XAGJS7009: System.InvalidOperationException: Operation is not valid due to the current state of the object.
	error XAGJS7009:    at Mono.Cecil.Cil.DefaultSymbolWriterProvider.GetSymbolWriter(ModuleDefinition module, String fileName)
	error XAGJS7009:    at Mono.Cecil.ModuleWriter.GetSymbolWriter(ModuleDefinition module, String fq_name, ISymbolWriterProvider symbol_writer_provider, WriterParameters parameters)
	error XAGJS7009:    at Mono.Cecil.ModuleWriter.Write(ModuleDefinition module, Disposable`1 stream, WriterParameters parameters)
	error XAGJS7009:    at Mono.Cecil.ModuleWriter.WriteModule(ModuleDefinition module, Disposable`1 stream, WriterParameters parameters)
	error XAGJS7009:    at Mono.Cecil.ModuleDefinition.Write(String fileName, WriterParameters parameters)
	error XAGJS7009:    at Mono.Cecil.AssemblyDefinition.Write(String fileName, WriterParameters parameters)
	error XAGJS7009:    at Xamarin.Android.Tasks.MarshalMethodsAssemblyRewriter.Rewrite(DirectoryAssemblyResolver resolver, List`1 targetAssemblyPaths, Boolean brokenExceptionTransitions)
	error XAGJS7009:    at Xamarin.Android.Tasks.GenerateJavaStubs.Run(DirectoryAssemblyResolver res, Boolean useMarshalMethods)
	error XAGJS7009:    at Xamarin.Android.Tasks.GenerateJavaStubs.RunTask()


The results of these changes in a `dotnet new maui` project,
an initial clean build:

  * Before MemoryMappedFile: GenerateJavaStubs = 1.318 s, 1 calls.
  *  After MemoryMappedFile: GenerateJavaStubs = 1.254 s, 1 calls.

Saving ~64ms or about ~5% in this example.

Note: The current version of the linker's resolver in .NET 8+ has
[moved to the dotnet/runtime repo][2].

[0]: https://learn.microsoft.com/dotnet/api/system.io.memorymappedfiles.memorymappedfile?view=net-8.0
[1]: https://github.com/xamarin/xamarin-android/blob/e694ba52a0cf5f45b213c8902bdbfb0ed348374a/src/Xamarin.Android.Build.Tasks/Tasks/GenerateJavaStubs.cs#L131
[2]: https://github.com/dotnet/runtime/blob/cd7d006030a7feace9076fa275fb5bffc1bf4a90/src/tools/illink/src/linker/Linker/AssemblyResolver.cs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants