Skip to content

Latest commit

 

History

History
290 lines (183 loc) · 16 KB

design.md

File metadata and controls

290 lines (183 loc) · 16 KB

Single-file Publish

Design for publishing apps as a single-file in .Net Core 3.0

Introduction

The goal of this effort is enable .Net-Core apps to be published and distributed as a single executable.

There are several strategies to implement this feature -- ranging from bundling the published files into zip file (ex: Warp), to native compiling and linking all the binaries together (ex: CoreRT). These options, along with their cost/benefit analysis is explored in this staging document.

Goals

In .Net Core 3.0, we plan to implement a solution that

  • Is widely compatible: Apps containing MSIL assemblies, ready-to-run assemblies, native binaries, configuration files, etc. can be packaged into one executable.
  • Can run framework dependent pure managed apps directly from bundle:
    • Executes IL assemblies, and processes configuration files directly from the bundled executable.
    • Extracts ready-to-run and native binaries to disk before loading them.
  • Usable with debuggers and tools: The single-file should be debuggable using the generated symbol file. It should also be usable with profilers and tools similar to a non-bundled app.

This feature-set is described as Stage 2 in the staging document, and can be improvised in further releases.

Non Goals

  • Optimizing for development: The single-file publishing is typically not a part of the development cycle. It is typically used in release builds as a packaging step. Therefore, the single-file feature will be designed with focus on consumption rather than production.
  • Merging IL: Tools like ILMerge combines the IL from many assemblies into one, but lose assembly identity in the process. This is not a goal for single-file feature.

Existing tools

Single-file packaging for .Net Core apps is currently supported by third-party tools such as Warp and Costura. We now consider the pros and cons of implementing this feature within .Net Core.

Advantages
  • A standardized experience available for all apps
  • Integration with dotnet CLI
  • A more transparent experience: for example, external tools may utilize public APIs such AssemblyLoadContext.Resolving or AssemblyLoadContext.ResolvingUnmanagedDll events to load files from a bundle. However, this may conflict with the app's own use of these APIs, which requires a cooperative resolution. Such a situation is avoided by providing inbuilt support for single-file apps.
Limitations
  • Inbox implementation of the feature adds complexity for customers with respect to the list of deployment options to choose from.
  • Independent tools can evolve faster on their own schedule.
  • Independent tools can provide richer set of features catering to match a specific set of customers.

We believe that the advantages outweigh the disadvantages in with respect to implementing this feature inbox.

Design

There are two main aspects to publishing apps as a self-extracting single file:

  • The Bundler: A tool that embeds the managed app, its dependencies, and the runtime into a single host executable.
  • The host: The "single-file" which facilitates the extraction and/or loading of embedded components.

The Bundler

Bundling Tool

The bundler is a tool that embeds the managed app and its dependencies into the native AppHost executable. The functional details of the bundler are explained in this document.

Build System Interface

Publishing to a single file can be triggered by adding the following property to an application's project file:

<PropertyGroup>
    <PublishSingleFile>true</PublishSingleFile>
</PropertyGroup>    
  • The PublishSingleFile property applies to both framework dependent and self-contained publish operations.
  • The PublishSingleFile property applies to platform-specific builds with respect to a given runtime-identifier. The output of the build is a native binary for the specified platform. When PublishSingleFile is set to true, it is an error to leave RuntimeIdentifier undefined, or to set UseAppHost to false.
  • Setting the PublishSingleFileproperty causes the managed app, managed dependencies, platform-specific native dependencies, configurations, etc. (basically the contents of the publish directory when dotnet publish is run without setting the property) to be embedded within the native apphost.

By default, the symbol files are not embedded within the single-file, but remain as separate files in the publish directory. This includes both the IL .pdb file, and the native .ni.pdb / app.guid.map files generated by ready-to-run compiler. Setting the following property causes the symbol files to be included in the single-file.

<PropertyGroup>
    <IncludeSymbolsInSingleFile>true</IncludeSymbolsInSingleFile>
</PropertyGroup>

Certain files can be explicitly excluded from being embedded in the single-file by setting following meta-data:

<ExcludeFromSingleFile>true</ExcludeFromSingleFile>

For example, to place some files in the publish directory but not bundle them in the single-file:

<ItemGroup>
    <Content Update="*.xml">
      <CopyToPublishDirectory>PreserveNewest</CopyToPublishDirectory>
      <ExcludeFromSingleFile>true</ExcludeFromSingleFile>
    </Content>
  </ItemGroup>

Interaction with other tools

Once the single-file-publish tooling is added to the publish pipeline, other static binary transformation tools may need to adapt its presence. For example:

  • The MSBuild logic in dotnet SDK should be crafted such that IlLinker, crossgen, and the single-file bundler run in that order in the build/publish sequence.
  • External tools like Fody that use AfterBuild/AfterPublish targets may need to adapt to expect the significantly different output generated by publishing to a single file. The goal in this case is to provide sufficient documentation and guidance.

The Host

On Startup, the AppHost checks if it has embedded files. If so, it

  • Memory maps the entire bundle file.
  • Extracts the necessary files (ex: native binaries) to disk (if any) as explained in this document.
  • Sets up data-structures so other components can access files embedded directly, as explained in this document

Dependency Resolution

An app may choose to only embed some files (ex: due to licensing restrictions) and expect to pickup other dependencies from application-launch directory, nuget packages, etc. In order to resolve assemblies and native libraries, the embedded resources are probed first, followed by other probing paths.

New API

In this section, we propose adding a few APIs to facilitate common operations on bundled-apps.

The binaries that are published in the project are expected to be handled transparently by the host. However, explicit access to the embedded files is useful in situations such as:

  • Reading additional files packaged into the app (ex: data files).
  • Open an assembly for reflection/inspection
  • Load plugins built as single-file class-libs using existing Loader APIs.

Therefore, we propose adding an API similar to GetManifestResourceStream to obtain a stream corresponding to an embedded file.

This is only a draft of the proposed APIs. The actual shape of the APIs will be decided via API review process.

namespace System.Runtime.Loader
{
    public partial class Bundle
    {
        // Check whether an app is running from a single-file bundle
        public static bool IsBundle(Assembly assembly);
        
        // Get the location where contents of the bundle are extracted
        public static string GetContentRoot(Assembly assembly);
 
        // Open a file embedded in the bundle built for the specified assembly 
        public static System.IO.Stream GetFileStream(Assembly assembly, string name);
    }
}

We can also provide an abstraction that abstracts away the physical location of a file (bundle or disk). For example, add a variant of GetFileStream API that looks for a file in the bundle, and if not found, falls back to disk-lookup. However such abstractions are also easy to build outside of the .Net Framework.

Existing API

We need to determine the semantics of current APIs such as Assembly.Location that return the information about an assembly's location on disk.

Assembly.Location

There are a few options to consider for the Assembly.Location property of a bundled assembly:

  • A fixed literal (ex: null) indicating that no actual location is available.
  • The simple name of the assembly (with no path).
  • A special UNC notation such as <bundle-path>/:/asm.dll to denote files that come from the bundle.
  • The path of the assembly as if it were not to be packaged into the single-file.
  • In the case of files spilled to disk (ex: ready-to-run assemblies) -- the actual location of the extracted file.
  • A configurable selection of the above, etc.

We propose keeping the default behavior of Assembly.Location. That is,

  • If the assembly is loaded directly from the bundle, return empty-string
  • If the assembly is extracted to disk, return the location of the extracted file.

Most of the app development can be agnostic to whether the app is published as single-file or not. However, the parts of the app that deal with physical locations of files need to be aware of the single-file packaging.

AppContext.BaseDirectory

A few options to consider for the AppContext.BaseDirectory are:

  • The directory where embedded files are extracted out: This option enables easy access to content files spilled to disk. However, if no files need to be extracted to disk for an application bundle, there's no extraction directory.
  • The extraction-directory if files are extracted, the AppHost directory otherwise. The limitation to this approach is that the value of AppContext.BaseDirectory changes with the way applications are packaged/executed.
  • The directory where AppHost binary resides (always).

We propose that AppContext.BaseDirectory should always be set to the directory where the AppHost bundle resides. This scheme doesn't provide an obvious mechanism to access the contents of the extraction directory -- by design. The recommended method for accessing content files from the bundle are:

  • Do not bundle application data files into the single-exe; instead them next to the bundle. This way, the application binary is a single-file, but not the whole application.
  • Embed data files as managed resources into application binary, and access them via resource management APIs.

Testing

  • Unit Tests to achieve good coverage on apps using managed, ready-to-run, native code
  • Tests for the newly added API
  • Tests to verify that framework-dependent and self-contained apps can be published as single file.
  • Tests for ensure that every app model template supported by .NET Core (console, wpf, winforms, web, etc.) can be published as a single file.
  • A subset of CoreCLR Tests
  • Tests to ensure that MSIL files with embedded PDBs are handled correctly
  • End-to-End testing on real world apps

Measurements

Measure publish size and run-time (first run, subsequent runs) for HelloWorld (framework-dependent and self-contained), Roslyn and MusicStore.

Telemetry

Collect telemetry for single-file published apps with respect to parameters such as:

  • Framework-dependent vs self-contained apps.
  • Whether the apps are Pure managed apps, ready-to run compiled apps, or have native dependencies.
  • Embedding of additional/data files, and use of file access API.

Further Work

Bundler Optimizations

Since all the files of an app published as a single-file live together, we can perform the following optimizations

  • R2R compile the app and all of its dependent assemblies in a single version-bubble

    Single-file apps compiled cross-platform may have this optimization disabled until the ready-to-run compiler (crossgen) supports cross-compilation.

  • Investigate whether collectively signing the files in an assembly saves space for certificates.

Compression

Currently the bundler does not compress the contents embedded at the end of the host binary. Compressing the bundled files and meta-data can significantly reduce the size of the single-file output (by about 30%-50% as determined by prototyping).

Single-file Plugins

The above design should be extended to seamlessly support single-file publish for plugins.

  • The bundler tool will mostly work as-is, regardless of whether we publish an application or class-lib. The binary blob with dependencies can be appended to both native and managed binaries.
  • For host/runtime support, the options are:
    • Implement plugins using existing infrastructure. For example: Take control of assembly/native binary loads via existing AssemblyLoadContext callbacks and events. Extract the files embedded within the single-file plugin using the GetFileStream() API and load them on demand.
    • Have new API to load a single-file plugin, for example: AssemblyLoadContext.LoadWithEmbeddedDependencies().

VS Integration

Developers should be able to use the feature easily from Visual Studio. The feature will start with text based support -- by explicitly setting the PublishSingleFile property in the project file. In future, we may provide a single-file publish-profile, or other UI triggers.

User Experience

To summarize, here's the overall experience for creating a HelloWorld single-file app

  • Create a new HelloWorld app: HelloWorld$ dotnet new console

Framework Dependent HelloWorld

  • Normal publish: dotnet publish

    • Publish directory contains the host HelloWorld.exe , the app HelloWorld.dll, configuration files HelloWorld.deps.json, HelloWorld.runtimeconfig.json, and the PDB-file HelloWorld.pdb.
  • Single-file publish: dotnet publish -r win10-x64 --self-contained=false /p:PublishSingleFile=true

    • Publish directory contains: HelloWorld.exe HelloWorld.pdb

    • HelloWorld.dll, HelloWorld.deps.json, and HelloWorld.runtimeconfig.json are embedded within HelloWorld.exe.

  • Run: HelloWorld.exe

    • The app runs completely from the single-file, without the need for intermediate extraction to file.

Self-Contained HelloWorld

  • Normal publish: dotnet publish -r win10-x64

    • Publish directory contains 221 files including the host, the app, configuration files, the PDB-file and the runtime.
  • Single-file publish: dotnet publish -r win10-x64 /p:PublishSingleFile=true

    • Publish directory contains: HelloWorld.exe HelloWorld.pdb
    • The remaining 219 files are embedded within the host HelloWorld.exe.
  • Run: HelloWorld.exe

    • The bundled app and configuration files are processed directly from the bundle.
    • Remaining 216 files will be extracted to disk at startup.
    • If reuse of extracted files is enabled, subsequent runs of the app may skip the extraction step.

Most applications are expected to work without any changes. However, apps with a strong expectation about absolute location of dependent files may need to be made aware of bundling and extraction aspects of single-file publishing. No difference is expected with respect to debugging and analysis of apps.

Related Work