Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements #3808

Merged
merged 21 commits into from
Nov 8, 2023
Merged

Conversation

JohnMcPMS
Copy link
Member

@JohnMcPMS JohnMcPMS commented Oct 24, 2023

Change

This change has several performance improvements based on profiling.

Installed package caching

A global process cache is stored for the unfiltered install data. It is very basic for updates right now to make it more reasonable to service, but future changes could improve things to examine the system data and only do delta updates. It could also be saved to disk to improve the performance of future processes.

The cache itself is not used by any callers, but instead in-memory copies are handed out. The cache sets up event listeners for the current data sources and waits for these events before it will attempt to update again. A full cache hit requires only copying the memory database, which SQLite has an efficient method for.

The only performance gain to the update path that is provided by this implementation is to reuse the names of MSIX packages, but since this accounts for a huge amount of creating the index it results in ~50% time reduction in creating an updated installed source (without any MSIX changes).

ICU regex caching

The regular expressions that we use were being compiled repeatedly. Since this is a fixed set of expressions, they are all now cached and copies are used for actual operation.

Version lookup (string)

The version lookup by string (such as winget show FOO -v VER) was doing a full table scan. This change shifts to pulling all of the version for the given identifier out and using Version object comparisons to find the proper result. This has the side effect of fixing a somewhat annoying user interaction where trailing 0s in a version were still required to find from the command line. One can now provide the smaller version string and still find the expected package version (eg 1.2 will now find version 1.2.0).

Version lookup (available version)

The round trip from retrieving the available version information to getting the manifest has been improved by including the manifest id value directly in the available version data. The available version package object now caches the version string to manifest id data as well, meaning that any usage of a PackageVersionKey that was retrieved from GetAvailableVersions will directly return the package version object without querying the index.

Let property lookups handle an unknown manifest id

The GetProperty code was enforcing that the manifest id was valid and then reusing existing code that did not handle a missing manifest. Creating new methods that can handle the missing manifest allows us to remove the initial check.

Validation

Added tests where appropriate, regression covers many other cases.

Microsoft Reviewers: Open in CodeFlow

@JohnMcPMS JohnMcPMS requested a review from a team as a code owner October 24, 2023 18:11

This comment has been minimized.

This comment has been minimized.

@denelon
Copy link
Contributor

denelon commented Nov 2, 2023

@JohnMcPMS JohnMcPMS changed the title Implement a simple cache for installed package data Performance improvements Nov 4, 2023
florelis
florelis previously approved these changes Nov 7, 2023
@@ -20,6 +21,66 @@ namespace AppInstaller::Regex
using uregex_ptr = wil::unique_any<URegularExpression*, decltype(uregex_close), uregex_close>;
using utext_ptr = wil::unique_any<UText*, decltype(utext_close), utext_close>;

static std::unique_ptr<impl> Create(std::string_view pattern, Options options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you add a comment stating this does caching to avoid recompiling the regex, please?

Comment on lines 64 to 76
auto itr = s_regex_cache.map.find(requested);
if (itr != s_regex_cache.map.end())
{
return std::make_unique<impl>(itr->second);
}
}

auto exclusiveLock = s_regex_cache.lock.lock_exclusive();

auto itr = s_regex_cache.map.find(requested);
if (itr != s_regex_cache.map.end())
{
return std::make_unique<impl>(itr->second);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please add a comment about why we need to search again with the exclusive lock. or anything else to not have to figure out the concurrency stuff when reading the code

static std::optional<typename Table::id_t> GetIdById(const SQLite::Connection& connection, SQLite::rowid_t id)
{
auto statement = details::ManifestTableGetIdsById_Statement(connection, id, { details::GetManifestTableColumnName<Table>() }, false);
if (statement.Step()) { return statement.GetColumn<typename Table::id_t>(0); } else { return std::nullopt; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: linebreaks

@@ -40,6 +40,11 @@ namespace AppInstaller::SQLite
return Utility::ConvertUnixEpochToSystemClock(lastWriteTime);
}

std::string SQLiteStorageBase::GetDatabaseIdentifier()
{
return MetadataTable::GetNamedValue<std::string>(m_dbconn, s_MetadataValueName_DatabaseIdentifier);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this return empty when not set?

Comment on lines 71 to 72
// Creating the source reference for this is sufficient to cause the cache to be updated on next Open.
InstalledForceCacheUpdate,
Copy link
Member

@florelis florelis Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me if this is an actual source I could use, or if it is just to force the update. Maybe update the comment to "Same as Installed, but creating the source reference causes the cache to be updated on the next Open"

Edit: It was clear once I saw the usage, but I'd still appreciate a clarification here

var installedCatalogReference = this.packageManager.GetLocalPackageCatalog(LocalPackageCatalog.InstalledPackages);

// Ensure package is not installed
string targetPackageProductCode = "{A499DD5E-8DC5-4AD2-911A-BCD0263295E9}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I had to look up what this GUID was. Can this be a constant somewhere else?

@@ -39,6 +39,9 @@ namespace AppInstaller
namespace Repository::Microsoft
{
void TestHook_SetPinningIndex_Override(std::optional<std::filesystem::path>&& indexPath);

using GetARPKeyFunc = Registry::Key(*)(void*, Manifest::ScopeEnum, Utility::Architecture);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could we use std::function here?

auto result1 = source1->Search({});
REQUIRE(!result1.Matches.empty());

for (const auto& match : result1.Matches)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could you add a comment about what this is loop is doing?

auto result1 = source1->Search({});
REQUIRE(!result1.Matches.empty());

for (const auto& match : result1.Matches)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could/should we check that there are no packages in result2 that aren't in result1?

Architecture architectureCallback = Architecture::Unknown;

ScopeEnum scopeTarget = ScopeEnum::User;
Architecture architectureTarget = Architecture::X64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this test run on x86?

Comment on lines 300 to 311
# - task: VSTest@2
# displayName: 'Run tests: Microsoft.Management.Configuration.UnitTests (InProc)'
# inputs:
# testRunTitle: Microsoft.Management.Configuration.UnitTests (InProc)
# testSelector: 'testAssemblies'
# testAssemblyVer2: '**\Microsoft.Management.Configuration.UnitTests.dll'
# searchFolder: '$(buildOutDir)\Microsoft.Management.Configuration.UnitTests'
# codeCoverageEnabled: true
# platform: '$(buildPlatform)'
# configuration: '$(BuildConfiguration)'
# diagnosticsEnabled: true
# condition: succeededOrFailed()
Copy link
Member

@florelis florelis Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use enabled: false instead of commenting it out

Suggested change
# - task: VSTest@2
# displayName: 'Run tests: Microsoft.Management.Configuration.UnitTests (InProc)'
# inputs:
# testRunTitle: Microsoft.Management.Configuration.UnitTests (InProc)
# testSelector: 'testAssemblies'
# testAssemblyVer2: '**\Microsoft.Management.Configuration.UnitTests.dll'
# searchFolder: '$(buildOutDir)\Microsoft.Management.Configuration.UnitTests'
# codeCoverageEnabled: true
# platform: '$(buildPlatform)'
# configuration: '$(BuildConfiguration)'
# diagnosticsEnabled: true
# condition: succeededOrFailed()
- task: VSTest@2
displayName: 'Run tests: Microsoft.Management.Configuration.UnitTests (InProc)'
inputs:
testRunTitle: Microsoft.Management.Configuration.UnitTests (InProc)
testSelector: 'testAssemblies'
testAssemblyVer2: '**\Microsoft.Management.Configuration.UnitTests.dll'
searchFolder: '$(buildOutDir)\Microsoft.Management.Configuration.UnitTests'
codeCoverageEnabled: true
platform: '$(buildPlatform)'
configuration: '$(BuildConfiguration)'
diagnosticsEnabled: true
enabled: false
condition: succeededOrFailed()

This comment has been minimized.

florelis
florelis previously approved these changes Nov 7, 2023
yao-msft
yao-msft previously approved these changes Nov 7, 2023
@@ -25,6 +25,9 @@ namespace AppInstaller::SQLite
// Gets the last write time for the database.
std::chrono::system_clock::time_point GetLastWriteTime();

// Gets the identifier written to the database when it was created.
std::string GetDatabaseIdentifier();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method used only for testing purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it exclusively for test purposes at this time. But it seemed like an interesting enough idea to just leave it in for all databases.

@JohnMcPMS JohnMcPMS dismissed stale reviews from yao-msft and florelis via 20c8e88 November 7, 2023 23:31
@JohnMcPMS JohnMcPMS merged commit 296a53d into microsoft:master Nov 8, 2023
8 checks passed
@JohnMcPMS JohnMcPMS deleted the cache-installed branch November 8, 2023 03:05
@Girofox
Copy link

Girofox commented Nov 8, 2023

Possible trigger for debugging

  • manually starting backup in Windows Backup app, this instantly causes WindowsPackageManagerServer.exe running at around 25 % cpu usage for some time.
  • process is still running even after backup is finished, but there is a message that apps backup may be incomplete.

The log file in %LocalAppData%\Packages\Microsoft.DesktopAppInstaller_8wekyb3d8bbwe\LocalState\DiagOutputDir grows very quickly to more than 1.5 Mb in my case too.

JohnMcPMS added a commit that referenced this pull request Nov 9, 2023
Cherry-pick of #3808 , then fixing up the changes (mostly namespace
changes, but also bring one extra function over) for the fact that the
SQLite base code moved around.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants