-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency resolution during install can be slow #3451
Comments
I installed all of the packages and the package versions, listed above, except:
You didn't supply Note to upgrade 23 packages, it took 17 minutes. So I cannot reproduce this and suspect the issue may lie with the packages I do not have. Without |
Thanks for your response. This does require access to a non-public repo with all the package versions. I've been reviewing the tests to see if I can create a test that reproduces the issue. Would you be interested in a PR that includes a unit test? |
We need a way to reproduce it here. The steps you provided don't allow me to do that. So everything you are doing and using is what is needed. If you want to supply the packages in private, I'm happy to set something up. A PR with a unit test isn't going to get us that, unfortunately. |
I can't provide access to the private repo or the packages we have. I will make packages that reproduce the issue that I can share. |
It turns out the dependency tree needs a kite tail to reproduce the graph below. I've attached a script and package files that generate the packages needed to reproduce this. I can send the packages I generated, but it's 3 MB so I did not attach it here. In the demo-projects folder, run GeneratePackages.ps1. This will generate a single package for package-a, package-b, and package-c. It will generate 1000 packages each for package-d and package-e. This shows up when running the upgrade even when no package is installed. I would guess it shows up when running install, but I haven't tested that yet. Here's the command I ran. choco upgrade package-a --verbose --yes --source C:\demo-projects The length of time it takes depends on the number of packages in the source. Change the first line in GeneratePackages.ps1 to $generateFewerPackages = true to see that it complete in shorter time (provided you don't have the full list of packages generated before). Graph of dependencies: graph TD;
A-->B;
A-->C;
B-->D;
C-->D;
D-->E
|
We need to reproduce your issue. The one you reported. Do you have 1000 packages in your repository, with inter-related dependencies, that are causing you day-to-day issues? What you've provided is a script to create up to 1000 packages with inter-related dependencies. This isn't a real-world use case and feels more hypothetical. If you have this in your repository, it's going to take a while to resolve. Chocolatey CLI uses NuGet libraries for package and dependency resolution, as you saw, so this would be better raised with them. |
Yes. We have packages with inter-related dependencies and push builds to an artifact server. The head package that pulls in all the dependencies is pushed twice daily, but the middle packages can be pushed more often than that. It is far from hypothetical, but is instead a simplified version of what we have. |
We have multiple packages for services. Each change a developer commits may update the configuration package and/or the service package. These are deployed continuously with automated testing and we can gather a very large number of package versions since they are generated with each build. I've been able to resolve this with a build of the choco source, but I'm uncertain of the implications of checking the dependency cache this way vs a regular .Contains call (see line 12 of each file). I see that the NuGet libraries are used for dependency resolution, but my hope is that the cache that is used currently can catch this case to avoid additional dependency checks. |
Following the instructions in this comment, the packages were installed in 2m 24s using Chocolatey CLI v2.2.2. This is a second run (after resetting the VM) without |
The local disk installs are significantly faster. When I run it with the fewer packages option I see an install of 2.991 seconds. There's a multiplier effect with the number of packages available and fetching those over the network is how you reach the times initially reported. |
I think I understand how the nuget.exe is avoiding this problem. For context, I'm looking at the following classes:
The key is in ResolverGather. It pulls in the primary target and finds its dependencies. A closure is built in a loop over the parent and child packages based on the Dependencies property of a SourcePackageDependencyInfo. This is projected down to just the package ids. Any packages that have not been searched by id goes through the dependency load again. The method used to find dependency information is DependencyInfoResource.ResolvePackages just like in the GetPackageDependencies method in NugetCommon (from chocolatey project). The difference is that the nuget client code will only call ResolvePackages once per id. My proposed fix considers the dependency VersionRange. Since ResolvePackages doesn't take version info, this may not be necessary. The key to avoiding the problem is reducing duplicate calls to ResolvePackages. |
Will this issue be fixed by #3433 ? |
#3433 looks to be a very similar issue, but the fix was not sufficient for this issue. I'm able to reproduce the slow resolution of packages with a build from the develop branch. HandleDependencies was added for #3433, and the GetPackageDependencies call in the else case doesn't consider the version info until after calling ResolvePackages. If version information is not considered in ResolvePackages, the cache will have more information than is being considered in HandleDependencies. The cache test works well for the first case in HandleDependencies, but should be more expansive for the second. |
I'm looking at the contributor requirements to offer up a possible solution. In the mean time, I'll offer this observation. GetPackageDependencies that takes a PackageIdentity uses availablePackages to avoid network requests if the results are already available. GetPackageDependencies that takes a packageId string needs to avoid duplicate network requests and could use availablePackages as well. |
This updates the dependency handling of packages to use availablePackages when a package from the same source satisfies the dependency. This improves packages that have dependencies with many versions available, and allows less data being requested from servers in the case we have already have available packages from the source. Without this fix, the installation will be delayed with repeated resolution of the same package dependencies.
This updates the dependency handling of packages to use availablePackages when a package from the same source satisfies the dependency. This improves packages that have dependencies with many versions available, and allows less data being requested from servers in the case we have already have available packages from the source. Without this fix, the installation will be delayed with repeated resolution of the same package dependencies. Use parentPackage version when finding the parent package.
@corbob, what is the impact of changing this from a bug to an improvement? 2.x is unusable for some of our private feeds. |
@josh-cooley That's a good question. The consensus we have come to internally is that this is an improvement as the current dependency resolution does work, just very slowly. The main difference of moving it from a bug to an improvement is that it moves the targetted milestone from 2.3.1 to 2.4.0. We do not currently have an ETA on any release, so it does not change the immediacy of a release in that regard, although a minor version change would typically be expected to take longer as it brings in more work, while a patch version change would potentially be sooner as it tends to be bugfixes only and narrower in scope. I was about to merge PR #3471 to develop which would earmark it for the 2.4.0 release, but based on your concern here, I'll hold off on merging it for now. And we can discuss it internally and merge it when we're ready for wherever we eventually target it to. Just to level set a little bit: regardless of which milestone we end up earmarking it against and ultimately merging it towards, I would not expect a release to be imminent. It is entirely likely that we would look to bring a few more things in before any release, and there are other priorities and work that is going on. |
I wrote my comments to prompt discussions on possibly targeting this to an earlier release. I understand that 2.4.0 is not imminent, so I hope there's time for those discussions without slipping this update past 2.4.0. |
@josh-cooley no worries, there has been some discussion internally about this and how to label it. Although the main delay in moving this issue and PR forward has been that I have been off since July 20th, just getting back today. I'll be getting it lined up today to merge into |
This updates the dependency handling of packages to use availablePackages when a package from the same source satisfies the dependency. This improves packages that have dependencies with many versions available, and allows less data being requested from servers in the case we have already have available packages from the source. Without this fix, the installation will be delayed with repeated resolution of the same package dependencies. Use parentPackage version when finding the parent package.
This updates the dependency handling of packages to use availablePackages when a package from the same source satisfies the dependency. This improves packages that have dependencies with many versions available, and allows less data being requested from servers in the case we have already have available packages from the source. Without this fix, the installation will be delayed with repeated resolution of the same package dependencies. Use parentPackage version when finding the parent package.
This commit adds pester tests and files required for the pester tests. The tests include ensuring that installing a package that contains a dependency tree that spans lots of versions does not take unnecessarily long. As well as duplicating an integration test into the pester tests to provide us with some belt and suspenders to help us catch issues at both the integration and the pester test levels.
(#3451) Use availablePackages in GetPackageDependencies
🎉 This issue has been resolved in version 2.4.0 🎉 The release is available on: |
Checklist
What You Are Seeing?
choco upgrade my-package command takes a very long time resolving dependencies. The verbose output shows that package resolution is occurring repeatedly for the same packages. Debugging the issue shows it processing multiple versions of the same dependency in NugetCommon.GetPackageDependencies and NugetCommon.HandleDependencies
Example dependencies
The dependency for B->C is similar to [1.0.1000, 2.0.0)
The dependency for C->D is similar to [1.0.0, 2.0.0)
All versions between 1.0.0 and 1.0.1000 are resolved and dependencies for D are then resolved for each instance.
Downgrading to 1.4 allows this to work.
What is Expected?
Expect resolution to the narrowest constrained dependencies. If 1.0.1000 or newer is required, no version less than 1.0.1000 should be considered.
How Did You Get This To Happen?
System Details
Installed Packages
Output Log
Additional Context
NugetCommon.HandleDependencies has a dependencyCache parameter. The cache is checked for exact matches with dependencyCache.Contains(dependency). Changing this to check the VersionRange with IsSubSetOrEqualTo allowed the upgrade to complete and target the narrowest range in the dependencies. The check looks like:
dependencyCache.Any(d => d.Id == dependency.Id && d.VersionRange.IsSubSetOrEqualTo(dependency.VersionRange));
Instead of the Contains call.
The text was updated successfully, but these errors were encountered: