-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logic for respecting the package auto-delete configuration #5415
Conversation
if (package.PackageRegistration.DownloadCount > _config.MaximumDownloadsForPackageId.Value) | ||
{ | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are cases when we can avoid _statisticsService.Refresh() which is expensive. Restructure the code to be cheaper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I really only need it before the stale check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
var sinceCreated = now - package.Created; | ||
|
||
// Always refresh the statistics. | ||
await _statisticsService.Refresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_statisticsService.Refresh() [](start = 18, length = 28)
We will want to analyze the success of the feature later on and understand how to improve it. Add traces that help understand why we returned false. Consider structuring if in a way that will allow creating a table of input properties and result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything looks fine to me
|
||
// Handle the "early" delete case, where the package version download count is not considered but total | ||
// download count on the entire ID (all versions) is considered. | ||
if (_config.StatisticsUpdateFrequencyInHours.HasValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the name StatisticsUpdateFrequencyInHours
is a bit misleading...potentially something along the lines of ExpectedStatisticsUpdateFrequencyInHours
would be more accurately convey that stats are not GUARANTEED to be updated every X hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expected
is good. I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I decided to not make this change as it would require a config change in NuGetDeployment. This is not worth the effort.
var sinceCreated = now - package.Created; | ||
|
||
// Always refresh the statistics. | ||
await _statisticsService.Refresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we will refresh stats every time someone goes to the ReportMyPackage
page or submit the ReportMyPackage
form?
I'm not sure about the logistics of this method (does it always refresh or is it just if the cache is stale?) but if this refresh is not cached, somebody could create a lot of turmoil on the site by hitting either of these two methods repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The refresh is cached and the cache is a singleton.
&& !_config.StatisticsUpdateFrequencyInHours.HasValue; | ||
} | ||
|
||
private bool AreStatisticsStale(DateTime now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a method that should be on the StatisticsService
itself, but if it's too much work to move it there it's fine here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stale is a relative thing. Different scenarios may or may not have different "stale" definitions. For now the concept only exists in the auto-delete workflow.
9a6afed
to
8d3bd1a
Compare
$"{nameof(_config.HourLimitWithMaximumDownloads)}.", | ||
nameof(config)); | ||
} | ||
} | ||
|
||
public Task<bool> CanPackageBeDeletedByUserAsync(Package package) | ||
public async Task<bool> CanPackageBeDeletedByUserAsync(Package package) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on only emitting telemetry when the support request form is actually submitted?
This would reduce the amount of de-duping we would have to do on the telemetry -- but not eliminate all duplicates, since a user could support multiple support requests for a package. I figured it would just be simpler to submit telemetry whenever this method is called and de-dupe later when we are analyzing the data.
As it stands today, telemetry is emitted before the support request form is loaded and after the form is submitted with a deletion request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we'll do the check before, and only show the link if it's possible? I think we should only submit telemetry on submit, not view rendering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we'll do the check before, and only show the link if it's possible?
Correct.
I think we should only submit telemetry on submit, not view rendering.
Yeah, that's where I'm leaning too.
// Handle the "early" delete case, where the package version download count is not considered but total | ||
// download count on the entire ID (all versions) is considered. | ||
if (_config.ExpectedStatisticsUpdateFrequencyInHours.HasValue | ||
&& details.SinceCreated < TimeSpan.FromHours(_config.ExpectedStatisticsUpdateFrequencyInHours.Value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean owners of popular packages that want to delete will need to unlist immediately, then can delete after the hour limit has expired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlisted is not a factor here.
Owners of popular packages (IDs > 125,000 downloads) cannot delete their packages at all.
/// allowed since the download data is too stale. If <see cref="StatisticsUpdateFrequencyInHours"/> is null, | ||
/// <see cref="ExpectedStatisticsUpdateFrequencyInHours"/> hours. Also, if the package has not had its download | ||
/// statistics updated in the last <see cref="ExpectedStatisticsUpdateFrequencyInHours"/> hours, the delete is not | ||
/// allowed since the download data is too stale. If <see cref="ExpectedStatisticsUpdateFrequencyInHours"/> is null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configured based on the frequency of the stats jobs? Should we have a specific recommendation in comments, based on job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configured based on the frequency of the stats jobs?
That's not too necessary. As mentioned in the XML doc, this value is used for a stale check (so the value can be larger than the average frequency).
Should we have a specific recommendation in comments, based on job?
Yeah, I'll try to improve this.
StaleStatistics, | ||
TooManyVersionDatabaseDownloads, | ||
TooManyVersionReportDownloads, | ||
TooLate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the UI always show an option to delete, but present the error message if the criteria aren't met?
Or do we check before showing a link, and after / on submit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the UI always show an option to delete, but present the error message if the criteria aren't met?
Nope, we only show the option if at the time of form load their package is eligible.
Or do we check before showing a link, and after / on submit?
Correct. And if the eligibility has changed since the form was presented, we just treat it as a normal support request.
$"{nameof(_config.HourLimitWithMaximumDownloads)}.", | ||
nameof(config)); | ||
} | ||
} | ||
|
||
public Task<bool> CanPackageBeDeletedByUserAsync(Package package) | ||
public async Task<bool> CanPackageBeDeletedByUserAsync(Package package) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we'll do the check before, and only show the link if it's possible? I think we should only submit telemetry on submit, not view rendering.
|
||
private bool HasVersion(StatisticsFact fact, string version) | ||
{ | ||
if (fact != null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if [](start = 12, length = 2)
Consider return xyz;
instead of if (xyz) { return true } return false;
,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
} | ||
|
||
// If no time ranges are configured, allow downloads any time. | ||
if (_config.HourLimitWithMaximumDownloads.HasValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if [](start = 12, length = 2)
Could we convert the chains of if
to else if
s? It'd be a little more clear I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
} | ||
catch | ||
{ | ||
// logging failed, don't allow exception to escape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// logging failed, don't allow exception to escape [](start = 16, length = 50)
Should we maybe log these exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
var hours = details.SinceCreated.TotalHours; | ||
|
||
TrackMetric(Events.UserPackageDeleteAfterHours, hours, properties => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
properties [](start = 67, length = 10)
The action here seems unnecessary. I think this just creating a dictionary and using its initializer syntax would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(it saves the initialization of the dictionary)
3f38e5f
to
00e5a4e
Compare
00e5a4e
to
044866c
Compare
Address https://github.com/NuGet/Engineering/issues/921.
Could I get some crazy nitpick 👀s on this PR? There is risk associated with this change as it allows users to delete their own packages in some cases.
I check both the DB and the statistics reports for download counts. This is to improve resiliency as separate jobs write these two artifacts.