-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: allow artifact gc to delete directory. Fixes #12857 #13091
Conversation
Signed-off-by: Tianchu Zhao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work figuring out the problem and fixing it!
Left a few modifications below and some optimization suggestions
workflow/artifacts/s3/s3.go
Outdated
isDir, err := s3cli.IsDirectory(artifact.S3.Bucket, artifact.S3.Key) | ||
if err != nil { | ||
return fmt.Errorf("failed to test if %s is a directory: %v", artifact.S3.Key, err) | ||
} | ||
|
||
if !isDir { | ||
return s3cli.Delete(artifact.S3.Bucket, artifact.S3.Key) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could potentially be optimized I'm thinking -- try deleting and if it doesn't work and gives the appropriate error, then proceed with the directory deletion
similar to what I did in #12974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, the current change doubles the API calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, looking deeper I realise
For
err = deleteFile
if isNotFound(err) {
deleteDir
}
to work, deleteFile
needs to return the appropriate error.
In s3, delete only adds a marker and will return 204
regardless if the key exist or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, use simple path suffix check instead of s3cli.ListDirectory
to determine if the path is a directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh does the /
check actually work? Great workaround if so!
workflow/artifacts/s3/s3.go
Outdated
for _, objKey := range keys { | ||
err = s3cli.Delete(artifact.S3.Bucket, objKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can probably be parallelized.
I am also a bit concerned that for large directories this could potentially hit a rate limit or cause a lot of load, although I think in this case archive: none
is already an edge case and a large dir would be another edge case, so potentially not necessary to think about that. though maybe should leave a comment for future reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err := retry.OnError(retry.DefaultBackoff, isTransientS3Err, func() error {
currently we have retry.DefaultBackoff
and isTransientS3Err
to handle the rate limit situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right above this code/surrounding it, good point.
I do think this could still be parallelized, although that can be a separate PR as it seems we lack parallelization in several (most?) places for artifacts (related: #12442)
Co-authored-by: Anton Gilgur <[email protected]> Signed-off-by: Tianchu Zhao <[email protected]>
Co-authored-by: Anton Gilgur <[email protected]> Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: Anton Gilgur <[email protected]> Signed-off-by: Anton Gilgur <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if the suffix check works. It might be good to have a test case for archive: none
as well
I'll leave it up to you if you want to do parallelization in this PR or separately
@tczhao do you want to re-trigger the CI, and then it sounds like this can be merged? |
CI doesn't need to be retrigerred as only the Windows tests are failing which are not required to pass. Can just hit merge |
ahh, thanks |
Signed-off-by: Tianchu Zhao <[email protected]> Signed-off-by: Anton Gilgur <[email protected]> Co-authored-by: Anton Gilgur <[email protected]> (cherry picked from commit a929c8f)
Fixes #12857
Motivation
We see error
The specified key does not exist
when gc artifact are not compressed and is directoryModifications
Determine if artifact is dir and delete accordingly (used in artifact gc)
Verification
Test log