-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add -prune option to dockerregistry #14585
Conversation
@@ -45,8 +47,41 @@ import ( | |||
registryconfig "github.com/openshift/origin/pkg/dockerregistry/server/configuration" | |||
) | |||
|
|||
var prune = flag.Bool("prune", false, "Prune blobs from the storage. DANGEROUS! Do it only in read only mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we check if registry is not in read-only mode and error out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but only if dockerregistry -prune
is executed in dc/docker-registry's container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if we could get blob's modification time, it would be a relatively safe operation even without switching to read-only mode. And it is relatively unsafe operation on an eventually consistent storage (like S3) even in read-only mode.
|
||
// ExecutePruner runs the pruner. | ||
func ExecutePruner(configFile io.Reader) { | ||
log.Infof("prune version=%s", version.Version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distribution version doesn't say much. Can this output OpenShift release/git version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time="2017-06-14T10:39:56.705688255Z" level=info msg="start prune" distribution_version="v2.4.1+unknown" kubernetes_version=v1.6.1+5115d708d7 openshift_version=v3.6.0-alpha.1+bb45ac4-1014-dirty
Is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/dockerregistry/server/prune.go
Outdated
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
) | ||
|
||
func Prune(ctx context.Context, storageDriver driver.StorageDriver, registry distribution.Namespace, registryClient RegistryClient) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some godoc would be nice, what it does, what is missing and what is required.
pkg/dockerregistry/server/prune.go
Outdated
logger.Printf("Deleting manifest: %s@%s", repoName, dgst) | ||
err := manifestService.Delete(ctx, dgst) | ||
if err != nil { | ||
return fmt.Errorf("delete manifest %s: %s", dgst, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these errors begin with something like failed to …
? I'd like to have warnings and errors easily distinguishable from Deleting …
.
pkg/dockerregistry/server/prune.go
Outdated
return nil | ||
} | ||
|
||
err := vacuum.RemoveBlob(string(dgst)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add logger.Printf("Deleting blob: %s", dgst)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is already logged from the inside of RemoveBlob.
INFO[0000] Deleting blob: /docker/registry/v2/blobs/sha256/f6/f60ecfcf302b2378acfc896904b637497acb7863ded1c8c867c92e97593ae412 go.version=go1.8.3 instance.id=68dc9918-abae-47cc-b118-8f0f161c5cca
Should I log it one more time, but with the digest only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. It's OK then.
Would it be possible to treat
as just |
Actually, the output looks like junk:
By default, some human readable output should be produced. Could you ignore the loglevel from the config file up to WARNING and print the |
@miminar I can ignore loglevel from the config file, but how then log level should be configured? |
e6738bd
to
c930106
Compare
pkg/dockerregistry/server/prune.go
Outdated
return nil | ||
}) | ||
if e, ok := err.(driver.PathNotFoundError); ok { | ||
logger.Warnf("No repositories are found: %s", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/are found/found/
I still think we should print "Keeping" and "Deleting" messages to stdout without any other garbage. All the context attributes like Could there be a different logger used to output these messages? |
pkg/dockerregistry/server/prune.go
Outdated
for _, repoName := range reposToDelete { | ||
err = vacuum.RemoveRepository(repoName) | ||
if err != nil { | ||
logger.Fatal("Failed to remove the repository %s: %v", repoName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/"F/"f/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which cases we should start a log message with a lowercase letter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other Fatal
statements here print either lowercase messages or they print errors that are lower-cased.
No, "Deleting repo" and "Deleting blob" come from the distribution code. And distribution can use only logrus as a logger. If we make another logrus instance that writes to stdout and ignores fields, we might end up with errors on stdout. |
@mfojtik This looks good to me except for:
If you are fine with these, you are free to merge. |
Since there are two out of three people against my pleas, let's call it a democracy and merge it. @legionus |
[Test]ing while waiting on the merge queue |
So we waited for 8 hours just to get the "Branch is Closed for Non-Bugs" error?! |
@legionus, the branch is closed to merge, you need to say [merge][severity: blocker] or something like this. |
this needs to wait until 3.6 is closed since hard prune is scheduled for 3.7 |
@miminar @dmage @legionus @mfojtik I am in favor of a dry-run flag on this command that shows what will be deleted. It provides administrators the ability to verify that the orphan prune will actually solve their issue. In the use case of low space alerts the administrator will only be able to guess that this will help their problem without really knowing it will. There is a chance that this command will do nothing at all and they really need to add disk space. It is also in line with our other commands. |
I's also like to see a |
@pweil- you aware that running this with So basically the Said that this is only for limited number of customers (normal users should be fine with regular prune), I would punt on dry-run here. We can do it as follow up, but I would rather unblock the customer first. Saying that, we have to be sure what blobs we are removing to not break existing images, I guess @dmage have that covered in a test or something :-) |
Yes and that is not useful 😄 Summary of discussion: If we're going to recommend to folks that they only use this command when they absolutely have to but cannot tell them when they have to then we're not providing the information we need to. Especially given the fact that we're going to require that they shut down uploads cluster-wide. @dmage is going to see if we can provide the amount of space that this command will free up if it is run. An admin can see that they're running out of space and |
|
/test all |
Either you can wait for #15423 or you can include it in your PR. |
Anyway, this needs a rebase. Please, include #14509 when you are at it. |
Signed-off-by: Oleg Bulatov <[email protected]>
Signed-off-by: Michal Minář <[email protected]>
Signed-off-by: Oleg Bulatov <[email protected]>
/approve Let's wait for green tests. I'd like to run |
Opened a new issue: openshift-eng/aos-cd-jobs#483 |
Good isolation here - I really like how cleanly this is encapsulated in the image and the lifecycle. |
/test end_to_end |
I'm not an assignee, but let's see if it works: Update: Looks like it does ... and suddenly, I've become an assignee. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dmage, mfojtik, miminar Associated issue: 1467340 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue |
…anes Automatic merge from submit-queue [3.6][Backport] Prune orphaned blobs on registry storage Resolves [bz#1479340](https://bugzilla.redhat.com/show_bug.cgi?id=1479340) Backports #14585
Resolves bz#1467340