Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix store: handle invalid cache block dir #1505

Merged
merged 2 commits into from
Sep 15, 2019

Conversation

FUSAKLA
Copy link
Member

@FUSAKLA FUSAKLA commented Sep 9, 2019

fixes #1504

  • CHANGELOG entry if change is relevant to the end user.

Changes

  • store: delete block dir with missing meta.json on startup

@FUSAKLA
Copy link
Member Author

FUSAKLA commented Sep 9, 2019

CI failing in master, will rebase once fixed

@FUSAKLA FUSAKLA force-pushed the fus-store-handle-invalid-block branch from cff24a3 to 0b60b5f Compare September 10, 2019 13:14
@FUSAKLA FUSAKLA requested a review from bwplotka September 10, 2019 13:17
Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, thanks for the contribution! The code itself looks okay and I get the idea but I have taken a look at your original ticket. As I understand it, all of the data looks okay in the remote object storage but you have empty directories in your local disk. However, how could this be if we run SyncBlocks() before? It seems like the correct place to fix here is this: https://github.com/thanos-io/thanos/blob/master/pkg/store/bucket.go#L1191 - we should probably add this check there. @FUSAKLA WDYT?

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for lag @FUSAKLA

I think I fully agree with @GiedriusS

It looks like our loadMeta is just broken as it assumes that if dir is present then meta.json is present as well. We should rather check for meta.json directly.

This thing is happening because if you restart store in wrong moment (when it syncs blocks meta.json), the directory can be created by meta.json not downloaded. Thanks for fixing this bug - it's quite edge case bug, but serious as we skip data!

Thanks!

@FUSAKLA
Copy link
Member Author

FUSAKLA commented Sep 13, 2019

Hi, yes @GiedriusS that totally makes sense. Thanks for the suggestion will refactor right away.

Just to add to the rarity. It's happening quite often in our production since upgrading to the 0.7.0 which is weird, but this should solve it.

@FUSAKLA FUSAKLA force-pushed the fus-store-handle-invalid-block branch from 0b60b5f to c58038a Compare September 13, 2019 15:07
@FUSAKLA
Copy link
Member Author

FUSAKLA commented Sep 13, 2019

@GiedriusS Should be moved as you suggested, PTAL if that is what you meant.

@@ -153,3 +153,10 @@ func IsBlockDir(path string) (id ulid.ULID, ok bool) {
id, err := ulid.Parse(filepath.Base(path))
return id, err == nil
}

func HasMetaFile(blockPath string) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am against this shallow function - such helper is not needed - the oneliner below is exactly enough for this.

In the same way, there is no Max(a, b) function is go standard library. You just do this one if on your own (:
What do you think, can we just inline this logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do what @bwplotka suggests, it might be neater. 👍

Copy link
Member Author

@FUSAKLA FUSAKLA Sep 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I eventually simplified it to checking directly for the meta file which should also provide the check for the directory existence right away. The creation of dir is agnostic to it's existence and new downloaded files should overwrite those existing in the block dir.

Thanks @bwplotka for pointing this out I just saw the IsBlockDir func so I somehow followed the pattern.

PTAL if this is ok with you.

Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for this fix! The logic seems correct to me now. Even if other errors will occur while doing a stat(2) on the meta file, it's not a problem since it will be caught later by the reading routine.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, good choice with checking file directly!

LGTM, thanks!

@bwplotka bwplotka merged commit 1389096 into thanos-io:master Sep 15, 2019
wbh1 pushed a commit to wbh1/thanos that referenced this pull request Sep 17, 2019
* fix store: handle invalid cache block dir

Signed-off-by: Martin Chodur <[email protected]>

* CR: simplify empty store cache block validation logic

Signed-off-by: Martin Chodur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

store: missing index.cache.json
4 participants