Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact: add label when no labels left after deduplication #2526

Merged
merged 10 commits into from
May 18, 2020

Conversation

sepich
Copy link
Contributor

@sepich sepich commented Apr 26, 2020

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Fixes #2550
Per #1014 (comment) i'm adding new flag --deduplication.label-if-empty=label=value to handle situation, when there are no labels left after deduplication.

Verification

Basic tests added.
Tested manually on our data, collected by 3 thanos-receive with only --label=replica="$(NAME)"

Alexander Ryabov added 3 commits April 26, 2020 21:54
Signed-off-by: Alexander Ryabov <[email protected]>
Signed-off-by: Alexander Ryabov <[email protected]>
Signed-off-by: Alexander Ryabov <[email protected]>
@sepich sepich force-pushed the deduplication-label-if-empty branch from 504a43b to 092ba1e Compare April 26, 2020 18:55
Alexander Ryabov added 2 commits April 28, 2020 19:25
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great. Thanks for fixing it.

However, to simplify flow rather than introducing a new CLI flag, could we just add an automated external label? Something like given_replica_label=compacted|deduped or constant_replica_label=compacted=deduped. WDYT?

@sepich
Copy link
Contributor Author

sepich commented May 12, 2020

Sure, will rewrite then to use first label name from list set in --deduplication.replica-label, and then automatically add first_label_name=deduped in this case.
I think it is better than constant_replica_label, because user probably do not have this constant in --query.replica-label on querier. Which will lead to metrics have additional label at some point in time (corresponding to compaction start)

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm.. It's brutal but think it makes sense. Thanks!

Still no unique external labels will affect people. I would also add leve.Warn log line mentioning this. WDYT? (:

Thanks! looking forward to update and then we can merge for 0.13 release.

@bwplotka
Copy link
Member

Good idea @kakkoyun !

@sepich
Copy link
Contributor Author

sepich commented May 18, 2020

I can add level.Warn log, but for us it is usual case - so this would just spam the logs.
Could you please clarify how it "affect people", so we consider adding additional label to current replica.

@bwplotka
Copy link
Member

I can add level.Warn log, but for us it is usual case - so this would just spam the logs.

Really? It's every compaction so it's fine. And also you should have unique label per each "stream" as it will easier to scale to more streams some day. That's why log .

Signed-off-by: Alexander Ryabov <[email protected]>
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bwplotka bwplotka merged commit 1859833 into thanos-io:master May 18, 2020
@bwplotka
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compact: fails to dedup data from thanos-receive
3 participants