-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add storage 'middleware' to do scrape time aggregation. #393
Conversation
Signed-off-by: Tom Wilkie <[email protected]>
|
func (b *batch) Add(l labels.Labels, t int64, v float64) (uint64, error) { | ||
b.samples = append(b.samples, sample{l, t, v}) | ||
return b.appender.Add(l, t, v) | ||
} | ||
|
||
func (b *batch) AddFast(ref uint64, t int64, v float64) error { | ||
return b.appender.AddFast(ref, t, v) | ||
} | ||
|
||
func (b *batch) Commit() error { | ||
for _, r := range b.aggregator.rules { | ||
if err := b.execute(r); err != nil { | ||
return err | ||
} | ||
} | ||
|
||
return b.appender.Commit() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool. If I understand correctly, this is going to commit both aggregated and non-aggregated samples to the WAL, right? I think this will end up with a usage pattern of the aggregation rules always being combined with metric_relabeling_rules to drop the original non-aggregated samples.
It would be cool if we could defer writes to the underlying appender and only write aggregated samples and samples that weren't part of an aggregation rule, but I fear that might be more complicated than it's worth.
I was originally thinking that we’d use relabelling rules to drop them
yeah. Could “taint” samples that were touched by series, doesn’t sound too
hard. I’d make it an option though, just in case users want both...
…On Sun, 7 Feb 2021 at 18:34, Robert Fratto ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pkg/aggregator/aggregator.go
<#393 (comment)>:
> +func (b *batch) Add(l labels.Labels, t int64, v float64) (uint64, error) {
+ b.samples = append(b.samples, sample{l, t, v})
+ return b.appender.Add(l, t, v)
+}
+
+func (b *batch) AddFast(ref uint64, t int64, v float64) error {
+ return b.appender.AddFast(ref, t, v)
+}
+
+func (b *batch) Commit() error {
+ for _, r := range b.aggregator.rules {
+ if err := b.execute(r); err != nil {
+ return err
+ }
+ }
+
+ return b.appender.Commit()
+}
Very cool. If I understand correctly, this is going to commit both
aggregated and non-aggregated samples to the WAL, right? I think this will
end up with a usage pattern of the aggregation rules always being combined
with metric_relabeling_rules to drop the original non-aggregated samples.
It would be cool if we could defer writes to the underlying appender and
only write aggregated samples and samples that weren't part of an
aggregation rule, but I fear that might be more complicated than it's worth.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#393 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADMNBLPDIG7PBQZZSPITTLS53MKNANCNFSM4XHVQHFA>
.
|
I keep revisiting this thought and the underlying approach again and again, so just to persist it: I would strongly argue in favor of allowing both "keep" and "taint and remove". |
As part of our bug scrub, we've decided to close this for now. We're planning on adding support for this eventually, where we'll use this PR as a base. I've opened grafana/alloy#554 to track this going forward. |
Signed-off-by: Tom Wilkie [email protected]
PR Description
This PR adds a storage "middleware" that wraps the existing storage and allows you to run quasi-recording rules again commit batches.
The idea here is that you can use these rules to aggregate away certain labels that lead to high cardinality.
Which issue(s) this PR fixes
Notes to the Reviewer
NB this isn't wired in anywhere, and is very much just a proof of concept of an idea - that you can reuse the PromQL engine to do these scrape-time aggregations, and don't need to invent another way of doing it.
Right now this won't work as "AddFast" isn't implemented. We need the labels to be able to execute the PromQL, so we need some cache for these.
Also, the batch doesn't use any form of index, making select queries more expensive than they should be.
PR Checklist