-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document possible race conditions during federation #514
Comments
This isn't specific to federation, this is the case with any query. |
It's much more relevant for federation because now you can get the situation where some metrics of a target are a scrape interval ahead of other metrics of the same target for a long while and that gets persisted. |
The same can happen if a recording rule is running during a scrape, which is a more supported use case. |
Good point. Maybe this should be documented for both cases then... |
It's a more general execution model thing, nothing is atomic. |
Yeah, but federation and rules are the only places I can think of where there can be such a long delay which is persisted, no? In normal graphs, you could only have that kind of glitch on the far right (now) end, where data is still coming in. But if you use that far right end to persist new data, then you have this problem. |
You have the exact same with recording rules. |
Idea to solve the issue for both cases:
If you think this is feasible, we can open an issue in prometheus/prometheus. |
In the end this goes back to SampleAppender accepting a slice of samples ( On Mon, Aug 15, 2016 at 10:50 AM Björn Rabenstein [email protected]
|
Sadly, changing the interface doesn't magically make ingestion atomic. My point in the discussion was that such an interface would suggest atomicity, which we cannot provide at the moment. Should we able to do so, I'm all up for such an interface.
The fast path is already there. |
Hence the second part of the sentence. Of course it needs support by the Yes, but if we added support for watermarks/atomic scrapes, I doubt it On Mon, Aug 15, 2016 at 11:04 AM Björn Rabenstein [email protected]
|
This may fall out of prometheus/prometheus#398 as the data required is the same. |
Atomicity is on the one hand a much stronger requirement than what I proposed here (e.g. in terms of handling an error halfway through the slice of samples). On the other hand, it wouldn't solve the problem at hand. The watermark would need to be wired to the setting of a timestamp when a scrape is initiated, long before samples from that scrape hit the storage layer. In any case, the idea doesn't seem to be completely insane, so I'll file an issue in prometheus/prometheus about it. We can have further discussions there. Here it's pretty much off-topic, and the discussion of an interface change as a possible result of a possible solution is yet another level away, or two… |
Hang on, weren't upstream recording rules the proposed workaround to avoid the race in federation vs sample ingestion? But you seem to be saying that they have the exact same problem. |
I said it was less likely to be a problem, not that it'd solve it. |
See prometheus/prometheus#1887 (comment)
The text was updated successfully, but these errors were encountered: