Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distributor: do inflight checks before reading request (for httpgrpc requests) #6300

Merged
merged 12 commits into from
Oct 24, 2023

Conversation

pstibrany
Copy link
Member

@pstibrany pstibrany commented Oct 9, 2023

What this PR does

This PR implements inflight checks in Distributor by using gRPC method limiter introduced in grafana/dskit#377. This PR relies on unmerged dskit PR grafana/dskit#392 (waiting for new gRPC-go release).

This PR also relies on clients passing the correct metadata headers for httpgrpc requests (using methods introduced in grafana/dskit#391).

To be rebased on top of #6301, after it's merged.

Checklist

  • Tests updated
  • [na] Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pracucci pracucci self-requested a review October 12, 2023 08:51
httpUrl := getSingleMetadata(md, httpgrpc.MetadataURL)
msgSize := getMessageSize(md, grpcutil.MetadataMessageSize)

if httpMethod == http.MethodPost && (httpUrl == api.PrometheusPushEndpoint || httpUrl == api.OTLPPushEndpoint) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure checking the URL for equality will work. Reason is that the endpoint is exposed under a configurable prefix (and I think the prefix is still the URL received here). What if we check if the URL ends with the requested endpoint? That's what we do in the query-frontend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done in 5cd002d.

inflight := d.inflightPushRequests.Inc()
defer func() {
if cleanupInDefer {
d.inflightPushRequests.Dec()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if to keep the code more future proof, we should decrease d.inflightPushRequestsBytes here. My take is that in in the future we'll add more checks after d.checkRequestSize() then we'll miss subtracting it. The check here could be done on rs.requestSize.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we could just change the defer function into a call to cleanupAfterPushFinished().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in adfcca0.

pkg/distributor/distributor.go Show resolved Hide resolved
@pstibrany pstibrany force-pushed the distributor-push-grpc-check branch from dbb1f77 to 25e9f1f Compare October 23, 2023 12:32
@pstibrany pstibrany marked this pull request as ready for review October 23, 2023 14:06
@pstibrany pstibrany requested review from a team as code owners October 23, 2023 14:06
@pstibrany
Copy link
Member Author

I think this is now ready for review. Compared to previous version of the PR, we now check size of httpgrpc.Request and mimirpb.WriteRequest separately, and they both contribute to inflight request size, because they are both in memory while request is being handled.

Extension of this (for future PR) could be to check for OTLP Request size too in OTLP Push handler.

Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job Peter! I can't see any issue. I left a couple of nits and questions (don't block on me re-reviewing it).

@@ -49,6 +49,7 @@
* [ENHANCEMENT] Querier: reduce memory consumed for queries that hit store-gateways. #6348
* [ENHANCEMENT] Ruler: include corresponding trace ID with log messages associated with rule evaluation. #6379
* [ENHANCEMENT] Querier: clarify log messages and span events emitted while querying ingesters, and include both ingester name and address when relevant. #6381
* [ENHANCEMENT] Ingester, Distributor: added experimental support for rejecting push requests received via gRPC before reading them into memory, if ingester or distributor is unable to accept the request. This is activated by using `-ingester.limit-inflight-requests-using-grpc-method-limiter` for ingester, and `-distributor.limit-inflight-requests-using-grpc-method-limiter` for distributor. #5976 #6300
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the experimental flags in docs/sources/mimir/configure/about-versioning.md too, please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Let's check httpgrpc metadata, if present.
httpMethod := getSingleMetadata(md, httpgrpc.MetadataMethod)
httpUrl := getSingleMetadata(md, httpgrpc.MetadataURL)
msgSize := getMessageSize(md, grpcutil.MetadataMessageSize)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] We could move this inside the if branch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


ctx, err = l.RPCCallStarting(context.Background(), httpgrpcHandleMethod, metadata.New(map[string]string{
httpgrpc.MetadataMethod: "POST",
httpgrpc.MetadataURL: api.PrometheusPushEndpoint,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Would be nice to add a prefix to ensure it matches anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

il := d.getInstanceLimits()
if il.MaxInflightPushRequests > 0 && inflight > int64(il.MaxInflightPushRequests) {
d.rejectedRequests.WithLabelValues(reasonDistributorMaxInflightPushRequests).Inc()
return ctx, nil, errMaxInflightRequestsReached
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm the reason why we removed the wrapping to util_log.DoNotLogError{} is because all errors returned by this function will not be logged anyway? If so, I would add a comment to clarify it.

Copy link
Member Author

@pstibrany pstibrany Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment in 00e660f. Wrapping is still done in limitsMiddleware.


func (d *Distributor) checkWriteRequestSize(rs *requestState, writeRequestSize int64) error {
// If writeRequestSize was already checked, don't check it again.
if rs.writeRequestSize > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: I'm good with this check, but could it ever happen that we call checkWriteRequestSize() twice for the same request? If yes, in which case? I can't see it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that would happen. It's only called from limitsMiddleware, and not from grpc method check. (We only call checkHttpgrpcRequestSize from there).

@pstibrany pstibrany force-pushed the distributor-push-grpc-check branch from acc3661 to 304de04 Compare October 24, 2023 12:07
@pstibrany
Copy link
Member Author

Good job Peter! I can't see any issue. I left a couple of nits and questions (don't block on me re-reviewing it).

Thanks for review. I've addressed your feedback, setting auto-merge. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants