Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first draft of rate limit API #2028

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

kyessenov
Copy link
Contributor

Signed-off-by: Kuat Yessenov [email protected]

Signed-off-by: Kuat Yessenov <[email protected]>
@google-cla google-cla bot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Jun 17, 2021
@istio-testing istio-testing added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 17, 2021
Signed-off-by: Kuat Yessenov <[email protected]>
@kyessenov kyessenov added the release-notes-none Indicates a PR that does not require release notes. label Jun 17, 2021
@@ -24,7 +24,8 @@ buf generate \
--path analysis \
--path authentication \
--path meta \
--path telemetry
--path telemetry \
--path policy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit of a nitpick but do we want a new group? "policy" feels very abstract - everything is policy. Maybe we should just it in the networking group?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - we already had security use broad 'policy', let's keep it in networking ( it's not telemetry ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is networking in istio. I don't like policy either, but please not networking. Open to suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of any good reason to not be in networking. It's not about like or not like here - it's not security nor telemetry, and I don't think we're planning to create a 'rate limiting' WG or API group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stated one good reason - this is not a universal rate limiting API. This is addressing a specific need for Istio users who do not want to use EnvoyFilters. It is not a core networking API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're rate limiting the use of features on the network I'd be inclined to keep it in networking

mesh/v1alpha1/config.proto Show resolved Hide resolved
// The behaviour in case the rate limiting service does not respond back.
// When it is set to true, the proxy will not allow traffic in case of
// communication failure between rate limiting service and the proxy.
bool failure_mode_deny = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be more natural as an Enum. This would match k8s webhooks, as prior art. Maybe we can change this to failurePolicy: {Fail,Ignore} to exactly match their api.

failureModeDeny: false feels awkward

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe fail_open?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matching webhooks is good.

Avoid booleans - not very nice when merging, nor extensible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enums should have default UNKNOWN forcing the users to think about which way to go. We want false to be used in most cases (fail open) so boolean matches that better. Using enum is inconsistent with fail_open field in authorization provider, which has the inverted semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have used fail_open in extauthz and I don’t think there is a 3rd option normally.
When there are only two options possible, bool is recommended per https://cloud.google.com/apis/design/design_patterns#bool_vs_enum_vs_string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 'forcing users' is a good strategy. We should have reasonable defaults, it's part of the job of defining an API. The problem is we are not designing a gRPC or imperative API - but a CR that will be used with yaml and may need merging semantics. The API design patterns are great for gRPC/imperative - but not a perfect match for yaml.

But as I mentioned in few other comments - and I'll keep asking for - let's look at patterns used by other APIs doing rate limiting, and try to keep it similar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree failureModeDeny: false seems odd. how about denyOnFailed: false or denyWhenFailed: false? would be good to add false is the default (assuming it is).

// Optional. Specifies a set of descriptors that are evaluated against the
// rate limit policy. The rate limit applies if any of the generated
// descriptors trigger a limit.
repeated RateLimitDescriptor descriptors = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "descriptor" feels very odd to me. I know its what Envoy uses, but we are not tied to envoy at all. Is there a more fitting description if we forget about envoy-isms? I noticed nginx and haproxy don't have any "descriptor" concept

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a RFC that lists the APIs used by nginx/haproxy/etc - we should look for commonality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd be aiming to integrate Envoy rate limit service implementations (derivations of https://github.com/envoyproxy/ratelimit) which will use this terminology. It would be confusing to call it something else when performing the integration.

Copy link
Member

@howardjohn howardjohn Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that just one of the implementation that will be supported? Maybe the only for now but I thought I had seen mentions of other backend?

If it is just envoy then I agree

Copy link
Contributor Author

@kyessenov kyessenov Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version of global is just envoy. We can add other policies later. Maybe it's better to be more specific but nothing else uses "envoy" as qualification in providers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haproxy has a rate limiting dsl, nginx is simpler but all of them have a notion of domain, which is a set of related counters.
Re: envoy specific, it is an api that anyone can implement.
However we don’t want too much client side specifics if we can avoid it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be great to have a list with services ( including managed ) and APIs - and what notions they have. If notion of domain is common - I'm all for it.

The model in K8S Gateway is to identify 'core' features, that can be adopted by multiple vendors ( both server and client ) - and keep the rest optional. I would like to focus first on finding the 'core' - the simplest viable API
that can have more than one implementation. We can follow up with extensions, but the API as specified in this PR is far too large, and claiming that anyone can implement Envoy XDS and API is not realistic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is not trying to define a universal cross-vendor rate limiting API. I don't think it's realistic to expect that without involvement of other vendors (haproxy/nginx). This is the reason why I don't want to combine this with "networking" group because it's not really a core API. It is more similar to Wasm API where implementations may choose to implement or not without losing core functions of a mesh. Right now, there is a clear demand for a first class large API for rate limiting, not an extension, because doing the same with EnvoyFilter is extremely error prone, given the complexity of the concepts. Trying to make this into some core API will either not address user's need or leave too many details opaque, making it no better than EnvoyFilter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it doesn't appear this API is better than EnvoyFilter. At least with EnvoyFilter you have the Envoy API and you can directly apply and map to envoy. We can make it easier to apply the EnvoyFilter - but having a mangled EnvoyFilter provided as an Istio API doesn't serve either people who use Istio nor Envoy APIs. Same complexity and even more 'error prone' - since you also need to guess how this API maps to the EnvoyFilter.

If it was implemented as WASM - I suspect it wouldn't be a spaghetti module with all concepts mixed in, but probably one
WASM module focused on local rate limits with a smaller API, etc.

By vendor I also mean Redis, Google, AWS and others who may have quota servers. And we don't really need 'involvement' from haproxy/nginx - just to understand their capabilities and API, and be able to answer the question 'why does our core API needs to be more complicated'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@costinm There is a difference between this API and runtime API that proxies use. For runtime APIs, my understanding is that Google would like to use RLS. gRPC as well. I can't speak for AWS or Redis, and we can only speculate about whether RLS is a good fit without them at the table.

Now if we agree RLS is an appropriate runtime API, we need a way to enable its usage in Istio. Because RLS uses "descriptors" as the core concept, it's only appropriate to use the same nomenclature in Istio API. The choice we have is the binding and the methods of computing descriptors. I'd like to focus on those two concepts instead of questioning why we need descriptors.

// Optional. The workload selector decides where to apply the server-side rate limit in the same namespace.
istio.type.v1beta1.WorkloadSelector workload = 1;
// Optional. The route selector decides which client route rules enforce the client-side rate limit.
RouteSelector route = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would be good to make sure that doc is approved before we approve this otherwise we will have some problems..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a lot of confusion with server side vs client side - I think it would be best to not make it explicit.

The rate limit defines the user intent - we may implement it in different ways ( including in middle boxes ), and
we may need to implement it for non-envoy cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't client vs server side not an implementation detail but an extremely important aspect? both are valid for different use cases but it's not a transparent decision?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this cannot ratelimit the below route as it has two route destination .

// apiVersion: networking.istio.io/v1beta1
// kind: VirtualService
// metadata:
//   name: reviews-route-two-domains
// spec:
//   hosts:
//   - reviews.com
//   http:
//   - route:
//     - destination:
//         host: dev.reviews.com
//       weight: 25
//     - destination:
//         host: reviews.com
//       weight: 75


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can't do that given how we translate to Envoy xDS. You would need a separate virtual service for dev.reviews.com.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding server/client side: my point was that we normally decide where to apply by selector ( i.e. if the CR is applied to a workload, it is typically a server-side API, etc).

For 'client side' APIs - we do have a lot of issues, including 'who decides' ( producer or consumer ), etc.
I would focus first on defining a clean and simple API that solves one specific use case - for example global rate limit applied on a gateway or server ( a very typical use case and IMO most useful ).

Like Telemetry, we can have a number of different, smaller APIs that are easier to understand and implement with multiple vendors - and for advanced cases either use EnvoyFilter or have extension APIs that are really
Envoy APIs in disguise. But we do need the 'common APIs' clearly separated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine leaving client-side specifically for gateways. The Istio APIs do not differentiate between virtual services for gateways vs virtual services for sidecars, so it's not possible to define it as such.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8S Gateway API will apply to Sidecars as well, and the goal is not to differentiate, but to simplify the API and focus on most important and common use cases - and make it as simple as possible.

Enforcing rate limit on each client side is not really possible - and we already have DestinationRule 'maxConnections, etc.
We can extend DestinationRule with more un-enforceable rate limiting.

Having an ingress gateway - or the backend's sidecar - protect the application with an enforced rate limit is far more common and important.

@mandarjog - for rate limit applied on a backend, the outcome is not only 'fail'/'allow' - but can also be 'spill over to other backends' ( we don't support this yet, but it can be quite valuable and can be implemented ).

Rate limits on an ingress using a quota/RL service are likely the most common case ( for RL applied to the backend -the main purpose is to protect the app behind the sidecar, that can't take more than x qps - so RL service doesn't make a lot of sense for common use case).

Copy link
Member

@linsun linsun Jun 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyessenov does this API allow users to configure rate limit from a namespace (as the clients)? seems only at the client level which could be too detailed as users may have v1/v2/canary and other services in that namespace which are clients.

policy/v1alpha1/ratelimit.proto Outdated Show resolved Hide resolved
// +genclient
// +k8s:deepcopy-gen=true
// -->
message RateLimit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use some more yaml examples - both e2e examples at the top level and for each section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I'd like to wait for initial bike shedding since it is a lot of work to rewrite samples.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add some initial examples? It is easier to envision how the API works for our users with examples.

// processed by this filter. Each request processed by the filter consumes a
// single token. If the token is available, the request will be allowed. If
// no tokens are available, the proxy responds with 429.
TokenBucket token_bucket = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit complex and ties to a single implementation. Does this get us any benefits that could not be expressed by something simpler like `limit: 10 QPS" or something? Looking around at other proxies they seem to offer a similar simple API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pure qps can not express the token bucket algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token buckets have subtle behavioral changes. I think this is mostly to be explicit rather than let the user second-guess the underlying behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound like a perfect candidate to not include in the core API, not until we have clear evidence it can have more than one implementation and clear semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token buckets are much easier to understand and re-implement than arbitrary QPS semantics. I do not understand the point about clear evidence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k8s or golang do not deal with the same performance envelope as envoy. Envoy can go up to 1 million qps where even microseconds of CPU time matter. We don't have enough evidence to know what good fill interval is for Istio so instead of trying to hard code it now, it's better to leave it open.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can't figure it out how can a user be expected to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - and even there was a case for 'expert' users to fine tune - that doesn't need to be in the same API and confuse the 99.9% of users that just understand "rps: 10". Advanced users that understand Envoy enough to configure this stuff can use EnvoyFilter, and if needed we can provide a second 'advanced' API or a 'blessed' filter.

Can we start again, with the common case first - 'request per second: 10' attached just to a backend and hostname ( per-route can be added later as well ). There is a huge difference between a real API intended for users, and calling a particular implementation with all the internals 'the API'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token buckets are a standard concept in rate limiting. See https://cloud.google.com/architecture/rate-limiting-strategies-techniques#techniques-enforcing-rate-limits, for example. Would wrapping this into "algorithm" proto with a sole field address the concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QPS as a single rate limit spec is going to be confusing. None of the technique achieve perfect QPS, and we can go over the limit with bursty traffic. Do I need to remind you about rate limit mixer test with sliding window not hitting rate limit time-to-time?

// no tokens are available, the proxy responds with 429.
TokenBucket token_bucket = 1;

// Per-descriptor local rate limit override.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this means. Can you clarify a bit in the comments how this is supposed to be used?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the rls documentation has to be read fully to understand this in absence of detailed examples .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another 'smell' that it doesn't belong in the core API. Can we have a simple API, focused on 'local rate limits' - and another simple one focused only on 'remote' ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The descriptors are shared between the two. There is soon a quota based rate limiting coming which will mix the two so it's not good to separate them.

mesh/v1alpha1/config.proto Show resolved Hide resolved
Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Is the rate applied on server side, or client side ?
  2. MeshConfig is intended for istiod configurations that are global. Pretty much everything in it needs to move - and would need a very strong reason why it must be set for all the workloads in the mesh.

I would suggest adding it to ProxyConfig - so it can be used for specific workloads, with different settings. It can also be added to DestinationRule, if it's client-side.

Or it can be a separate proto, that we can start in ProxyConfig and move to a standalone CRD.

But please not in MeshConfig

// +k8s:deepcopy-gen=true
// -->
message RateLimit {
oneof binding {
Copy link
Member

@hzxuzhonghu hzxuzhonghu Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could i limit one specific workload to access the route? I think they need to combine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking for a client-side rate limit restricted to a particular workload? That can be added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But please not in MeshConfig

Rate limit feels like a similar category as telemetry,tracing,and ext authz backend. So why not follow the same pattern?

If we don't like that we should change everything IMO. It may be better to he consistently "wrong" than inconsistent?

And we are changing - I hope. Telemetry, tracing are getting promoted to CRDs, hopefully Authz will graduate too. The hope is that after all features graduate, the MeshConfig will be almost empty shell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for specifying a particular backend - we have pretty good patterns ( BackendRef, hostname, etc).
Configurations related to a service/feature should be in the actual CRD - not in MeshConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Telemetry API still referring to mesh config providers? Almkst all of the providers in mesh config are tracing providers added for the Telemetry api

// kind: VirtualService
// route: blah
// ```
message RouteSelector {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, it looks very hard to understand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is less direct. However, it allows us to deprecate networking API and keep this API around. The decoupling is important for future compatibility.

// Resource kind.
string kind = 3;
// Optional. Individual route rule name within the resource.
string route = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For vs, it is the HTTPRoute.Name? This field can be optional now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VirtualService has many routes inside. This is referring to a route within a virtual service. It could be /special_path, for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the common use case for the user to want to have a rate limit for a URL, or is it really to have a rate limit for requests going to a particular backend ? With the route just a convoluted way to identify the cluster ?

And is this common across other rate limit services and APIs ? Once again, a doc with examples for all existing common rate limit services and APIs would make this much easier to review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 and the doc to document and agree on common user cases. Is tying Ratelimit to route the right method to configure rate limit for client side?

@howardjohn
Copy link
Member

But please not in MeshConfig

Rate limit feels like a similar category as telemetry,tracing,and ext authz backend. So why not follow the same pattern?

If we don't like that we should change everything IMO. It may be better to he consistently "wrong" than inconsistent?

uint32 port = 2;

// The rate limit domain to use when calling the service.
string domain = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean the whole mesh rl is under one domain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's per provider. You can have multiple rate limit providers with different domains.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a typical use for the field. E.g. prod vs test ?

The Envoy docs don't suggest much which calls into question the value of asking the user to configure it vs having a defgault.

@kyessenov
Copy link
Contributor Author

@costinm

  1. All variations are possible:
  • local rate limit on outbound route at ingress;
  • global rate limit on outbound route at ingress;
  • local rate limit on app sidecar
  • global rate limit on app sidecar
  1. Usage of provider is for consistency with the rest of APIs (e.g. telemetry API). This was strongly requested AFAIR, and I don't want to have a special way just for rate limit. Let's not use this PR to resurrect that discussion and keep it separate.

Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
// Constant entry key and value.
ConstantEntry constant = 3;
// CEL expression entry.
Expression expression = 4;
Copy link
Contributor

@bianpengyuan bianpengyuan Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identity from jwt token and spiffee are going to be very commonly used, maybe we should make them first class instead of using expression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. I think some reviewers want this API to be minimal though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take minimum as including identity in the descriptor but not expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be prescriptive in the examples as opposed to enumerating cases in the API. I want to hear what others think since I am fine with adding identity as an option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason to include identity is that dynamic metadata key for jwt token identity is implementation detail and it would be better to not let user configure expression with that. So does mTLS, relies on envoy CEL implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easy to add domain-specific actions here as long as we have agreement to broaden the API surface.

}
oneof entry_type {
// Entry obtained from a header value.
RequestHeader header = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does RequestHeader apply to local rate limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. The local rate limit has descriptor version, too.

Copy link
Contributor

@bianpengyuan bianpengyuan Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I am a bit confused. I thought local rate limit applies with condition match semantic (i.e. if header x has value y, then apply this rate limit). IIUC RequestHeader is used to get value of a header instead of matching it with some value, which I am not sure how does it work with local. I think I will need some example to understand it better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would have something like this in local rate limit:

token_bucket: { general rate limit, can be set very high }
descriptors:
  token_bucket: { rate limit for matching header x to y}
  entries:
  - key: x
    value: y

Then add a rate limit action for header x separately.

// Entry obtained by testing a header value against exact value, prefix, etc.
RequestHeaderMatch header_match = 2;
// Constant entry key and value.
ConstantEntry constant = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I guess constant does not apply to local rate limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think descriptors are the same in both. Local rate limit gets a descriptor, matches against any one in the policy, then uses the corresponding token bucket instead of the global token bucket.

// Defines configuration for a global rate limiting service provider.
// The service must implement Rate Limit Service (RLS):
// https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/ratelimit/v3/rls.proto.
message RateLimitProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a usecase for multiple rls? Or can everything be expressed as one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of a use case of multiple global RLS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • canary services using a different RLS ( for example a canary version of the RLS )
  • regional RLS
  • different implementations or instances of RLS ( for example different redis servers, dedicated to one or a set of apps)

The other question is - do we have a use case for a namespace-configured RLS instead of mesh wide, mesh-admin configured one. And I think the answer is yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is covered by multiple named provider instances no?

// to unambiguously resolve a service in the service registry. The <Hostname> is a fully qualified host name of a
// service defined by the Kubernetes service or ServiceEntry.
//
// Example: "rls.default.svc.cluster.local" or "bar/rls.example.com".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples specifically make it sounds like they have to be in cluster, they don’t have to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are copied verbatim from all other providers. I don't have the context why we use this naming scheme in providers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service implementations don't have to be in-cluster but we do prefer the Service/ServiceEntry to be a declared KRM resource available to the control plane. Not strictly required though...

// The behaviour in case the rate limiting service does not respond back.
// When it is set to true, the proxy will not allow traffic in case of
// communication failure between rate limiting service and the proxy.
bool failure_mode_deny = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have used fail_open in extauthz and I don’t think there is a 3rd option normally.
When there are only two options possible, bool is recommended per https://cloud.google.com/apis/design/design_patterns#bool_vs_enum_vs_string

// Optional. The workload selector decides where to apply the server-side rate limit in the same namespace.
istio.type.v1beta1.WorkloadSelector workload = 1;
// Optional. The route selector decides which client route rules enforce the client-side rate limit.
RouteSelector route = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TargetRef is a good idea. @howardjohn is that document nearing approval?
Besides ref is a better description than selector.

re the example that Zhonghu gave, how would limit per destination?

// Optional. Specifies a set of descriptors that are evaluated against the
// rate limit policy. The rate limit applies if any of the generated
// descriptors trigger a limit.
repeated RateLimitDescriptor descriptors = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haproxy has a rate limiting dsl, nginx is simpler but all of them have a notion of domain, which is a set of related counters.
Re: envoy specific, it is an api that anyone can implement.
However we don’t want too much client side specifics if we can avoid it.


// Route selector references a client networking resource to overlay the rate
// limit filter application. This can be a service, a virtual service, or a
// gateway resource. Route name designation is optional.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

named route: clairify if it is the name of the xds resource or the configuration resource?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get to control the route name in the portable way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Route is the k8s resource inner name. Routes are not separate k8s resource, the hosts are, so we have to add a separate qualifier to select a route within. It's portable since it's not using Envoy route name.

// no tokens are available, the proxy responds with 429.
TokenBucket token_bucket = 1;

// Per-descriptor local rate limit override.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the rls documentation has to be read fully to understand this in absence of detailed examples .

@costinm
Copy link
Contributor

costinm commented Jun 23, 2021 via email

@mandarjog
Copy link
Contributor

Token buckets are standard for implementing rate limiting. The question I suspect is if they should be exposed as part of the API, and if users should be expected ( or allowed ) to mess with this.

The alternative is that we have no istio api, but controllers for backends to program envoy. What this means is that if someone uses Google cloud armor to configure their rate limits we will use that config to correctly configure the proxy. There will be a similar rls controller that will read back end rls config and program envoy. These cotrollers may require istiod extensions or output an EnvoyFilter.

Managed grpc may or may not implement it, but are Likely to implement rls anyway.

The uses cases that customers need, rate limiting based on service account or based on presence and values of multiple headers cannot be expressed in a simpler api.
But using the alternative above we can solve that problem for specific backend without adding an api.

a least common denominator istio api can be orthogonally added .

@kyessenov
Copy link
Contributor Author

kyessenov commented Jun 23, 2021 via email

Comment on lines +76 to +78
// If no route name is provided, the rate limit applies to all routes. However,
// the rate limit for a named route takes priority over the rate limit for the
// entire resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we give an example on the priority thing?


option go_package = "istio.io/api/policy/v1alpha1";

package istio.policy.v1alpha1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level concern with rate limiting in general - I think we need to ensure it works well with our existing APIs. For example, retries and locality lb. If we have a single pod at rate limit (constant 429) in our zone, and 1000s of pods outside the zone returning 200, we must fail over. Similarly with retries, we probably don't want to retry the pod returning 429s (if we retry on 429 at all)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends. We don't want cascading failures so 429 might need to be propagated downstream to shed load. This would be the primary purpose for this version of rate limiting since that appears what users want. Fail-over is a different feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior of rate limiting and locality load balancing can be extremely dangerous though.

Without rate limiting: pod gets overloaded and starts returning 5xx, triggering outlier detection and failover
With rate limiting: pod sheds load with 429, not triggering outlier detection. 100% of requests are now failing, even though there are plenty of pods that could handle the load.

This is exacerbated when there is uneven distribution between pods in different zones (ie 1 pod in priority 0, 1000s in priority 1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

503 and 429 are similar, and envoy treats x-envoy-retry-grpc-on / resource exhausted as retryable.

@kyessenov
Copy link
Contributor Author

kyessenov commented Jun 28, 2021 via email

@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Jul 2, 2021
@istio-testing
Copy link
Collaborator

@kyessenov: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jwendell jwendell mentioned this pull request Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. needs-rebase Indicates a PR needs to be rebased before being merged release-notes-none Indicates a PR that does not require release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants