From 50c8737b2ab432f8c183bf187fdbf60a4ec46ab7 Mon Sep 17 00:00:00 2001 From: Alex Snaps Date: Thu, 22 Sep 2022 11:08:08 -0400 Subject: [PATCH] Phase one --- doc/how-it-works.md | 6 +- doc/server/configuration.md | 216 +++++++++++++++++++-- doc/topologies.md | 20 +- limitador-server/README.md | 2 +- limitador/src/storage/redis/redis_async.rs | 2 +- 5 files changed, 216 insertions(+), 30 deletions(-) diff --git a/doc/how-it-works.md b/doc/how-it-works.md index 24a7d097..dfcefe9a 100644 --- a/doc/how-it-works.md +++ b/doc/how-it-works.md @@ -34,10 +34,10 @@ The counter's condition will match. Then, the counter will be increased and the If the limit is exceeded, the request will be rejected with `429 Too Many Requests`, otherwise accepted. -Note that the counter is being activated eventhough it does not match *all* the entries of the +Note that the counter is being activated even though it does not match *all* the entries of the descriptor. The same rule applies for the *variables* field. -Currently *condition* implementation only allows *equal* operator. +Currently, *condition* implementation only allows *equal* operator. More operators can be implemented if there are use cases. The *variables* field is a list of keys. @@ -71,7 +71,7 @@ Reason: conditions key does not exist ```yaml conditions: - "KEY_A == 'VALUE_A'" - - "OTHER_KEY = 'WRONG_VALUE'" + - "OTHER_KEY == 'WRONG_VALUE'" max_value: 1 seconds: 60 variables: [] diff --git a/doc/server/configuration.md b/doc/server/configuration.md index e1854336..8ac13c86 100644 --- a/doc/server/configuration.md +++ b/doc/server/configuration.md @@ -1,7 +1,187 @@ -# Configuration using environment variables - -The Limitador server has some options that can be configured with environment +# Limitador configuration + +## Command line configuration + +The preferred way of starting and configuring the Limitador instance is using the command line: + +``` +USAGE: + limitador-server [OPTIONS] [STORAGE] + +ARGS: + The limit file to use + +OPTIONS: + -b, --rls-ip The IP to listen on for RLS [default: 0.0.0.0] + -p, --rls-port The port to listen on for RLS [default: 8081] + -B, --http-ip The IP to listen on for HTTP [default: 0.0.0.0] + -P, --http-port The port to listen on for HTTP [default: 8080] + -l, --limit-name-in-labels Include the Limit Name in prometheus label + -v Sets the level of verbosity + --validate Validates the LIMITS_FILE and exits + -h, --help Print help information + -V, --version Print version information + +STORAGES: + memory Counters are held in Limitador (ephemeral) + redis Uses Redis to store counters + redis_cached Uses Redis to store counters, with an in-memory cache +``` + +The values used are authoritative over any [environment variables](#configuration-using-environment-variables) independently set. + +### Limit definitions + +The `LIMITS_FILE` provided is the source of truth for all the limits that will be enforced. The file location will be +monitored by the server for any changes and be hot reloaded. If the changes are invalid, they will be ignored on hot +reload, or the server will fail to start. + +#### The `LIMITS_FILE`'s format + +When starting the server, you point it to a `LIMITS_FILE`, which is expected to be a _yaml_ file with an array of +`limit` definitions, with the following format: + +```json +{ + "$schema": "http://json-schema.org/draft-04/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + }, + "namespace": { + "type": "string" + }, + "seconds": { + "type": "integer" + }, + "max_value": { + "type": "integer" + }, + "conditions": { + "type": "array", + "items": [ + { + "type": "string" + } + ] + }, + "variables": { + "type": "array", + "items": [ + { + "type": "string" + } + ] + } + }, + "required": [ + "namespace", + "seconds", + "max_value", + "conditions", + "variables" + ] +} +``` + +Here is an example of such a limit definition: + +```yaml +namespace: example.org +max_value: 10 +seconds: 60 +conditions: + - "req.method == 'GET'" variables: + - user_id +``` + + - `namespace` namespaces the limit, will generally be the domain, [see here](../how-it-works.md) + - `seconds` is the duration for which the limit applies, in seconds: e.g. `60` is a span of time of one minute + - `max_value` is the actual limit, e.g. `100` would limit to 100 requests + - `name` lets the user _optionally_ name the limit + - `variables` is an array of variables, which once resolved, will be used to qualify counters for the limit, + e.g. `api_key` to limit per api keys + - `conditions` is an array of conditions, which once evaluated will decide whether to apply the limit or not + +#### `condition` syntax + +Each `condition` is an expression producing a boolean value (`true` or `false`). All `conditions` _must_ evaluate to +`true` for the `limit` to be applied on a request. + +Expressions follow the following syntax: `$IDENTIFIER $OP $STRING_LITERAL`, where: + + - `$IDENTIFIER` will be used to resolve the value at evaluation time, e.g. `role` + - `$OP` is an operator, either `==` or `!=` + - `$STRING_LITERAL` is a literal string value, `"` or `'` demarcated, e.g. `"admin"` + +So that `role != "admin"` would apply the limit on request from all users, but `admin`'s. + +### Counter storages + +Limitador will load all the `limit` definitions from the `LIMITS_FILE` and keep these in memory. To enforce these +limits, Limitador needs to track requests in the form of counters. There would be at least one counter per limit, but +that number grows when `variables` are used to qualify counters per some arbitrary values. + +#### `memory` + +As the name implies, Limitador will keep all counters in memory. This yields the best results in terms of latency as +well as accuracy. By default, only up to `1000` "concurrent" counters will be kept around, evicting the oldest entries. +"Concurrent" in this context means counters that need to exist at the "same time", based of the period of the limit, +as "expired" counters are discarded. + +This storage is ephemeral, as if the process is restarted, all the counters are lost and effectively "reset" all the +limits as if no traffic had been rate limited, which can be fine for short-lived limits, less for longer-lived ones. + +#### `redis` + +When you want persistence of your counters, such as for DR or across restarts, using `redis` will store the counters in +a redis instance using the provided `URL`. Increments to _individual_ counters is made within redis itself, providing +accuracy over these, races tho can occur when multiple limitador servers are used against a single redis and using +"stacked" limits (i.e. over different periods). Latency is also impacted, as it results in one additional hop to talk +to redis and maintain the counters. + +#### `redis_cached` + +In order to avoid some communication overhead to redis, `redis_cached` adds an in memory caching layer within the +limitador servers. This lowers the latency, but sacrifices some accuracy as it will not only cache counters, but also +coalesce counters updates to redis over time. See [this configuration](#redis_local_cache_enabled) option for more +information. + +For an in-depth coverage of the different topologies supported and how they affect the behavior, see the +[topologies' document](../topologies.md). + +#### `infinispan` optional storage - _experimental_ + +The default binary will _not_ support [Infinispan](https://infinispan.org/) as a storage backend for counters. If you +want to give it a try, you would need to build your own binary of the server using: + +```commandline +cargo build --release --features=infinispan +``` + +Which will add the `infinispan` to the supported `STORAGES`. + +``` +USAGE: + limitador-server infinispan [OPTIONS] + +ARGS: + Infinispan URL to use + +OPTIONS: + -n, --cache-name Name of the cache to store counters in [default: limitador] + -c, --consistency The consistency to use to read from the cache [default: + Strong] [possible values: Strong, Weak] + -h, --help Print help information +``` + +## Configuration using environment variables + +The Limitador server has some options that can be configured with environment variables. These will override the +_default_ values the server uses. [Any argument](#command-line-configuration) used when starting the server will prevail over the +environment variables. #### `ENVOY_RLS_HOST` @@ -24,14 +204,14 @@ variables: - Format: `string`. -### `HTTP_API_PORT` +#### `HTTP_API_PORT` - Port where the HTTP API listens. - Optional. Defaults to `8080`. - Format: `integer`. -### `LIMITS_FILE` +#### `LIMITS_FILE` - YAML file that contains the limits to create when limitador boots. If the limits specified already have counters associated, limitador will not delete them. @@ -40,7 +220,7 @@ Changes to the file will be picked up by the running server. - Format: `string`, file path. -### `LIMIT_NAME_IN_PROMETHEUS_LABELS` +#### `LIMIT_NAME_IN_PROMETHEUS_LABELS` - Enables using limit names as labels in Prometheus metrics. This is disabled by default because for a few limits it should be fine, but it could become a @@ -49,7 +229,7 @@ docs](https://prometheus.io/docs/practices/naming/#labels) - Optional. Disabled by default. - Format: `bool`, set to `"1"` to enable. -### `REDIS_LOCAL_CACHE_ENABLED` +#### `REDIS_LOCAL_CACHE_ENABLED` - Enables a storage implementation that uses Redis, but also caches some data in memory. The idea is to improve throughput and latencies by caching the counters @@ -63,7 +243,7 @@ sacrifices some rate-limit accuracy. This mode does two things: other instances will not become aware of the counter updates until they're committed to Redis. - Caches counters. Instead of fetching the value of a counter every time - it's needed, the value is cached for a configurable period. The trade off is + it's needed, the value is cached for a configurable period. The trade-off is that when running several instances of Limitador, an instance will not become aware of the counter updates other instances do while the value is cached. When a counter is already at 0 (limit exceeded), it's cached until @@ -86,7 +266,7 @@ sacrifices some rate-limit accuracy. This mode does two things: - Note: "REDIS_URL" needs to be set. -### `REDIS_LOCAL_CACHE_FLUSHING_PERIOD_MS` +#### `REDIS_LOCAL_CACHE_FLUSHING_PERIOD_MS` - Used to configure the local cache when using Redis. See [`REDIS_LOCAL_CACHE_ENABLED`](#redis_local_cache_enabled). This env only applies @@ -95,7 +275,7 @@ when `"REDIS_LOCAL_CACHE_ENABLED" == 1`. - Format: `integer`. Duration in milliseconds. -### `REDIS_LOCAL_CACHE_MAX_TTL_CACHED_COUNTERS_MS` +#### `REDIS_LOCAL_CACHE_MAX_TTL_CACHED_COUNTERS_MS` - Used to configure the local cache when using Redis. See [`REDIS_LOCAL_CACHE_ENABLED`](#redis_local_cache_enabled). This env only applies @@ -104,7 +284,7 @@ when `"REDIS_LOCAL_CACHE_ENABLED" == 1`. - Format: `integer`. Duration in milliseconds. -### `REDIS_LOCAL_CACHE_TTL_RATIO_CACHED_COUNTERS` +#### `REDIS_LOCAL_CACHE_TTL_RATIO_CACHED_COUNTERS` - Used to configure the local cache when using Redis. See [`REDIS_LOCAL_CACHE_ENABLED`](#redis_local_cache_enabled). This env only applies @@ -113,7 +293,7 @@ when `"REDIS_LOCAL_CACHE_ENABLED" == 1`. - Format: `integer`. -### `REDIS_URL` +#### `REDIS_URL` - Redis URL. Required only when you want to use Redis to store the limits. - Optional. By default, Limitador stores the limits in memory and does not @@ -121,16 +301,16 @@ require Redis. - Format: `string`, URL in the format of `"redis://127.0.0.1:6379"`. -### `RUST_LOG` +#### `RUST_LOG` - Defines the log level. - Optional. Defaults to `"error"`. - Format: `enum`: `"debug"`, `"error"`, `"info"`, `"warn"`, or `"trace"`. -## When built with the `infinispan` feature +### When built with the `infinispan` feature - _experimental_ -### `INFINISPAN_CACHE_NAME` +#### `INFINISPAN_CACHE_NAME` - The name of the Infinispan cache that Limitador will use to store limits and counters. This variable applies only when [`INFINISPAN_URL`](#infinispan_url) is @@ -139,7 +319,7 @@ require Redis. - Format: `string`. -### `INFINISPAN_COUNTERS_CONSISTENCY` +#### `INFINISPAN_COUNTERS_CONSISTENCY` - Defines the consistency mode for the Infinispan counters created by Limitador. This variable applies only when [`INFINISPAN_URL`](#infinispan_url) is set. @@ -147,10 +327,10 @@ require Redis. - Format: `enum`: `"Strong"` or `"Weak"`. -### `INFINISPAN_URL` +#### `INFINISPAN_URL` - Infinispan URL. Required only when you want to use Infinispan to store the limits. - Optional. By default, Limitador stores the limits in memory and does not require Infinispan. -- Format: `URL`, inthe format of `http://username:password@127.0.0.1:11222`. +- Format: `URL`, in the format of `http://username:password@127.0.0.1:11222`. diff --git a/doc/topologies.md b/doc/topologies.md index 3ff640c3..28e29787 100644 --- a/doc/topologies.md +++ b/doc/topologies.md @@ -1,16 +1,17 @@ -# Redis active-active storage +# Deployment topologies + +## In-memory + +## Redis + +### Redis active-active storage The RedisLabs version of Redis supports [active-active replication](https://docs.redislabs.com/latest/rs/concepts/intercluster-replication/). Limitador is compatible with that deployment mode, but there are a few things to take into account regarding limit accuracy. -## Set up - -In order to try active-active replication, you can follow this [tutorial from -RedisLabs](https://docs.redislabs.com/latest/rs/getting-started/getting-started-active-active/). - -## Considerations +#### Considerations With an active-active deployment, the data needs to be replicated between instances. An update in an instance takes a short time to be reflected in the @@ -25,3 +26,8 @@ hand, if we have defined limits with a high number of hits and a long period, the effect will be basically negligible. For example, if we define a limit of one hour, and we know that the data takes around one second to be replicated, the accuracy loss is going to be negligible. + +#### Set up + +In order to try active-active replication, you can follow this [tutorial from +RedisLabs](https://docs.redislabs.com/latest/rs/getting-started/getting-started-active-active/). diff --git a/limitador-server/README.md b/limitador-server/README.md index 0143777c..d7ffc754 100644 --- a/limitador-server/README.md +++ b/limitador-server/README.md @@ -10,7 +10,7 @@ can be configured with these ENVs: `ENVOY_RLS_HOST`, `ENVOY_RLS_PORT`, Or using the command line arguments: -```commandline +``` Limitador Server The Kuadrant team - github.com/Kuadrant Rate Limiting Server diff --git a/limitador/src/storage/redis/redis_async.rs b/limitador/src/storage/redis/redis_async.rs index 69c33393..72235d1e 100644 --- a/limitador/src/storage/redis/redis_async.rs +++ b/limitador/src/storage/redis/redis_async.rs @@ -13,7 +13,7 @@ use std::collections::HashSet; use std::str::FromStr; use std::time::Duration; -// Note: this implementation does no guarantee exact limits. Ensuring that we +// Note: this implementation does not guarantee exact limits. Ensuring that we // never go over the limits would hurt performance. This implementation // sacrifices a bit of accuracy to be more performant.