Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tracing: make full redaction of trace messages configurable
When we discovered that tenants could "trace into" KV, we considered this a security problem since no mechanism ensured that tenants could receive data they are allowed to see. The KV team reacted to this with cockroachdb#70562, which makes all verbose traces *redactable* and by redacting them at the KV-Tenant boundary. Serverless has shipped with this redactability in place in order to prevent data leakage across the boundary with KV and now needs to get back on the `release-21.2` branch to sync with the rest of development. Unfortunately, making traces redactable had a [significant] performance overhead. On simple selects, this overhead is in the 10-20% area (when verbose tracing is enabled). This presents a stability risk, since some customers run with verbose tracing enabled for all queries. Once they switch to 21.2, their system may be underprovisioned as a result of the newly grown tracing overhead in the `log.{Ve,E}vent{,f}` family of methods. The peformance hit comes via the additional costs of redacting log strings via the `redact.Sprintf` function. This PR changes makes trace redactability a choice. Because we needed to support mixed-version clusters in cockroachdb#70562 this is not as big a lift as might have been expected. The approach taken is that tracers and the spans they generate can be marked as "redactable". If the flag is enabled, logging events to the span will use full redaction via `redact.Sprintf`. If the flag is not enabled, we implement a manual "coarse redaction" of the entire message by surrounding it with redaction markers. We take a "fast path" by checking to see if there are any existing redaction markers to escape in the message, making the assumption that usually there aren't. This approach maintains redactability (to varying degrees) of traces while reducing the performance hit of tracing in the case where we don't want detailed redactability. A further commit will enable redactability on tenant-bound traces since we do want to maintain that functionality in multi-tenant deployments. Original issue investigation that kicked off concerns about performance impact of redaction: cockroachdb#70110 (comment) ---- Benchmarks The base for all 3 below benchmarks that's used is the code on `release-21.2` with the "drop traces for tenants" commit reverted. This reproduces a reasonable "base" state where tracing redaction isn't really introduced at all. Additionally, all benchmarks are run with modifications introduced in the first two commits in cockroachdb#71610 which force tracing on in benchmarks by running `SET CLUSTER SETTING trace.debug.enable = true` in the regular and multinode scenarios shown below. Below are tracing benchmarks for the conditional redaction that is in **this** commit. ``` ~/go/src/github.com/cockroachdb/cockroach make-traces-redactable-and-default-to-off--v2* ≡ 19:36:43 ❯ benchstat 83fc452-no-tracing-bench.txt 2bea5a2ac5-conditional-redaction-simple.txt name old time/op new time/op delta Tracing/Cockroach/tracing=f/Scan1-8 612µs ± 9% 591µs ±16% ~ (p=0.393 n=10+10) Tracing/Cockroach/tracing=f/Insert-8 785µs ± 4% 767µs ± 6% ~ (p=0.360 n=8+10) Tracing/Cockroach/tracing=t/Scan1-8 597µs ± 1% 542µs ± 1% -9.23% (p=0.000 n=10+9) Tracing/Cockroach/tracing=t/Insert-8 762µs ± 3% 739µs ± 8% ~ (p=0.315 n=10+10) Tracing/MultinodeCockroach/tracing=f/Scan1-8 1.05ms ±30% 0.92ms ± 5% -12.57% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=f/Insert-8 1.19ms ±15% 1.13ms ± 9% -5.39% (p=0.035 n=10+10) Tracing/MultinodeCockroach/tracing=t/Scan1-8 979µs ± 3% 945µs ± 6% -3.42% (p=0.024 n=9+9) Tracing/MultinodeCockroach/tracing=t/Insert-8 1.54ms ±48% 1.19ms ±16% -22.76% (p=0.011 n=10+10) name old alloc/op new alloc/op delta Tracing/Cockroach/tracing=f/Scan1-8 86.5kB ± 1% 90.2kB ± 3% +4.29% (p=0.000 n=9+9) Tracing/Cockroach/tracing=f/Insert-8 109kB ± 3% 111kB ± 0% +2.68% (p=0.005 n=9+7) Tracing/Cockroach/tracing=t/Scan1-8 86.0kB ± 1% 89.8kB ± 0% +4.42% (p=0.000 n=10+9) Tracing/Cockroach/tracing=t/Insert-8 108kB ± 3% 112kB ± 3% +3.05% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=f/Scan1-8 156kB ±37% 149kB ± 1% ~ (p=0.050 n=9+9) Tracing/MultinodeCockroach/tracing=f/Insert-8 154kB ±10% 154kB ± 3% ~ (p=0.382 n=8+8) Tracing/MultinodeCockroach/tracing=t/Scan1-8 145kB ± 1% 150kB ± 5% +3.61% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=t/Insert-8 157kB ±13% 183kB ±40% ~ (p=0.237 n=8+10) name old allocs/op new allocs/op delta Tracing/Cockroach/tracing=f/Scan1-8 1.44k ± 0% 1.50k ± 0% +4.22% (p=0.000 n=9+8) Tracing/Cockroach/tracing=f/Insert-8 1.64k ± 0% 1.72k ± 0% +4.39% (p=0.000 n=9+9) Tracing/Cockroach/tracing=t/Scan1-8 1.44k ± 0% 1.50k ± 0% +4.33% (p=0.000 n=8+7) Tracing/Cockroach/tracing=t/Insert-8 1.64k ± 0% 1.72k ± 0% +4.34% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=f/Scan1-8 2.41k ±11% 2.38k ± 1% ~ (p=0.150 n=10+9) Tracing/MultinodeCockroach/tracing=f/Insert-8 2.15k ± 1% 2.22k ± 1% +3.35% (p=0.000 n=8+8) Tracing/MultinodeCockroach/tracing=t/Scan1-8 2.31k ± 1% 2.40k ± 4% +4.19% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=t/Insert-8 2.17k ± 6% 2.33k ±12% +7.43% (p=0.012 n=8+10) ``` These statistics are the *base* hit we take on the redaction work **prior** to any conditional redaction being added. This is what we're trying to improve on. ``` ~/go/src/github.com/cockroachdb/cockroach make-traces-redactable-and-default-to-off--v2* ≡ 19:36:49 ❯ benchstat 83fc452-no-tracing-bench.txt dc42e37-after-redaction.txt name old time/op new time/op delta Tracing/Cockroach/tracing=f/Scan1-8 612µs ± 9% 693µs ± 2% +13.22% (p=0.000 n=10+10) Tracing/Cockroach/tracing=f/Insert-8 785µs ± 4% 878µs ± 1% +11.86% (p=0.000 n=8+9) Tracing/Cockroach/tracing=t/Scan1-8 597µs ± 1% 687µs ± 3% +15.05% (p=0.000 n=10+9) Tracing/Cockroach/tracing=t/Insert-8 762µs ± 3% 878µs ± 1% +15.35% (p=0.000 n=10+8) Tracing/MultinodeCockroach/tracing=f/Scan1-8 1.05ms ±30% 1.06ms ± 2% ~ (p=0.050 n=9+9) Tracing/MultinodeCockroach/tracing=f/Insert-8 1.19ms ±15% 1.28ms ±11% +7.85% (p=0.035 n=10+10) Tracing/MultinodeCockroach/tracing=t/Scan1-8 979µs ± 3% 1066µs ± 1% +8.87% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=t/Insert-8 1.54ms ±48% 1.26ms ± 2% ~ (p=0.515 n=10+8) name old alloc/op new alloc/op delta Tracing/Cockroach/tracing=f/Scan1-8 86.5kB ± 1% 98.0kB ± 1% +13.33% (p=0.000 n=9+9) Tracing/Cockroach/tracing=f/Insert-8 109kB ± 3% 125kB ± 1% +15.09% (p=0.000 n=9+9) Tracing/Cockroach/tracing=t/Scan1-8 86.0kB ± 1% 97.9kB ± 1% +13.90% (p=0.000 n=10+9) Tracing/Cockroach/tracing=t/Insert-8 108kB ± 3% 125kB ± 1% +15.53% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=f/Scan1-8 156kB ±37% 162kB ± 1% ~ (p=0.059 n=9+8) Tracing/MultinodeCockroach/tracing=f/Insert-8 154kB ±10% 197kB ±43% +28.54% (p=0.000 n=8+10) Tracing/MultinodeCockroach/tracing=t/Scan1-8 145kB ± 1% 162kB ± 1% +11.81% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=t/Insert-8 157kB ±13% 172kB ± 5% +9.84% (p=0.007 n=8+8) name old allocs/op new allocs/op delta Tracing/Cockroach/tracing=f/Scan1-8 1.44k ± 0% 1.66k ± 0% +15.59% (p=0.000 n=9+9) Tracing/Cockroach/tracing=f/Insert-8 1.64k ± 0% 1.95k ± 0% +18.48% (p=0.000 n=9+8) Tracing/Cockroach/tracing=t/Scan1-8 1.44k ± 0% 1.66k ± 0% +15.46% (p=0.000 n=8+9) Tracing/Cockroach/tracing=t/Insert-8 1.64k ± 0% 1.95k ± 1% +18.52% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=f/Scan1-8 2.41k ±11% 2.58k ± 1% ~ (p=0.105 n=10+8) Tracing/MultinodeCockroach/tracing=f/Insert-8 2.15k ± 1% 2.58k ±12% +20.04% (p=0.000 n=8+10) Tracing/MultinodeCockroach/tracing=t/Scan1-8 2.31k ± 1% 2.59k ± 1% +12.09% (p=0.000 n=9+9) Tracing/MultinodeCockroach/tracing=t/Insert-8 2.17k ± 6% 2.47k ± 1% +13.76% (p=0.000 n=8+8) ``` Touches cockroachdb#70562. Release note: None
- Loading branch information