Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix race condition on audit logging CCL hook #105227

Merged

Conversation

THardy98
Copy link

@THardy98 THardy98 commented Jun 20, 2023

Fixes: #104660, #104838

This change fixes a race condition that can occur when multiple
SQLServers are created simultaneously, causing simultaneous writes to an
unprotected global variable used to configure a CCL audit logging
feature.

This change removes the global variable and instead invokes the hook to
check enterprise status at log time.

Release note (bug fix): This change fixes a race condition that
can occur when multiple SQLServers are created simultaneously, causing
simultaneous writes to an unprotected global variable used to configure
a CCL audit logging feature.

@THardy98 THardy98 added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Jun 20, 2023
@THardy98 THardy98 requested review from knz and a team June 20, 2023 21:12
@blathers-crl
Copy link

blathers-crl bot commented Jun 20, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@THardy98 THardy98 marked this pull request as draft June 20, 2023 21:13
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


-- commits line 4 at r1:
Don't forget to make the PR description reflect the release note.


-- commits line 8 at r1:
Something is off with your URLs and issue/epic references here. They seem totally unrelated.


-- commits line 11 at r1:
Something very wrong happened with these release notes.


pkg/server/server_sql.go line 1385 at r1 (raw file):

	initialized, _ := auditlogging.ReadEnterpriseParamsHook()
	if !initialized {
		auditlogging.WriteEnterpriseParamsHook(ctx, base.CheckEnterpriseEnabled(execCfg.Settings, cfg.ClusterIDContainer.Get(), "role-based auditing"))

This way of capturing the settings and the cluster ID container is very "not OK".

The only dependency you can preserve in a global variable is a reference to the function base.CheckEnterpriseEnabled. Everything else needs to be passed as argument to validateAuditLogConfig.

@THardy98 THardy98 force-pushed the fix_audit_enterprise_hook_data_race branch 5 times, most recently from b42dfaa to a809c2f Compare June 21, 2023 14:15
Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


-- commits line 4 at r1:

Previously, knz (Raphael 'kena' Poss) wrote…

Don't forget to make the PR description reflect the release note.

Done.


-- commits line 8 at r1:

Previously, knz (Raphael 'kena' Poss) wrote…

Something is off with your URLs and issue/epic references here. They seem totally unrelated.

Fixed.


-- commits line 11 at r1:

Previously, knz (Raphael 'kena' Poss) wrote…

Something very wrong happened with these release notes.

Fixed.


pkg/server/server_sql.go line 1385 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

This way of capturing the settings and the cluster ID container is very "not OK".

The only dependency you can preserve in a global variable is a reference to the function base.CheckEnterpriseEnabled. Everything else needs to be passed as argument to validateAuditLogConfig.

It's difficult to pass arguments to validateAuditLogConfig as the cluster setting validation function has an expected signature of: (st *settings.Values, input string) error, and doesn't have the necessary arguments to call CheckEnterpriseEnabled. I'm not sure how to pass server state into a cluster setting's validation function aside from a hook like this.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/server/server_sql.go line 1385 at r1 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…

It's difficult to pass arguments to validateAuditLogConfig as the cluster setting validation function has an expected signature of: (st *settings.Values, input string) error, and doesn't have the necessary arguments to call CheckEnterpriseEnabled. I'm not sure how to pass server state into a cluster setting's validation function aside from a hook like this.

This should teach you that validateAuditLogConfig is not the right place to do an enterprise check.

Generally, we do not do license checks on cluster settings. This is undesirable anyway; what would it mean if the value is set properly, and then the enterprise license expires? We want the feature to be disabled in that case.

The proper approach wrt license check is to do it in the SQL session. Either once when the session is opened or when we actually need to do audit logging, when/if the log config is non-empty.

Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/server/server_sql.go line 1385 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

This should teach you that validateAuditLogConfig is not the right place to do an enterprise check.

Generally, we do not do license checks on cluster settings. This is undesirable anyway; what would it mean if the value is set properly, and then the enterprise license expires? We want the feature to be disabled in that case.

The proper approach wrt license check is to do it in the SQL session. Either once when the session is opened or when we actually need to do audit logging, when/if the log config is non-empty.

I had thought about doing the enterprise check elsewhere, but doesn't it seem odd to allow users to change the cluster setting value even if they can't use its underlying feature?

@knz
Copy link
Contributor

knz commented Jun 21, 2023

doesn't it seem odd to allow users to change the cluster setting value even if they can't use its underlying feature?

We have precedent: using kerberos authentication in the HBA config.

@THardy98 THardy98 force-pushed the fix_audit_enterprise_hook_data_race branch from a809c2f to da1afd6 Compare June 21, 2023 19:29
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is better. LGTM with nit.

Reviewed 7 of 7 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/sql/audit_logging.go line 122 at r4 (raw file):

	// Do not emit audit events for reserved users/roles. This does not omit the root user.
	// Do not emit audit events for internal planners.
	return auditlogging.UserAuditEnabled(p.execCfg.Settings, p.EvalContext().ClusterID) || p.User().IsReserved() || p.isInternalPlanner

nit: consider checking whether the audit log config is empty first and take a shortcut. The enterprise license check is comparatively more expensive.

@THardy98 THardy98 force-pushed the fix_audit_enterprise_hook_data_race branch from da1afd6 to d6e0211 Compare June 23, 2023 20:31
@THardy98 THardy98 changed the title sql: fix race/deadlock condition on audit logging CCL hook sql: fix racecondition on audit logging CCL hook Jun 23, 2023
@THardy98 THardy98 changed the title sql: fix racecondition on audit logging CCL hook sql: fix race condition on audit logging CCL hook Jun 23, 2023
@THardy98 THardy98 force-pushed the fix_audit_enterprise_hook_data_race branch from d6e0211 to 329c7c1 Compare June 23, 2023 20:34
@THardy98 THardy98 requested a review from knz June 23, 2023 20:34
@THardy98 THardy98 marked this pull request as ready for review June 23, 2023 20:34
@THardy98
Copy link
Author

THardy98 commented Jun 23, 2023

I'm having a lot of trouble trying to determine the cause of #105174.

It seems difficult to reproduce, running ./dev test pkg/sql -f TestQueryCache --stress -- --define gotags=bazel,gss,deadlock locally and on gceworker doesn't reproduce reliably (I sometimes add --stress-args='-p=4' to not blow up my/gceworker machine, not sure if that has a huge impact on reproducability).

It often produces what looks like the test hanging:
0 runs so far, 0 failures, over <'x' amount of time>
but very rarely an actual deadlock detection:
POTENTIAL DEADLOCK: ...

even when running for over 30mins.

I've tried running the above on commits prior to the introduction of any audit logging logic (I'm not certain if the CCL commit tagged in the issue is the cause of the deadlock, I wanted to bisect without much success) and seem to get the same result (sometimes runs will complete, sometimes not).

As such, this PR will only address the race condition bug. I'll mention this on the deadlock issue as well.

Fixes cockroachdb#104660.

This change fixes a race condition that can occur when multiple
SQLServers are created simulatenously, causing simultaneous write to an
unprotected global variable used to configure a CCL audit logging
feature.

This change removes the global variable and instead invokes the hook to
check enterprise status at log time.

Release note (bug fix): This change fixes a race  condition that
can occur when multiple SQLServers are created simulatenously, causing
simultaneous writes to an unprotected global variable used to configure
a CCL audit logging feature.
@THardy98 THardy98 force-pushed the fix_audit_enterprise_hook_data_race branch from 329c7c1 to a5bbd98 Compare June 26, 2023 14:10
@knz
Copy link
Contributor

knz commented Jun 26, 2023

I'm having a lot of trouble trying to determine the cause of #105174.

It's mildly possible that the race condition (the issue you're fixing here) is a different issue from the deadlock. So I would be OK merging this (and closing the race condition issue) while keeping the deadlock issue investigation separate.

@THardy98
Copy link
Author

THardy98 commented Jun 27, 2023

Coming across another data race:
/dev test pkg/server -f TestAdminAPITableStats --stress --stress-args='-p=6' --race

Stack trace:

==================
WARNING: DATA RACE
Write at 0x00c019613f28 by goroutine 37415:
  internal/reflectlite.Swapper.func3()
      GOROOT/src/internal/reflectlite/swapper.go:42 +0xf2
  sort.insertionSort_func()
      GOROOT/src/sort/zsortfunc.go:13 +0x94
  sort.pdqsort_func()
      GOROOT/src/sort/zsortfunc.go:73 +0x3cb
  sort.Slice()
      GOROOT/src/sort/slice.go:23 +0xc4
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle.(*Handle).connectStreamLocked()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle/kvflowhandle.go:238 +0x3ca
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle.(*Handle).ConnectStream()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle/kvflowhandle.go:228 +0x1a4
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicaFlowControlIntegrationImpl).tryReconnect()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/flow_control_replica_integration.go:423 +0x518
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicaFlowControlIntegrationImpl).refreshStreams()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/flow_control_replica_integration.go:289 +0x84
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicaFlowControlIntegrationImpl).onRaftTicked()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/flow_control_replica_integration.go:257 +0x8b
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).updateProposalQuotaRaftMuLocked()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_proposal_quota.go:268 +0xe7c
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1213 +0x2344
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReady()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:814 +0x20f
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processReady()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:646 +0x1cf
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftSchedulerShard).worker()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:418 +0x2f5
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).Start.func2()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:321 +0x9a
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:484 +0x1f6

Previous read at 0x00c019613f28 by goroutine 1882489:
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle.(*Handle).Admit()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle/kvflowhandle.go:110 +0x2a4
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvadmission.(*controllerImpl).AdmitKVWork()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvadmission/kvadmission.go:310 +0x5ec
  github.com/cockroachdb/cockroach/pkg/server.(*Node).batchInternal()
      github.com/cockroachdb/cockroach/pkg/server/node.go:1267 +0x4ac
  github.com/cockroachdb/cockroach/pkg/server.(*Node).Batch()
      github.com/cockroachdb/cockroach/pkg/server/node.go:1463 +0x3bd
  github.com/cockroachdb/cockroach/pkg/kv/kvpb._Internal_Batch_Handler.func1()
      github.com/cockroachdb/cockroach/pkg/kv/kvpb/bazel-out/k8-fastbuild/bin/pkg/kv/kvpb/kvpb_go_proto_/github.com/cockroachdb/cockroach/pkg/kv/kvpb/api.pb.go:10059 +0x88
  github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.ServerInterceptor.func1()
      github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:96 +0x1d4
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1162 +0x125
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func3()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:166 +0xf8
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1165 +0x22a
  github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/auth.go:105 +0x322
  github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor-fm()
      <autogenerated>:1 +0xe6
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1165 +0x22a
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1.1()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:133 +0x70
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:336 +0x147
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:131 +0x137
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1165 +0x22a
  google.golang.org/grpc.chainUnaryInterceptors.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1167 +0x298
  github.com/cockroachdb/cockroach/pkg/kv/kvpb._Internal_Batch_Handler()
      github.com/cockroachdb/cockroach/pkg/kv/kvpb/bazel-out/k8-fastbuild/bin/pkg/kv/kvpb/kvpb_go_proto_/github.com/cockroachdb/cockroach/pkg/kv/kvpb/api.pb.go:10061 +0x1dd
  google.golang.org/grpc.(*Server).processUnaryRPC()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1340 +0x15da
  google.golang.org/grpc.(*Server).handleStream()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1713 +0xff8
  google.golang.org/grpc.(*Server).serveStreams.func1.2()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:965 +0xec

Goroutine 37415 (running) created at:
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:475 +0x619
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).Start()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:313 +0x406
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processRaft()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:714 +0xc4
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).Start()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:2063 +0x127a
  github.com/cockroachdb/cockroach/pkg/server.(*Node).start()
      github.com/cockroachdb/cockroach/pkg/server/node.go:604 +0x1539
  github.com/cockroachdb/cockroach/pkg/server.(*Server).PreStart()
      github.com/cockroachdb/cockroach/pkg/server/server.go:1800 +0x349a
  github.com/cockroachdb/cockroach/pkg/server.(*TestServer).Start()
      github.com/cockroachdb/cockroach/pkg/server/testserver.go:622 +0x8f
  github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).startServer()
      github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:593 +0xd4
  github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).Start()
      github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:389 +0x890
  github.com/cockroachdb/cockroach/pkg/testutils/testcluster.StartTestCluster()
      github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:233 +0x9a
  github.com/cockroachdb/cockroach/pkg/testutils/testcluster.NewTestCluster()
      github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:329 +0xcec
  github.com/cockroachdb/cockroach/pkg/testutils/testcluster.NewTestCluster()
      github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:329 +0xcec
  github.com/cockroachdb/cockroach/pkg/testutils/testcluster.StartTestCluster()
      github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:232 +0x7d
  github.com/cockroachdb/cockroach/pkg/server_test.TestAdminAPITableStats()
      github.com/cockroachdb/cockroach/pkg/server_test/pkg/server/admin_cluster_test.go:86 +0x134
  testing.tRunner()
      GOROOT/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      GOROOT/src/testing/testing.go:1493 +0x47

Goroutine 1882489 (running) created at:
  google.golang.org/grpc.(*Server).serveStreams.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:963 +0x4dd
  google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders()
      google.golang.org/grpc/internal/transport/external/org_golang_google_grpc/internal/transport/http2_server.go:599 +0x30c9
  google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams()
      google.golang.org/grpc/internal/transport/external/org_golang_google_grpc/internal/transport/http2_server.go:644 +0x378
  google.golang.org/grpc.(*Server).serveStreams()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:949 +0x275
  google.golang.org/grpc.(*Server).handleRawConn.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:891 +0x64
==================
I230627 02:06:18.994367 1 (gostd) testmain.go:534  [-] 1  Test //pkg/server:server_test exited with error code 66


ERROR: exit status 66

This one is less clear and I don't have a clear idea of how to tackle this. Would love to pair on this with someone who has a hunch of what's going on and what's causing this.

@knz
Copy link
Contributor

knz commented Jun 27, 2023

This data race is not related to your work. O believe we have an issue for it already, and it's for the kv team. Try looking for that stack trace in the issue tracker.

@THardy98
Copy link
Author

This data race is not related to your work. O believe we have an issue for it already, and it's for the kv team. Try looking for that stack trace in the issue tracker.

Ah, here it is: #104837
thank you!

Copy link
Contributor

@j82w j82w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 5 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz)


pkg/sql/audit_logging.go line 122 at r4 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

nit: consider checking whether the audit log config is empty first and take a shortcut. The enterprise license check is comparatively more expensive.

Doesn't it do that shortcut in the method?

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @j82w)


pkg/sql/audit_logging.go line 122 at r4 (raw file):

Previously, j82w (Jake) wrote…

Doesn't it do that shortcut in the method?

The first time around I thought the call to shouldNotRoleBasedAudit came first. If that's not the case, please ignore my comment.

@THardy98
Copy link
Author

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @j82w)

pkg/sql/audit_logging.go line 122 at r4 (raw file):

Previously, j82w (Jake) wrote…

Doesn't it do that shortcut in the method?

The first time around I thought the call to shouldNotRoleBasedAudit came first. If that's not the case, please ignore my comment.

It did :) I made the change.

@THardy98
Copy link
Author

@knz are there still blocking changes?

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@THardy98
Copy link
Author

THardy98 commented Jun 28, 2023

Wanted to post a benchmark with these changes before merging.

sql.log.user_audit cluster setting is non-empty, causing checks for the enterprise license, but the enterprise license check fails so there is no overhead of actual logging. Consequently, the benchmark is just the cost of the enterprise checks:

Number of reads that didn't return any results: 118.
Write sequence could be resumed by passing --write-seq=R702190 to the next run.

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 1800.0s        0       13333509         7407.5      1.0      1.0      1.6      2.0     83.9  read

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 1800.0s        0         702188          390.1      1.9      1.9      2.6      3.7     79.7  write

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 1800.0s        0       14035697         7797.6      1.0      1.0      1.8      2.4     83.9

Negligible loss (~2%) compared to results in #101658. Could be accounted for by different runs not using the same data initialization. This is only experienced by CCL users.

@THardy98
Copy link
Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 28, 2023

Build succeeded:

@craig craig bot merged commit 7d54fc8 into cockroachdb:master Jun 28, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 28, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from a5bbd98 to blathers/backport-release-23.1-105227: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/audit_logging.go line 122 at r4 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…

It did :) I made the change.

Just wanted to callout that p.isInternalPlanner - confusingly - is likely to not be set to true when the query is run via the InternalExecutor, so there may a bug in here. It is similar to a9b83b5 in nature, and I'll look into fixing this isInternalPlanner on master (but not might be backportable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: TestQueryCache failed with data race
5 participants