Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction Details page has an empty statements table, caused by the transaction fingerprint ID for most statements not being populated in the endpoint response (and possibly deeper) #81470

Closed
jocrl opened this issue May 18, 2022 · 5 comments · Fixed by #83616
Assignees
Labels
A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@jocrl
Copy link
Contributor

jocrl commented May 18, 2022

Describe the problem
First reported by @dbist.

The first symptom is that the Statements table in the Transactions Details page is empty, but all other parts of the page are populated.

Upon further inspection, it turns out that for the statements involved, the transaction_fingerprint_id column in the statement_statistics table is 0 instead of the transaction id. Accordingly, in the endpoint key.key_data.transaction_fingerprint_id is 0 instead of the transaction id.
image

This is is despite that in the endpoint response transactions -> index 5 -> stats_data -> statement_fingerprint_ids is correctly populated with statement_fingerprint_ids.

To Reproduce

This is the case in the roachprod cluster on 22.1-rc linked in this slack thread (the cluster might be gone/a different version later). With Redux devtools installed, go to the transactions details page and check cachedData -> statements -> data -> statements -> the statement at index 5 -> key -> key_data -> transaction_fingerprint_id, and that should be "0". The slack thread also contains an attachment of the Redux state.

This is not reproducible by checking out 5b78463ed2e7106a8477b63fa837564ad02bb510 (22.1-rc), building it, and running the demo database.

Additional data / screenshots
Note that as of reporting we currently only have a repro on a customer cluster, so be wary of screenshots.

Environment:

  • CockroachDB version: 22.1-rc, roachprod

Jira issue: CRDB-15168

@jocrl jocrl added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-observability A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console labels May 18, 2022
@jlinder jlinder added sync-me and removed sync-me labels May 20, 2022
@jocrl
Copy link
Contributor Author

jocrl commented May 25, 2022

This happened a second time, with another of @dbist's clusters

@xinhaoz
Copy link
Member

xinhaoz commented May 31, 2022

Is it possible to get the value of sql.stats.associate_stmt_with_txn_fingerprint.enabled in the cluster? That cluster setting being set to false causes the txnfingerprintid to be set to 0 for stmts in explicit transactions.

@jocrl
Copy link
Contributor Author

jocrl commented May 31, 2022

Ah, that's a good point!
The cluster has since been taken down, but that will be useful to see the next time this is reported. I'll add this to our runbook known issues section.
@kevin-v-ngo, my understanding is that sql.stats.associate_stmt_with_txn_fingerprint.enabled should usually be true, correct? I was wondering if we need some sort of UI messaging for when it is set to false (that the table should say hey, this is empty because this is turned off). But maybe not if it sounds like sql.stats.associate_stmt_with_txn_fingerprint.enabled would only be modified for troubleshooting purposes.

@dbist
Copy link
Contributor

dbist commented May 31, 2022

this is a vanilla cluster, why would this property be off, we didn't touch it.

root@34.75.1.14:26000/aci> SHOW CLUSTER SETTING sql.stats.assoociate_stmt_with_txn_fingerprint;
ERROR: unknown setting: "sql.stats.assoociate_stmt_with_txn_fingerprint"
root@34.75.1.14:26000/aci> SHOW CLUSTER SETTING sql.stats.assoociate_stmt_with_txn_fingerprint.enabled;
ERROR: unknown setting: "sql.stats.assoociate_stmt_with_txn_fingerprint.enabled"
root@34.75.1.14:26000/aci> SHOW CLUSTER SETTING sql.stats.assoociate_stmt_with_txn_fingerprint_enabled;
ERROR: unknown setting: "sql.stats.assoociate_stmt_with_txn_fingerprint_enabled"
root@34.75.1.14:26000/aci> SHOW SETTING sql.stats.assoociate_stmt_with_txn_fingerprint_enabled;
invalid syntax: statement ignored: at or near "setting": syntax error
SQLSTATE: 42601
DETAIL: source SQL:
SHOW SETTING sql.stats.assoociate_stmt_with_txn_fingerprint_enabled
     ^
HINT: try \h SHOW
root@34.75.1.14:26000/aci> show sql.stats.associate_stmt_with_txn_fingerprint.enabled;
invalid syntax: statement ignored: at or near "sql": syntax error
SQLSTATE: 42601
DETAIL: source SQL:
show sql.stats.associate_stmt_with_txn_fingerprint.enabled
     ^
HINT: try \h SHOW
root@34.75.1.14:26000/aci> show associate_stmt_with_txn_fingerprint.enabled;
ERROR: unrecognized configuration parameter "associate_stmt_with_txn_fingerprint.enabled"
SQLSTATE: 42704
root@34.75.1.14:26000/aci> SHOW CLUSTER SETTING sql.stats.assoociate_stmt_with_txn_fingerprint_enabled;
ERROR: unknown setting: "sql.stats.assoociate_stmt_with_txn_fingerprint_enabled"

@xinhaoz
Copy link
Member

xinhaoz commented Jun 28, 2022

An update on this:
The issue was improper statement stats recording for stmts in txns that encounter txnUpgradeToExplicit, which would appear in situations where you create an explicit txn in a batch.
Releasing a fix this week.

craig bot pushed a commit that referenced this issue Jul 6, 2022
83616: sql, sqlstats: create temporary stats container for all txns r=xinhaoz a=xinhaoz

# Commit 1
### sql: add txn_fingerprint_id to node_statement_statistics

This commit adds the `txn_fingerprint_id` column to
`crdb_internal.node_statement_statistics`.

Release note (sql): `txn_fingerprin_id` has been added to
`crdb_internal.node_statement_statistics`. The type of the
column is NULL or STRING.

------------------------------
# Commit 2
### sql, sqlstats: create temporary stats container for all txns

Fixes: #81470

Previously, the stats collector followed different procedures for stats
collection depending on whether or not the txn was explicit.

For explicit transactions, all the stmts in the txn must be recorded with
the same `transactionFingerprintID`, which is  only known after all stmts
in the txn have been executed. In order to record the correct
txnFingerprintID, a temporary stats container was created for stmts in the
current transaction. The `transactionFingerprintID` was then populated for
all stmts in the temp container, and the temp container was merged with
the parent.

For implict transactions, the assumption was there would only be a single
stmt in the txn, and so no temporary container was created, with stmts
being written directly to the application stats.

This assumption was incorrect, as it is possible for implicit txns to have
multiple stmts, such as stmts sent in a batch.

This commit ensures that stats are properly collected for implicit txns
with multiple stmts. The stats collector now follows the same procedure
for both explicit and implicit txns, creating a temporary container for
local txn stmts and merging on txn finish.

Release note (bug fix): Statement and transaction stats are now properly
recorded for implicit txns with multiple stmts.

83762: docs: add MVCC range tombstones tech note r=sumeerbhola,jbowens,nicktrav a=erikgrinaker

[Rendered version](https://github.com/erikgrinaker/cockroach/blob/mvcc-range-tombstones-tech-note/docs/tech-notes/mvcc-range-tombstones.md)

---

This describes the current state, but the details will likely change as
we iterate on the implementation.

Resolves #83406.

Release note: None

83873: tenantsettingswatcher: remove the version gate r=yuzefovich a=yuzefovich

This commit removes the version gate for the tenant settings.

Release note: None

83902: server: remove TLS cert data retrieval over HTTP r=catj-cockroach a=knz

Back in CockroachDB v1.1 (v17.2 in the new calver scheme), we
introduced a certificate rotation mechanism. To help
teach/troubleshoot that feature, we also provided a way for the
operator to view the certificate details in the DB Console (expiration
time, addresses, etc.)

This work was done in PR #16087, to solve issues #15027/#1674.

However, as part of that PR, the implementation of the back-end API
also included the *data* of the cert (including the cert signature
and the signature chain) in the response payload.

This additional payload was never used in a user-facing feature: the
DB Console does not display it nor does it contain a link to "download
the cert file". The back-end API is not public either, so we are not
expecting end-users to have legitimate uses for this feature.

Meanwhile, leaking cert data through an API runs dangerously close
to violating PCI guidelines (not quite, since keys are not exposed,
but still...).

So in order to avoid a remark on this during PCI review cycles, and to
remove the chance this will be misused, this patch removes the
data payload from the cert response.

The DB Console screen corresponding to the original work remains
unaffected.

For reference here's how the console screen looks: 
![image](https://user-images.githubusercontent.com/642886/177591040-f554fdf0-2b04-48f6-af36-0b94c0bcaf4c.png)


Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
@craig craig bot closed this as completed in 607026a Jul 6, 2022
xinhaoz added a commit to xinhaoz/cockroach that referenced this issue Jul 7, 2022
Fixes: cockroachdb#81470

Previously, the stats collector followed different procedures for stats
collection depending on whether or not the txn was explicit.

For explicit transactions, all the stmts in the txn must be recorded with
the same `transactionFingerprintID`, which is  only known after all stmts
in the txn have been executed. In order to record the correct
txnFingerprintID, a temporary stats container was created for stmts in the
current transaction. The `transactionFingerprintID` was then populated for
all stmts in the temp container, and the temp container was merged with
the parent.

For implict transactions, the assumption was there would only be a single
stmt in the txn, and so no temporary container was created, with stmts
being written directly to the application stats.

This assumption was incorrect, as it is possible for implicit txns to have
multiple stmts, such as stmts sent in a batch.

This commit ensures that stats are properly collected for implicit txns
with multiple stmts. The stats collector now follows the same procedure
for both explicit and implicit txns, creating a temporary container for
local txn stmts and merging on txn finish.

Release note (bug fix): Statement and transaction stats are now properly
recorded for implicit txns with multiple stmts.
xinhaoz added a commit to xinhaoz/cockroach that referenced this issue Jul 7, 2022
Fixes: cockroachdb#81470

Previously, the stats collector followed different procedures for stats
collection depending on whether or not the txn was explicit.

For explicit transactions, all the stmts in the txn must be recorded with
the same `transactionFingerprintID`, which is  only known after all stmts
in the txn have been executed. In order to record the correct
txnFingerprintID, a temporary stats container was created for stmts in the
current transaction. The `transactionFingerprintID` was then populated for
all stmts in the temp container, and the temp container was merged with
the parent.

For implict transactions, the assumption was there would only be a single
stmt in the txn, and so no temporary container was created, with stmts
being written directly to the application stats.

This assumption was incorrect, as it is possible for implicit txns to have
multiple stmts, such as stmts sent in a batch.

This commit ensures that stats are properly collected for implicit txns
with multiple stmts. The stats collector now follows the same procedure
for both explicit and implicit txns, creating a temporary container for
local txn stmts and merging on txn finish.

Release note (bug fix): Statement and transaction stats are now properly
recorded for implicit txns with multiple stmts.
xinhaoz added a commit to xinhaoz/cockroach that referenced this issue Jul 7, 2022
Fixes: cockroachdb#81470

Previously, the stats collector followed different procedures for stats
collection depending on whether or not the txn was explicit.

For explicit transactions, all the stmts in the txn must be recorded with
the same `transactionFingerprintID`, which is  only known after all stmts
in the txn have been executed. In order to record the correct
txnFingerprintID, a temporary stats container was created for stmts in the
current transaction. The `transactionFingerprintID` was then populated for
all stmts in the temp container, and the temp container was merged with
the parent.

For implict transactions, the assumption was there would only be a single
stmt in the txn, and so no temporary container was created, with stmts
being written directly to the application stats.

This assumption was incorrect, as it is possible for implicit txns to have
multiple stmts, such as stmts sent in a batch.

This commit ensures that stats are properly collected for implicit txns
with multiple stmts. The stats collector now follows the same procedure
for both explicit and implicit txns, creating a temporary container for
local txn stmts and merging on txn finish.

Release note (bug fix): Statement and transaction stats are now properly
recorded for implicit txns with multiple stmts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants