Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ui: transaction time reporting extremely large number #82894

Closed
THardy98 opened this issue Jun 14, 2022 · 2 comments · Fixed by #83590
Closed

ui: transaction time reporting extremely large number #82894

THardy98 opened this issue Jun 14, 2022 · 2 comments · Fixed by #83590
Assignees
Labels
A-sql-observability Related to observability of the SQL layer C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@THardy98
Copy link

THardy98 commented Jun 14, 2022

The console has been seen to report extremely large numbers for a transaction's time on the transactions tab page. Here is an example:

transactions_aggregate_statement_time

This was seen looking at the transaction times on one of Twitter’s clusters. Auto-commit for transactions was enabled, and these transactions are implicit transactions. The cluster is question was self-hosted, running v22.1.0.

Jira issue: CRDB-16714

@THardy98 THardy98 added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-observability Related to observability of the SQL layer T-sql-observability labels Jun 14, 2022
@rafiss
Copy link
Collaborator

rafiss commented Jun 22, 2022

I have a way to reproduce the large UI time.

  1. Start a fresh v22.1.2 cluster. (I used single-node insecure.)
  2. Run the following program.
package main

import (
        "context"
        "fmt"
        // Import driver
        "github.com/jackc/pgx/v4"
)

func main() {
    ctx := context.Background()

    c, err := pgx.Connect(ctx, "postgresql://root@localhost:26257/defaultdb?sslmode=disable")
    fmt.Println(err)

    var i int
    err = c.QueryRow(ctx, "SELECT $1::INT8", 1).Scan(&i)
    fmt.Println(i, err)

}
  1. Go to the Transactions page: http://localhost:8081/#/sql-activity?tab=Transactions

image


It seems specific to the extended protocol. I changed the last bit to

    var i int
    err = c.QueryRow(ctx, "SELECT 1::INT8",  pgx.QuerySimpleProtocol(true)).Scan(&i)
    fmt.Println(i, err)

and could not reproduce.


I wonder if after #76792 went in, it’s no longer correct to unconditionally call statsCollector.Reset during each phase of the extended protocol (e.g. prepare and bind). All of these phases normally (but not always) happen in the same implicit transaction

@rafiss
Copy link
Collaborator

rafiss commented Jun 23, 2022

After discussion on Slack, I think we want this fix:

diff --git a/pkg/sql/conn_executor_exec.go b/pkg/sql/conn_executor_exec.go
index be845b3231..34d3af3992 100644
--- a/pkg/sql/conn_executor_exec.go
+++ b/pkg/sql/conn_executor_exec.go
@@ -2168,8 +2168,7 @@ func (ex *connExecutor) recordTransactionStart(txnID uuid.UUID) {

     // Transaction received time is the time at which the statement that prompted
     // the creation of this transaction was received.
-    ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionTransactionReceived,
-        ex.phaseTimes.GetSessionPhaseTime(sessionphase.SessionQueryReceived))
+    ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionTransactionReceived, txnStart)
     ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionFirstStartExecTransaction, timeutil.Now())
     ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionMostRecentStartExecTransaction,
         ex.phaseTimes.GetSessionPhaseTime(sessionphase.SessionFirstStartExecTransaction))

The reason is that now, the implicit transaction starts before SessionQueryReceived is populated. Luckily, we already had the txnStart time available here, and this is the correct time to use anyway.

THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jun 29, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` session phase time
to be recorded incorrectly, causing extremely large transactions times
on the UI. This change fixes this mistake by setting the actual
transaction start time as the `SessionTransactionReceived` session phase
time, instead of `SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jun 30, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jun 30, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 6, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
craig bot pushed a commit that referenced this issue Jul 7, 2022
83123: sql/stats: conversion of datums to and from quantile function values r=yuzefovich,rytaft,mgartner a=michae2

To predict histograms in statistics forecasts, we will use linear
regression over quantile functions. (Quantile functions are another
representation of histogram data, in a form more amenable to statistical
manipulation.)

The conversion of histograms to quantile functions will require
conversion of histogram bounds (datums) to quantile values (float64s).
And likewise, the inverse conversion from quantile functions back to
histograms will require the inverse conversion of float64 quantile
values back to datums. These conversions are a little different from our
usual SQL conversions in `eval.PerformCast`, so we add them to a new
quantile file in the `sql/stats` module.

This code was originally part of #77070 but has been pulled out to
simplify that PR. A few changes have been made:
- `histogramValue` has been renamed to `FromQuantileValue`.
- Support for `DECIMAL`, `TIME`, `TIMETZ`, and `INTERVAL` has been
  dropped. Clamping these types in `FromQuantileValue` was too complex
  for the first iteration of statistics forecasting. We expect the
  overwhelming majority of ascending keys to use `INT` or `TIMESTAMP`
  types.
- Bugs in `FLOAT4`, `TIMESTAMP` and `TIMESTAMPTZ` conversions have been
  fixed.
- We're now clamping timestamps to slightly tighter bounds to avoid the
  problems with infinite timestamps (see #41564).

Assists: #79872

Release note: None

83590: sql: fix and rename sql stats session transaction received time r=THardy98 a=THardy98

Resolves: #82894

Due to a change from #76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.

83943: outliers: protocol buffer adjustments r=matthewtodd a=matthewtodd

This handful of commits slightly re-shapes the protocol buffer messages we use for tracking outliers, making some of the work in #81021 a little easier to express.

83961: cmd/dev: add generate execgen subcommand r=ajwerner a=yuzefovich

Release note: None

83975: streamingccl: unskip TestTenantStreaming r=miretskiy a=stevendanna

This makes a couple of changes aimed at unskipping
TestTenantStreaming:

- Fix a nil pointer access in our stream status verification
  function. We changed the name of key that this function was
  accessing. This NPE was hidden by another panic along the unclean
  shutdown path in the server.

- Lower various intervals so that this test doesn't take 90 seconds.

I've run this under stress for a few hundred iterations without error.

Release note: None

83979: streamingccl: small logging and tracing cleanups r=miretskiy a=stevendanna

- Add a tracing span to ingestion job cutover.
- Move a particularly noisy log message to VInfo(3).
- Prefer log.VInfo to `if log.V(n) {}` in cases where we aren't doing
  expensive argument construction.

Release note: None

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
@craig craig bot closed this as completed in f8f7459 Jul 7, 2022
blathers-crl bot pushed a commit that referenced this issue Jul 7, 2022
Resolves: #82894

Due to a change from #76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 7, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-observability Related to observability of the SQL layer C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants