Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: use an implicit txn during the extended protocol #76792

Merged
merged 3 commits into from
Feb 23, 2022

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Feb 18, 2022

fixes #71665

In the Postgres wire protocol, entering the extended protocol also
starts an implicit transaction. This transaction is committed when the
Sync message is received.

Previously, we were starting 3 separate "implicit" transactions for the
Prepare, Bind, and ExecPortal commands. (Implicit is in scare quotes
because they were created entirely ad-hoc.)

Now, the same implicit txn is used for the duration of the extended
protocol.

If BEGIN is executed as a prepared statement, then the implicit
trasnaction is upgraded to an explicit one. This behavior matches
Postgres.

Release note (bug fix): Fixed a bug where the different stages of
preparing, binding, and executing a prepared statement would use
different implicit transactions. Now these stages all share the same
implicit transaction.

@rafiss rafiss added the do-not-merge bors won't merge a PR with this label. label Feb 18, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss force-pushed the implicit-txn-prepare branch from 915b396 to fafa04a Compare February 20, 2022 22:16
@rafiss rafiss changed the title [WIP, DNM] sql: start implicit txn for Prepare message sql: use an implicit txn during the extended protocol Feb 20, 2022
@rafiss
Copy link
Collaborator Author

rafiss commented Feb 20, 2022

Here's a test that fails before this PR, which I would have expected this PR to fix. But it still fails. The behavior of the last few lines is quite confusing -- preparing and executing the same statement with almost the same AOST timestamps gets different results. Maybe something to do with leases?

Anyway, unless someone feels strongly or has an idea for how to fix, I'll file that as a separate issue and leave it outside of this PR.

func TestResolveTypesAOST(t *testing.T) {
	srv, db, _ := serverutils.StartServer(t, base.TestServerArgs{Insecure: true})
	defer srv.Stopper().Stop(context.Background())
	defer db.Close()

	time.Sleep(2 * time.Second)

	_, err := db.Exec("CREATE TYPE typ AS ENUM('hi', 'hello')")
	require.NoError(t, err)

	var s string

	err = db.QueryRow("SELECT 'hi'::typ FROM generate_series(1,1) AS OF SYSTEM TIME '-1s'").Scan(&s)
	// Fails as expected
	require.EqualError(t, err, "pq: type \"typ\" does not exist")

	err = db.QueryRow("SELECT $1::typ FROM generate_series(1,1) AS OF SYSTEM TIME '-1s'", "hi").Scan(&s)
	// No error happens even though it should
	require.NoError(t, err)

	err = db.QueryRow("SELECT $1::typ FROM generate_series(1,1) AS OF SYSTEM TIME '-1s'", "hi").Scan(&s)
	// The error happens here though.
	require.EqualError(t, err, "pq: type \"typ\" does not exist")

	_, err = db.Exec("CREATE TABLE tab(a TEXT)")
	require.NoError(t, err)

	err = db.QueryRow("SELECT * FROM tab AS OF SYSTEM TIME '-1s'").Scan(&s)
	// No error happens even though it should
	require.NoError(t, err)

	err = db.QueryRow("SELECT * FROM tab AS OF SYSTEM TIME '-1s'").Scan(&s)
	// The error happens here though.
	require.EqualError(t, err, "pq: type \"typ\" does not exist")
}

@rafiss rafiss removed the do-not-merge bors won't merge a PR with this label. label Feb 20, 2022
@rafiss rafiss force-pushed the implicit-txn-prepare branch from fafa04a to 4deb540 Compare February 20, 2022 22:47
@rafiss rafiss force-pushed the implicit-txn-prepare branch 2 times, most recently from af5416e to 60c425a Compare February 21, 2022 22:50
@rafiss rafiss requested review from ajwerner, otan and a team February 21, 2022 22:58
@otan
Copy link
Contributor

otan commented Feb 22, 2022


	err = db.QueryRow("SELECT * FROM tab AS OF SYSTEM TIME '-1s'").Scan(&s)
	// No error happens even though it should
	require.NoError(t, err)

	err = db.QueryRow("SELECT * FROM tab AS OF SYSTEM TIME '-1s'").Scan(&s)
	// The error happens here though.
	require.EqualError(t, err, "pq: type \"typ\" does not exist") 

why isn't the error "tab does not exist?"

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took a look, changes makes sense to me. will caveat i'm not 100% on the extended protocol. i trust ORM tests to break if something is off ;)

the error above is suspicious to me though.

Parse {"Name": "S_4", "Query": "INSERT INTO t VALUES(1)"}
Bind {"PreparedStatement": "S_4"}
Execute
Query {"String": "SELECT 1"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment that the prepared statement is re-executed and we assume a constraint error from that? that confused me for a second.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah for sure

@rafiss
Copy link
Collaborator Author

rafiss commented Feb 22, 2022

why isn't the error "tab does not exist?"

oops my bad -- copy paste error. also, it seems that one of the other fixes I made to clean up the PR made this test (with typos fixed) pass now, so we're all good!

@rafiss
Copy link
Collaborator Author

rafiss commented Feb 22, 2022

i trust ORM tests to break if something is off ;)

I'll try running the ORM tests before merging this

In the Postgres wire protocol, entering the extended protocol also
starts an implicit transaction. This transaction is committed when the
Sync message is received.

Previously, we were starting 3 separate "implicit" transactions for the
Prepare, Bind, and ExecPortal commands. (Implicit is in scare quotes
because they were created entirely ad-hoc.)

Now, the same implicit txn is used for the duration of the extended
protocol.

If BEGIN is executed as a prepared statement, then the implicit
trasnaction is upgraded to an explicit one. This behavior matches
Postgres.

Release note (bug fix): Fixed a bug where the different stages of
preparing, binding, and executing a prepared statement would use
different implicit transactions. Now these stages all share the same
implicit transaction.
@rafiss rafiss force-pushed the implicit-txn-prepare branch from 60c425a to afb6fa2 Compare February 22, 2022 17:28
@blathers-crl blathers-crl bot requested a review from otan February 22, 2022 17:28
@rafiss rafiss force-pushed the implicit-txn-prepare branch from afb6fa2 to 312b3a7 Compare February 22, 2022 18:51
@rafiss rafiss force-pushed the implicit-txn-prepare branch from 312b3a7 to 4d18a12 Compare February 22, 2022 20:57
@rafiss
Copy link
Collaborator Author

rafiss commented Feb 23, 2022

tftr!

bors r=otan

@craig
Copy link
Contributor

craig bot commented Feb 23, 2022

Build succeeded:

THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Jul 6, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
craig bot pushed a commit that referenced this pull request Jul 7, 2022
83123: sql/stats: conversion of datums to and from quantile function values r=yuzefovich,rytaft,mgartner a=michae2

To predict histograms in statistics forecasts, we will use linear
regression over quantile functions. (Quantile functions are another
representation of histogram data, in a form more amenable to statistical
manipulation.)

The conversion of histograms to quantile functions will require
conversion of histogram bounds (datums) to quantile values (float64s).
And likewise, the inverse conversion from quantile functions back to
histograms will require the inverse conversion of float64 quantile
values back to datums. These conversions are a little different from our
usual SQL conversions in `eval.PerformCast`, so we add them to a new
quantile file in the `sql/stats` module.

This code was originally part of #77070 but has been pulled out to
simplify that PR. A few changes have been made:
- `histogramValue` has been renamed to `FromQuantileValue`.
- Support for `DECIMAL`, `TIME`, `TIMETZ`, and `INTERVAL` has been
  dropped. Clamping these types in `FromQuantileValue` was too complex
  for the first iteration of statistics forecasting. We expect the
  overwhelming majority of ascending keys to use `INT` or `TIMESTAMP`
  types.
- Bugs in `FLOAT4`, `TIMESTAMP` and `TIMESTAMPTZ` conversions have been
  fixed.
- We're now clamping timestamps to slightly tighter bounds to avoid the
  problems with infinite timestamps (see #41564).

Assists: #79872

Release note: None

83590: sql: fix and rename sql stats session transaction received time r=THardy98 a=THardy98

Resolves: #82894

Due to a change from #76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.

83943: outliers: protocol buffer adjustments r=matthewtodd a=matthewtodd

This handful of commits slightly re-shapes the protocol buffer messages we use for tracking outliers, making some of the work in #81021 a little easier to express.

83961: cmd/dev: add generate execgen subcommand r=ajwerner a=yuzefovich

Release note: None

83975: streamingccl: unskip TestTenantStreaming r=miretskiy a=stevendanna

This makes a couple of changes aimed at unskipping
TestTenantStreaming:

- Fix a nil pointer access in our stream status verification
  function. We changed the name of key that this function was
  accessing. This NPE was hidden by another panic along the unclean
  shutdown path in the server.

- Lower various intervals so that this test doesn't take 90 seconds.

I've run this under stress for a few hundred iterations without error.

Release note: None

83979: streamingccl: small logging and tracing cleanups r=miretskiy a=stevendanna

- Add a tracing span to ingestion job cutover.
- Move a particularly noisy log message to VInfo(3).
- Prefer log.VInfo to `if log.V(n) {}` in cases where we aren't doing
  expensive argument construction.

Release note: None

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Jul 7, 2022
Resolves: #82894

Due to a change from #76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Jul 7, 2022
Resolves: cockroachdb#82894

Due to a change from cockroachdb#76792, implicit transactions can start before
`SessionQueryReceived` session phase time is set by the sqlstats system.
In turn, this caused the `SessionTransactionReceived` (now renamed as
`SessionTransactionStarted`) session phase time to be recorded
incorrectly, causing extremely large transactions times on the UI. This
change fixes this mistake by setting the actual transaction start time
as the `SessionTransactionStarted` session phase time, instead of
`SessionQueryReceived`.

Release note (bug fix): The `SessionTransactionReceived` session phase
time is no longer recorded incorrectly, fixing large transaction times
from appearing on the UI, also renamed to `SessionTransactionStarted`.
ajwerner added a commit to ajwerner/cockroach that referenced this pull request Nov 22, 2022
For legacy reasons, we were resetting the descriptor collection state in Bind
if we thought we were not in a transaction. Since cockroachdb#76792, we're always in a
transaction. You might think that'd mean that the logic would not run. Sadly,
for other still unclear reasons, when in an implicit transaction
`(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result
was that we'd erroneously reset our descriptor state in the middle of a multi-
statement implicit transaction if bind was invoked.

Fixes cockroachdb#82921

Release note (bug fix): Fixed a bug which could lead to errors when running
multiple schema change statements in a single command using a driver that
uses the extended pgwire protocol internally (Npgsql in .Net as an example).
These errors would have the form "attempted to update job for mutation 2, but
job already exists with mutation 1".
craig bot pushed a commit that referenced this pull request Nov 22, 2022
92300: sql: fix bug with multi-statement implicit txn schema changes and Bind r=ajwerner a=ajwerner

For legacy reasons, we were resetting the descriptor collection state in Bind if we thought we were not in a transaction. Since #76792, we're always in a transaction. You might think that'd mean that the logic would not run. Sadly, for other still unclear reasons, when in an implicit transaction `(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result was that we'd erroneously reset our descriptor state in the middle of a multi- statement implicit transaction if bind was invoked.

Fixes #82921

Release note (bug fix): Fixed a bug which could lead to errors when running multiple schema change statements in a single command using a driver that uses the extended pgwire protocol internally (Npgsql in .Net as an example). These errors would have the form "attempted to update job for mutation 2, but job already exists with mutation 1".

Co-authored-by: Andrew Werner <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Nov 22, 2022
For legacy reasons, we were resetting the descriptor collection state in Bind
if we thought we were not in a transaction. Since #76792, we're always in a
transaction. You might think that'd mean that the logic would not run. Sadly,
for other still unclear reasons, when in an implicit transaction
`(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result
was that we'd erroneously reset our descriptor state in the middle of a multi-
statement implicit transaction if bind was invoked.

Fixes #82921

Release note (bug fix): Fixed a bug which could lead to errors when running
multiple schema change statements in a single command using a driver that
uses the extended pgwire protocol internally (Npgsql in .Net as an example).
These errors would have the form "attempted to update job for mutation 2, but
job already exists with mutation 1".
blathers-crl bot pushed a commit that referenced this pull request Nov 22, 2022
For legacy reasons, we were resetting the descriptor collection state in Bind
if we thought we were not in a transaction. Since #76792, we're always in a
transaction. You might think that'd mean that the logic would not run. Sadly,
for other still unclear reasons, when in an implicit transaction
`(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result
was that we'd erroneously reset our descriptor state in the middle of a multi-
statement implicit transaction if bind was invoked.

Fixes #82921

Release note (bug fix): Fixed a bug which could lead to errors when running
multiple schema change statements in a single command using a driver that
uses the extended pgwire protocol internally (Npgsql in .Net as an example).
These errors would have the form "attempted to update job for mutation 2, but
job already exists with mutation 1".
ajwerner added a commit that referenced this pull request Dec 14, 2022
For legacy reasons, we were resetting the descriptor collection state in Bind
if we thought we were not in a transaction. Since #76792, we're always in a
transaction. You might think that'd mean that the logic would not run. Sadly,
for other still unclear reasons, when in an implicit transaction
`(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result
was that we'd erroneously reset our descriptor state in the middle of a multi-
statement implicit transaction if bind was invoked.

Fixes #82921

Release note (bug fix): Fixed a bug which could lead to errors when running
multiple schema change statements in a single command using a driver that
uses the extended pgwire protocol internally (Npgsql in .Net as an example).
These errors would have the form "attempted to update job for mutation 2, but
job already exists with mutation 1".
ajwerner added a commit that referenced this pull request Dec 14, 2022
For legacy reasons, we were resetting the descriptor collection state in Bind
if we thought we were not in a transaction. Since #76792, we're always in a
transaction. You might think that'd mean that the logic would not run. Sadly,
for other still unclear reasons, when in an implicit transaction
`(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result
was that we'd erroneously reset our descriptor state in the middle of a multi-
statement implicit transaction if bind was invoked.

Fixes #82921

Release note (bug fix): Fixed a bug which could lead to errors when running
multiple schema change statements in a single command using a driver that
uses the extended pgwire protocol internally (Npgsql in .Net as an example).
These errors would have the form "attempted to update job for mutation 2, but
job already exists with mutation 1".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql/pgwire: Sync command should destroy named portals when outside of explicit txn
3 participants