Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: support tpc-c under read committed #100176

Closed
nvanbenschoten opened this issue Mar 30, 2023 · 2 comments · Fixed by #113834
Closed

roachtest: support tpc-c under read committed #100176

nvanbenschoten opened this issue Mar 30, 2023 · 2 comments · Fixed by #113834
Assignees
Labels
A-read-committed Related to the introduction of Read Committed A-testing Testing tools and infrastructure C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Mar 30, 2023

As part of the Read Committed implementation, nightly testing of TPC-C will be adapted to run at the Read Committed isolation level. When doing so, transaction retry loops will be removed from the workload to validate that retry errors are rare or non-existent under weak isolation.

TPC-C provides a useful sandbox to test weaker isolation levels because it contains three moderately complex read-write transactions, two read-only transactions, a diverse schema with referential integrity constraints, and twelve post-workload consistency checks.

Jira issue: CRDB-26569

Epic CRDB-26548

@nvanbenschoten nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-testing Testing tools and infrastructure A-read-committed Related to the introduction of Read Committed labels Mar 30, 2023
@exalate-issue-sync exalate-issue-sync bot added the T-sql-queries SQL Queries Team label Apr 5, 2023
@mgartner mgartner moved this to 23.2 Release in SQL Queries Jul 24, 2023
@nvanbenschoten
Copy link
Member Author

When doing this, we will also need to ensure that the transaction procedures can tolerate weak isolation. Comparing to benchbase, the one place where it looks like we'll need to add explicit row-level locking is:

diff --git a/pkg/workload/tpcc/new_order.go b/pkg/workload/tpcc/new_order.go
index 5152a2841a6..1ed02a5199b 100644
--- a/pkg/workload/tpcc/new_order.go
+++ b/pkg/workload/tpcc/new_order.go
@@ -302,7 +302,8 @@ func (n *newOrder) run(ctx context.Context, wID int) (interface{}, error) {
                                        SELECT s_quantity, s_ytd, s_order_cnt, s_remote_cnt, s_data, s_dist_%02[1]d
                                        FROM stock
                                        WHERE (s_i_id, s_w_id) IN (%[2]s)
-                                       ORDER BY s_i_id`,
+                                       ORDER BY s_i_id
+                                       FOR UPDATE`,
                                        d.dID, strings.Join(stockIDs, ", "),
                                ),
                        )

It would be interesting to see whether the consistency check at the end of the workload would catch this.

@nvanbenschoten
Copy link
Member Author

nvanbenschoten commented Oct 7, 2023

They do, nice! workload check tpcc --expensive-checks fails on checks 3.3.2.10 and 3.3.2.12 with Read Committed but not Serializable. Adding just that FOR UPDATE doesn't fix the issue though, so there's something else going wrong.

The checks also don't fail under Repeatable Read.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Nov 4, 2023
Informs cockroachdb#100176.

This commit adds an `--isolation-level` flag to tpcc, which controls the
isolation level to run the workload transactions under. If unset, the
workload will run with the default isolation level of the database.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Nov 5, 2023
Informs cockroachdb#100176.

This commit adds SELECT FOR UPDATE locking in two places to ensure that
the workload avoids anomalies when run under Read Committed isolation.

The first of these is in the NewOrder transaction, when querying the
"stock" table in preparation for updating quantities and order counts
for the items in an order. There are no consistency checks which fail
without this, but the locking is present in benchbase (https://github.com/cmu-db/benchbase/blob/546afa60dae4f8a6b00b84b77c77ff7684e494ad/src/main/java/com/oltpbenchmark/benchmarks/tpcc/procedures/NewOrder.java#L88)
and makes sense to do.

The second of these is in the Delivery transaction, when querying the
"new_order" table to select an order to deliver. The order selected is
processed by the transaction, including updating counters in the
corresponding "customer" row, so it's important to have full isolation.
Without this, consistency checks `3.3.2.10` and `3.3.2.12` (`workload
check tpcc --expensive-checks`) do fail, presumably because a customer's
row is updated twice for a single order.

This use of SELECT FOR UPDATE in the Delivery transaction is an
alternative to a patch like 36709df, which would probably be more
efficient than the approach we have here, but would not exercise the
database in an interesting way. We opt to use SELECT FOR UPDATE.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Nov 5, 2023
Closes cockroachdb#100176.

This commit adds the following two roachtest variants:
```
tpcc-nowait/isolation-level=read-committed/nodes=3/w=1
tpcc/headroom/isolation-level=read-committed/n4cpu16
```

It also ensures that the `tpcc-nowait` tests runs the full set of expensive
consistency checks at the end. The "nowait" variant run a more heavily
contended version of tpcc, but with few warehouses, so the checks should
still be fast.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Nov 15, 2023
Informs cockroachdb#100176.

This commit adds an `--isolation-level` flag to tpcc, which controls the
isolation level to run the workload transactions under. If unset, the
workload will run with the default isolation level of the database.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Nov 15, 2023
Informs cockroachdb#100176.

This commit adds SELECT FOR UPDATE locking in two places to ensure that
the workload avoids anomalies when run under Read Committed isolation.

The first of these is in the NewOrder transaction, when querying the
"stock" table in preparation for updating quantities and order counts
for the items in an order. There are no consistency checks which fail
without this, but the locking is present in benchbase (https://github.com/cmu-db/benchbase/blob/546afa60dae4f8a6b00b84b77c77ff7684e494ad/src/main/java/com/oltpbenchmark/benchmarks/tpcc/procedures/NewOrder.java#L88)
and makes sense to do.

The second of these is in the Delivery transaction, when querying the
"new_order" table to select an order to deliver. The order selected is
processed by the transaction, including updating counters in the
corresponding "customer" row, so it's important to have full isolation.
Without this, consistency checks `3.3.2.10` and `3.3.2.12` (`workload
check tpcc --expensive-checks`) do fail, presumably because a customer's
row is updated twice for a single order.

This use of SELECT FOR UPDATE in the Delivery transaction is an
alternative to a patch like 36709df, which would probably be more
efficient than the approach we have here, but would not exercise the
database in an interesting way. We opt to use SELECT FOR UPDATE.

Release note: None
craig bot pushed a commit that referenced this issue Nov 15, 2023
113719: kvstreamer: adjust recently added tracing r=yuzefovich a=yuzefovich

This commit makes some minor adjustments to the recently added tracing in the streamer:
- `singleRangeBatch.String()` now has a more sane behavior when it contains many requests (previously, we would truncate the requests but would keep everything else, now we include the full information only about the first 5 and the last 5 "sub-requests")
- that method also no longer includes `r.reqsKeys` because this field is redundant with `r.reqs` and is likely to be empty anyway
- the "exit" message in `GetResults` now specifies the number of results and the error if present
- redundant "incomplete Get" and "incomplete Scan" messages are removed (they add very little additional information - the number of incomplete Gets is already printed elsewhere, plus the KV layer already specifies whether each Get / Scan request resulted in a "resume span" meaning it was incomplete).

Epic: None

Release note: None

113834: workload/tpcc: support Read Committed isolation r=nvanbenschoten a=nvanbenschoten

Closes #100176.

This PR consists of a series of commits which together add support for Read Committed isolation to the TPC-C workload and then use it to add new roachtest variants.

See individual commits, including an interesting change to explicit row-level locking in TPC-C transactions to avoid concurrency anomalies.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in b9bdc43 Nov 15, 2023
@github-project-automation github-project-automation bot moved this from 23.2 Release to Done in SQL Queries Nov 15, 2023
blathers-crl bot pushed a commit that referenced this issue Jan 12, 2024
Informs #100176.

This commit adds an `--isolation-level` flag to tpcc, which controls the
isolation level to run the workload transactions under. If unset, the
workload will run with the default isolation level of the database.

Release note: None
blathers-crl bot pushed a commit that referenced this issue Jan 12, 2024
Informs #100176.

This commit adds SELECT FOR UPDATE locking in two places to ensure that
the workload avoids anomalies when run under Read Committed isolation.

The first of these is in the NewOrder transaction, when querying the
"stock" table in preparation for updating quantities and order counts
for the items in an order. There are no consistency checks which fail
without this, but the locking is present in benchbase (https://github.com/cmu-db/benchbase/blob/546afa60dae4f8a6b00b84b77c77ff7684e494ad/src/main/java/com/oltpbenchmark/benchmarks/tpcc/procedures/NewOrder.java#L88)
and makes sense to do.

The second of these is in the Delivery transaction, when querying the
"new_order" table to select an order to deliver. The order selected is
processed by the transaction, including updating counters in the
corresponding "customer" row, so it's important to have full isolation.
Without this, consistency checks `3.3.2.10` and `3.3.2.12` (`workload
check tpcc --expensive-checks`) do fail, presumably because a customer's
row is updated twice for a single order.

This use of SELECT FOR UPDATE in the Delivery transaction is an
alternative to a patch like 36709df, which would probably be more
efficient than the approach we have here, but would not exercise the
database in an interesting way. We opt to use SELECT FOR UPDATE.

Release note: None
nvanbenschoten added a commit that referenced this issue Jan 17, 2024
Closes #100176.

This commit adds the following two roachtest variants:
```
tpcc-nowait/isolation-level=read-committed/nodes=3/w=1
tpcc/headroom/isolation-level=read-committed/n4cpu16
```

It also ensures that the `tpcc-nowait` tests runs the full set of expensive
consistency checks at the end. The "nowait" variant run a more heavily
contended version of tpcc, but with few warehouses, so the checks should
still be fast.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-read-committed Related to the introduction of Read Committed A-testing Testing tools and infrastructure C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants