Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-19.1: pgerror: avoid wrapping special roachpb errors #35939

Merged
merged 2 commits into from
Mar 19, 2019

Conversation

knz
Copy link
Contributor

@knz knz commented Mar 19, 2019

Backport 1/1 commits from #35937.
Backport 1/1 commits from #35944.

/cc @cockroachdb/release


Fixes #35883.
Fixes #35943

Prior to this patch, the following scenario was possible:

1. some SQL function A() calls client.Txn() to call B() on its behalf
   with a retry loop.
2. client.Txn() calls B()
3. B() runs KV operation, encounters some *roachpb.RetryError
4. B() calls `pgerror.Wrap()` on that retry error, turns it into pg
   error with code 40001
5. B() returns into client.Txn()
6. client.Txn() tests the type of the error expecting to find a
   roachpb error for retries. Because it's a pgerror now, the test
   fails and the retry error is not handled automatically. It flows
   out.
7. A() receives a retry error when it was not expecting any.

The error would incur spurious DDL failures, with the following
example:

```
--- FAIL: TestLogic/fakedist-opt/computed (2.66s)
    logic.go:2346:

        testdata/logic_test/computed:687:
        expected:
        division by zero

        got:
        pq: column "d" being dropped, try again later
```

(This particular test would fail after just a few attempts during a
stress run.)

This patch fixes this scenario by ensuring that retry and ambiguous
result errors flow through `pgerror.Wrapf` without being flattened
into a `pgerror.Error`, and instead remain wrapped as the original
cause via `errors.Wrapf`.

The patch is verified to work by stress testing the logic tests and
verifying that the various error symptoms do not occur any more.

Additionally, two missing calls to `errors.Cause()` were added to
ensure any enclosed roachpb error is visible to a type check.

Release note: None
@knz knz requested review from bobvawter and tbg March 19, 2019 13:55
@knz knz requested review from a team March 19, 2019 13:55
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member

tbg commented Mar 19, 2019 via email

@knz
Copy link
Contributor Author

knz commented Mar 19, 2019

LGTM but wait until the other one has merged please.

Works for me. I assume you also want a green master build?

@tbg
Copy link
Member

tbg commented Mar 19, 2019 via email

@knz
Copy link
Contributor Author

knz commented Mar 19, 2019

No, just for it to have merged to master (there have been cases where we merged the backport before the original, and then some change had to be made to the original, etc...)

Oh I see. Well it's merged now.

@knz
Copy link
Contributor Author

knz commented Mar 19, 2019

... and we're getting another totally unrelated failure here. I'm going to check whether my patch is responsible for it.

This patch ensures that the full error message (pre-unwrap) is
included for errors that flow out of distsql process towards the
gateway.

This is important because some other internal mechanisms inside
CockroachDB are expecting to see the prefixes introduced by
errors.Wrap to decide other things (a bad idea, but not fixable in the
short term), for example see issue cockroachdb#35942.

Release note: None
@knz
Copy link
Contributor Author

knz commented Mar 19, 2019

The new failure is #35943.
I fixed it in #35944, and added that commit to this backport PR.

I tested locally that stress testing the import code does not yield the error any more.

@RaduBerinde
Copy link
Member

LGTM

@RaduBerinde RaduBerinde merged commit f4fa260 into cockroachdb:release-19.1 Mar 19, 2019
@knz knz deleted the backport19.1-35937 branch March 20, 2019 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants