sql: validating FK constraint uses extremely slow hash join #33452
Labels
A-schema-changes
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
Currently
VALIDATE CONSTRAINT
for foreign keys performs a hash join on the source and target columns, which can be extremely slow.SCRUB
has a much faster implementation of fk validation that uses a merge join: I ran theSCRUB
fk validation on a tpcc-100 table on a roachprod cluster, and it only took a few seconds, compared to >1 hour (before I killed it) for the query that would be used inVALIDATE CONSTRAINT
. We should try to reimplementVALIDATE CONSTRAINT
to use the same approach asSCRUB
.There's no straightforward way to do this via SQL because null values need to be handled correctly, which requires an
ON
clause that contains a bunch ofa=b OR (a IS NULL AND b IS NULL)
clauses (as in the current implementation, https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/check.go#L115-L121). The waySCRUB
gets around this is to create a plan that uses a merge join, and then have a separate step to setNullEquality
to true on allMergeJoinerSpec
s in the plan.This will partially address #32118, but the unrelated issue of long transactions restarting still needs to be fixed.
The text was updated successfully, but these errors were encountered: