Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for CHECK constraint in INSERT statement in engine and SPI #14964

Merged
merged 2 commits into from
Jan 25, 2023

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Nov 9, 2022

Description

Add support for CHECK constraint in INSERT statement to engine and SPI.

Release notes

(x) Release notes are required, with the following suggested text:

# General
* Add support for `CHECK` constraints in `INSERT` statement. ({issue}`14964`)

@ebyhr
Copy link
Member Author

ebyhr commented Nov 9, 2022

CI hit #14441

@ebyhr ebyhr requested review from findepi and martint November 10, 2022 01:49
@ebyhr ebyhr changed the title Add support for CHECK constraint in INSERT statement Add support for CHECK constraint in INSERT statement in engine and SPI Nov 10, 2022
@@ -21,14 +21,17 @@
{
private final SchemaTableName table;
private final List<ColumnSchema> columns;
private final List<String> checkConstraints;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this here? Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use from StatementAnalyzer. I would add a new method in Metadata if it's better than here. Or are you suggesting moving to the under column?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like not belonging here at all. I'd like to understand this more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like it needs to stay here (or we would have to remove almost all usages of getTableSchema and migrate them to getTableMetadata)

.matches("SELECT BIGINT '1'");

assertThatThrownBy(() -> assertions.query("INSERT INTO mock.tiny.nation VALUES (26, 'POLAND', 0, 'No comment')"))
.hasMessage("Cannot insert row that does not match to a check constraint: (\"nationkey\" > CAST(100 AS bigint))");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have planner tests that assert actual plans?

@martint @kasiafi would it be a good idea here?

@electrum
Copy link
Member

Do we want to enforce that the check expression does not contain subqueries?

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 61cc472 to 1666575 Compare November 17, 2022 08:48
@ebyhr
Copy link
Member Author

ebyhr commented Nov 17, 2022

I think so. Changed to deny an expression having subqueries.

@@ -21,14 +21,17 @@
{
private final SchemaTableName table;
private final List<ColumnSchema> columns;
private final List<String> checkConstraints;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like it needs to stay here (or we would have to remove almost all usages of getTableSchema and migrate them to getTableMetadata)

}
for (String checkConstraint : checkConstraints) {
ViewExpression filter = new ViewExpression(session.getIdentity().getUser(), Optional.of(name.getCatalogName()), Optional.of(name.getSchemaName()), checkConstraint);
Expression expression = analyzeRowFilter(session.getIdentity().getUser(), table, name, accessControlScope, filter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analyzeRowFilter technically does more or less what we want -- it takes a String expression and produces an Expression.
However, the method name and checks done in that method (e.g. analysis.hasRowFilter(name, currentIdentity) or analysis.registerTableForRowFiltering(name, currentIdentity)) are not applicable to check constraints.

Let's rollback changes to analyzeRowFilter.
Let's extract common code for converting connector-provided String into Expression (with due verifications).

// Predicate evaluates to UNKNOWN (e.g. NULL > 100) should not violate check constraint
new CoalesceExpression(predicate, TRUE_LITERAL),
TRUE_LITERAL,
new Cast(failFunction(metadata, session, CONSTRAINT_VIOLATION, "Cannot insert row that does not match to a check constraint: " + predicate), toSqlType(BOOLEAN)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Check constraint violation: " + predicate ?

what does eg PostgreSQL print?
what does MySQL print?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the error message.

PostgreSQL

tpch=# create table test (c1 int check (c1 > 0));
tpch=# insert into test values (-1);
ERROR:  new row for relation "test" violates check constraint "test_c1_check"
DETAIL:  Failing row contains (-1).

MySQL

mysql> create table test (c1 int, check (c1 > 0));
mysql> insert into test values (-1);
ERROR 3819 (HY000): Check constraint 'test_chk_1' is violated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like ours better, since it mentions the check formula. (we don't have a check name, and maybe we don't need one for error reporting)
I don't think we could do this for row filters, but i think we can do this for check constraints.


public RelationPlan addRowFilters(List<Expression> filters, Table node, RelationPlan plan, Function<Expression, Expression> predicateTransformation, Function<Table, Scope> accessControlScope)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for inserts/update/merge, the name addRowConstraints would be better.
Row filters become "row constraints", that's fine.
(the opposite, calling "check constraints" a "row filter" doesn't sound right)

@@ -210,6 +210,11 @@
</item>
<!-- Backwards incompatible changes since the previous release -->
<!-- Any exclusions below can be deleted after each release -->
<item>
<code>java.method.numberOfParametersChanged</code>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply keep the old constructor as deprecated?

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 1666575 to f361888 Compare November 18, 2022 04:57
@ebyhr
Copy link
Member Author

ebyhr commented Nov 18, 2022

CI core/trino-main hit #12627. Going to push an empty commit.

// Predicate evaluates to UNKNOWN (e.g. NULL > 100) should not violate check constraint
new CoalesceExpression(predicate, TRUE_LITERAL),
TRUE_LITERAL,
new Cast(failFunction(metadata, session, CONSTRAINT_VIOLATION, "Cannot insert row that does not match to a check constraint: " + predicate), toSqlType(BOOLEAN)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like ours better, since it mentions the check formula. (we don't have a check name, and maybe we don't need one for error reporting)
I don't think we could do this for row filters, but i think we can do this for check constraints.

@@ -23,6 +23,12 @@
private final List<ColumnSchema> columns;
private final List<String> checkConstraints;

@Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the new one experimental and remove deprecation. For now. We will add deprecation when the new one is no longer experimental.

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch 2 times, most recently from a43263c to d5193c1 Compare November 21, 2022 02:38
@ebyhr
Copy link
Member Author

ebyhr commented Nov 21, 2022

Addressed comments and added tests for other statements, UPDATE, MERGE and DELETE.

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from d11a4d6 to 850f38c Compare December 6, 2022 07:28
@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 850f38c to f68c4a9 Compare December 14, 2022 02:47
@ebyhr
Copy link
Member Author

ebyhr commented Dec 14, 2022

Rebased on upstream to resolve conflicts.

}

private void analyzeFiltersAndMasks(Table table, QualifiedObjectName name, Optional<TableHandle> tableHandle, RelationType relationType, String authorization)
private void analyzeFiltersAndMasks(Table table, QualifiedObjectName name, Optional<TableHandle> tableHandle, RelationType relationType, String authorization, List<String> checkConstraints)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was checkConstraints added to this method? They don't belong here. This method is specifically about masks and filters on a source table, and check constraints are none of that. They don't even apply to a source table, only to the target of an insert or update.

catch (ParsingException e) {
throw new TrinoException(INVALID_ROW_FILTER, extractLocation(table), format("Invalid row filter for '%s': %s", name, e.getErrorMessage()), e);
}
Expression expression = parseRowConstraintExpression(table, name, filter, INVALID_ROW_FILTER, "Invalid row filter");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this. This is spurious reuse of functionality. We don't need to add an abstraction for a "row constraint expression". It's a bit of a stretch to call row filters and check constraints "constraint expressions"


analysis.recordSubqueries(expression, expressionAnalysis);

private void addCoercionForRowConstraint(Table table, QualifiedObjectName name, Expression expression, ExpressionAnalysis expressionAnalysis, String constraintType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing special about a "row constraint" in this method. If anything, it should be called coerceToBoolean, but it also feels like spurious reuse of code.

@@ -251,26 +251,30 @@ protected RelationPlan visitTable(Table node, Void context)
}
}

plan = addRowFilters(node, plan);
plan = addRowConstraints(node, plan);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above, row filters and check constraints are two difference concepts. Let's not mix them up.

@ebyhr
Copy link
Member Author

ebyhr commented Dec 15, 2022

Fixing CI failures.

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from e6f2f22 to a390038 Compare December 21, 2022 02:25
@ebyhr
Copy link
Member Author

ebyhr commented Dec 21, 2022

@martint Thanks for your review. Addressed comments.

@ebyhr
Copy link
Member Author

ebyhr commented Jan 4, 2023

Rebased on upstream to resolve conflicts.

@ebyhr
Copy link
Member Author

ebyhr commented Jan 4, 2023

@martint Gentle reminder.

1 similar comment
@ebyhr
Copy link
Member Author

ebyhr commented Jan 9, 2023

@martint Gentle reminder.

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 58e4c54 to 7daee6c Compare January 12, 2023 03:39
@ebyhr
Copy link
Member Author

ebyhr commented Jan 12, 2023

Rebased on upstream to resolve conflicts.

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 7daee6c to 9e28882 Compare January 12, 2023 05:36
public List<Expression> getRowFilters(Table node)
{
return rowFilters.getOrDefault(NodeRef.of(node), ImmutableList.of());
}

public List<Expression> getCheckConstraints(Table node)
{
return checkConstraints.getOrDefault(NodeRef.of(node), ImmutableList.of());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap in unmodifiableList

Comment on lines +3200 to +3202
// TODO https://github.com/trinodb/trino/issues/15411 Add support for CHECK constraint to UPDATE statement
throw semanticException(NOT_SUPPORTED, update, "Updating a table with a check constraint is not supported");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

* List of constraints data in a table is expected to satisfy.
* Engine ensures rows written to a table meet these constraints.
* A check constraint is satisfied when it evaluates to True or Unknown.
* @return List of string representation of a Trino SQL scalar expression that can refer to table columns by name and produces a result coercible to boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intellij formatter would put an empty line before @return line. Would yours too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* List of constraints data in a table is expected to satisfy.
* Engine ensures rows written to a table meet these constraints.
* A check constraint is satisfied when it evaluates to True or Unknown.
* @return List of string representation of a Trino SQL scalar expression that can refer to table columns by name and produces a result coercible to boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intellij formatter would put an empty line before @return line. Would yours too?

throw new TrinoException(TYPE_MISMATCH, extractLocation(table), format("Expected check constraint for '%s' to be of type BOOLEAN, but was %s", name, actualType), null);
}

analysis.addCoercion(expression, BOOLEAN, coercion.isTypeOnlyCoercion(actualType, BOOLEAN));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this coercion getting applied?
(see #15744)

throw semanticException(EXPRESSION_NOT_IN_DISTINCT, expression, "Check constraint expression should be deterministic");
}
if (containsCurrentTimeFunctions(expression)) {
throw semanticException(EXPRESSION_NOT_IN_DISTINCT, expression, "Check constraint expression should not contain CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME or LOCALTIMESTAMP");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should not contain temporal expression"
or "should not be time-dependent"

}

// Ensure that the expression doesn't contain non-deterministic functions. This should be "retrospectively deterministic" per SQL standard.
if (!DeterminismEvaluator.isDeterministic(expression, this::getResolvedFunction)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can import isDeterministic statically

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 9e28882 to 8f35174 Compare January 18, 2023 23:48
@ebyhr
Copy link
Member Author

ebyhr commented Jan 19, 2023

trino-redis hit #13779

@ebyhr
Copy link
Member Author

ebyhr commented Jan 19, 2023

I will merge this PR tomorrow if there's no objection.

@@ -2193,17 +2211,8 @@ private void checkStorageTableNotRedirected(QualifiedObjectName source)
});
}

private void analyzeFiltersAndMasks(Table table, QualifiedObjectName name, Optional<TableHandle> tableHandle, List<Field> fields, String authorization)
private void analyzeFiltersAndMasks(Table table, QualifiedObjectName name, RelationType relationType, Scope accessControlScope)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense. Please add that short explanation to the commit message so it doesn't get lost.

expression = sqlParser.createExpression(filter.getExpression(), createParsingOptions(session));
}
catch (ParsingException e) {
throw new TrinoException(INVALID_ROW_FILTER, extractLocation(table), format("Invalid check constraint for '%s': %s", name, e.getErrorMessage()), e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong error code. Check constraints are not row filters.

}

// Ensure that the expression doesn't contain non-deterministic functions. This should be "retrospectively deterministic" per SQL standard.
if (!isDeterministic(expression, this::getResolvedFunction)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXPRESSION_NOT_IN_DISTINCT is the wrong error code for this.

@@ -4646,6 +4676,62 @@ private void analyzeRowFilter(String currentIdentity, Table table, QualifiedObje
analysis.addRowFilter(table, expression);
}

private void analyzeCheckConstraint(Table table, QualifiedObjectName name, Scope scope, ViewExpression filter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using filter for the argument name is a little misleading. Rename it to constraint

@@ -291,6 +291,29 @@ public RelationPlan addRowFilters(Table node, RelationPlan plan, Function<Expres
return new RelationPlan(planBuilder.getRoot(), plan.getScope(), plan.getFieldMappings(), outerContext);
}

public RelationPlan addCheckConstraints(List<Expression> filters, Table node, RelationPlan plan, Function<Expression, Expression> predicateTransformation, Function<Table, Scope> accessControlScope)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's "predicate transformation"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a function inserting an expression to fail when the constraint is violated. Do you have any suggestions for the new variable name?

    private static Function<Expression, Expression> failIfCheckConstraintIsNotMet(Metadata metadata, Session session)
    {
        return predicate -> new IfExpression(
                // When predicate evaluates to UNKNOWN (e.g. NULL > 100), it should not violate the check constraint.
                new CoalesceExpression(predicate, TRUE_LITERAL),
                TRUE_LITERAL,
                new Cast(failFunction(metadata, session, CONSTRAINT_VIOLATION, "Check constraint violation: " + predicate), toSqlType(BOOLEAN)));
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that abstraction is even necessary. Why not inline that logic directly in addCheckConstraints?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e., just modify addCheckConstraints to do this:

public RelationPlan addCheckConstraints(List<Expression> constraints, Table node, RelationPlan plan, Function<Table, Scope> accessControlScope)
{
    ...
    for (Expression constraint : constraints) {
        ...

        Expression predicate = new IfExpression(
                // When predicate evaluates to UNKNOWN (e.g. NULL > 100), it should not violate the check constraint.
                new CoalesceExpression(coerceIfNecessary(analysis, constraint, planBuilder.rewrite(constraint)), TRUE_LITERAL),
                TRUE_LITERAL,
                new Cast(failFunction(plannerContext.getMetadata(), session, CONSTRAINT_VIOLATION, "Check constraint violation: " + predicate), toSqlType(BOOLEAN)));

        planBuilder = planBuilder.withNewRoot(new FilterNode(
                idAllocator.getNextId(),
                planBuilder.getRoot(),
                predicate));
    }

    ...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint Agree with inline. Updated.

@@ -291,6 +291,29 @@ public RelationPlan addRowFilters(Table node, RelationPlan plan, Function<Expres
return new RelationPlan(planBuilder.getRoot(), plan.getScope(), plan.getFieldMappings(), outerContext);
}

public RelationPlan addCheckConstraints(List<Expression> filters, Table node, RelationPlan plan, Function<Expression, Expression> predicateTransformation, Function<Table, Scope> accessControlScope)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly as above, the name filters is misleading. Rename it to constraints.

@Test
public void testInsert()
{
assertions.query("INSERT INTO mock.tiny.nation VALUES (101, 'POLAND', 0, 'No comment')")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertThat(assertions.query(...))
    .matches(...)


assertThatThrownBy(() -> assertions.query("INSERT INTO mock.tiny.nation VALUES (26, 'POLAND', 11, 'No comment')"))
.hasMessage("Check constraint violation: (\"regionkey\" < CAST(10 AS bigint))");
assertThatThrownBy(() -> assertions.query("INSERT INTO mock.tiny.nation VALUES "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use text blocks

public void testMergeInsert()
{
// Within allowed row filter
assertThatThrownBy(() -> assertions.query("""
Copy link
Member

@martint martint Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format as:

       assertThatThrownBy(() -> assertions.query("""
                MERGE INTO mock.tiny.nation 
                USING (VALUES 42) t(dummy) 
                ON false
                WHEN NOT MATCHED THEN 
                    INSERT VALUES (101, 'POLAND', 0, 'No comment')
                """))
                .hasMessage("line 1:1: Cannot merge into a table with check constraints");

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 8f35174 to 20319f5 Compare January 19, 2023 23:06
@ebyhr
Copy link
Member Author

ebyhr commented Jan 19, 2023

@martint Addressed comments.

@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from 20319f5 to aa09214 Compare January 20, 2023 22:13
@ebyhr ebyhr requested a review from martint January 24, 2023 03:37
{
// Predicate evaluates to UNKNOWN (e.g. NULL > 100) should not violate check constraint
assertions.query("INSERT INTO mock.tiny.nation(nationkey) VALUES (null)")
.assertThat()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertThat(assertions.query(...), here and everywhere else in this file

The same Scope instance should be used between row filters and
check constraints in StatementAnalyzer. Otherwise, PlanSanityChecker
throws "Unexpected identifier in logical plan" error.
@ebyhr ebyhr force-pushed the ebi/core-check-constraint branch from aa09214 to cf39155 Compare January 24, 2023 23:15
@ebyhr ebyhr merged commit d2003fc into trinodb:master Jan 25, 2023
@ebyhr ebyhr deleted the ebi/core-check-constraint branch January 25, 2023 02:21
@ebyhr ebyhr mentioned this pull request Jan 25, 2023
@github-actions github-actions bot added this to the 406 milestone Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

6 participants