Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support primary key alternate syntax #7160

Merged
merged 5 commits into from
Aug 23, 2023

Conversation

parkma99
Copy link
Contributor

@parkma99 parkma99 commented Jul 31, 2023

Which issue does this PR close?

Closes #7152

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes , add a test case in datafusion/sql/tests/sql_integration.rs

Are there any user-facing changes?

@parkma99 parkma99 changed the title support primary key aother syntax feat: support primary key aother syntax Jul 31, 2023
@jonahgao
Copy link
Member

jonahgao commented Aug 1, 2023

Perhaps we should consider scenarios where primary keys are defined multiple times.

For example:

psql=> create table foo(
        a int primary key, 
        b int, 
        c int, 
        primary key(b,c)
);

ERROR:  multiple primary keys for table "foo" are not allowed
LINE 5:         primary key(b,c)

It will fail in PostgreSQL.

@parkma99 parkma99 marked this pull request as draft August 1, 2023 03:00
@parkma99
Copy link
Contributor Author

parkma99 commented Aug 1, 2023

Thanks @jonahgao , I will update this scenarios recently.

@mustafasrepo
Copy link
Contributor

mustafasrepo commented Aug 1, 2023

Perhaps we should consider scenarios where primary keys are defined multiple times.

For example:

psql=> create table foo(
        a int primary key, 
        b int, 
        c int, 
        primary key(b,c)
);

ERROR:  multiple primary keys for table "foo" are not allowed
LINE 5:         primary key(b,c)

It will fail in PostgreSQL.

I think we may choose not to fail, for these cases as long as both a and (b, c) satisfies primary key constraint.

@parkma99 parkma99 marked this pull request as ready for review August 1, 2023 13:12
@github-actions github-actions bot added the sql SQL Planner label Aug 1, 2023
@parkma99
Copy link
Contributor Author

parkma99 commented Aug 1, 2023

cc @mustafasrepo @jonahgao @alamb

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks good to me -- thank you @parkma99 (for this and all your other contributions)

I had one suggestion about adding a "not implemented" error (and maybe some tests that cover the unsupported syntax) but since this PR is better than what is on master I think it could also be merged as is

is_primary,
})
}
// TODO (parkma99) handle ForeignKey etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add code here that returns a Err(DataFusionError::NotImplemented) for other types of constraints -- I think this code will silently ignore them

Maybe it could be something like

match option.option {
  ast::ColumnOption::Unique { is_primary } => ... 
  option => return Err(DataFusionError::NotImplemented(format!("Constraint {} not implemented", option))),
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, given that the current code will also ignore such errors I don't think this change is required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should not return Error because of without any constraints is ok.


#[test]
fn plan_create_table_with_multi_pk() {
let sql = "create table person (id int, name string primary key, primary key(id))";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this means a multi part primary key -- postgres errors on this syntax

postgres=# create table person (id int, name varchar primary key, primary key(id));
ERROR:  multiple primary keys for table "person" are not allowed
LINE 1: ...e table person (id int, name varchar primary key, primary ke...

I think the way this is supposed to be expressed is like

create table person (id int, name varchar,  primary key(name,  id));

Copy link
Contributor

@mustafasrepo mustafasrepo Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this means a multi part primary key -- postgres errors on this syntax

postgres=# create table person (id int, name varchar primary key, primary key(id));
ERROR:  multiple primary keys for table "person" are not allowed
LINE 1: ...e table person (id int, name varchar primary key, primary ke...

I think the way this is supposed to be expressed is like

create table person (id int, name varchar,  primary key(name,  id));

I guess it is better to follow postgres in this case. Postgres supports multiple unique constraints. Hence following query

create table person (id int, name string primary key, primary key(id))

can be written as

create table person (id int, name string unique not null, primary key(id))

to define unique constraint. For writing composite primary key one have to use following syntax

create table person (id int, name varchar,  primary key(name,  id));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parkma99 is this comment still outstanding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I am at home this week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worries and no pressure! I was just trying to understand what the state of this PR was -- it looks like you are aware of the feedback and just haven't had a chance to implement it. I wanted to make sure you weren't waiting on additional feedback

I'll mark it as draft so it doesn't appear on the list of PRs needing review

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think two primary keys on the same column doesn't really make sense, but we can always make this check more strict as a follow on PR I think

.trim();
quick_test(sql, plan);

let sql = "create table person (id int primary key, name string)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding unique constraints as well

create table person (id int unique, name string)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a test case about unique constraints.

@alamb alamb changed the title feat: support primary key aother syntax feat: support primary key alternate syntax Aug 1, 2023
@alamb
Copy link
Contributor

alamb commented Aug 8, 2023

Marking as draft per #7160 (comment)

@alamb alamb marked this pull request as draft August 8, 2023 20:26
@parkma99 parkma99 marked this pull request as ready for review August 19, 2023 14:45
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Aug 19, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me now; What do you think @mustafasrepo ?


#[test]
fn plan_create_table_with_multi_pk() {
let sql = "create table person (id int, name string primary key, primary key(id))";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think two primary keys on the same column doesn't really make sense, but we can always make this check more strict as a follow on PR I think

@alamb alamb merged commit 727e8c6 into apache:main Aug 23, 2023
@mustafasrepo
Copy link
Contributor

Sorry for the late response. I was at vacation, and didn't have time to look. Thanks @parkma99 for this PR. Thanks @alamb for the review.

@parkma99 parkma99 deleted the feat/support_primary_key branch August 31, 2023 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for primary key parsing.
4 participants