Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to identify different set operators & allow chain of multiple set operators #138

Merged
merged 29 commits into from
Feb 18, 2020

Conversation

Dencrash
Copy link
Contributor

No description provided.

@Dencrash
Copy link
Contributor Author

Dencrash commented Jan 6, 2020

Thanks for the comments and spotting some of the oversights!

…erator struct, remove opt_limit and opt_order change
@mrks
Copy link
Member

mrks commented Jan 11, 2020

What's the status on this? Once the current master is merged, is this ready for a final review?

@Dencrash
Copy link
Contributor Author

@mrks that depends. There is one bug still in there that is not yet closed. The bug is not something we added but already exists for the current system. The solution for that is relatively easy, but ugly as sin and we were stalling last week to see, if we can find a better solution.

We could do the following:
Let every query have a vector of set operation instead of only allowing one linked and nested set operation. This would mean that we don't overwrite old set operations anymore when specific parenthesis are used, but it means that we mix a vector with linked lists.

This is the only thing that is currently not done.

@mrks
Copy link
Member

mrks commented Jan 13, 2020

Can't you iterate over the chained list of select statements and add the set operation to the end? Something like

$$ = $1;
while ($$->setOperator) $$ = $$->setOperator;
$$->setOperator = $2;

No idea if bison likes that though.

Also, could you add some tests with similar (multiple, different, nested, ...) set operations in the WITH part of the query?

@Dencrash
Copy link
Contributor Author

Hanging onto the event would mean that we have the wrong order of operation. We also cannot use the other Select statement because the set operation there could also be overwritten.

We can add test with WITH Statements and more nested queries

@mrks
Copy link
Member

mrks commented Jan 13, 2020

Let's discuss this on Wednesday.

@Dencrash
Copy link
Contributor Author

@mrks I have added an explanation to the evaluation order of SetOperators as a comment. From our side this should be ready for your final review

Copy link
Member

@mrks mrks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing fundamental here, just trying to gain favors with posterity.

@Dencrash
Copy link
Contributor Author

Dencrash commented Feb 17, 2020

@mrks I think that I added all the requested changes. Let me know, if you think this is ready to merge

@mrks mrks merged commit e8ce1c4 into hyrise:master Feb 18, 2020
@Daniel-Xu
Copy link

Hi, @Dencrash. From the test case, I don't see that the setOperations has multiple children. Is there any case that requires the setOperations to be a vector instead of a selectOption pointer?

@Daniel-Xu
Copy link

From the comment in the source code.
I believe (SELECT * FROM students INTERSECT SELECT * FROM students_2) UNION SELECT * FROM students_3 ORDER BY grade ASC; this is the case I was talking about ↑

@mweisgut
Copy link
Contributor

mweisgut commented Jan 7, 2022

Hi, @Dencrash. From the test case, I don't see that the setOperations has multiple children. Is there any case that requires the setOperations to be a vector instead of a selectOption pointer?

Do you mean a setOperation pointer rather than a selectOption pointer?

I think you're right. Thank's for pointing to that issue. #199 contains a patch addressing it.

@Daniel-Xu
Copy link

Hi, @Dencrash. From the test case, I don't see that the setOperations has multiple children. Is there any case that requires the setOperations to be a vector instead of a selectOption pointer?

Do you mean a setOperation pointer rather than a selectOption pointer?

I think you're right. Thank's for pointing to that issue. #199 contains a patch addressing it.

yea, selectOption was a typo

@klauck
Copy link
Contributor

klauck commented Jan 17, 2022

Hi @Daniel-Xu , hi @mweisgut,
if setOperations is only a pointer, we cannot express the order of set operations anymore,
e.g., there is no way to express the difference between (Q1 EXCEPT Q2) EXCEPT Q3 and Q1 EXCEPT (Q2 EXCEPT Q3).
Using this example: For (Q1 EXCEPT Q2) EXCEPT Q3, Q1 must have two children and setOperations must be a vector. For Q1 EXCEPT (Q2 EXCEPT Q3), Q1 and Q2 have one child each.

@klauck
Copy link
Contributor

klauck commented Jan 17, 2022

Nevertheless, in the current version Q1 EXCEPT Q2 EXCEPT Q3 == Q1 EXCEPT (Q2 EXCEPT Q3), which I think is a bug.
It should be Q1 EXCEPT Q2 EXCEPT Q3 == (Q1 EXCEPT Q2) EXCEPT Q3.

@klauck
Copy link
Contributor

klauck commented Jan 17, 2022

@mweisgut pointed out that INTERSECT binds more tightly than UNION and EXCEPT. [https://www.postgresql.org/docs/14/queries-union.html]
This is not yet implemented in the parser.

@Daniel-Xu
Copy link

Hi @Daniel-Xu , hi @mweisgut, if setOperations is only a pointer, we cannot express the order of set operations anymore, e.g., there is no way to express the difference between (Q1 EXCEPT Q2) EXCEPT Q3 and Q1 EXCEPT (Q2 EXCEPT Q3). Using this example: For (Q1 EXCEPT Q2) EXCEPT Q3, Q1 must have two children and setOperations must be a vector. For Q1 EXCEPT (Q2 EXCEPT Q3), Q1 and Q2 have one child each.

Thanks for the response, this makes sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants