Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding Derived Columns and Expression Syntax #3782

Merged
merged 59 commits into from
Nov 8, 2022

Conversation

jdunkerley
Copy link
Member

@jdunkerley jdunkerley commented Oct 7, 2022

Pull Request Description

  • Added expression ANTLR4 grammar and sbt based build.
  • Added expression support to set and filter on the Database and InMemory Table.
  • Added expression support to aggregate on the Database and InMemory Table.
  • Removed old aggregate functions (sum, max, min and mean) from Column types.
  • Adjusted database Column + operator to do concatenation (||) when text types.
  • Added power operator ^ to both Column types.
  • Adjust iif to allow for columns to be passed for when_true and when_false parameters.
  • Added is_present to database Column type.
  • Added coalesce, min and max functions to both Column types performing row based operation.
  • Added support for Date, Time_Of_Day and Date_Time constants in database.
  • Added read method to InMemory Column returning self (or a slice).

Important Notes

  • Moved approximate type computation to SQL_Type.
  • Fixed issue in LongNumericOp where it was always casting to a double.
  • Removed head from InMemory Table (still has first method).

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the
    Scala,
    Java,
    and
    Rust
    style guides.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed: Enso GUI was tested when built using BOTH
      ./run ide build and ./run ide watch.

@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch 3 times, most recently from dd1eecb to 4b52532 Compare October 11, 2022 12:10
@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch 2 times, most recently from af9f9aa to d4fcc24 Compare October 18, 2022 14:40
@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch 5 times, most recently from 8bca32f to ea997e4 Compare October 31, 2022 09:45
@jdunkerley jdunkerley changed the title Prototyping derived expressions... Expanding Derived Columns and Expression Syntax Nov 1, 2022
@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch 2 times, most recently from 63cef2c to 06fdc78 Compare November 4, 2022 11:18
Comment on lines +871 to +877
_ : Text ->
table_at = self.at column
if table_at.is_error.not then self.filter table_at filter on_problems else
expression = self.evaluate column
if expression.is_error.not then self.filter expression filter on_problems else
pick_error = expression.catch Expression_Error.Syntax_Error (_->table_at)
on_problems.handle_errors pick_error fallback=self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems exactly the same as in DB, right? I'm wondering if we should extract it to some Table_Helpers to DRY. Not entirely sure though.

std-bits/table/src/main/antlr4/Expression.g4 Outdated Show resolved Hide resolved
std-bits/table/src/main/antlr4/Expression.g4 Outdated Show resolved Hide resolved
std-bits/table/src/main/antlr4/Expression.g4 Show resolved Hide resolved
@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch from 9c07170 to cc37a0e Compare November 4, 2022 16:36
@jdunkerley jdunkerley marked this pull request as ready for review November 4, 2022 17:35
test/Table_Tests/src/Expression_Spec.enso Outdated Show resolved Hide resolved
test/Table_Tests/src/Expression_Spec.enso Outdated Show resolved Hide resolved
test/Table_Tests/src/Expression_Spec.enso Outdated Show resolved Hide resolved
test/Table_Tests/src/Expression_Spec.enso Show resolved Hide resolved
test/Table_Tests/src/Expression_Spec.enso Show resolved Hide resolved
test/Table_Tests/src/Expression_Spec.enso Show resolved Hide resolved
Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results are really amazing! 🎉

In general it looks good to me, I have just some style comments.
I'm thinking if it would be worth to add some more comments explaining how ExpressionVisitorImpl is meant to work - I think such documentation will be helpful in maintaining it in the future.

Comment on lines +114 to +117
return metaObject != null && metaObject.asHostObject() instanceof Class<?>
? makeConstantColumn.apply(value)
: value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand this condition - I understand that if an object comes from Java, we pass it over to makeConstantColumn, but otherwise we just pass it as-is. Why is that so? The name wrapAsColumn is not clear enough to me to say what should be expected here.
Can we add some comment to this function explaining what it is meant to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking forward at the code I think I now understand what is the meaning, but now I'm afraid of this logic.
It will make a constant column from any Java value but pass Enso values as-is. I guess for now it is OK, but if in the future we allow somehow in the expression to take hold of a singleton Enso value, this part of the logic will break - we'd expect it to wrap the value to a column, but it would be passed as-is instead.

Can we change the logic to detect the column? Maybe we can look at the MetaObject and see if it matches the moduleName (when Jaroslav's changes with MetaObjects for Atoms are integrated).

If not, let's at least add a comment warning that we may need to be careful in case of any changes.

@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch from 773371d to f40dc73 Compare November 6, 2022 17:36
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the reasons why this DSL is being created, I am sad to see the Enso language itself was not found capable enough to provide similar (and better) user experience. My hope is that it is just a temporary state and the Enso language improves to provide good enough support to turn this DSL into an internal DSL.

If you share a similar hope, then it'd make sense to package this DSL as a separate library that extends the Standard.Table one. That would allow for a smoother replacement with an internal DSL in the future.

expression_test "[B] / [A]" [1, 0.75, 0.8333333333333334, 1, 1.2]
expression_test "[A] ^ [B]" [1, 2.8284271247461903, 15.588457268119896, 256, 15625]
expression_test "[A] % [B]" [0, 0.5, 0.5, 0, 5]
expression_test "[A] + [B] + [A]" [3, 5.5, 8.5, 12, 16]
Copy link
Member

@JaroslavTulach JaroslavTulach Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ten years ago I had a DSL vs. API design shootout with a friend at a conference. We had to write our notes down to even understand each other. Given our classification, what we have here is an external DSL with its own parser embedded in an Enso string.

A string based DSL approach will increase the complexity of our tooling. We will not be able to benefit from Enso type system, use Enso suggestions DB, etc. We will have to find ways for the IDE to understand this DSL syntax.

Given the sole purpose of Enso was to work effectively with tables, this feels as a side step. A recognition that Enso isn't good enough at manipulating tables yet. I sincerely hope that over the time Enso improves and provides good enough framework for dealing with tables as well.

Ideally we should create an internal DSL, reuse the Enso parser, intermix the DSL in between other Enso constructs. Make sure the Enso type system straightens the DSL and the DSL is 1st class citizen in the Enso code and can be naturally manipulated by the IDE.

@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch from e623bc0 to 5db850f Compare November 7, 2022 14:35
@jdunkerley jdunkerley force-pushed the wip/jd/expression-idea branch from 014a4a5 to 448e514 Compare November 8, 2022 08:56
@jdunkerley jdunkerley added the CI: Ready to merge This PR is eligible for automatic merge label Nov 8, 2022
@JaroslavTulach JaroslavTulach self-requested a review November 8, 2022 15:57
@mergify mergify bot merged commit 45276b2 into develop Nov 8, 2022
@mergify mergify bot deleted the wip/jd/expression-idea branch November 8, 2022 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants