Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved polyglot Date support #3559

Merged
merged 67 commits into from
Jul 21, 2022

Conversation

JaroslavTulach
Copy link
Member

@JaroslavTulach JaroslavTulach commented Jul 1, 2022

Pull Request Description

Significantly improves the polyglot Date support (as introduced by #3374). It enhances the Date_Spec to run it in four flavors:

  • with Enso Date (as of now)
  • with JavaScript Date
  • with JavaScript Date wrapped in (JavaScript) array
  • with Java LocalDate allocated directly

The code is then improved by necessary modifications to make the Date_Spec pass.

Important Notes

James has requested in #181755990 - e.g. Review and improve InMemory Table support for Dates, Times, DateTimes, BigIntegers the following program to work:

foreign js dateArr = """
    return [1, new Date(), 7]

main =
    IO.println <| (dateArr.at 1).week_of_year

the program works with here in provided changes and prints 27 as of today.

@jdunkerley has provided tests for proper behavior of date in Table and Column. Those tests are working as of f16d07e. One just needs to accept List<Value> and then query Value for isDate() when needed.

Last round of changes is related to exception handling. 8b686b1 makes sure makePolyglotError accepts only polyglot values. Then it wraps plain Java exceptions into WrapPlainException with has_type method - 60da5e7 - the remaining changes in the PR are only trying to get all tests working in the new setup.

The support for Time isn't part of this PR yet.

Checklist

Please include the following checklist in your PR:

  • [ x ] The documentation has been updated if necessary.
  • [ x ] All code conforms to the Java style guide
  • All code has been tested:
    • [ x ] Unit tests have been written where possible.

@jdunkerley jdunkerley force-pushed the wip/jtulach/PolyglotDateSupport_181755990 branch from b764759 to 4c98949 Compare July 7, 2022 07:30
@jdunkerley
Copy link
Member

Generally this works really well.

I added some tests for Table.

The one part that doesn't work is when we send an Enso Data to Java it is represented as a PolyglotMap
image

This means we don't use DateFormatter and end up calling back to Enso.
Ideally when we pass the object into Java it would be a LocalDate rather than a PolyglotMap

@JaroslavTulach JaroslavTulach self-assigned this Jul 11, 2022
@JaroslavTulach JaroslavTulach force-pushed the wip/jtulach/PolyglotDateSupport_181755990 branch from 111a003 to 69320d4 Compare July 11, 2022 16:32
@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Jul 11, 2022

I added some tests for Table.

Great. Thanks for the use-case.

The one part that doesn't work is when we send an Enso Data to Java it is represented as a PolyglotMap

I've got that part working in JaroslavTulach@a4edf9f however that is unlikely the preferred solution - it requires one to know the type of the Column ahead of time and request conversion to the array of right objects.

This means we don't use DateFormatter and end up calling back to Enso. Ideally when we pass the object into Java it would be a LocalDate rather than a PolyglotMap

The problem is that Column.fromItems uses List<Object> argument - GraalVM properly converts the Enso Array to List and notes that elements of that list should be java.lang.Object - however they already are! E.g. when one requests list.get(i) there is no conversion requested and Atom is returned as Atom wrapped into PolyglotMap.

My workaround patch fixes that by having two conversion methods. The other one requests List<LocalDate> - then the conversion happens.

Solution exists, but it is very ugly. Maybe we are not doing the conversions correctly! Also most of the current List conversions is hidden behind @TruffleBoundary at HostToTypeNode.asJavaObject - e.g. it is magnitudes slower than it could be.

To get a solution quickly, I believe we shall explore e9b0800 - e.g. obtain Value representing the array and use it i Java code to query its type and convert it into any format we need. Opinions @kustosz, @hubertp, @jdunkerley?

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Jul 12, 2022

Without e9b0800 I am getting following output when I execute the test:

enso$ enso --run test/Table_Tests/src/Table_Date_Spec.enso --in-project test/Table_Tests/
File.read (Delimited) should work with Dates:
    - should be able to read in a table with dates
    - should be able to treat a single value as a Date
    - should be able to compare columns and table
Should be able to serialise a table with Dates to Text:
    - should serialise back to input
    - [FAILED] should serialise dates with format
        Reason: 'From\n1979-05-04\n1990-11-28\n1997-05-02\n2007-06-27\n2010-05-11\n2016-07-13\n2019-07-24\n' did not equal 'From\n04.05.1979\n28.11.1990\n02.05.1997\n27.06.2007\n11.05.2010\n13.07.2016\n24.07.2019\n' (at enso/test/Table_Tests/src/Table_Date_Spec.enso:51:13-48).

with the e9b0800 change the should serialise dates with format error goes away:

enso$ enso --run test/Table_Tests/src/Table_Date_Spec.enso --in-project test/Table_Tests/
Listening for transport dt_socket at address: 8000
File.read (Delimited) should work with Dates:
    - should be able to read in a table with dates
    - should be able to treat a single value as a Date
    - [FAILED] should be able to compare columns and table
        Reason: [1979-05-04, 1990-11-28, 1997-05-02, 2007-06-27, 2010-05-11, 2016-07-13, 2019-07-24] did not equal [1979-05-04, 1990-11-28, 1997-05-02, 2007-06-27, 2010-05-11, 2016-07-13, 2019-07-24] (at enso/test/Table_Tests/src/Util.enso:16:5-50).
Should be able to serialise a table with Dates to Text:
    - should serialise back to input
    - should serialise dates with format

however another problem appears should be able to compare columns and table - e.g. the solution has some collateral damage which has to be avoided - done in f16d07e

Copy link
Contributor

@kustosz kustosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a step backwards, away from proper date support: now every single date object will get wrapped eagerly, instead of being wrapped on demand. Bad for performance, bad for semantics (since now we always mangle polyglot dates even if no such need exists), I can't see a single thing this is actually good for. Also, still only focused on dates, meaning we'll get 4x the mess when trying to support the whole set of time-like types. The right way to do this is to follow the lead of Text – convert on-demand, and expose methods normally, like for any other builtin.

Which leads me to another point: I was told by the Truffle team that Env.asGuestValue is discouraged and poised for deprecation. Therefore introducing new uses into the codebase should be motivated by a tangible difficulty of implementing another solution. And with Hubert's changes to builtin objects, doing it right is trivial.

@ExportMessage
boolean isMemberInvocable(String member, @CachedLibrary(limit="1") InteropLibrary delegate) {
if (
"has_type".equals(member) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a piece of documentation which describes how has_type works (unless I missed it) and how its semantics differ from Java.is_instance. As otherwise, I don't understand why the change is made and what it means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have discussed with @JaroslavTulach, there seems to be no good place to document this has_type method, because it is added essentially to any Java type extending Throwable.
But IIRC only if the Throwable is caught, ones created with new may not necessarily have it - @JaroslavTulach is that the case? That's how I currently understand the wrapping mechanism - that it happens only at 'catch-site' - but I may be wrong. Thus, that may too be worth clarifying.

Thus, there is no Enso supertype common for exceptions that we could attach the doc to.

Because of that, we have decided the best course of action may be to add a has_type (or a similar name) method to the type Polyglot_Error cause Atom which would then delegate to the underlying wrapped exception's has_type implementation. Then the base has_type extension will not be directly documented, but all of its behaviour can be explained in the Polyglot_Error.has_type "facade".

Does that sound good?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it is added essentially to any Java type extending Throwable

No, not really. One has to construct Polyglot_Error atom via Error.makePolyglotError - then the first argument is AbstractTruffleException instance which responds to has_type invokeMember message.

there is no Enso supertype common for exceptions that we could attach the doc to.

No, there is not any common super type. The first argument of Polyglot_Error is of type Dynamic.

Polyglot_Error.has_type sounds OKish.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, will it simplify anything? If there is:

internal_pattern = maybe_java_pattern.map_error case _ of
            Polyglot_Error err ->
                if err.has_type PatternSyntaxException . not then err else
                    Regex.Syntax_Error err.getMessage
            other -> other

how shall one invoke Polyglot_Error.has_type instance method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you'd need

internal_pattern = maybe_java_pattern.map_error error-> case error of
            Polyglot_Error cause ->
                if error.has_type PatternSyntaxException . not then error else
                    Regex.Syntax_Error cause.getMessage
            other -> other

Copy link
Member Author

@JaroslavTulach JaroslavTulach Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end I got a better idea: ec0b51d - there is Any.is_a, so let's reuse that method! It may not be absolutely perfect, but it avoids defining new method name and supports following usecase:

  polyglot java import PatternSyntaxException
  # ...
  if error.is_a PatternSyntaxException....

Polemic: Looking at Meta.enso I see:

is_a : Any -> Any -> Boolean
is_a value typ = if typ == Any then True else
    if is_error value then typ == Base.Error else
        case value of
            # ....
            Base.Polyglot -> typ == Base.Polyglot

my goal with this is_a implementation is to reuse getMetaObject. In case we get a guest object representing Java host object then getMetaObject will return the guest object representing Java class - e.g. exactly the one we get by doing polyglot java import PatternSyntaxException. Then following code seems natural:

  polyglot java import PatternSyntaxException
  # ...
  if error.is_a PatternSyntaxException....

Copy link
Collaborator

@hubertp hubertp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • requests made by James and Radek

Co-authored-by: Hubert Plociniczak <[email protected]>
@JaroslavTulach JaroslavTulach force-pushed the wip/jtulach/PolyglotDateSupport_181755990 branch from 2c1e85c to 9fbcd1c Compare July 19, 2022 07:15
Copy link
Contributor

@kustosz kustosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all right. One thing that should be noted is that we should get rid of Time_Utils ASAP – migrate everything to builtins instead. We can do it in a separate issue tho.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants