Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrow builder is not an Array #9358

Merged
merged 5 commits into from
Mar 13, 2024
Merged

Conversation

hubertp
Copy link
Collaborator

@hubertp hubertp commented Mar 11, 2024

Pull Request Description

Follow up on #9150 - making sure that Arrow builder is not accidentally treated as an Array by disallowing reading elements.

Important Notes

Also making sure that the length of the resulting Arrow Array is consistent with what user requested.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.

@hubertp hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Mar 11, 2024
@radeusgd
Copy link
Member

Shouldn't we make it non-array completely by returning false in hasArrayElements?

I think, reading the PR description, we should not match:

case arrow_builder of
    _ : Array -> "still an array"
    _ -> "not an array"

and as far as I remember, if we respond hasArrayElements=true, then it will match such a case (I could be wrong here, anyway it feels like it may be worth adding such a test just to understand what the behaviour is supposed to be there).

@hubertp
Copy link
Collaborator Author

hubertp commented Mar 11, 2024

Shouldn't we make it non-array completely by returning false in hasArrayElements?

I think, reading the PR description, we should not match:

case arrow_builder of
    _ : Array -> "still an array"
    _ -> "not an array"

and as far as I remember, if we respond hasArrayElements=true, then it will match such a case (I could be wrong here, anyway it feels like it may be worth adding such a test just to understand what the behaviour is supposed to be there).

Depends. I would also have to manually enable all the related methods, e.g. length. Catch-22. Or as @JaroslavTulach put it, cheating ;P

@radeusgd
Copy link
Member

Shouldn't we make it non-array completely by returning false in hasArrayElements?
I think, reading the PR description, we should not match:

case arrow_builder of
    _ : Array -> "still an array"
    _ -> "not an array"

and as far as I remember, if we respond hasArrayElements=true, then it will match such a case (I could be wrong here, anyway it feels like it may be worth adding such a test just to understand what the behaviour is supposed to be there).

Depends. I would also have to manually enable all the related methods, e.g. length. Catch-22. Or as @JaroslavTulach put it, cheating ;P

What do you mean manually enable?

I don't see any Catch-22 scenario here. The builder is a separate type with its own API. Why would we pretend it is an array when we don't really use it that way anyway?

@JaroslavTulach
Copy link
Member

Why would we pretend it is an array when we don't really use it that way anyway?

That's basically the same question I asked here. Why?

@@ -14,6 +14,7 @@
import com.oracle.truffle.api.library.ExportLibrary;
import com.oracle.truffle.api.library.ExportMessage;
import org.enso.interpreter.arrow.LogicalLayout;
import org.graalvm.collections.Pair;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though Jaroslav said org.graalvm.collections was going away?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great observation. Thank you James. The question is how Pair gets on classpath? Probably as a result of some transitive dependencies. We should update the dependencies of Arrow language to not use a dependency with such a huge transitive closure, but rather more direct one which doesn't contain org.graalvm.collections as transitive dependency.

I don't know off-hand how such change should look like, but it'd be a good change.

@hubertp
Copy link
Collaborator Author

hubertp commented Mar 11, 2024

Why would we pretend it is an array when we don't really use it that way anyway?

That's basically the same question I asked here. Why?

In that case I misunderstood the conversation during the meeting because I got the impression that all the complaints were about at.
Sure, I can make the change.

}

@ExportMessage
Object readArrayElement(long index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation (together with isArrayElementReadable) guarantees that builder.at 1 is going to yield an exception.

@JaroslavTulach
Copy link
Member

Why would we pretend it is an array when we don't really use it that way anyway?

That's basically the same question I asked here. Why?

... I got the impression that all the complaints were about at.

I believe the complaints were about at, length and all other methods derived from those base two - e.g. map, fold and everything else provided by Array.enso.

I believe you effectively disabled at and all other methods that call at behind the scene. I believe that neither map nor fold can succeed. All these methods yield an exception as soon as they make a call to at. Maybe one can call builder.length, but that's probably the only method that can be called without yielding an exception.

Why would we pretend it is an array when we don't really use it that way anyway?

Given no method defined by Array.enso can succeed (except length), it makes little sense to pretend builder : Array. At least neither I nor Radek see a reason why builder should pretend it is an Array when 99% methods defined on Array yield an exception.

@hubertp hubertp requested a review from JaroslavTulach March 12, 2024 16:17
@GenerateInline(value = false)
abstract class WriteToBuilderNode extends Node {

public abstract void executeWrite(ArrowFixedSizeArrayBuilder receiver, long index, Object value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks OK.

@hubertp hubertp added the CI: Ready to merge This PR is eligible for automatic merge label Mar 13, 2024
@mergify mergify bot merged commit f82e802 into develop Mar 13, 2024
37 of 40 checks passed
@mergify mergify bot deleted the wip/hubert/9118-builder-array branch March 13, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants