Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Vector.build #9304

Closed
5 tasks
radeusgd opened this issue Mar 6, 2024 · 9 comments
Closed
5 tasks

Implement Vector.build #9304

radeusgd opened this issue Mar 6, 2024 · 9 comments
Assignees
Labels
-libs Libraries: New libraries to be implemented

Comments

@radeusgd
Copy link
Member

radeusgd commented Mar 6, 2024

@JaroslavTulach suggested introducing a new way of using the Vector Builder:

For a while I am asking myself, if this is the right pattern to follow? Shouldn't the Vector.new_builder and other builders be changed to receive a callback lambda function to perform the building in a clearly defined scope? E.g.

v = Vector.new_builder builder->
    builder.set 8 127
    builder.set 1 10

it looks the similar and feels more functional. Moreover such a lambda based approach is going to work fine in the IDE (unlike the current unclosed builder).

As far as I can say the lambda approach supports all use-cases except creating builder and leaving it open - however that shall be an anti-pattern anyway, I believe/hope.

As we have all agreed this is a very good idea. It will allow us to use the builder in the IDE safely - the current approach would be unsafe to use in the GUI due to being to relying on side-effects.

We probably will want to call the operation Vector.build.

  • Introduce the static method Vector.build that runs the provided lambda with a newly created builder, then seals it and returns as its result the built vector.
    • The result of the lambda is ignored. But if it is a dataflow error, it should still be propagated (let's pro-actively avoid bugs like Vector.each ignores dataflow errors raised in the function #9250).
    • Currently it is not in scope of this task to implement set for the builder. The builder relies on append and that is enough for now. We can add set if needed, but not as part of this ticket.
  • Mark Vector.new_builder as deprecated.
  • Scan existing usages of Vector.new_builder.
    • Convert, at least the easy instances, into Vector.build instead.
    • If there are more problematic instances where it is not easy to convert to the new idiom, let's create a follow up ticket to refactor them later.
    • If all instances were removed, remove Vector.new_builder altogether. If not, let's remove it as part of the followup ticket.
@radeusgd radeusgd added the -libs Libraries: New libraries to be implemented label Mar 6, 2024
@github-project-automation github-project-automation bot moved this to ❓New in Issues Board Mar 6, 2024
@AdRiley AdRiley moved this from ❓New to 📤 Backlog in Issues Board Mar 6, 2024
@GregoryTravis GregoryTravis moved this from 📤 Backlog to 🔧 Implementation in Issues Board Apr 16, 2024
@GregoryTravis
Copy link
Contributor

A couple questions:

  • In what sense does this new Vector.build not rely on side-effects? If the lambda must use .set or .append to provide values, is that somehow less of a side-effect than the current .new_builder approach?
  • Are there any cases where we should not just use Vector.new with a value-returning lambda? Are there efficiency issues for Vector.new?

@somebody1234
Copy link
Contributor

@GregoryTravis I think the idea is that .build always starts from a fresh instance, and always produces a certain result (since it always performs a certain set of steps on the fresh instance).

IOW: It doesn't mutate an existing builder that's coming in from another node

As for why we can't use a value-returning lambda - I imagine it's because sometimes we want to be able to append to a lambda, e.g. when iterating:

builder = Vector.new_builder
go has_next = if has_next.not then Nothing else
builder.append (result_set.getObject index)
@Tail_Call go result_set.next
go result_set.next
builder.to_vector

@enso-bot
Copy link

enso-bot bot commented Apr 16, 2024

Greg Travis reports a new STANDUP for today (2024-04-16):

Progress: Dataflow Error stack traces; sent final Decimal PRs; implemented Vector.build and started converting uses of new_builder It should be finished by 2024-04-19.

Next Day: convert uses of .new_builder to .build

@radeusgd
Copy link
Member Author

  • In what sense does this new Vector.build not rely on side-effects? If the lambda must use .set or .append to provide values, is that somehow less of a side-effect than the current .new_builder approach?

Good point, I guess I used wrong phrasing.

Both have side effects, but with Vector.build all side effects are encapsulated inside of the inner lambda, whereas with new_builder they escape the current expression - which is problematic in the IDE because the nodes can be re-run a bit randomly in interactive mode - so builder.append nodes scattered around the graph may be executed in unpredictable order; whereas Vector.build kind of forces the operation to be trapped inside of a single node - removing the problems of interactive execution.

It is a bit like the difference between Haskell's STRef that lives only inside of runST VS IORef that has no encapsulation on its side effects. With the STRef the algorithm is still pure on the outside, even if using mutability on the inside.

  • Are there any cases where we should not just use Vector.new with a value-returning lambda? Are there efficiency issues for Vector.new?

Not efficiency but practicality. I think for some algorithms it is just simpler to have a builder.

For example, if I want to compute a running sum over a vector - tbh I'm not sure how I could do that with Vector.new. The problem is with Vector.new the lambda just depends on the index i - but in our case we also need to depend on the previous result.

I guess the usual functional pattern would be to use fold with a List data structure that allows O(1) append. And Enso does have the List, but still Vector is used 99% of the time so I imagine users will feel better using the more familiar pattern.

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Apr 17, 2024

* In what sense does this new `Vector.build` not rely on side-effects? 

If the lambda must use .set or .append to provide values,
is that somehow less of a side-effect than the current new_builder approach?

Create Vector.new_builder node in the IDE. Draw some lines from it and append. Then to_vector. Play with the nodes and observe the results: aren't the values accumulating in the builder strangely? Isn't there more values than there should be in the Vector?

chaos and disorder

That's because IDE is intercepting individual method invocations and caching their results. That is not going to happen in case of the lambda.

@JaroslavTulach
Copy link
Member

  • Are there any cases where we should not just use Vector.new with a value-returning lambda? Are there efficiency issues for Vector.new?

Good question. I believe with the proposed Vector.build you don't have to specify size in advance.

@GregoryTravis
Copy link
Contributor

@jdunkerley What is an example of using fold with a Builder in the GUI? I ask because I imagine it would require a multi-line lambda with calls to .set and I'm not sure how to do that in the GUI.

@enso-bot
Copy link

enso-bot bot commented Apr 17, 2024

Greg Travis reports a new STANDUP for today (2024-04-17):

Progress: Converting Base/ uses of .new_builder to .build It should be finished by 2024-04-19.

Next Day: convert uses of .new_builder to .build?

@enso-bot
Copy link

enso-bot bot commented Apr 18, 2024

Greg Travis reports a new STANDUP for today (2024-04-18):

Progress: hash/equals consistency It should be finished by 2024-04-19.

Next Day: convert uses of .new_builder to .build?

@GregoryTravis GregoryTravis moved this from 🔧 Implementation to 👁️ Code review in Issues Board Apr 19, 2024
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-libs Libraries: New libraries to be implemented
Projects
Archived in project
Development

No branches or pull requests

4 participants