Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Vector.build: implementation and some tests #9725

Merged
merged 13 commits into from
Apr 18, 2024
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -650,6 +650,7 @@
- [Added `Decimal.min` and `.max`.][9663]
- [Added `Decimal.round`.][9672]
- [Implemented write support for Enso Cloud files.][9686]
- [Added `Vector.build`.][9725]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -950,6 +951,7 @@
[9663]: https://github.com/enso-org/enso/pull/9663
[9672]: https://github.com/enso-org/enso/pull/9672
[9686]: https://github.com/enso-org/enso/pull/9686
[9725]: https://github.com/enso-org/enso/pull/9725

#### Enso Compiler

Expand Down
63 changes: 55 additions & 8 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,13 @@ type Vector a
Vector.collect (List.Cons 1 <| List.Cons 2 <| List.Nil) .x .xs stop_at=(_==List.Nil)
collect : Any -> (Any -> Any) -> (Any -> Any) -> Integer | Nothing -> (Any -> Boolean) -> Vector Any
collect seq element:(Any -> Any) next:(Any -> Any) limit:(Integer | Nothing)=Nothing stop_at:(Any -> Boolean)=(_==Nothing) =
b = Vector.new_builder (if limit.is_nothing then 10 else limit)
iterate item remaining = if remaining == 0 || (stop_at item) then b.to_vector else
b.append <| element item
@Tail_Call iterate (next item) (if remaining.is_nothing then Nothing else remaining-1)
iterate seq limit
Vector.build capacity=(if limit.is_nothing then 10 else limit) builder->
iterate item remaining =
done = remaining == 0 || (stop_at item)
if done.not then
builder.append <| element item
@Tail_Call iterate (next item) (if remaining.is_nothing then Nothing else remaining-1)
iterate seq limit

## PRIVATE
ADVANCED
Expand Down Expand Up @@ -145,14 +147,15 @@ type Vector a

## PRIVATE
ADVANCED
DEPRECATED
Creates a new vector builder instance.

A vector builder is a mutable data structure, that allows for gathering
A vector `Builder` is a mutable data structure, that allows for gathering
a number of elements and then converting them into a vector. This is
particularly useful when the number of elements is not known upfront.

A vector allows to store an arbitrary number of elements in linear memory. It
is the recommended data structure for most applications.
A vector allows to store an arbitrary number of elements in linear
memory. It is the recommended data structure for most applications.

Arguments:
- capacity: Initial capacity of the Vector.Builder
Expand All @@ -171,6 +174,50 @@ type Vector a
new_builder : Integer -> Builder
new_builder (capacity=10) = Builder.new capacity

## PRIVATE
ADVANCED
Creates a new `Vector` by providing a `Builder` to the provided function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rephrase this sentence as providing X to the provided Y sounds weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "passing a Builder to the provided function"


A vector `Builder` is a mutable data structure, that allows for gathering
a number of elements and then converting them into a vector. This is
particularly useful when the number of elements is not known upfront.

`.build` creates a new `Builder`, passes it to the provided function,
which can add elements using the `Builder`'s `.append` method. When the
function is done, `.build` then closes the `Builder and returns the
resulting `Vector`.

The provided function should call `.append` to add new elements to the
`Builder`. The return value of the provided function is not used, unless
it is a dataflow error, in which case the `Vector` is not built, and the
dataflow error is propagted instead.

A vector allows to store an arbitrary number of elements in linear
memory. It is the recommended data structure for most applications.

Arguments:
- function: a function taking a `Builder` and using it to add elements.
- capacity: Initial capacity of the Vector.Builder

! Error Conditions

- If the provided function throws a dataflow error, the `Vector` is not
built, and the error is propagted instead.

> Example
Construct a vector using a builder that contains the items 1 to 5.

Vector.build builder->
builder.append 1
builder.append 2
builder.append 3
# => [1, 2, 3]
build : (Builder -> Any) -> Integer -> Vector
build (function : Builder -> Any) (capacity : Integer = 10) -> Vector =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
build (function : Builder -> Any) (capacity : Integer = 10) -> Vector =
build (function : Builder -> Any) (initial_capacity : Integer = 10) -> Vector =

I think maybe adding initial_ would clarify the meaning of this variable; without initial it could sound like it's the max capacity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

builder = Builder.new capacity
result = function builder
result.if_not_error builder.to_vector

## PRIVATE
ADVANCED

Expand Down
58 changes: 37 additions & 21 deletions test/Base_Tests/src/Data/Vector_Spec.enso
Original file line number Diff line number Diff line change
Expand Up @@ -420,27 +420,25 @@ type_spec suite_builder name alter = suite_builder.group name group_builder->

group_builder.specify "should allow applying a function to each element" <|
vec = alter [1, 2, 3, 4]
vec_mut = Vector.new_builder
vec.each vec_mut.append
vec_mut.to_vector . should_equal vec
result = Vector.build builder->
vec.each builder.append
result . should_equal vec

group_builder.specify "should accept changed elements" <|
vec_mut = Vector.new_builder
vec_mut.append 1
vec_mut.append 1.1
vec_mut.append Nothing
vec = alter <| Vector.build builder->
builder.append 1
builder.append 1.1
builder.append Nothing

vec = alter vec_mut.to_vector
vec.length . should_equal 3
vec.at 0 . should_equal 1
vec.at 1 . should_equal 1.1
vec.at 2 . should_equal Nothing

group_builder.specify "should accept Nothing" <|
vec_mut = Vector.new_builder
vec_mut.append Nothing
vec = alter <| Vector.build builder->
builder.append Nothing

vec = alter vec_mut.to_vector
vec.length . should_equal 1
vec.at 0 . should_equal Nothing

Expand Down Expand Up @@ -778,21 +776,19 @@ type_spec suite_builder name alter = suite_builder.group name group_builder->

group_builder.specify "should correctly propagate state through each" <|
v = State.run Number 77 <|
b = Vector.new_builder
(alter ["A", "B"]).each x->
b.append x
b.append (State.get Number).to_text
b.to_vector
Vector.build builder->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume this is a general even for complicated computations. Just wrap your computation in Vector.build builder-> block and that's it.

(alter ["A", "B"]).each x->
builder.append x
builder.append (State.get Number).to_text

v.should_equal ['A', '77', 'B', '77']

group_builder.specify "should correctly propagate state through map" <|
v = State.run Number 55 <|
b = Vector.new_builder
(alter ["X", "Y"]).map x->
b.append x
b.append (State.get Number).to_text
b.to_vector
Vector.build builder->
(alter ["X", "Y"]).map x->
builder.append x
builder.append (State.get Number).to_text

v.should_equal ['X', '55', 'Y', '55']

Expand Down Expand Up @@ -1034,6 +1030,26 @@ add_specs suite_builder =
v = Vector.collect l .x .xs limit=30 stop_at=(_==List.Nil)
v . should_equal [1, 2, 3]

suite_builder.group "Vector.build" group_builder->
group_builder.specify "Should be able to build a vector" <|
v = Vector.build builder->
builder.append 1
builder.append 2
builder.append Nothing
builder.append 3
v . should_equal [1, 2, Nothing, 3]

v2 = Vector.build builder->
_ = builder # Do nothing
v2 . should_equal []

group_builder.specify "Should propagate dataflow errors" <|
v = Vector.build builder->
builder.append 1
builder.append 2
Error.throw (Illegal_Argument.Error "asdf")
v . should_fail_with (Illegal_Argument.Error "asdf")
Comment on lines +1050 to +1055
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great test.

Not directly related, but while we are at it we could consider solving a long outstanding problem we used to have:

        group_builder.specify "Should propagate the first dataflow error encountered in `append`" <|
            v = Vector.build builder->
                builder.append 1
                builder.append (Error.throw (Illegal_Argument.Error "ERROR2")
                builder.append 3
                builder.append (Error.throw (Illegal_Argument.Error "ERROR4")
                builder.append 5
            v . should_fail_with (Illegal_Argument.Error "ERROR2")

Unfortunately, as far as I know currently this will actually not fail but produce a vector [1, 3, 5], essentially swallowing the dataflow errors.

We used to have a pattern of 'folding' the builder i.e. append returns a builder and this would have to be rewritten as:

                builder . append 1
                        . append (Error.throw (Illegal_Argument.Error "ERROR2")
                        . append 3
                        . append (Error.throw (Illegal_Argument.Error "ERROR4")
                        . append 5

in which case the error would be correctly propagated.

Unfortunately, the 'folding the builder' pattern did not get much traction. In many places it was just not too practical - the builder was inherently mutable object so this more 'functional' folding was often not fitting its usages.

Do you think we could amend the builder to detect that append got a dataflow error and avoid swallowing it?

This could be implemented by holding a encountered_error : Ref (Any | Nothing) inside of Builder, then append can check item.is_error and set this Ref the first time an error is encountered; then Builder.to_vector would first check if encountered_error is set and if so, propagate that error instead of building the vector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a bit out of scope of this PR, so I guess we can also do it as a separate ticket/PR instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good idea. #9733


suite_builder.group "Vector/Array equality" group_builder->
v1 = [1, 2, 3]
a1 = v1.to_array
Expand Down
Loading