allow slurping in any position #42902

simeonschaub · 2021-11-02T03:11:21Z

This extends the current slurping syntax by allowing the slurping to not
only occur at the end, but anywhere on the lhs. This allows syntax like
a, b..., c = x to work as expected.

The feature is implemented using a new function called split_rest
(definitely open to better names), which takes as arguments the
iterator, the number of trailing variables at the end as a Val and
possibly a previous iteration state. It then spits out a vector
containing all slurped arguments and a tuple with the n values that get
assigned to the rest of the variables. The plan would be to customize
this for different finite collection, so that the first argument won't
always be a vector, but that has not been implemented yet.

split_rest differs from rest of course in that it always needs to be
eager, since the trailing values need to be known immediately. This is
why the slurped part has to be a vector for most iterables, instead of a
lazy iterator as is the case for rest.

Mainly opening this to get some feedback on the proposed API here.

Closes #43413

base/tuple.jl

simeonschaub · 2021-11-04T05:00:31Z

Ok, I think this should be mostly working now. Before I add some tests, add some more overloads and finish this off, I would like some feedback on the proposed API here from triage:

Is this the behavior people would expect, i.e. always return a vector in the fallback case? (I do still mean to add some overloads for the more common collections.)
Is the use of Val here warranted? I didn't trust constant propagation quite enough to achieve type stability in more complex cases, but I know it has improved significantly lately, so maybe I'm just too pessimistic. Since the number of variables should always be static, I don't believe this use of Val is too problematic though.
I did say above that the second return value of split_rest would always be a tuple, but the current lowering implementation does not actually assume that, so it could be any iterator. In fact, maybe the generic fallback should not return a tuple but array instead? That would always result in a small additional allocation, but is probably preferable for iterators with non-concrete element types.
The current implementation tries to be smart in cases like a, b..., c = 1, 2, 3, 4 and treats it as a = 1; b..., = 2, 3; c = 4. This does mean it can be quite difficult to tell semantically whether destructuring is using rest, split_rest or neither like in this example. Since this only affects tuples for which those all should behave exactly the same, I don't really expect this to be an issue though,
What do people think of the name? I'm certainly not set on it.

Seelengrab · 2021-11-04T07:47:43Z

What do people think of the name? I'm certainly not set on it.

Maybe middle_slurp, since this allows for slurping in the middle of the LHS? split_rest sounds like splitting the rest of the iterator and to me doesn't really make it clear what it's for. Another idea: split_slurp_rest, to make it clear that it's the rest of the slurp that gets split.

The current implementation tries to be smart in cases like

Can this be smart for things that return a (Abstract)Vector on the RHS as well? I suspect this convenient syntax would be used for getting the first & last element of a vector at the same time, which (as I understand it) would be costly with the general iterator fallback.

Other than that, this looks nice!

timholy · 2021-11-04T11:03:14Z

I do like this. I suppose it's predictable the next request will be to support a, b::Symbol..., c::Int... = itr, so it might be worth thinking a bit about where this kind of functionality will require limits.

simeonschaub · 2021-11-04T14:52:35Z

Can this be smart for things that return a (Abstract)Vector on the RHS as well? I suspect this convenient syntax would be used for getting the first & last element of a vector at the same time, which (as I understand it) would be costly with the general iterator fallback.

No, array literals do promotion, so we can't do that in this case. We can definitely add a specialized version of split_rest for arrays, although I don't think the current version will perform that badly.

I suppose it's predictable the next request will be to support a, b::Symbol..., c::Int... = itr, so it might be worth thinking a bit about where this kind of functionality will require limits.

I'm afraid it might already be a little late for that since we do already support type annotations for slurping in the last argument, where it annotates the type of the slurped iterator itself though, not the element type. In this PR, we get the same behavior since lowering always works recursively.

Seelengrab · 2021-11-04T15:08:38Z

No, array literals do promotion, so we can't do that in this case. We can definitely add a specialized version of split_rest for arrays, although I don't think the current version will perform that badly.

Ah, I didn't even think about array literals, only about arrays returned from functions! That should still be a O(1) case instead of O(n) in theory, right? The question is to view or not to view the slurped part - I think the former would be better, to keep allocations low for people wanting to use the higher abstraction.

simeonschaub · 2021-11-04T15:27:32Z

That should still be a O(1) case instead of O(n) in theory, right?

Yeah, probably.

The question is to view or not to view the slurped part

We already decided against using views in #37410, so we probably don't want to rehash that discussion.

Seelengrab · 2021-11-04T15:37:03Z

We already decided against using views in #37410, so we probably don't want to rehash that discussion.

I may be reading the PR wrong or misunderstanding you, but I was thinking that e.g.

a, slurp..., b = f() # f returns a Vector{Int}

could result in Tuple{Int, SubArray{...}, Int}, which would destructure to the a, slurp, b part (the size is known statically after all, just like the indices required for the view in the slurped part). If I'm misunderstanding and this was exactly what was talked about, sorry for the noise!

simeonschaub · 2021-11-04T15:46:45Z

could result in Tuple{Int, SubArray{...}, Int}, which would destructure to the a, slurp, b part

That tuple is never created, but semantically you can probably think about it this way. What I'm saying is that slurp should never just be a view as in this case, since it would be non-obvious that
mutating slurp can have undesired side effects. We discussed that on the triage call I believe for #37410.

StefanKarpinski · 2021-11-04T19:24:49Z

Triage approves of the concept. @JeffBezanson would like to review the implementation.

This extends the current slurping syntax by allowing the slurping to not only occur at the end, but anywhere on the lhs. This allows syntax like `a, b..., c = x` to work as expected. The feature is implemented using a new function called `split_rest` (definitely open to better names), which takes as arguments the iterator, the number of trailing variables at the end as a `Val` and possibly a previous iteration state. It then spits out a vector containing all slurped arguments and a tuple with the n values that get assigned to the rest of the variables. The plan would be to customize this for different finite collection, so that the first argument won't always be a vector, but that has not been implemented yet. `split_rest` differs from `rest` of course in that it always needs to be eager, since the trailing values need to be known immediately. This is why the slurped part has to be a vector for most iterables, instead of a lazy iterator as is the case for `rest`. Mainly opening this to get some feedback on the proposed API here.

base/array.jl

simeonschaub · 2022-02-12T19:15:45Z

Bump

base/strings/basic.jl

JeffBezanson · 2022-04-06T20:44:27Z

base/strings/basic.jl

+        @assert _check_length_split_rest(length(s), n)
+    end
+    last_n = SubString(s, nextind(s, i), lastind)
+    front = s[begin:i]


Is there a reason not to use either indexing or SubString for both values?

I think my reasoning was that the type of last_n doesn't really matter as long as it iterates correctly, so we might as well avoid the extra allocation. I guess we could make front a SubString as well, but that might be wasteful for small string if n is not much smaller than the string length, since we can't free the original string

JeffBezanson · 2022-04-07T19:14:42Z

src/julia-syntax.scm

+                                          (list assigns)))
+                              (n (length lhss-))
+                              (st (gensy))
+                              (end (list after))


IIUC, would be clearer to make this a mutable variable than a 1-element list.

Oh I see, it is also passed to destructure- 😭

Yeah, agree it's a little bit awkward, but couldn't think of something more elegant

src/julia-syntax.scm

sprmnt21 · 2022-07-20T07:01:22Z

I don't know if this is the right place to make the following remark / request, but this is where I saw (coming from here )that the new split_rest function has been defined.
I have had some (quite a few) occasions of wanting to split on a positions basis.
I think it would be useful in some cases to have a more general function, such as split_at_pos (itr, pos ...)

simeonschaub added feature Indicates new feature / enhancement requests compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels Nov 2, 2021

Seelengrab reviewed Nov 2, 2021

View reviewed changes

base/tuple.jl Show resolved Hide resolved

simeonschaub added the triage This should be discussed on a triage call label Nov 4, 2021

StefanKarpinski requested a review from JeffBezanson November 4, 2021 19:24

simeonschaub mentioned this pull request Dec 13, 2021

Allow ... in non-final assignment location #43413

Closed

simeonschaub force-pushed the sds/extended_slurp branch from 54b2938 to fc075da Compare December 20, 2021 23:44

JeffBezanson reviewed Jan 6, 2022

View reviewed changes

base/array.jl Outdated Show resolved Hide resolved

simeonschaub added 3 commits January 10, 2022 16:16

don't force tail to be static

d4cefef

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

4fce905

fix tests, add some new ones

a1ada06

simeonschaub changed the title ~~RFC: allow slurping in any position~~ allow slurping in any position Jan 11, 2022

simeonschaub added needs docs Documentation for this change is required needs news A NEWS entry is required for this change and removed triage This should be discussed on a triage call labels Jan 11, 2022

simeonschaub marked this pull request as ready for review January 11, 2022 17:50

simeonschaub added this to the 1.8 milestone Jan 11, 2022

simeonschaub and others added 3 commits January 18, 2022 09:46

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

d984b60

add some docs

5ab22cd

add NEWS entry

9d1e5b7

simeonschaub added 2 commits January 20, 2022 15:45

fix docs

6c2eedb

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

29e002e

simeonschaub removed needs docs Documentation for this change is required needs news A NEWS entry is required for this change labels Jan 20, 2022

fix whitespace

0b438c4

simeonschaub force-pushed the sds/extended_slurp branch from 205a381 to 0b438c4 Compare January 20, 2022 22:40

simeonschaub added 2 commits January 27, 2022 22:50

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

3ec586a

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

b14182b

vchuravy removed this from the 1.8 milestone Feb 14, 2022

JeffBezanson self-assigned this Apr 6, 2022

JeffBezanson reviewed Apr 6, 2022

View reviewed changes

base/strings/basic.jl Outdated Show resolved Hide resolved

JeffBezanson reviewed Apr 6, 2022

View reviewed changes

simeonschaub added 4 commits April 6, 2022 17:23

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

a2f1cad

update compat annotations

0d9d04b

remove nonsense assert

34a24c3

fix bad rebase n NEWS [ci skip]

73c8155

simeonschaub closed this Apr 6, 2022

simeonschaub reopened this Apr 6, 2022

retrigger CI

ae13239

JeffBezanson reviewed Apr 7, 2022

View reviewed changes

src/julia-syntax.scm Show resolved Hide resolved

simeonschaub added 2 commits April 8, 2022 00:28

explain destructure-

e3c1066

Merge remote-tracking branch 'origin/master' into sds/extended_slurp

d20ba83

JeffBezanson approved these changes Apr 8, 2022

View reviewed changes

simeonschaub merged commit 385762b into master Apr 8, 2022

simeonschaub deleted the sds/extended_slurp branch April 8, 2022 21:50

simeonschaub mentioned this pull request Apr 19, 2022

Allow slurping at the beginning rather than just the end (for destructuring and method definitions) #42036

Closed

adienes mentioned this pull request Nov 29, 2022

add an option to intersect arguments passed to Cols JuliaData/DataFrames.jl#3224

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow slurping in any position #42902

allow slurping in any position #42902

simeonschaub commented Nov 2, 2021 •

edited

Loading

simeonschaub commented Nov 4, 2021

Seelengrab commented Nov 4, 2021

timholy commented Nov 4, 2021

simeonschaub commented Nov 4, 2021

Seelengrab commented Nov 4, 2021

simeonschaub commented Nov 4, 2021

Seelengrab commented Nov 4, 2021 •

edited

Loading

simeonschaub commented Nov 4, 2021

StefanKarpinski commented Nov 4, 2021

simeonschaub commented Feb 12, 2022

JeffBezanson Apr 6, 2022

simeonschaub Apr 6, 2022

JeffBezanson Apr 7, 2022

JeffBezanson Apr 7, 2022

simeonschaub Apr 8, 2022

sprmnt21 commented Jul 20, 2022 •

edited

Loading

allow slurping in any position #42902

allow slurping in any position #42902

Conversation

simeonschaub commented Nov 2, 2021 • edited Loading

simeonschaub commented Nov 4, 2021

Seelengrab commented Nov 4, 2021

timholy commented Nov 4, 2021

simeonschaub commented Nov 4, 2021

Seelengrab commented Nov 4, 2021

simeonschaub commented Nov 4, 2021

Seelengrab commented Nov 4, 2021 • edited Loading

simeonschaub commented Nov 4, 2021

StefanKarpinski commented Nov 4, 2021

simeonschaub commented Feb 12, 2022

JeffBezanson Apr 6, 2022

Choose a reason for hiding this comment

simeonschaub Apr 6, 2022

Choose a reason for hiding this comment

JeffBezanson Apr 7, 2022

Choose a reason for hiding this comment

JeffBezanson Apr 7, 2022

Choose a reason for hiding this comment

simeonschaub Apr 8, 2022

Choose a reason for hiding this comment

sprmnt21 commented Jul 20, 2022 • edited Loading

simeonschaub commented Nov 2, 2021 •

edited

Loading

Seelengrab commented Nov 4, 2021 •

edited

Loading

sprmnt21 commented Jul 20, 2022 •

edited

Loading