Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CombinationsSequence should be a Collection #219

Open
dabrahams opened this issue Jan 6, 2024 · 2 comments
Open

CombinationsSequence should be a Collection #219

dabrahams opened this issue Jan 6, 2024 · 2 comments

Comments

@dabrahams
Copy link
Contributor

It's multipass (like almost every other Sequence in practice). Not being a collection limits its applicability in silly ways.

@natecook1000
Copy link
Member

@dabrahams Could you give some examples of the limitations that you're running into here?

IIRC, the reason that CombinationSequence is only a sequence is that its iterator stores and mutates a single copy of the base collection's indices as it iterates. By contrast, a collection would need to provide an index type that either stores a separate copy of the indices (potentially very heavyweight) or stores just a position, and then recalculates the indices when used (which should be doable, but might be costly). While the Collection protocol doesn't explicitly rule out such a heavy index type, it's very different from the way most collections work, and having O(N) work happening in each subscript would probably be unexpected for users.

That said, CombinationSequence might already be falling down on the efficiency job. It looks like the iterator currently copies each combination into a new array, instead of using a collection wrapper to combine the base collection and the array of indices. Such a wrapper would be independently useful, as well (and indeed I thought we already included it):

let numbers = [10, 20, 30, 40, 50]
for x in numbers.indexed(by: [4, 2, 0]) {
    print(x)
}
// 50
// 30
// 10

@dabrahams
Copy link
Contributor Author

Could you give some examples of the limitations that you're running into here?

I didn't run into a limitation, but I know such limitations exist. It's the whole reason I wrote Zip2Collection: there are algorithms that are defined only on Collection because it is multipass.

IIRC, the reason that CombinationSequence is only a sequence is that its iterator stores and mutates a single copy of the base collection's indices as it iterates.

That's no reason it can't be a collection, as you yourself know well and this implementation demonstrates.

having O(N) work happening in each subscript would probably be unexpected for users.

I don't know what your N is in this case; it's certainly not the length of the resulting collection, which is what people usually mean when they say O(N).

The work of dereferencing this thing is proportional to the data footprint of the elements you get, which is no worse than someStrings.lazy.map($0 + ".").

"It's surprisingly expensive" is subjective and not a reason for non-conformance to a protocol. The implementation either meets the efficiency bounds (and we even play loosey-goosey with that criterion in places) or it does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants