Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change collect behaviour for maps #17111

Closed
wants to merge 2 commits into from
Closed

Conversation

Ryman
Copy link
Contributor

@Ryman Ryman commented Sep 9, 2014

Current collect behavior for the Map types is lossy, where additional values per key are thrown away and only the last value is kept. I think the user should make an explicit decision regarding this behavior.

Perhaps this warrants a different trait, or another static method to initialize maps from an iterator of pairs (with the lossy behaviour), i.e. Hashmap::from_iter_lossy(..)?

I think it's worth looking closely at the changes in the 2nd commit as some of them may benefit from assertions that duplicates don't exist.

Also from an implementation standpoint, I'm sure I'm overlooking a better way to get an iterator from a single value, rather than Some(v).move_iter().

Collecting to a map is now a grouping operation, so the value
type will need to implement `Extendable`.

`from_iter` and `extend` were lossy before this change, which
can be unexpected behaviour. The user should have to make a
decision about how to handle adding values when there is
already a value for a given key.

To fix code broken by this, use a fold or a for loop.

e.g.
let map = iter.map(|x| (x, x.y)).collect::<HashMap<T, U>>();

Can become:
let map = iter.fold(HashMap::new(), |mut map, x| {
    // You may want to use find_mut to update existing values.
    // This will overwrite an old value for a key with a new one.
    map.insert(x, x.y);
    map
});

[breaking-change]
@thestinger
Copy link
Contributor

I think the current design is sensible, and it's consistent with other languages (C++, D, Python, [...]). The standard iterator methods only exist to provide sugar for common cases and being explicit about every algorithm choice is counter to that goal. The zip method is another example of the practical design, as it stops when the first iterator does rather than trying to bubble up the condition as an error by yielding Option<(K, V)> or (Option<K>, Option<V>). I don't think grouping is the common case.

@huonw
Copy link
Member

huonw commented Sep 9, 2014

I wonder if we could have this via collect wrappers, e.g. a GroupedCollect type used, like iter.collect::<GroupedCollect<HashMap<uint, Vec<...>>>>().

BTW, we don't necessarily want to be using Extendable for this purpose, at least, not extend, since it is appending a single element a time but extend is designed for the case of appending many, i.e. it can have large overhead. At some point, we possibly want a trait that offers a "add one" method. Furthermore, in this context, makes a lot of sense to allow, e.g. collecting an Iterator<(K, uint)> into a HashMap<(K, uint)> by summing up the uints (this would imply passing a closure to control the behaviour precisely).

Also, btw, rewriting the .collect call into a for+insert loop will be less efficient; most collect implementation use the iterator's size_hint to preallocate a best-guess size for the resulting data structure (where this makes sense) and thus save effort (and for .iter().map(...).collect()-like statements, this lower bound is actually exact, so no reallocations are necessary). The naive loop versions do not do this.

cc @aturon

@Gankra
Copy link
Contributor

Gankra commented Sep 9, 2014

I agree with thestinger here, the current behaviour matches my intuition that extend is just sugar for pushing each of the elements individually (with room for optimization). For Sets and Maps, that means dropping duplicates. If you want to keep duplicates, you should be using a MultiMap (which we don't provide, but it isn't too hard to write one).

@aturon
Copy link
Member

aturon commented Sep 9, 2014

I agree that the current semantics of collect for maps is the expected one, and I wouldn't want to force additional constraints (Extendable) on the value type as proposed here.

@gankro to be clear, the use case here is more general than for multimaps, because the idea is that the value type has a way to "merge in" additional values.

I could imagine adding this functionality as a separate constructor for maps -- something like from_iter_merge and introduce a separate Merge trait. (For that matter, the Add trait could be an appropriate choice.)

But in general, we're trying to pare down libstd to focus on core functionality that is widely used; we hope that more experimental APIs can live in the Cargo system first. Before considering something like this for std, it'd be good to have some very clear use cases and experience with it.

@Ryman
Copy link
Contributor Author

Ryman commented Sep 10, 2014

If current behavior is desired/expected, then I agree with @aturon that this could live in an external crate. Closing.

@Ryman Ryman closed this Sep 10, 2014
lnicola pushed a commit to lnicola/rust that referenced this pull request Jul 11, 2024
… r=Veykril

fix: don't remove parentheses for calls of function-like pointers that are members of a struct or union

Fixes rust-lang#17111
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants