std: Add retain method for HashMap and HashSet #39560

F001 · 2017-02-05T09:41:44Z

rust-highfive · 2017-02-05T09:41:50Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @bluss (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

bluss · 2017-02-05T13:24:47Z

src/libstd/collections/hash/map.rs

+    /// map.retain(|&k, _| k % 2 == 0);
+    /// assert_eq!(map.len(), 3);
+    /// ```
+    #[stable(feature = "retain_hash_map", since = "1.17.0")]


This should enter the source tree as an unstable feature. The feature name is good. I'll just give you an initial review libs team will decide on ultimately accepting this or not.

What we learned from Vec::retain's restrictions is that mutable access while retaining is useful. In this case it should then be that it passes &K, &mut V.

bluss · 2017-02-05T13:26:03Z

src/libstd/collections/hash/set.rs

+    /// set.retain(|&k| k % 2 == 0);
+    /// assert_eq!(set.len(), 3);
+    /// ```
+    #[stable(feature = "retain_hash_set", since = "1.17.0")]


Same (unstable) The two methods can use the same feature name, that makes it simpler; they will almost surely be stabilized later as a unit.

bluss · 2017-02-05T14:09:36Z

src/libstd/collections/hash/map.rs

+        for (h, k, v) in old_table.into_iter() {
+            if f(&k, &v) {
+                self.insert_hashed_nocheck(h, k, v);
+            }


The implementation looks good. It could also use the ordered insertion loop that resize is using, since we're iterating in hash order and preserving the order.

Either of those are not in place, and ideally this operation should be in place, I wonder if that can be done efficiently? I'd prefer to have this API even if it is not implemented in place. Since it's only ever removing keys, I don't see any traps (cases where it runs very slowly).

Iterating the map buckets in reverse order and deleting when needed should be efficient (lower number of backward shift) but I'm not sure if the current XXXBucket interface supports that.

The current bucket table interface doesn't support reverse iteration over buckets. Adding that support is as simple as writing a prev method, though. For deletion, you should be able to use code similar to the existing backward shift.

Anyway, @F001's implementation is a good starting point.

I updated this function by "in place" algorithm. And used reverse iteration.

Though, the benchmark data has not collected. I don't know whether it is more efficient.

bluss · 2017-02-05T14:11:57Z

cc @pczarn @arthurprs

arthurprs

This is a great addition.

Even if the non-inplace algorithm performs better in some cases, I think the inplace is the way to go in order to allow caller to intentionally avoid the memory allocator.

arthurprs · 2017-02-07T14:06:47Z

src/libstd/collections/hash/table.rs

+    // during insertion. We must skip forward to a bucket that won't
+    // get reinserted too early and won't unfairly steal others spot.
+    // This eliminates the need for robin hood.
+    pub fn buffer_head(table: M) -> Bucket<K, V, M> {


What about head_bucket here?

arthurprs · 2017-02-07T14:19:22Z

src/libstd/collections/hash/table.rs

+                }
+                Empty(b) => {
+                    // Encountered a hole between clusters.
+                    b.into_bucket()


Now that we use it for retain we should early return on an empty bucket as well, avoiding walking the buckets more than needed.

I don't understand this. Empty bucket and occupied bucket are not in separate groups. I have to walk all the buckets without missing any one, just like the clone method does.

Could you please explain more about this?

Due to the way RobinHood hashing works as soon as you encounter an empty bucket the next non-empty bucket is guaranteed to have displacement = 0. Right now the code goes all the way to the first full bucket with displacement = 0, which is fine for both use cases but may mean extra work for retain. Changing it to return on empty bucket as well changes nothing for resize and possibly saves this bit of work on retain. Micro, but still...

Nevermind this, it's probably not worth it.

I'm still struggling to fully understand the algorithm...

Thank you all the same.

Is there any other issue there?

arthurprs · 2017-02-07T14:21:22Z

src/libstd/collections/hash/table.rs

@@ -1061,6 +1129,45 @@ impl<K: Clone, V: Clone> Clone for RawTable<K, V> {
    }
 }

+impl<K: Debug, V: Debug> Debug for RawTable<K, V> {


Should we keep these out to avoid unnecessary code? The end user can't use these anyway.

arthurprs · 2017-02-07T14:40:28Z

src/libstd/collections/hash/map.rs

@@ -416,22 +416,26 @@ fn search_hashed<K, V, M, F>(table: M, hash: SafeHash, mut is_match: F) -> Inter
    }
 }

-fn pop_internal<K, V>(starting_bucket: FullBucketMut<K, V>) -> (K, V) {
+fn pop_internal<K, V>(starting_bucket: FullBucketMut<K, V>)
+    -> (K, V, &mut RawTable<K, V>)


Any reason not to return Bucket<K, V, ...> instead?

arthurprs

Looks good to me, just one small issue.

arthurprs · 2017-02-08T13:03:56Z

src/libstd/collections/hash/map.rs

@@ -416,22 +416,26 @@ fn search_hashed<K, V, M, F>(table: M, hash: SafeHash, mut is_match: F) -> Inter
    }
 }

-fn pop_internal<K, V>(starting_bucket: FullBucketMut<K, V>) -> (K, V) {
+fn pop_internal<K, V>(starting_bucket: FullBucketMut<K, V>)


Could you revert this to return the &mut Table (reverting my previous suggestion)? Sorry about that.
Right now it isn't consistent regarding what bucket it returns. Fixing that would increase the diff even more for no good reason.

Ok. Should I squash all the commits together?

Fix #36648

F001 · 2017-02-09T01:06:00Z

@bluss @pczarn , do you have other concerns?

arthurprs

👍

alexcrichton · 2017-02-13T16:07:49Z

@bors: r+

bors · 2017-02-13T16:07:50Z

📌 Commit d90a7b3 has been approved by alexcrichton

bors · 2017-02-13T23:48:15Z

⌛ Testing commit d90a7b3 with merge cccf756...

steveklabnik · 2017-02-13T23:58:21Z

@bors: retry

@bluss

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

@bluss

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

@bluss

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

bors · 2017-02-14T14:47:37Z

⌛ Testing commit d90a7b3 with merge 9344cd3...

arthurprs · 2017-02-14T15:11:12Z

I overlooked it before but we may want a have a &V instead of &mut V for consistency with Vec and other friends (#25477) even if it's strictly less useful.

alexcrichton · 2017-02-14T15:24:45Z

Ah no that's intentional, we'd like to change Vec but we're unfortunately unable to :(

bors · 2017-02-14T16:28:05Z

💔 Test failed - status-travis

alexcrichton · 2017-02-14T17:44:37Z

@bors: retry * #38878

…

On Tue, Feb 14, 2017 at 10:28 AM, bors ***@***.***> wrote: 💔 Test failed - status-travis <https://travis-ci.org/rust-lang/rust/builds/201531576> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#39560 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD95Ol-4oQ-zhQdF3nWXn3pQbcCaP2jks5rcdYWgaJpZM4L3cmy> .

@bluss

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

bors · 2017-02-15T07:30:14Z

⌛ Testing commit d90a7b3 with merge ea8c629...

@bluss

std: Add retain method for HashMap and HashSet Fix #36648 r? @bluss

bors · 2017-02-15T10:22:30Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing ea8c629 to master...

Implement BTreeMap::retain and BTreeSet::retain Adds new methods `BTreeMap::retain` and `BTreeSet::retain`. These are implemented on top of `drain_filter` (rust-lang#70530). The API of these methods is identical to `HashMap::retain` and `HashSet::retain`, which were implemented in rust-lang#39560 and stabilized in rust-lang#36648. The docs and tests are also copied from HashMap/HashSet. The new methods are unstable, behind the `btree_retain` feature gate, with tracking issue rust-lang#79025. See also rust-lang/rfcs#1338.

rust-highfive assigned bluss Feb 5, 2017

bluss reviewed Feb 5, 2017

View reviewed changes

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Feb 6, 2017

arthurprs reviewed Feb 7, 2017

View reviewed changes

arthurprs suggested changes Feb 8, 2017

View reviewed changes

std: Add retain method for HashMap and HashSet

d90a7b3

Fix #36648

arthurprs approved these changes Feb 9, 2017

View reviewed changes

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Feb 14, 2017

Rollup merge of rust-lang#39560 - F001:retainHashMap, r=alexcrichton

ced87c6

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

frewsxcv mentioned this pull request Feb 14, 2017

Rollup of 10 pull requests #39800

Closed

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Feb 14, 2017

Rollup merge of rust-lang#39560 - F001:retainHashMap, r=alexcrichton

6e3c54f

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

frewsxcv mentioned this pull request Feb 14, 2017

Rollup of 9 pull requests #39805

Closed

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Feb 14, 2017

Rollup merge of rust-lang#39560 - F001:retainHashMap, r=alexcrichton

d9e3131

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

frewsxcv mentioned this pull request Feb 14, 2017

Rollup of 10 pull requests #39816

Closed

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Feb 15, 2017

Rollup merge of rust-lang#39560 - F001:retainHashMap, r=alexcrichton

5a85e56

std: Add retain method for HashMap and HashSet Fix rust-lang#36648 r? @bluss

frewsxcv mentioned this pull request Feb 15, 2017

Rollup of 6 pull requests #39833

Closed

bors added a commit that referenced this pull request Feb 15, 2017

Auto merge of #39560 - F001:retainHashMap, r=alexcrichton

ea8c629

std: Add retain method for HashMap and HashSet Fix #36648 r? @bluss

bors merged commit d90a7b3 into rust-lang:master Feb 15, 2017

mbrubeck mentioned this pull request Apr 17, 2017

Map::retain rust-lang/rfcs#1338

Closed

2 tasks

mbrubeck mentioned this pull request Aug 21, 2017

Add retain function servo/rust-smallvec#59

Merged

F001 deleted the retainHashMap branch November 16, 2017 08:44

mbrubeck mentioned this pull request Nov 13, 2020

Implement BTreeMap::retain and BTreeSet::retain #79026

Merged

finnbear mentioned this pull request Aug 7, 2023

Tracking Issue for linked_list_retain #114135

Open

3 tasks

std: Add retain method for HashMap and HashSet #39560

std: Add retain method for HashMap and HashSet #39560

Conversation

F001 commented Feb 5, 2017

rust-highfive commented Feb 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pczarn Feb 5, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bluss commented Feb 5, 2017

arthurprs left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arthurprs Feb 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

F001 Feb 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arthurprs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

F001 commented Feb 9, 2017

arthurprs left a comment

Choose a reason for hiding this comment

alexcrichton commented Feb 13, 2017

bors commented Feb 13, 2017

bors commented Feb 13, 2017

steveklabnik commented Feb 13, 2017

bors commented Feb 14, 2017

arthurprs commented Feb 14, 2017

alexcrichton commented Feb 14, 2017

bors commented Feb 14, 2017

alexcrichton commented Feb 14, 2017 via email

bors commented Feb 15, 2017

bors commented Feb 15, 2017

pczarn Feb 5, 2017 •

edited

Loading

arthurprs left a comment •

edited

Loading

arthurprs Feb 8, 2017 •

edited

Loading

F001 Feb 8, 2017 •

edited

Loading