Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "entry" API for maps #17320

Closed
aturon opened this issue Sep 16, 2014 · 8 comments
Closed

Implement "entry" API for maps #17320

aturon opened this issue Sep 16, 2014 · 8 comments
Labels
B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented.

Comments

@aturon
Copy link
Member

aturon commented Sep 16, 2014

Tracking issue for rust-lang/rfcs#216

@aturon aturon added A-libs B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. labels Sep 16, 2014
Gankra added a commit to Gankra/rust that referenced this issue Sep 18, 2014
Deprecates the `find_or_*` family of "internal mutation" methods on `HashMap` in
favour of the "external mutation" Entry API as part of RFC 60. Part of rust-lang#17320,
although this still needs to be done on `TreeMap` and `BTree`. Work on `TreeMap`
is TBD, and `BTree`'s work is part of the complete rewrite in rust-lang#17334.

The implemented API deviates from the API described in the RFC in two key places:

* `VacantEntry.set` yields a mutable reference to the inserted element to avoid code
duplication where complex logic needs to be done *regardless* of whether the entry
was vacant or not.
* `OccupiedEntry.into_mut` was added so that it is possible to return a reference
into the map beyond the lifetime of the Entry itself, providing functional parity
to `VacantEntry.set`.

[breaking-change]
Gankra added a commit to Gankra/rust that referenced this issue Sep 25, 2014
Deprecates the `find_or_*` family of "internal mutation" methods on `HashMap` in
favour of the "external mutation" Entry API as part of RFC 60. Part of rust-lang#17320,
although this still needs to be done on the rest of the maps, they don't have
any internal mutation methods defined, so they can be done without deprecating
or breaking anything. Work on `BTree`'s is part of the complete rewrite in rust-lang#17334.

The implemented API deviates from the API described in the RFC in two key places:

* `VacantEntry.set` yields a mutable reference to the inserted element to avoid code
duplication where complex logic needs to be done *regardless* of whether the entry
was vacant or not.
* `OccupiedEntry.into_mut` was added so that it is possible to return a reference
into the map beyond the lifetime of the Entry itself, providing functional parity
to `VacantEntry.set`.

This allows the full find_or_insert functionality to be implemented using this API.
A PR will be submitted to the RFC to amend this.

[breaking-change]
bors added a commit that referenced this issue Sep 25, 2014
Deprecates the `find_or_*` family of "internal mutation" methods on `HashMap` in
favour of the "external mutation" Entry API as part of RFC 60. Part of #17320,
but this still needs to be done on the rest of the maps. However they don't have
any internal mutation methods defined, so they can be done without deprecating
or breaking anything. Work on `BTree` is part of the complete rewrite in #17334.

The implemented API deviates from the API described in the RFC in two key places:

* `VacantEntry.set` yields a mutable reference to the inserted element to avoid code
duplication where complex logic needs to be done *regardless* of whether the entry
was vacant or not.
* `OccupiedEntry.into_mut` was added so that it is possible to return a reference
into the map beyond the lifetime of the Entry itself, providing functional parity
to `VacantEntry.set`.

This allows the full find_or_insert functionality to be implemented using this API.
A PR will be submitted to the RFC to amend this.

[breaking-change]
@Gankra
Copy link
Contributor

Gankra commented Sep 28, 2014

Do we want to implement this on SmallntMap? It borders on useless there (in theory you can avoid an index check, and maybe some other absolutely minor branching). It's pretty trivial to implement, but it just seems like a waste.

@alexcrichton
Copy link
Member

I've found it's often very convenient to have a consistent API across maps (HashMap/TreeMap/SmallIntMap/BtreeMap, etc). It's really useful when migrating from one container to another!

@Gankra
Copy link
Contributor

Gankra commented Sep 28, 2014

Fair enough.

@Gankra
Copy link
Contributor

Gankra commented Oct 9, 2014

Status Report:

  • HashMap and BTreeMap are done.
  • SmallIntMap is in the queue, although I'm super unhappy with where it is atm: implement Entry API for SmallIntMap #17879
  • @pczarn has offered to look into TreeMap, though they're quite busy (and I'd rather they focus on adapative hashing if anything)
  • @michaelsproul has offered to look into TrieMap

@gamazeps
Copy link
Contributor

Looking into TreeMap ;)

@michaelsproul
Copy link
Contributor

TrieMap is nearly done, I'll have a PR up in the next few days :)

bors added a commit that referenced this issue Nov 10, 2014
…=bstrie

I've implemented the new collection views API for TrieMap. I more or less followed the approach set out by @gankro in BTreeMap, by using a `SearchStack`. There's quite a bit of unsafe code, but I've wrapped it safely where I think is appropriate. I've added tests to ensure everything works, and performance seems quite good.

```
test trie::bench_map::bench_find                           ... bench:     67879 ns/iter (+/- 4192)
test trie::bench_map::bench_find_entry                     ... bench:    186814 ns/iter (+/- 18748)
test trie::bench_map::bench_insert_large                   ... bench:    716612 ns/iter (+/- 160121)
test trie::bench_map::bench_insert_large_entry             ... bench:    851219 ns/iter (+/- 20331)
test trie::bench_map::bench_remove                         ... bench:    838856 ns/iter (+/- 27998)
test trie::bench_map::bench_remove_entry                   ... bench:    981711 ns/iter (+/- 53046)
```

Using an entry is slow compared to a plain find, but is only ~15% slower for inserts and removes, which is where this API is most useful. I'm tempted to remove the standalone `remove` function in favour of an entry-based approach (to cut down on complexity).

I've added some more comments to the general part of the code-base, which will hopefully help the next person looking over this. I moved the three key structures to the top of the file so that the nesting structure is clearly visible, and renamed `Child<T>` to `TrieNode<T>` and `TrieNode<T>` to `InternalNode<T>` to improve clarity. If these changes are creeping, I'm happy to revert them.

Let me know if my use of `fail!` is ok, I was a little unsure of how specific to be. Some of the data-structures have various invariants that shouldn't be broken, so using `fail!` seemed appropriate.

## Still to do

* Modernise iterators (make them double-ended).
* Make the keys generic, or rename this data-structure (see: #14902).
* Possibly move this code out of libcollections. [Searching Github for TrieMap turns up very few real results.][triemap-search]

Related issues: #18009 and #17320

[triemap-search]: https://github.com/search?utf8=%E2%9C%93&q=TrieMap+language%3ARust&type=Code&ref=searchresults
@steveklabnik
Copy link
Member

@gankro what's the status of this today?

@Gankra
Copy link
Contributor

Gankra commented Feb 15, 2015

Resolved!

@Gankra Gankra closed this as completed Feb 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented.
Projects
None yet
Development

No branches or pull requests

6 participants