Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some possible inefficiences #28

Open
jaykrell opened this issue Apr 11, 2021 · 3 comments
Open

some possible inefficiences #28

jaykrell opened this issue Apr 11, 2021 · 3 comments

Comments

@jaykrell
Copy link

rangemap

This is one my favorate data structures.
Thank you for implementing it.

I believe ranges are under utilized in practise in general.

I implemented the "same" thing in C++ years ago.

My implementation has some efficiency gaps and I think yours does too.

I'm not good at Rust, but when I went to reimplement
my C++ open and perfected, I ran into, something
I think you "messed up" too.

Specifically I think there need not be any linear searches.

Including here:

https://github.com/jeffparsons/rangemap/blob/master/src/map.rs#L155

    while let Some((stored_range_start_wrapper, stored_value)) = self

while I have not finished my code, I believe you can limit yourself
to two binary searches -- for the start and the end.

Also here, I think your implementation sometimes does an insert
where it can reuse an existing range.

I realize this structure has surprisingly many cases to handle and
making it perfect is not trivial.

e.g. you should avoid this:

// Delete the stored range, and then add back between

    // Delete the stored range, and then add back between
    // 0 and 2 subranges at the ends of the range to insert.
    self.btm.remove(&stored_range_start_wrapper);
@jeffparsons
Copy link
Owner

Hi Jay, thanks for getting in touch! 👋

https://github.com/jeffparsons/rangemap/blob/master/src/map.rs#L155

Unfortunately the Rust BTreeMap API is quite limited at the moment, so I think removing all entries between the start and end of the range we're interested isn't possible. I'm sure it's possible to optimise the current implementation a bit, but rather than doing that and making it significantly more complex now, I'd rather wait until features like rust-lang/rust#81075 are stabilized and I can optimise it and make it simpler at the same time. 😊

The Holy Grail in my mind would be a proper BTreeMap cursor API that would let you navigate the tree in whatever direction(s) you want, and delete/insert at whatever point you're at. A handy extension would be methods that consume two non-mutating cursors to do things like removing that whole range from the map, which would let you efficiently find your start and end points however you want and then remove all elements in the range without having to re-find those elements you already found.

Also here, I think your implementation sometimes does an insert
where it can reuse an existing range.

I absolutely agree, and this is one of the things I'm least happy with. But as above, I'm hesitant to optimise it right now and make it much more complex in the process given that I know there are some better upstream APIs on the horizon that should give me the best of both worlds.

In short, yepp, I completely agree with you, but I don't really have the time right now to implement and maintain those optimizations.

@jaykrell
Copy link
Author

Yeah, C++ maybe has advantages here, it already has the iterators (cursor) and the range delete, though there also inconsistencies between global upper_bound / lower_bound and set/map::upper_bound/lower_bound -- set/map force you to always start at the root, but ideal here is, I think, you do upper or lower bound of the range start, and then from there for the end.

And to further agree, I put more time into this in C++ and it is, tricky. There are so many cases, overlap, adjacent, on each side (of each side), and the data matching or not. A few cases can be combined.

Also to be clear, in terms of insert/delete with cursor, it should be a hint probably...depending on if the underlying storage is a vector whose order you maintain, or a tree that maintains its own ordering. I generally prefer the tree/set/map underlying except for that nagging problem in STL where upper_bound/lower_bound in set/map cannot take a starting point.

@Rua
Copy link

Rua commented Mar 7, 2022

While working on #40 I noticed that there are a few places where a key is cloned. This seems like it could be inefficient depending on what's being used as the key, but I'm not sure what could be used as an alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants