-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(store): Add new TreeMap
implementation
#665
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have taken just a quick look through the code so far. Will try to dive deeper later this week. Let's finish these collections 😉
key: K, // key stored in a node | ||
lft: Option<FreeListIndex>, // left link of a node | ||
rgt: Option<FreeListIndex>, // right link of a node | ||
ht: u32, // height of a subtree at a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you have inherited the AVL implementation from the old collection and this optimization is probably not worth the trouble, but instead of storing tree height, you can actually just store the balance factor (-1, 0, 1). It is quite easy to see that it would be enough by looking at all current usages of ht
. From what I remember upholding the balance factor invariant is a bit tricky though and the gain is minuscule (if any at all), but I thought I would share my thoughts just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's tackle this in a following PR. This is under unstable
so let's try to refactor and optimize once we have a framework to compare the optimizations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't mind. Just thought I would point this out before I forget :)
/// - `insert`/`remove`: O(log(N)) | ||
/// - `min`/`max`: O(log(N)) | ||
/// - `above`/`below`: O(log(N)) | ||
/// - `range` of K elements: O(Klog(N)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar thing here - if we make iterators actually hold Node
instances, we would be able to answer range
queries with a better complexity of O(K + log(N))
. Every time you need next
element, you would just follow a deterministic algorithm of finding the next/previous value in a binary tree. This might be a little more worthwhile than my last point, but I leave it to your judgement.
I am not 100% sure on my assessment of complexity as each next-element-lookup takes O(log(N))
, so if we judge the complexity naively it would still be O(Klog(N))
. However, if we actually do amortized analysis here, I think in a BST with inner nodes it should be O(K + log(N))
. I can uncover some existing formal proof on this if you want or, if not, probably can come up with one myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely possible and should be done, but requires a decent refactor as with the current impl we don't have a way to traverse up to the parent for these operations. Perhaps like the previous point we should implement this optimization in a future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a mini pass on this, with some minor comments. I don't understand enough about AVL-Trees so will do another pass later on when I up my understanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks good to go. The interface mainly looks the same as the older one, and we should be able to refine it based on what Daniyar mentioned
near-sdk/src/store/tree_map/mod.rs
Outdated
let prefix = prefix.into_storage_key(); | ||
let mut vec_key = prefix.into_storage_key(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the extra into_storage_key()
intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some documentation nitpicks and a potential thing to look into
K: Borrow<Q>, | ||
Q: BorshSerialize + ToOwned<Owned = K> + Ord, | ||
{ | ||
self.values.contains_key(k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would personally rename values
to something else as lines like this one look a little odd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't personally see the issue, but don't have a preference, so I'm happy to hear if you have a name that might be more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant that it looks weird that the map has a field named values
that also hold keys. Feel free to ignore my comment, I don't really mind either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, technically it doesn't hold the keys, it just is generic on the key type to do lookups
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point too
} else { | ||
if key.lt(&node.key) { | ||
let idx = match node.lft { | ||
Some(lft) => self.insert_at(expect(self.node(lft)).clone(), lft, key), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you explored whether it is possible to get rid of all these clones by writing another auxiliary method like this one:
fn node_mut(&mut self, id: FreeListIndex) -> Option<&mut Node<K>> {
self.nodes.get_mut(id)
}
And then just operating with &mut Node<K>
in this method (and maybe some others too)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, absolutely did, the issue with the current impl is you can't borrow this mutably and borrow other parts of the vec immutably. Yes we know there isn't overlap, but would need unsafe for this to happen. This is why clones currently happen in the implementation, but is still an improvement over the existing implementation because it was loading and deserializing the value where the clone was.
This can be optimized in the future without breaking changes with the current API, but would require almost a full re-write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I thought something like that might be the case, thanks for confirming.
Closes #588
So this is somewhat based on the original impl, but I changed quite a bit, so probably would have been better to just start from scratch. That was incredibly painful to make the old impl more efficient and under the new API of the underlying data structures.
All the APIs are changed to match
std::collections::BTreeMap
.I'll make a pass-through tonight or tomorrow to make sure everything we can match we do, and then I will open this up. Otherwise, all current functionality will remain the same if anyone wants to start understanding the changes. I tried doing changes in place, to more clearly see differences (586f0f0) but this ended up not being very helpful because a lot of things changed.