Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: caching LookupSet implementation #654

Merged
merged 2 commits into from
Dec 9, 2021
Merged

feat: caching LookupSet implementation #654

merged 2 commits into from
Dec 9, 2021

Conversation

itegulov
Copy link
Contributor

@itegulov itegulov commented Dec 1, 2021

Closes #586

The interface matches std::collections::HashSet without the functions that require being aware of all values in the set. The implementation relies on LookupMap for all the heavy lifting.

Copy link
Contributor

@austinabell austinabell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine to come in if you make that one change to not expose the lookup_set module, but we will probably want to open an issue to optimize this and complete it soon (because there might be a format change, but probably not because the value shouldn't serialize as any bytes)

near-sdk/src/store/lookup_map/mod.rs Show resolved Hide resolved
T: Borrow<Q>,
Q: BorshSerialize + ToOwned<Owned = T> + Ord,
{
self.map.contains_key(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that one unintentional property that it will only cache the result if the map does not contain the key. The reason the map can't cache the fact that the key does exist, is because it does not actually read the value from storage.

where
T: Clone,
{
self.map.insert(value, ()).is_none()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will note that for things like this, it will incur a larger storage cost. If you look at https://github.com/near/nearcore/blob/d0087f6c5006c1bf581e2eb30ab430e89cb1bc74/nearcore/res/mainnet_genesis.json#L157 you can see storage_read_base (and per-key byte) is slightly larger than storage_has_key_base. When doing this remove, it is doing a read of the bytes rather than just checking if it exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link, I will use it as a reference in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe I am misunderstanding something here, but I am confused what do storage_read_base and storage_has_key_base have to do with LookupMap::insert. From what I see it just uses env::storage_write (that happens in flush()) and nothing else. I don't see a more efficient way of writing a key to the trie other than writing the key itself and 0 bytes as the value (from what I understand writing charges a fixed base cost + per byte rate for both key and value data).

T: Borrow<Q>,
Q: BorshSerialize + ToOwned<Owned = T> + Ord,
{
self.map.remove(value).is_some()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before

near-sdk/src/store/mod.rs Outdated Show resolved Hide resolved
@austinabell
Copy link
Contributor

Also, do you mind adding a CHANGELOG.md entry for this? We try to keep up with breaking or feature changes like this with the PR

@itegulov
Copy link
Contributor Author

itegulov commented Dec 2, 2021

@austinabell judging by your remarks (which are fair), I am guessing we can't actually avoid implementing LookupSet properly (and not use LookupMap internally as I am doing right now) if we want to have a (reasonably) optimal solution?

@ChaoticTempest
Copy link
Member

@itegulov Think you could just override the map.flush() call and implement your own for LookupSet. That's where the env::storage_write and env::storage_remove calls happens. Maybe this requires you exposing LookupMap.cache to pub(crate) though. This is probably the nice part about everything being deferred till Drop::drop

@itegulov
Copy link
Contributor Author

itegulov commented Dec 2, 2021

@ChaoticTempest this might help with insert/remove cost (although, as I mentioned in another thread, I am confused how exactly that would work), but I don't think it will help with lookup efficiency since that actually happens in LookupSet::contains and not LookupSet::flush.

@austinabell
Copy link
Contributor

@austinabell judging by your remarks (which are fair), I am guessing we can't actually avoid implementing LookupSet properly (and not use LookupMap internally as I am doing right now) if we want to have a (reasonably) optimal solution?

Yeah, let's just do in a follow up PR though, this is a valid solution for now and is under an unstable API

@itegulov Think you could just override the map.flush() call and implement your own for LookupSet. That's where the env::storage_write and env::storage_remove calls happens. Maybe this requires you exposing LookupMap.cache to pub(crate) though. This is probably the nice part about everything being deferred till Drop::drop

The extra gas costs are on reads though, not writes. These calls will be the same regardless, there just might be slightly more computation to serialize zero bytes through borsh, but this probably gets optimized away anyway. Am I misunderstanding the suggestion?

Copy link
Contributor

@austinabell austinabell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to resolve conflicts and we can pull this in

@itegulov itegulov merged commit 1a37d85 into master Dec 9, 2021
@itegulov itegulov deleted the lookup-set-v2 branch December 9, 2021 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v2 LookupSet
3 participants