-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify staker sampling #2056
Simplify staker sampling #2056
Conversation
tests/acceptance/blockchain/agents/test_sampling_distribution.py
Outdated
Show resolved
Hide resolved
be9cb3f
to
6416894
Compare
@fjarri - Can the integration test failure be fixed via rebase over master? |
d332dd2
to
0f8b8dc
Compare
Evidently, yes :) |
Codecov Report
@@ Coverage Diff @@
## main #2056 +/- ##
==========================================
- Coverage 83.66% 83.64% -0.02%
==========================================
Files 103 103
Lines 15039 15058 +19
==========================================
+ Hits 12582 12595 +13
- Misses 2457 2463 +6
Continue to review full report at Codecov.
|
88d6eec
to
b3a031f
Compare
3856658
to
1292692
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still digesting the PR - but the sampling algorithm is much easier to understand, and I like the reservoir abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 - Looking good - I invite reviews from @cygnusv, @GhadaAlmashaqbeh, and @michwill !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, looking very good. It definitely improves the codebase, getting rid of old and ugly code and replacing it with a much simpler yet elegant alternative. Great stuff!
I have some comments where I'd like your opinion, though.
selected_addresses.update(sampled_addresses) | ||
found_ursulas = self.__find_ursulas(sampled_addresses, quantity) | ||
return found_ursulas | ||
return set(found_ursulas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the output is a set, I don't know if the shuffling is necessary as the iteration order over sets is arbitrary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, don't sets share the ordered behaviour with dicts since Py3.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎸
nucypher/policy/policies.py
Outdated
handpicked_ursulas = handpicked_ursulas or set() | ||
selected_ursulas = set(handpicked_ursulas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like the previous one-liner was simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, not sure how it ended up this way. Fixed.
More logical weighted sampling for stakers, with the sampler extracted to a separate class for the ease of unit testing.
additional_ursulas
andattempts
parameters removed fromsample()
.The second commit makes exposes the staker sampler as an iterator-like object.
BlockchainPolicy.sample_essential()
now works as follows:quantity
stakers is neededhandpicked_ursulas
right away)quantity
of stakers, see which ones are already known, and send the rest to the learnerThe nodes that were drawn first have the priority. The returned result is shuffled (or should it be shuffled in
sample()
instead, that is shuffle the known stakers as well?)Advantages over the previous sampling method:
sleep()
and let the learner do its thingThere are more advanced sampling methods, e.g. "Weighted random sampling with a reservoir" by Efraimidis and Spirakis, but they require floats. I'll stick to integers for now, in case we ever want it back in a contract.