Simplify staker sampling #2056

fjarri · 2020-05-28T23:28:06Z

More logical weighted sampling for stakers, with the sampler extracted to a separate class for the ease of unit testing. additional_ursulas and attempts parameters removed from sample().

The second commit makes exposes the staker sampler as an iterator-like object. BlockchainPolicy.sample_essential() now works as follows:

quantity stakers is needed
Get the stakers reservoir based on the list of stakers returned by the contract (excluding handpicked_ursulas right away)
Draw quantity of stakers, see which ones are already known, and send the rest to the learner
Wait a little, check for newly known nodes. If it's still not enough (some nodes are being checked), draw more stakers from the reservoir and send them to the learner.
Loop until time is expired or we have enough nodes

The nodes that were drawn first have the priority. The returned result is shuffled (or should it be shuffled in sample() instead, that is shuffle the known stakers as well?)

Advantages over the previous sampling method:

no need to "pre-sample" 1.5x of Ursulas; we draw samples as we need them.
simpler algorithm structure
we give addresses to the learner in a batch instead of one by one
do not run the loop again and again; sleep() and let the learner do its thing

There are more advanced sampling methods, e.g. "Weighted random sampling with a reservoir" by Efraimidis and Spirakis, but they require floats. I'll stick to integers for now, in case we ever want it back in a contract.

tests/acceptance/blockchain/agents/test_sampling_distribution.py

nucypher/blockchain/eth/agents.py

KPrasch · 2020-06-03T18:26:03Z

@fjarri - Can the integration test failure be fixed via rebase over master?

fjarri · 2020-06-03T20:20:01Z

Evidently, yes :)

codecov · 2020-06-03T20:21:29Z

Codecov Report

Merging #2056 into main will decrease coverage by 0.01%.
The diff coverage is 86.95%.

@@            Coverage Diff             @@
##             main    #2056      +/-   ##
==========================================
- Coverage   83.66%   83.64%   -0.02%     
==========================================
  Files         103      103              
  Lines       15039    15058      +19     
==========================================
+ Hits        12582    12595      +13     
- Misses       2457     2463       +6

Impacted Files	Coverage Δ
nucypher/blockchain/eth/agents.py	`92.50% <85.41%> (-0.38%)`	⬇️
nucypher/policy/policies.py	`90.87% <86.48%> (+0.35%)`	⬆️
nucypher/blockchain/eth/actors.py	`86.56% <100.00%> (-0.02%)`	⬇️
nucypher/network/nodes.py	`81.84% <100.00%> (+0.02%)`	⬆️
nucypher/cli/commands/alice.py	`85.51% <0.00%> (-0.94%)`	⬇️
nucypher/characters/base.py	`89.59% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76eea57...449139e. Read the comment docs.

nucypher/blockchain/eth/agents.py

derekpierre

Still digesting the PR - but the sampling algorithm is much easier to understand, and I like the reservoir abstraction.

nucypher/blockchain/eth/agents.py

nucypher/policy/policies.py

KPrasch

👍 - Looking good - I invite reviews from @cygnusv, @GhadaAlmashaqbeh, and @michwill !

cygnusv

In general, looking very good. It definitely improves the codebase, getting rid of old and ugly code and replacing it with a much simpler yet elegant alternative. Great stuff!
I have some comments where I'd like your opinion, though.

nucypher/blockchain/eth/agents.py

nucypher/policy/policies.py

nucypher/blockchain/eth/agents.py

nucypher/policy/policies.py

cygnusv · 2020-08-07T18:12:20Z

nucypher/policy/policies.py

-        selected_addresses.update(sampled_addresses)
-        found_ursulas = self.__find_ursulas(sampled_addresses, quantity)
-        return found_ursulas
+        return set(found_ursulas)


If the output is a set, I don't know if the shuffling is necessary as the iteration order over sets is arbitrary.

Hm, don't sets share the ordered behaviour with dicts since Py3.7?

…ibution

derekpierre

🎸

derekpierre · 2020-08-11T17:39:27Z

nucypher/policy/policies.py

+        handpicked_ursulas = handpicked_ursulas or set()
+        selected_ursulas = set(handpicked_ursulas)


Feels like the previous one-liner was simpler

Fair enough, not sure how it ended up this way. Fixed.

fjarri requested review from cygnusv, jMyles and KPrasch May 28, 2020 23:29

fjarri commented May 28, 2020

View reviewed changes

tests/acceptance/blockchain/agents/test_sampling_distribution.py Outdated Show resolved Hide resolved

KPrasch requested review from michwill and vzotova May 28, 2020 23:31

fjarri force-pushed the sampling branch from 4fd8101 to 2463eae Compare May 28, 2020 23:36

KPrasch reviewed May 28, 2020

View reviewed changes

nucypher/blockchain/eth/agents.py Outdated Show resolved Hide resolved

cygnusv reviewed May 28, 2020

View reviewed changes

nucypher/blockchain/eth/agents.py Outdated Show resolved Hide resolved

nucypher/blockchain/eth/agents.py Outdated Show resolved Hide resolved

fjarri force-pushed the sampling branch 2 times, most recently from be9cb3f to 6416894 Compare May 29, 2020 01:57

fjarri added the Enhancement New or improved features label May 29, 2020

mswilkison added this to the Before activating StakersEscrow milestone Jun 1, 2020

fjarri force-pushed the sampling branch 2 times, most recently from d332dd2 to 0f8b8dc Compare June 3, 2020 20:12

fjarri force-pushed the sampling branch 12 times, most recently from 88d6eec to b3a031f Compare June 9, 2020 23:09

fjarri force-pushed the sampling branch 3 times, most recently from 3856658 to 1292692 Compare June 10, 2020 00:55

fjarri marked this pull request as ready for review July 6, 2020 16:18

KPrasch requested review from KPrasch, cygnusv, derekpierre, mswilkison, tuxxy and vepkenez July 7, 2020 15:58

mswilkison reviewed Jul 7, 2020

View reviewed changes

nucypher/blockchain/eth/agents.py Outdated Show resolved Hide resolved

KPrasch requested a review from GhadaAlmashaqbeh July 8, 2020 18:31

derekpierre reviewed Jul 13, 2020

View reviewed changes

nucypher/blockchain/eth/agents.py Outdated Show resolved Hide resolved

nucypher/blockchain/eth/agents.py Outdated Show resolved Hide resolved

nucypher/blockchain/eth/agents.py Show resolved Hide resolved

nucypher/policy/policies.py Outdated Show resolved Hide resolved

KPrasch approved these changes Jul 13, 2020

View reviewed changes

KPrasch changed the base branch from master to main July 21, 2020 00:41

fjarri force-pushed the sampling branch from 1292692 to 251d0e9 Compare July 29, 2020 23:40

cygnusv reviewed Aug 7, 2020

View reviewed changes

fjarri force-pushed the sampling branch from 251d0e9 to d3960a4 Compare August 8, 2020 01:46

cygnusv approved these changes Aug 10, 2020

View reviewed changes

fjarri added 4 commits August 10, 2020 12:33

Simplify staker sampling and add unit tests for proper sampling distr…

31715e5

…ibution

Expose staker sampling as an iterator (ish)

9ca6c6d

Implement RFCs

c76b8f5

Implement RFCs, part 2

f8c562a

fjarri force-pushed the sampling branch from d3960a4 to f8c562a Compare August 10, 2020 19:36

derekpierre approved these changes Aug 11, 2020

View reviewed changes

fjarri added 2 commits August 11, 2020 22:27

Implement RFCs, part 3

3237176

Fix a logical mistake in sample()

449139e

KPrasch merged commit 697a6d3 into nucypher:main Aug 13, 2020

fjarri deleted the sampling branch August 18, 2020 00:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify staker sampling #2056

Simplify staker sampling #2056

fjarri commented May 28, 2020 •

edited

Loading

KPrasch commented Jun 3, 2020

fjarri commented Jun 3, 2020

codecov bot commented Jun 3, 2020 •

edited

Loading

derekpierre left a comment

KPrasch left a comment

cygnusv left a comment

cygnusv Aug 7, 2020

fjarri Aug 8, 2020

derekpierre left a comment

derekpierre Aug 11, 2020

fjarri Aug 12, 2020

		handpicked_ursulas = handpicked_ursulas or set()
		selected_ursulas = set(handpicked_ursulas)

Simplify staker sampling #2056

Simplify staker sampling #2056

Conversation

fjarri commented May 28, 2020 • edited Loading

KPrasch commented Jun 3, 2020

fjarri commented Jun 3, 2020

codecov bot commented Jun 3, 2020 • edited Loading

Codecov Report

derekpierre left a comment

Choose a reason for hiding this comment

KPrasch left a comment

Choose a reason for hiding this comment

cygnusv left a comment

Choose a reason for hiding this comment

cygnusv Aug 7, 2020

Choose a reason for hiding this comment

fjarri Aug 8, 2020

Choose a reason for hiding this comment

derekpierre left a comment

Choose a reason for hiding this comment

derekpierre Aug 11, 2020

Choose a reason for hiding this comment

fjarri Aug 12, 2020

Choose a reason for hiding this comment

fjarri commented May 28, 2020 •

edited

Loading

codecov bot commented Jun 3, 2020 •

edited

Loading