Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid memory allocations and copies when loading states #937

Closed
wants to merge 3 commits into from

Conversation

arnetheduck
Copy link
Member

  • rolls back some of the ref changes
  • adds utility to calculate stack sizes

@arnetheduck
Copy link
Member Author

incidentally, this also fixes most new warnings

proc decode(data: openArray[byte]) =
try:
# TODO can't write to output directly..
outputAddr[] = SSZ.decode(data, BeaconState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easy to create a helper that can write to a var directly. I'll look into after the PR is merged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the serialization library would have to be rewritten to not use exceptions and assume a blank slate (empty seqs etc) which would complicate it somewhat - on the plus side it could reuse seq memory etc instead of allocating new seqs - in general though, it's uglier and less safe - specially in the presence of exceptions - this BeaconChainDB acts as an exception barrier so it can do tricks like this somewhat safely as long as it has the rollback workaround

the bigger issue here is the lifetime of outputAddr - I wish there was a safe construct for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if BeaconState becomes a ref, the compiler might move it somewhere else on the heap and invalidate the address.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I put unsafeAddr here so that we can grep for it easily - it should be safe though since the scope is local

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides, nim isn't fancy enough to do compacting, is it?

@@ -122,11 +122,15 @@ proc getStateFromSnapshot(conf: BeaconNodeConf): NilableBeaconStateRef =
quit 1

try:
result = SSZ.decode(snapshotContents, BeaconStateRef)
let res = BeaconStateRef()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you consider it necessary to rewrite the code here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to have only one code path for loading BeaconState avoiding subtle differences and bugs - the most important use case is loading states efficiently in the BlockPool that manages state rewind which is used all over the place in security critical contexts thus this use case guides the tradeoffs in the API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. My question was why did you decide to rewrite:

result = SSZ.decode(snapshotContents, BeaconStateRef)

into

let res = BeaconStateRef()
res[] = SSZ.decode(snapshotContents, BeaconState)
result = res

I see no practical benefits of doing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this way, we don't hit any ref code paths in SSZ that implicitly allocate references - indeed, it means we only use one set of code paths to load BeaconState across the whole codebase - allocate-then-set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this argument unconvincing. The ref code paths are already used at will in the test suite. One can argue they are better tested :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well - not really - there are no ref inside any SSZ objects any more so the ref support in SSZ could be taken away, simplifying it - in fact, this was one of the points raised on the eth1 branch, that loading ref:s inside SSZ is better avoided since it's not a major use case and the code would be simpler without it - loading ref with value semantics as is implemented right now is questionable: they're no longer refs then, but values that behave like refs which is weird and unexpected.

Copy link
Contributor

@zah zah Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Simplify" is an overstatement here. Since SSZ is based on generic functions, you'll compile the ref support only if you actually use it. And if you use it, it will produce just a single additional function that will have exactly the same body as the 3 lines written here.

Since writing 1 line is simpler than writing 3 lines, I'd say the having the ref support simplifies things. I'm not sure why you seem to be fighting the idea that we could employ data sharing between the SSZ objects, but this would require supporting refs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd flip the question and ask why it was changed to begin - this wasn't a necessary change and broke away from what the other code was doing already - as of now, hiding allocations inside ssz doesn't bring any benefits, while it does add costs - there was code designed specifically to avoid it, so instead of having 2 styles, it seems simpler to have one made out of simpler pieces.

we can change this one back, but ideally we should be moving towards hashedbeaconstate/statedata consistently to decrease the surface of options going around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elegance? I wouldn't insist if you feel so strongly about it, but the 3 line code is definitely triggering my OCD tendencies :)

Copy link
Member Author

@arnetheduck arnetheduck Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - pushed a newClone update which should satisfy the 1-line OCD, keeps the ref allocation visible and separate from SSZ and exploit RVO at the same time - I view the hidden memory allocation as a risk and would prefer that it was removed from SSZ as far as is possible, until there's a motivation that gives a return for that risk - how's that for a compromise?

If yes, I'd go ahead and remove some of the ref changes that were introduced with eth1 to further de-risk SSZ - again, until there's an actual benefit that can be evaluated on its own, and not forced in as part of an unrelated feature.

doAssert stateOpt.isSome, "failed to obtain latest state. database corrupt?"
let tmpState = stateOpt.get
let tmpState = BeaconStateRef()
let stateOpt = db.getState(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this usage pattern is repeated few times, maybe a helper like db.getNewState() would be useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can have the added bonus that it avoids the allocation in the "not found" case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is init - it crashes if it doesn't find the state - see above, about code paths. that said, it could be rewritten to load the state directly into the result - I'm hesitant to do so though until more of the BeaconChainDB interface has been refactored - in particular, it should save the state and its stateroot index atomically and ditto for loading - that would make init a lot more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've flagged 4 usages of this potential API. 3 of them are in non-testing code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the intent is that the BlockPool and BeaconStateDB don't work with BlockState directly because this causes some non-atomic behaviour: BeaconState should always be saved with a state root - I've updated the code to reflect this more clearly, but that refactoring is slightly out of scope here - it would need transaction support in the kv store which is still todo

Added getStateRef to the db tests which simplifies them a bit, but sticks with the same loading as block pool

@@ -8,16 +8,17 @@ import
cli do(kind: string, file: string):

template printit(t: untyped) {.dirty.} =
let v =
if cmpIgnoreCase(ext, ".ssz") == 0:
SSZ.loadFile(file, t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code was better before. Why do you insist on avoiding printit(BeaconStateRef)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same reason as elsewhere: one way of decoding states

else:
echo "Unknown file type: ", ext
quit 1
let v = new t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kills the warning, has the same practical effect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you can kill the warnings with printit(NilableBeaconStateRef) too.

blckX = SSZ.loadFile(blck, SignedBeaconBlock)
flags = if verifyStateRoot: {skipStateRootValidation} else: {}

var stateY = HashedBeaconState(data: stateX, root: hash_tree_root(stateX))
if not state_transition(stateY, blckX, flags, noRollback):
stateY.data = SSZ.loadFile(pre, BeaconState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be fixed as well. It could be SSZ.loadFile(pre, stateY.data) (var version)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, RVO is already doing the var transformation, so the only gain is type inference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed - which is why I think it's sort of not worth it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well - there is one more reason: RVO in nim is buggy and slow - but to write it correctly is hard, so it'll likely be buggy anyway

@@ -10,23 +10,23 @@ import

type
AttestationInput = object
state: BeaconStateRef
state: BeaconState
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the benefit of this change. We are just creating more and more types that are unsafe to use as stack variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto - one code path for loading states to maintain the same behaviour when fuzzing and when running rewinds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way I view it is that we consistently give the caller the chance to control the allocation instead making that choice for them - in the 90% case, I'd agree that the caller should not be bothered, but in this case we've identified a particular bottleneck that warrants more careful allocation

else:
# TODO re-check crash here in mainnet
true
not db.getState(Eth2Digest(), tmpState[], noRollback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another usage of db.getNewState

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to add test-only helpers in the the prod code - it seems better to stress the code that's used in real scenarios - is it motivated to add it to testutils? a helper there could call the prod code.

if state.isSome:
return jsonResult(state.get)
let tmp = BeaconStateRef() # TODO use tmpState - but load the entire StateData!
let state = node.db.getState(root.get, tmp[], noRollback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another usage of db.getNewState()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is not to allocate here but rather reuse an existing instance like tmpState - that requires more rewriting though which was planned for a later PR

beacon_chain/block_pool.nim Outdated Show resolved Hide resolved
proc decode(data: openArray[byte]) =
try:
# TODO can't write to output directly..
outputAddr[] = SSZ.decode(data, BeaconState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if BeaconState becomes a ref, the compiler might move it somewhere else on the heap and invalidate the address.

research/stackSizes.nim Show resolved Hide resolved
* rolls back some of the ref changes
* adds utility to calculate stack sizes
* works around bugs in nim exception handling and rvo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants