feat(cannon): Binary serialization for snapshots #7559

clabby · 2023-10-05T14:48:56Z

Overview

Moves the cannon state snapshots to a streamable binary serialization format, usable by types implementing the Serializable interface:

// Serializable defines functionality for a type that may be serialized to raw bytes.
type Serializable interface {
	// Serialize encodes the type as raw bytes.
	Serialize(out io.Writer) error

	// Deserialize decodes raw bytes into the type.
	Deserialize(in io.Reader) error
}

The encoding scheme is just regular old flat binary - when a fixed size type is written, it's written raw, and when a dynamically sized piece of data is written, it is written following a length prefix.

Rationale

As it stands, the serialization format of state snapshots is somewhat convoluted; We first base64 encode the pages within Cannon's Memory, JSON encode the full State in memory, then we gzip the JSON. This method brings us to a streamable binary format with a single compression pass for the State, which is commonly very large towards the tail end of Cannon execution.

Inphi · 2023-10-05T17:36:10Z

Alternatively, if we don't care about compatibility with other VMs, it'll be much easier to use Go's built-in codec. gob alleviates the need to write and maintain the serde routines when new fields are added. It uses reflection to figure out the encoding.

clabby · 2023-10-05T18:13:38Z

Alternatively, if we don't care about compatibility with other VMs, it'll be much easier to use Go's built-in codec. gob alleviates the need to write and maintain the serde routines when new fields are added. It uses reflection to figure out the encoding.

One of the primary reasons for this change is to define an easy-to-implement codec for the state snapshots so that other VM implementations, namely cannon-rs, can be a drop-in option for the op-challenger. Currently the op-challenger directly relies on code from Cannon for serde ops, so any go-specific codecs would be a pain to implement in other languages or require an added dependency (of which I've been trying to minimize at all costs).

Inphi · 2023-10-05T18:43:01Z

Alternatively, if we don't care about compatibility with other VMs, it'll be much easier to use Go's built-in codec. gob alleviates the need to write and maintain the serde routines when new fields are added. It uses reflection to figure out the encoding.

One of the primary reasons for this change is to define an easy-to-implement codec for the state snapshots so that other VM implementations, namely cannon-rs, can be a drop-in option for the op-challenger. Currently the op-challenger directly relies on code from Cannon for serde ops, so any go-specific codecs would be a pain to implement in other languages or require an added dependency (of which I've been trying to minimize at all costs).

VMs that represent program memory differently, including cannon-rs, will have to massage the data quite a bit to integrate the snapshot. It follows that the snapshots should be in a standard format so it's easy to parse. We sorta had this with the JSON serializer, but an unstructured binary representation will be cumbersome and error-prone to parse. I suggest using something like protobuf, flatbuffers, or even rlp (ok maybe not), if the goal is to allow other VMs use the same snapshot format.

Inphi · 2023-10-05T18:44:39Z

Also, I dunno if it's a realistic usecase to support loading and dumping snapshots using different VM implementations. It's probably good to have something like this as a test vector, but not worth it as a user-facing feature imo.

ajsutton · 2023-10-05T19:15:20Z

I'd potentially be open to changing op-challenger to not read cannon snapshots as well. We could run a cannon subcommand to convert a snapshot to a claim hash instead. We'd have to review the use cases to know what that would take but it would be good to keep them more decoupled if possible.

protolambda · 2023-10-05T21:47:32Z

cannon/mipsevm/state.go

@@ -70,6 +72,214 @@ func (s *State) EncodeWitness() StateWitness {
 	return out
 }

+func (s *State) Serialize(out io.Writer) error {


The state is so small (< 500 bytes) that it may just be worth serializing/deserializing the state witness, and writing/reading it as one blob into the writer, to avoid many different write/read calls.

protolambda · 2023-10-05T21:50:16Z

See above suggestion regarding state serialization. Another benefit would also be that if we include the memory-hash as part of the state witness in the snapshot, then you can determine the claim-hash by just parsing and hashing the witness part of the full snapshot, which can be very fast, as you don't have to load the full state.

protolambda · 2023-10-05T21:52:00Z

If we prefix the state snapshot with the claim hash of the snapshot, maybe we don't have to invoke any binary at all, and are still able to generalize it, since the challenger would just always read the first 32 bytes as claim hash, and ignore the rest of the snapshot?

clabby · 2023-10-06T01:34:40Z

If we prefix the state snapshot with the claim hash of the snapshot, maybe we don't have to invoke any binary at all, and are still able to generalize it, since the challenger would just always read the first 32 bytes as claim hash, and ignore the rest of the snapshot?

This would definitely be nice if we stick with this format - good thinking. Stepping back a bit, this change can be slowrolled as there's no immediate need for it - What if we thought about making a common FFI schema for all of the VM implementations? The abstraction problem seems to be the binary + the op-challenger using go cannon-specific types for serialization etc. If not for that, the VM impls wouldn’t have to care about the serialization format at all.

With a nice FFI schema for the VMs to implement, we could interact with VM implementations as a library in the op-challenger and galadriel (in the future) + a binary that supports it, and the user of the interface could decide if they’d like to even make snapshots, etc., taking that responsibility off of a specific VM implementation's types. Each one could decide how it wants to do this, and the interface could just have a standardized binary format similar to this one, but with a bit more rigidity. This would make it easier to use several different VM implementations in a drop-in way for services that utilize the interfaces (which can be implemented in multiple languages) and then leave everything else to the consumer, which would be nice.

ajsutton · 2023-10-06T05:04:11Z

I like the idea of prefixing the snapshot with the key data as a simple but very effective step forward. I'm unclear on how much effort an FFI type schema would take to setup. The challenger has a few requirements that may not be obvious here though:

It needs to get data at a specific trace index OR the hash of the final trace index (if the requested one is from a later step). Currently the challenger is reading the output snapshot to get the last step's data and is jumping through some hoops to keep track of what the last step index actually is. Would be nice if the VM could manage that more itself
The data it needs for that step is the claim hash, preimage key and value (if any), state witness (pre-image of the claim hash) and the proof data
It needs to compute the claim hash from the pre-state for verification
It needs to manage disk allocations - it provides a directory to write data to and is responsible for deleting it when the game is no longer relevant.

So we'd need to be able to easily get both the claim hash and the preimage (the state witness). We don't really want to just get the state witness and hash it ourselves as the hashing process requires setting the status byte which would be good to keep inside cannon if possible. Could be hash :: witness :: whatever other data though.

It would be fine for the VM to be just given a directory for its own use and it can manage snapshots etc itself, though the challenger assumes fetching a previously generated proof is fast currently. It is useful for the challenger to have a flag to control how often snapshots are stored to let the user trade off execution time vs disk space.

Beyond that I'm a fan of simplifying the CannonTraceProvider and having cannon manage more of the details itself. That would also give more flexibility to alternate VMs.

github-actions · 2023-10-21T01:45:47Z

This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days.

codecov · 2023-10-28T06:39:51Z

Codecov Report

Merging #7559 (b7ab721) into develop (62d4457) will decrease coverage by 1.08%.
Report is 672 commits behind head on develop.
The diff coverage is 27.71%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7559      +/-   ##
===========================================
- Coverage    53.48%   52.41%   -1.08%     
===========================================
  Files          162      163       +1     
  Lines         6048     6250     +202     
  Branches       970      970              
===========================================
+ Hits          3235     3276      +41     
- Misses        2691     2798     +107     
- Partials       122      176      +54

Flag	Coverage Δ
cannon-go-tests	`58.13% <27.71%> (-5.36%)`	⬇️
chain-mon-tests	`26.95% <ø> (ø)`
common-ts-tests	`26.74% <ø> (ø)`
contracts-bedrock-tests	`61.26% <ø> (ø)`
contracts-ts-tests	`100.00% <ø> (ø)`
core-utils-tests	`44.03% <ø> (ø)`
sdk-next-tests	`42.18% <ø> (ø)`
sdk-tests	`42.18% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
cannon/mipsevm/page.go	`88.88% <ø> (+22.67%)`	⬆️
cannon/cmd/load_elf.go	`0.00% <0.00%> (ø)`
cannon/cmd/witness.go	`0.00% <0.00%> (ø)`
cannon/cmd/run.go	`0.00% <0.00%> (ø)`
cannon/cmd/binary.go	`40.47% <40.47%> (ø)`
cannon/mipsevm/memory.go	`71.86% <34.54%> (-9.01%)`	⬇️
cannon/mipsevm/state.go	`37.79% <23.31%> (-51.34%)`	⬇️

trianglesphere · 2023-11-01T22:14:35Z

cannon/mipsevm/state.go

@@ -70,6 +72,214 @@ func (s *State) EncodeWitness() StateWitness {
 	return out
 }

+func (s *State) Serialize(out io.Writer) error {


I'd like a quick spec on the format before we merge

ajsutton · 2023-11-01T23:58:46Z

Consensus seems to be this is a worthwhile change and should be fine to merge, but it's worth documenting the binary format first. The aim is just to make it easier to understand the format, not to set it as an official standard format - we may well make further changes to this file format and beak compatibility prior to fault proofs shipping to mainnet.

github-actions · 2023-11-16T01:47:45Z

This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days.

ajsutton · 2023-11-24T00:09:13Z

cannon/mipsevm/state.go

+	serMemBuf := new(bytes.Buffer)
+	err := s.Memory.Serialize(serMemBuf)
+	if err != nil {
+		return err
+	}
+	serMemBytes := serMemBuf.Bytes()
+	serMemLen := uint32(len(serMemBytes))


I'd missed this before, but this is really very unfortunate since we now need to hold a copy of the entire memory in RAM while serializing. That was causing OOM failures originally until we started gzipping the pages individually to reduce the size.

I think we need to find a way to avoid this and actually stream the memory content. The simplest thing would be for the memory serialisation to prefix its content with the number of pages to read and then read exactly those pages back in rather than reading until EOF. Net result is the prefix becomes the number of pages (ie list length) rather than number of bytes in the memory, but now you can fully stream the content when reading and writing.

As this currently stands I think we'll run into memory issues again.

ajsutton · 2023-11-24T00:13:39Z

Pushed a couple of commits to update this with the latest changes from develop, fix the devnet failure and update how atomic file writes are done.

github-actions · 2023-12-08T01:47:51Z

This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days.

clabby force-pushed the cl/cannon-ser branch 2 times, most recently from 883c57e to 37ee20f Compare October 5, 2023 14:55

clabby marked this pull request as ready for review October 5, 2023 15:38

clabby requested a review from a team as a code owner October 5, 2023 15:38

clabby requested review from mslipper and trianglesphere October 5, 2023 15:38

clabby self-assigned this Oct 5, 2023

clabby added A-cannon Area: cannon A-op-challenger Area: op-challenger labels Oct 5, 2023

protolambda reviewed Oct 5, 2023

View reviewed changes

clabby mentioned this pull request Oct 10, 2023

feat: Binary Serialization op-rs/cannon-rs#29

Open

github-actions bot added the Stale label Oct 21, 2023

protolambda removed the Stale label Oct 24, 2023

clabby added 6 commits October 27, 2023 17:01

feat(cannon): move cannon state snapshots to binary encoding format

b5e63fc

Update op-challenger tests

d3684f7

golint

09060dc

Update provider tests

9c0e154

Add a version byte to the State + Memory encoding

7e44b12

Resolve conflicts

4929053

clabby force-pushed the cl/cannon-ser branch from 3254d5a to 4929053 Compare October 27, 2023 21:02

Merge branch 'develop' into cl/cannon-ser

4fc7f46

trianglesphere reviewed Nov 1, 2023

View reviewed changes

github-actions bot added the Stale label Nov 16, 2023

github-actions bot closed this Nov 21, 2023

sebastianst removed the Stale label Nov 21, 2023

sebastianst reopened this Nov 21, 2023

ajsutton added 3 commits November 24, 2023 09:38

Merge branch 'develop' into cl/cannon-ser

cc8e78b

ci: Update caching of cannon prestates

0c5b842

cannon: Switch binary serialization to use atomic writer

24fbc5d

ajsutton reviewed Nov 24, 2023

View reviewed changes

ajsutton requested a review from a team as a code owner November 24, 2023 00:13

ci: Persist the binary prestate file.

b7ab721

github-actions bot added the Stale label Dec 8, 2023

github-actions bot closed this Dec 13, 2023

ajsutton mentioned this pull request Sep 3, 2024

cannon: Support binary serialisation for snapshots #11718

Merged

clabby deleted the cl/cannon-ser branch October 23, 2024 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cannon): Binary serialization for snapshots #7559

feat(cannon): Binary serialization for snapshots #7559

clabby commented Oct 5, 2023

Inphi commented Oct 5, 2023

clabby commented Oct 5, 2023 •

edited

Loading

Inphi commented Oct 5, 2023

Inphi commented Oct 5, 2023 •

edited

Loading

ajsutton commented Oct 5, 2023

protolambda Oct 5, 2023

protolambda commented Oct 5, 2023 •

edited

Loading

protolambda commented Oct 5, 2023

clabby commented Oct 6, 2023 •

edited

Loading

ajsutton commented Oct 6, 2023

github-actions bot commented Oct 21, 2023

codecov bot commented Oct 28, 2023 •

edited

Loading

trianglesphere Nov 1, 2023

ajsutton commented Nov 1, 2023

github-actions bot commented Nov 16, 2023

ajsutton Nov 24, 2023

ajsutton commented Nov 24, 2023

github-actions bot commented Dec 8, 2023

feat(cannon): Binary serialization for snapshots #7559

feat(cannon): Binary serialization for snapshots #7559

Conversation

clabby commented Oct 5, 2023

Overview

Rationale

Inphi commented Oct 5, 2023

clabby commented Oct 5, 2023 • edited Loading

Inphi commented Oct 5, 2023

Inphi commented Oct 5, 2023 • edited Loading

ajsutton commented Oct 5, 2023

protolambda Oct 5, 2023

Choose a reason for hiding this comment

protolambda commented Oct 5, 2023 • edited Loading

protolambda commented Oct 5, 2023

clabby commented Oct 6, 2023 • edited Loading

ajsutton commented Oct 6, 2023

github-actions bot commented Oct 21, 2023

codecov bot commented Oct 28, 2023 • edited Loading

Codecov Report

trianglesphere Nov 1, 2023

Choose a reason for hiding this comment

ajsutton commented Nov 1, 2023

github-actions bot commented Nov 16, 2023

ajsutton Nov 24, 2023

Choose a reason for hiding this comment

ajsutton commented Nov 24, 2023

github-actions bot commented Dec 8, 2023

clabby commented Oct 5, 2023 •

edited

Loading

Inphi commented Oct 5, 2023 •

edited

Loading

protolambda commented Oct 5, 2023 •

edited

Loading

clabby commented Oct 6, 2023 •

edited

Loading

codecov bot commented Oct 28, 2023 •

edited

Loading