Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add marlin repo sha to snark key identity #6208

Merged
merged 10 commits into from
Oct 3, 2020
Merged

Conversation

nholland94
Copy link
Member

We hit a bug in a testnet recently where the snark keys change from underneath the network in s3, causing nodes which were restarted by kubernetes to download different snark keys and end up in an incompatible state where they cannot verify proofs from the network. As a fix to this problem, I'm not adding the marlin repo sha to the snark key's s3 storage key so that any updates to the marlin repo (which would make incompatible keys) will create a different keypair in s3, avoiding the problem in the future.

2 notes about these changes:

  1. I'm not totally sure I didn't miss somewhere in the CI process where we calculate the expected s3 keypair storage key from outside of the ocaml process. If you have details about how the s3 keypair upload from CI works, please double check I didn't miss anything.
  2. It's possible we may be able to get rid of the hard coded _v2 in the keypair name now that the marlin repo sha is in there. I didn't do that for now, but if that's the case, I can make that change.

@nholland94 nholland94 added ci-build-me Add this label to trigger a circle+buildkite build for this branch ready-to-merge Adding this label will trigger mergify and trigger CI to run and merge the PR labels Sep 30, 2020
@@ -14,4 +14,5 @@
snarky.backendless
sponge
snarkette
coda_version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a transitive dependency to everything above this library on universe. I can't think of any easy fix for it, but we should try to come up with one ASAP, or push forward with moving these things out of the build process altogether.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not ideal, but if we change the zexe interface so that we thread the marlin repo sha down to it instead of baking it in at that layer, we could hack around this. It's not a great solution imo, but it would fix rebuilding zexe_backend every single build, which seems worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be able to do a hack where I depend on the source tree of the marlin repo in external and generate a file internally to zexe_backend that contains the sha for it. If I do it right, it will only trigger a rebuild when the marlin repo changes, but we'll see if I can get dune to play nice like that. I'll test it out some before I merge this.

@lk86
Copy link
Contributor

lk86 commented Sep 30, 2020

I've been working on caching these keys in the docker image before even building the main daemon (and ran into issues because the v2 keys aren't static), a tool in the marlin repo for consistently doing that would be nice and is probably the better place to standardize. Does this also help avoid the universe dependency?

@lk86
Copy link
Contributor

lk86 commented Sep 30, 2020

To add some clarity on what we do in CI:

The build artifact job runs make build_or_download_pv_keys followed by ./scripts/publish-pvkeys.sh to push them to s3, neither of these jobs explicitly references the file name. Anything you store in /tmp/coda_cache_dir during build_or_download_pv_keys ends up in s3 under snark-keys.o1test.net

@mrmr1993
Copy link
Member

mrmr1993 commented Oct 1, 2020

I've been working on caching these keys in the docker image before even building the main daemon (and ran into issues because the v2 keys aren't static), a tool in the marlin repo for consistently doing that would be nice and is probably the better place to standardize.

@lk86 any changes to the snark code will require new keys, so sticking them in the docker image only works until the constraint system next changes. The real fix for this would be to move key generation out of the coda.exe build entirely and manage them separately.

@netlify
Copy link

netlify bot commented Oct 1, 2020

Deploy preview for mina-staging ready!

Built with commit a1b60c3

https://deploy-preview-6208--mina-staging.netlify.app

@nholland94
Copy link
Member Author

The real fix for this would be to move key generation out of the coda.exe build entirely and manage them separately.

I think we should start moving towards this soon. This seems like the best solution. The docker solution is not great since the keypairs can change too frequently (and they are pretty big), and having the daemon generate it is messy.

@nholland94 nholland94 removed the ready-to-merge Adding this label will trigger mergify and trigger CI to run and merge the PR label Oct 1, 2020
@deepthiskumar deepthiskumar added the ready-to-merge Adding this label will trigger mergify and trigger CI to run and merge the PR label Oct 2, 2020
@nholland94
Copy link
Member Author

@mrmr1993 I managed to remove the new zexe dependency on (universe). Would you mind doing a quick re-review of the new implementation?

@mergify mergify bot merged commit 1a40fc4 into develop Oct 3, 2020
@mergify mergify bot deleted the fix/snark-key-s3-storage branch October 3, 2020 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-build-me Add this label to trigger a circle+buildkite build for this branch ready-to-merge Adding this label will trigger mergify and trigger CI to run and merge the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants