Add API for generic data directory #335

winston-h-zhang · 2024-02-20T07:00:12Z

This PR implements a minimal generic data directory for arecibo to read/write from. It is similar the the data directory used in Lurk. Besides the immediate need to have some way of managing witness/cross-term vectors for our MSM data analysis, there has been a need for this type of infrastructure for a while.

See, for example, microsoft/Nova#270, which can build upon this PR. There is also argumentcomputer/lurk-beta#1086, which may need to move some of Lurk's caching logic into arecibo.

This PR doesn't intend to address all the issues of these PRs, but it would be nice to have the first increment of work streamlined into the codebase, so that we can slowly build up to a robust solution.

huitseeker

This PR seems to solve an issue that does not exist.

[LUR-33] Refactor common behavior between Nova/Supernova in public_parameters caching lurk-beta#1086 is about not duplicating some of the Public Parameter logic between Supernova and Nova. The present approach seems unrelated, and in fact you have taken steps towards solving that issue in the unfinished Organize public param creation with assorted structs #288 that seem to go in a very different direction,
Load KZG setup parameters from file #270 is certainly a high-priority issue, but it is not addressed here.

So this is solving an issue that isn't specified. Could we write it down and open that issue, or more clearly explain what problem this tries to solve in the PR description?

Moreover, on the shape of the addition: this is writing the witness and cross-term by inserting itself in the RecursiveSNARK. This is obviously invasive not only from a protocol standpoint (which might be justified by unspecified goals), but also from a code organization standpoint: we would now need to maintain writing configuration logic in the SuperNova RecursiveSNARK, at least, and possibly in the CompressedSNARKs (x 4).

huitseeker · 2024-02-26T13:59:39Z

src/lib.rs

@@ -489,6 +502,13 @@ where
      T: r1cs::default_T::<Dual<E1>>(r1cs_secondary.num_cons),
    };

+    let config = RecursiveSNARKConfig {
+      #[cfg(any(target_arch = "x86_64", target_arch = "aarch64"))]


Why the configuration override? does my Raspberry pi (arm7) not have a hard drive?

wwared

Not commenting on whether this PR solves the issues around loading/writing data in the way we want to or not, but wanted to point out a few things regarding this code regardless of that:

Since we call set or get_or_init from within the prove_step function (by calling write_data()), isn't it possible that ARECIBO_CONFIG could be initialized from different threads? (even possibly from different threads proving different, unrelated things) It might be better to use OnceLock instead if that's the case (OnceCell is not thread-safe) (it should not be a big deal to change because it may only block during initialization)
The cfg overrides feel weird to me, shouldn't it be cfg(target_arch = "wasm32") instead of cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))?
I feel like it'd be simpler if the data mod was present on every arch, but for the arches we disable it (wasm) just replace with dummy functions instead. This prevents the #[cfg] directives from polluting the callers of the data module. One way this could be implemented is moving the real implementation to a sub-module, and re-exporting inside data with #[cfg] checks
Usability-wise, it would be good to support overriding the ~/.arecibo_data path with an env var like ARECIBO_DATA_DIR or similar
I know it's meant to be extended in the future, but it's weird how write_data is duplicated on the global, per-process DataConfig stored in ARECIBO_CONFIG as well as in the RecursiveSNARKConfig present in every RecursiveSNARK. I think for now it would be better to not have the RecursiveSNARKConfig at all and keep configuration (enabling/disabling the write data) on the data module, possibly controlled by other env vars

winston-h-zhang · 2024-03-18T21:22:11Z

Thank you for all the comments! I've written a starting issue here: #371. It's more of a set of questions that I have for how to proceed, rather than what I think needs to be done, so I would be happy to discuss and create a plan for what's next.

winston-h-zhang requested a review from huitseeker February 20, 2024 07:00

winston-h-zhang mentioned this pull request Feb 20, 2024

A cache for saving public params and commitment keys to local disk #79

Closed

winston-h-zhang force-pushed the cache-witness branch 3 times, most recently from a7901e3 to 6d44f7d Compare February 22, 2024 18:07

huitseeker suggested changes Feb 26, 2024

View reviewed changes

wwared suggested changes Mar 15, 2024

View reviewed changes

winston-h-zhang mentioned this pull request Mar 18, 2024

Generic data directory for arecibo #371

Open

winston-h-zhang force-pushed the cache-witness branch from 9c76e57 to 37615c7 Compare March 19, 2024 16:41

Hanting Zhang added 5 commits March 28, 2024 07:45

add API for data directory

6206b8c

use home, non-wasm builds

e5bc63c

write r1cs_primary

cbd15e3

cleanup

ef869a7

separate different sparse matrices

7110611

winston-h-zhang force-pushed the cache-witness branch from 11bc72c to 7110611 Compare March 28, 2024 07:46

winston-h-zhang force-pushed the cache-witness branch from 7e645a4 to 233cf2f Compare April 18, 2024 08:38

write results too

e129704

winston-h-zhang force-pushed the cache-witness branch from 233cf2f to e129704 Compare April 18, 2024 08:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API for generic data directory #335

Add API for generic data directory #335

winston-h-zhang commented Feb 20, 2024 •

edited

Loading

huitseeker left a comment

huitseeker Feb 26, 2024

wwared left a comment •

edited

Loading

winston-h-zhang commented Mar 18, 2024

Add API for generic data directory #335

Are you sure you want to change the base?

Add API for generic data directory #335

Conversation

winston-h-zhang commented Feb 20, 2024 • edited Loading

huitseeker left a comment

Choose a reason for hiding this comment

huitseeker Feb 26, 2024

Choose a reason for hiding this comment

wwared left a comment • edited Loading

Choose a reason for hiding this comment

winston-h-zhang commented Mar 18, 2024

winston-h-zhang commented Feb 20, 2024 •

edited

Loading

wwared left a comment •

edited

Loading