-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove chopped notes from logs #1641
Comments
@dbanks12 @suyash67 @jeanmon Given how involved are the potential solutions and how much complexity they add to the code I think we didn't give enough thought to just leaving the code as is. I was just discussing this with @iAmMichaelConnor and leaving the code as is would have 2 downsides:
I think point 2 is not that big of a deal because we can easily filter the logs out in AztecRPC. Regarding point 1 this waste occurs whenever one function creates a commitment, and another function nullifies that commitment, all within the same tx. So it will happen if a recursive function is called, or if re-entrancy happens. In my discussion with Mike it's not clear how common this scenario would be. I think it would be a good idea to first try to log these transient note occurrences. What I propose is submitting a log with the note spending info over http from NoteProcessor whenever this happens. It should be straightforward to implement and if we get enough sandbox users we might get an idea regarding whether the issue is worth dealing with. |
also referenced in #4891 |
Closes #5017 This PR adds side effect counters to logs, tracks them throughout the kernels, sorts them (only in private kernel tail, public kernel tail is todo), and hashes them to one value in the tail. ## Private path Encrypted and unencrypted logs flow via a private call: - A private fn calling `emit_(un)encrypted_log` now actually pushes the hash of the log to the `PrivateContext` as a `SideEffect` - These logs hashes are maintained through `PrivateCircuitPublicInputs` - They are collected in `init` and `inner` kernels - Like note hashes, we now provide index hints in the `tail` and sort the logs hashes - In the final step of `tail`, we either run `to_public`, keeping individual logs hashes, or `finish_tail` (converting `.end` to `CombinedAccumulatedData`), where logs are hashed to a single value - As before, the single value is rolled up and recalculated in L1 ## Public path Unencrypted logs flow via a public call: - A private fn calling `emit_unencrypted_log` now actually pushes the hash of the log to the `PublicContext` as a `SideEffect` - These logs hashes are maintained through `PublicCircuitPublicInputs` - They are collected in `init` and `inner` kernels - Like note hashes, we currently don't sort in the public tail (I've added a ts hack like the note hash one) - In the final step of `tail`, we hash logs to a single value - As before, the single value is rolled up and recalculated in L1 ## TODOs - Since individual logs are not hashed by the circuits (this is part of #1165 which should be done once #1641 is finished), instead a hash is passed via the oracles, the context circuits can't calculate the length of each log. So the length is still passed in. - In the kernels, logs currently sit as `SideEffects` but should be some special counted logs struct. It makes sense to me to first separate logs which are linked to note preimages (no issue, but part of kernel epic), then assign each type of logs its own struct. - I added some hacks/qs (see comments below). --------- Co-authored-by: Leila Wang <[email protected]>
## A follow-up to #5718: - [x] - Hash logs inside the circuit (closing #1165) Complete, with some caveats. In most cases, whatever log is being broadcast can be converted to bytes with one of the traits (in `log_traits`) and sha hashed. Note we now need these traits because `sha256_slice` has (understandably) been removed, so we must define a fixed length input for generic types. The one exception to this is broadcasting a contract class which ends up emitting up to 518,400 bytes. This is still hashed in ts via a new oracle method `emit_contract_class_unencrypted_log`. I believe it's fine to hash this outside of the circuit, because correctness will be guaranteed by the bytecode commitment (as opposed to a generic log, where we need the hash to come from the circuit to trust it). - [x] - Accumulate logs length inside the circuit Complete, perhaps we should track lengths of each log to more easily split them into revertible/non-revertible later on? - [x] - Flat hash the logs in `tail` + update documentation Now that `sha256_slice` is removed, we would have to flat hash all the empty space up to max logs - unsure whether this would give us any benefit? EDIT: After some testing, it is more efficient for larger numbers of logs (~9+) in both the circuit and L1, so I have implemented flat hashing. - [x] - Add a `logsCache`, like `noteCache`, to track logs + ordering in ts in nested executions Note that the `logsCache` is only implemented (and required) in the private context for now. Public calls don't require squashing and removal of logs representing nullified notes, so an array (`allUnencryptedLogs`) will do. It currently just keeps track of ordering when we have nested calls, but will be useful for removing transient logs (#1641). - [x] - Investigate + solve issue with `tx.ts` error not throwing I'm not sure why this check exists - the comment: ``` // This check is present because each private function invocation creates encrypted FunctionL2Logs object and // both public and private function invocations create unencrypted FunctionL2Logs object. Hence "num unencrypted" // >= "num encrypted". ``` implies that functions must emit both types of logs? A tx with one private call, which emits one encrypted log, should fail here, and I don't see why. EDIT: Have removed the check as it seems redundant. --- Note that in nested calls that have more than one side effect, logs will have duplicate side effect counters and so cannot be sorted correctly (#6052). Currently the logs collected in `allUnencryptedLogs` are the only place that have correctly ordered logs in this case.
Closes #1641 ## This PR: ### Logs as `SideEffect`s -> `LogHash` or `NoteLogHash` Adds new structs to track logs in the kernel circuits, so we can track individual log lengths (and assign gas accordingly) and separate logs linked to notes from generic logs. The length changes contribute to #4712 but this PR does not close it - we need some more work in ts to ensure logs from the non-revertible side actually persist to the tx. ### Adds `note_encrypted_logs` To aid the kernels/pxe/note dbs in managing note encrypted logs, most of the changed files are adding a new array of note logs to anything already containing `.encrypted_logs_<hashes/hash>`. Since all our current contracts/tests only emit encrypted logs for notes, the values expected in `.encrypted_logs` have basically moved over to `.note_encrypted_logs`. A later PR will add code for generic encrypted logs (see below). Includes adding logs to the note cache in `client_execution_context` and removing `LogsCache`, as the note cache handles note specific logs. ### Squashes transient note logs Removes any logs linked to note hashes nullified in the same tx. This involves adding note logs to transient hints for the private kernel tail and chopping them inside the circuit. ### TODOs/Notes - This PR ended up pretty large, so while we can silo log hashes now, it would be cleaner to do in a follow up PR. - There are some small changes to public/AVM contexts which are mostly to replace `SideEffect` with `LogHash` and a couple of things addressed in comments below. - Currently, the only encrypted logs that are emitted are linked to notes. To allow for generic values emitted as encrypted logs, I'll need to add a new oracle call and methods, which makes sense to do once we encrypt inside the circuit (#1139). - Total logs length (e.g. `encrypted_log_preimages_length` is no longer required now we track indvidual lengths (it's also not trusted since it comes from contexts), so it can be removed from the kernels. - We originally tried to track log lengths the same way as `ts`, which adds 4 bytes to each nested 'group' of logs: `L2BlockL2Logs.txLogs = TxL2Logs -> .functionLogs = FunctionL2Logs -> .logs = <Un>EncryptedL2Log[]` but this never actually matched up because we accumulate and sort logs regardless of what call they came from. E.g. 3 groups of `FunctionL2Logs` would each have +4 length when added to a tx, but we take the contents, sort them, then add to a single `FunctionL2Logs.logs` array before assigning to a tx, so end up with a single +4 length. This length value is only used on L1 to destructure the blob of bytes representing logs. Any incorrect lengths would lead to a logs hash mismatch, so we don't need to match them in the circuit anyway. For this reason, I've removed a few unnecessary +4s from the circuits.
Problem statement: Commitment/notes which are created and nullified within the same transaction are chopped so that no entry in the private data tree is ever created. However, the current state of the code still emits a log for such commitments (at least for value note). When the note processor is processing such a log it will not find a corresponding entry in the tree and will log a warning. In addition, there might be other cases where this could cause confusion or issues, for instance, when a chopped note that could be spent by more than one user is received by such a user and is not aware that note is actually already nullified.
Remark: Once this done, it is important to revert/adapt work done as part of issue #1603
References
Original Design on Discourse: https://discourse.aztec.network/t/proposal-forcing-the-sequencer-to-actually-submit-data-to-l1/426
Solution Design
Key Properties
### HackMd Solution Design
The text was updated successfully, but these errors were encountered: