Migrate to element-addressable memory #1084

Fumuran · 2025-01-17T22:33:33Z

This PR changes the memory model from word-addressable to element-addressable to be compatible with a new version of the Miden VM.

PhilippGackstatter

Looks good to me! Thanks for also updating the docs in memory.rs 🙌.

Left some optional comments.

PhilippGackstatter · 2025-01-20T10:35:41Z

miden-lib/asm/kernels/transaction/lib/memory.masm

-const.INPUT_NOTE_SECTION_OFFSET=1048576
+const.INPUT_NOTE_SECTION_OFFSET=4194304

 # The memory address at which the input note data section begins.
-const.INPUT_NOTE_DATA_SECTION_OFFSET=1064960
+const.INPUT_NOTE_DATA_SECTION_OFFSET=4259840

 # The memory address at which the number of input notes is stored.
-const.NUM_INPUT_NOTES_PTR=1048576
+const.NUM_INPUT_NOTES_PTR=INPUT_NOTE_SECTION_OFFSET


Would it be perhaps better if we had:

# The memory address at which the number of input notes is stored. INPUT_NOTE_SECTION_NUM_PTR=4194304 # The memory address at which the first input note begins. INPUT_NOTE_SECTION_PTR=4194308

Then we don't have to add 4 to the note section offset wherever we use it, making those procedures a bit easier to read. This might apply to more of these constants here.

In general I wonder if having pointers defined in terms of offsets is helpful. I'm wondering if we should have just pointers defined explicitly and offsets stated in comments or something like that. Otherwise we will always have code that interprets an offset as a pointer, and then we might as well have defined it as a pointer in the first place.

Any thoughts on this, @Fumuran?

(None of this has to be addressed in this PR.)

I think that is what we should have now, I think "offset" here (in most paces) is actually a pointer: I think it should be INPUT_NOTE_SECTION_PTR and so on. The only place where we use (or at least should afaik) is offsets inside the structures which should have more than one instance, for example the data of the whole account (since we can load foreign accounts) or the values inside the node data, since we have a lot of nodes.

So, in other words, I think that it is a good idea to change the suffixes those constants from _OFFSET to _PTR, because they are essentially pointers.

To have a different pointer for the number of elements seems like a good idea, although I'm not sure whether it will be that practically helpful, let me look through its usage.

Maybe we can make an issue and investigate whether we use these section pointers in such a way that the first entry should be the length and the remaining entries the elements, or if we mostly use it independently.

miden-lib/asm/kernels/transaction/lib/prologue.masm

miden-lib/build.rs

PhilippGackstatter · 2025-01-21T11:47:07Z

miden-lib/src/transaction/memory.rs

+// | Section           | Start address | End address | Start address (words) |  End address (words) |
+// | ----------------- | :-----------: | :---------: | :--------------------:| :-------------------:|
+// | Bookkeeping       | 0             | 31          | 0                     | 7                    |


Suggested change

// | Section | Start address | End address | Start address (words) | End address (words) |

// | ----------------- | :-----------: | :---------: | :--------------------:| :-------------------:|

// | Bookkeeping | 0 | 31 | 0 | 7 |

// | Section | Element start address (word) | Element end address (word) |

// | ----------------- | :-----------: | :---------: |

// | Bookkeeping | 0 (0) | 31 (7) |

Could we include the word in parentheses instead of separate columns? Or alternatively explain outside the table headers that the elemnt comes first and the corresponding word in parentheses.

Yes, good idea. Honestly first I left word addresses just to make sure I didn't confuse anything, but probably it will be more convenient to check addresses in both element and word forms.

Although I'm not sure which format will be the best. In the technical literature which I read the unit of measurement in the tables is placed after the column name, separated with comma. Will it be clear for the user? Something like that:

// | Section | Start address, pointer (word pointer) | End address, pointer (word pointer) | // | ----------------- | :-----------------------------------: | :---------------------------------: | // | Bookkeeping | 0 (0) | 31 (7) |

Yeah I think what you suggested is fine. We don't export this documentation anyway, so it's only for the miden-base devs.

PhilippGackstatter · 2025-01-21T11:47:35Z

miden-lib/src/transaction/memory.rs

+// | Block header      | 800           | 835         | 200                   | 208                  |
+// | Chain MMR         | 1_200         | 1_331?      | 300                   | 332?                 |
+// | Kernel data       | 1_600         | 1_739       | 400                   | 434                  |
+// | Accounts data     | 8_192         | 532_479     | 2048                  | 133_119              | 64 foreign accounts max


Nit: Should we add a proper comment table column for "64 foreign accounts max"?

Great Idea! I think it will be generally helpful to have something like Max amount column for every section

Edit: taking into account your next comment, I think it will be better just to have a comment column.

PhilippGackstatter · 2025-01-21T11:56:17Z

miden-lib/src/transaction/memory.rs

+// | Global inputs     | 400           | 423         | 100                   | 105                  |
+// | Block header      | 800           | 835         | 200                   | 208                  |
+// | Chain MMR         | 1_200         | 1_331?      | 300                   | 332?                 |
+// | Kernel data       | 1_600         | 1_739       | 400                   | 434                  |


Suggested change

// | Kernel data | 1_600 | 1_739 | 400 | 434 |

// | Kernel data | 1_600 | 1_744 | 400 | 434 |

Nit: Is this not 1604+(4*35) = 1744?

I think it would be great to explain some of these calculations (like this one or the Chain MMR one) that do not have directly corresponding constants in memory.masm. Could just be in a separate table comment column here saying something like "35 kernel procedures hashes (4 elements each)".

End pointer here is the last memory pointer occupied by the corresponding data. So to get the last pointer we need to calculate 1600 + (34 * 4 + 3). It is easy to see this pattern on the simple example. Let's say we need to calculate the end pointer in case we have start and end word pointers as 100 and 101:

┌─┬─┬─┬─┬─┬─┬─┬─┐ |0|1|2|3|4|5|6|7| └─┴─┴─┴─┴─┴─┴─┴─┘ | 100 | 101 |

In that case it will be 1 full word (4 elements, which will get us to the first element of the last word, index 4) and also 3 elements in the last word

Thanks for the explanation. I think my main point was that the kernel procedures start at 1604 rather than 1600, since the word at 1600..=1603 contains the number of procedures, or not?

You're right though that when writing it as an inclusive range, the last address would indeed be 1 less than what I wrote. Then we'd have 1604+(34*4+3) = 1743.

By the Kernel data here we assume both number of kernel procedures and their hashes, so this brings us back to the question of whether it is worth creating a separate pointer to the length of the data length of the section ahead.

As far as I can see, in most places we already have these constants for lengths, they just aren't shown. For example, we have this constant for kernel procedures.
Although we don't have such constants for input notes, so I think it makes sense to add a new constant for it to make a nullifier obtaining more simple (well, we are back to your initial question about a new constant for input notes :) )

miden-lib/src/transaction/memory.rs

miden-tx/src/tests/kernel_tests/test_prologue.rs

bobbinth

Looks good! Thank you. Not a full review yet (I covered like 80% of the non-test code) - but I left some comments inline.

miden-lib/asm/kernels/transaction/lib/tx.masm

miden-lib/asm/note_scripts/P2ID.masm

miden-lib/asm/note_scripts/P2IDR.masm

miden-lib/asm/note_scripts/SWAP.masm

miden-tx/src/host/account_procs.rs

miden-tx/src/testing/tx_context/builder.rs

miden-lib/src/transaction/outputs.rs

bobbinth · 2025-01-21T19:17:34Z

miden-lib/asm/kernels/transaction/main.masm

@@ -140,11 +140,13 @@ proc.main.1
        # => []
    end

+    push.0 drop


Let's use random numbers for push operation - otherwise, there is a chance we'll run into collisions. Let's also make sure to do this for all trace instructions.

Lastly, let's add comments explaining which it is needed (a single-line explanation will do).

bobbinth

Looks good! Thank you! I left a couple of small comments inline.

The plan for merging this is roughly as follows:

I'll merge the element-addressable memory PR in miden-vm.
Then, I'll release the new version of miden-vm.
After this, we can update the Cargo.toml file in this PR to use crates.io versions of miden-vm and then we merge it.

bobbinth · 2025-01-22T07:05:13Z

miden-lib/asm/note_scripts/SWAP.masm

+    # store the note inputs to memory starting at address 12
+    push.12 exec.note::get_assets 


Could we add a comment with the state of the stack after this line?

bobbinth · 2025-01-22T07:13:09Z

miden-lib/src/transaction/outputs.rs

+    let nonce = elements[ACCT_ID_AND_NONCE_OFFSET as usize + ACCT_NONCE_IDX];
+    let vault_root = TryInto::<[Felt; 4]>::try_into(
+        &elements[ACCT_VAULT_ROOT_OFFSET as usize..ACCT_VAULT_ROOT_OFFSET as usize + WORD_SIZE],
+    )
+    .unwrap()
+    .into();
+    let storage_commitment = TryInto::<[Felt; 4]>::try_into(
+        &elements[ACCT_STORAGE_COMMITMENT_OFFSET as usize
+            ..ACCT_STORAGE_COMMITMENT_OFFSET as usize + WORD_SIZE],
+    )
+    .unwrap()
+    .into();
+    let code_commitment = TryInto::<[Felt; 4]>::try_into(
+        &elements[ACCT_CODE_COMMITMENT_OFFSET as usize
+            ..ACCT_CODE_COMMITMENT_OFFSET as usize + WORD_SIZE],
+    )
+    .unwrap()
+    .into();


This looks a bit more complicated than ideal. Maybe create a helper function - something like get_word() or parse_word(), which could look like so:

fn parse_word(data: &[Felt], offset: usize) -> Word

Ideally this would return an error or Option if the conversion fails, so that at call sites we can do parse_word(slice[0..4], 0).expect("we should have sliced off exactly 4 bytes"). This seems better than panicking unconditionally inside the function, because it makes the caller less aware of this issue.

bobbinth · 2025-01-22T08:25:06Z

miden-lib/src/transaction/memory.rs

@@ -275,41 +279,41 @@ pub const NOTE_MEM_SIZE: MemoryAddress = 512;
 // │ NOTE │ SERIAL │ SCRIPT │ INPUTS │ ASSETS │ META │ NOTE  │   NUM  │ ASSET │ ... │ ASSET │ PADDING │
 // │  ID  │  NUM   │  ROOT  │  HASH  │  HASH  │ DATA │ ARGS  │ ASSETS │   0   │     │   n   │         │
 // ├──────┼────────┼────────┼────────┼────────┼──────┼───────┼────────┼───────┼─────┼───────┼─────────┤
-//    0        1       2        3        4       5       6       7      8 + n
+//    0        4       8        12       16      20     24       28    32 + 4n


nit: previously, the addresses referred to words and so they were centered. Now, they refer to the first element of the word - so, I'd probably shift to be "left-aligned". For example:

// ┌──────┬────────┬────────┬────────┬────────┬──────┬───────┬────────┬───────┬─────┬───────┬─────────┬ // │ NOTE │ SERIAL │ SCRIPT │ INPUTS │ ASSETS │ META │ NOTE │ NUM │ ASSET │ ... │ ASSET │ PADDING │ // │ ID │ NUM │ ROOT │ HASH │ HASH │ DATA │ ARGS │ ASSETS │ 0 │ │ n │ │ // ├──────┼────────┼────────┼────────┼────────┼──────┼───────┼────────┼───────┼─────┼───────┼─────────┤ // 0 4 8 12 16 20 24 28 32 + 4n

PhilippGackstatter · 2025-01-22T09:39:35Z

miden-lib/asm/kernels/transaction/main.masm

+    # use `push.* drop` instructions before `trace` to make sure that MAST root will be unique
+    push.3456069754 drop
    trace.EPILOGUE_END


My understanding was that with 0xPolygonMiden/miden-vm#1122 fixed we no longer needed these. You added explanations that this is for MAST root uniqueness, but what is the larger problem here?

As far as I know, we fixed emit instruction and advice injectors, but for the trace instruction seems like everything remained the same. I'm not sure I fully understand the case, but for now I created a template-ish issue for the trace handling to help investigate further: #1094

bobbinth · 2025-01-22T11:21:02Z

I'll merge the element-addressable memory PR in miden-vm.

Then, I'll release the new version of miden-vm.

After this, we can update the Cargo.toml file in this PR to use crates.io versions of miden-vm and then we merge it.

The first two items from this list are now done. @Fumuran could you address remaining comments and let's merge.

@PhilippGackstatter - once this is merged, could you merge #1090 and then update #1091 so that we could merge it later today as well?

Fumuran and others added 10 commits January 16, 2025 13:19

refactor: update toml files, update memory.rs

54c76e6

refactor: update memory.masm, fix kernel data end address

491aacc

refactor: building state -- update miden-tx tests

fbe136e

recator: update unit test, updating integration tests

174207f

refactor: migrate to the VM next branch

c7844de

chore: Add shared modules before adding kernel lib

f52c818

refactor: update dependencies, fix formatting

6b40798

feat: add handling of the FalconSigToStack event to the Host

20c39ad

feat: Make EVENT_ID_PREFIX an associated constant

2b2c346

chore: Return error if event is an unexpected system event

c4dc749

bobbinth mentioned this pull request Jan 21, 2025

Migrate to the VM next branch (v0.12.0) #1086

Merged

Merge branch 'andrew-migrate-to-vm-next' into element-addressable-memory

d22d20a

Fumuran force-pushed the element-addressable-memory branch from 08ddb24 to d22d20a Compare January 21, 2025 10:14

Fumuran marked this pull request as ready for review January 21, 2025 10:14

chore: update changelog

621202a

PhilippGackstatter mentioned this pull request Jan 21, 2025

Turn miden-vm and miden-crypto errors into source errors #1090

Merged

Merge branch 'next' into element-addressable-memory

5e88c85

Fumuran changed the base branch from next to andrew-migrate-to-vm-next January 21, 2025 10:50

PhilippGackstatter approved these changes Jan 21, 2025

View reviewed changes

refactor: fix the last test

8488c1a

Base automatically changed from andrew-migrate-to-vm-next to next January 21, 2025 17:56

Merge branch 'next' into element-addressable-memory

c37498c

Fumuran linked an issue Jan 21, 2025 that may be closed by this pull request

Migrate to element-addressable memory #1059

Closed

Fumuran requested a review from bobbinth January 21, 2025 18:35

chore: update benchmark results

f13e7ae

bobbinth reviewed Jan 21, 2025

View reviewed changes

Fumuran added 3 commits January 21, 2025 23:55

chore: slightly improve the doc comments format

82163ce

refactor: update inline comments, small improvements

c407e03

refactor: update inline comments in SWAP script

54ce3e5

bobbinth approved these changes Jan 22, 2025

View reviewed changes

PhilippGackstatter reviewed Jan 22, 2025

View reviewed changes

Fumuran added 2 commits January 22, 2025 14:55

refactor: update docs, impl parse_word helper func

8bca52f

chore: update dependencies

2da141a

Fumuran merged commit 155e610 into next Jan 22, 2025
12 checks passed

Fumuran deleted the element-addressable-memory branch January 22, 2025 12:25

Fumuran mentioned this pull request Jan 22, 2025

Migrate to element-addressable memory #1059

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to element-addressable memory #1084

Migrate to element-addressable memory #1084

Fumuran commented Jan 17, 2025 •

edited

Loading

PhilippGackstatter left a comment

PhilippGackstatter Jan 20, 2025

Fumuran Jan 21, 2025

PhilippGackstatter Jan 22, 2025

PhilippGackstatter Jan 21, 2025

Fumuran Jan 21, 2025

Fumuran Jan 21, 2025

PhilippGackstatter Jan 22, 2025

PhilippGackstatter Jan 21, 2025

Fumuran Jan 21, 2025 •

edited

Loading

PhilippGackstatter Jan 21, 2025

Fumuran Jan 21, 2025 •

edited

Loading

PhilippGackstatter Jan 22, 2025

Fumuran Jan 22, 2025

bobbinth left a comment

bobbinth Jan 21, 2025

bobbinth left a comment

bobbinth Jan 22, 2025

bobbinth Jan 22, 2025

PhilippGackstatter Jan 22, 2025

bobbinth Jan 22, 2025

PhilippGackstatter Jan 22, 2025

Fumuran Jan 22, 2025

bobbinth commented Jan 22, 2025

	// \| Kernel data \| 1_600 \| 1_739 \| 400 \| 434 \|
	// \| Kernel data \| 1_600 \| 1_744 \| 400 \| 434 \|

		# store the note inputs to memory starting at address 12
		push.12 exec.note::get_assets

Migrate to element-addressable memory #1084

Migrate to element-addressable memory #1084

Conversation

Fumuran commented Jan 17, 2025 • edited Loading

PhilippGackstatter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fumuran Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fumuran Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobbinth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobbinth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobbinth commented Jan 22, 2025

Fumuran commented Jan 17, 2025 •

edited

Loading

Fumuran Jan 21, 2025 •

edited

Loading

Fumuran Jan 21, 2025 •

edited

Loading