Fix the runtime-version choice. #11209

RGafiyatullin · 2022-04-12T20:58:34Z

When called in context of a block, the runtime-version, defined before that block's computation, is used; with a little exception for the cases when the context is the genesis-block. (example for such a call)

When running in a constrained pruning mode (i.e. when a limited quantity of state-db snapshots is available) such an invocation will be available for all the non-pruned blocks except for the last one.

…rate-9997

skunert · 2022-04-13T07:50:07Z

Basically this is Solution #3 from the issue. I am not sure I understand this comment:

When running in a constrained pruning mode (i.e. when a limited quantity of state-db snapshots is available) such an invocation will be available for all the non-pruned blocks except for the last one.

If I call the runtime on some non-pruned block, why are we guaranteed to have the runtime from the previous block available and not pruned?

wigy-opensource-developer · 2022-04-13T07:53:58Z

If I call the runtime on some non-pruned block, why are we guaranteed to have the runtime from the previous block available and not pruned?

There is no guarantee for that at all.

skunert · 2022-04-13T07:58:30Z

If I call the runtime on some non-pruned block, why are we guaranteed to have the runtime from the previous block available and not pruned?

There is no guarantee for that at all.

I think I misinterpreted the comment. I read "last one" as "best block", but it is the opposite, the first available non-pruned block. Got it 👍 . So we would accept that for these blocks you just can not call.

client/state-db/src/lib.rs

andresilva

I left some comments. In general there are two things that we need to think about the current approach:

We will need to make the distinction between block construction and general runtime calls, to make sure we take the :code from the correct block
We might need to make some changes to pruning, of the top of my head I was thinking that we might want to make sure we keep around the state of the parent of the last finalized block (currently we'll also prune on finality regardless of window iirc)

Solution 4 from the original will also require that we do the first point above (i.e. distinguish between block execution and other calls), since we need to pick the code from different storage entries, but it would not require any changes to pruning.

andresilva · 2022-04-13T12:10:07Z

client/service/src/client/call_executor.rs

+	fn state_with_runtime_code(
+		&self,
+		block_id: BlockId<Block>,
+	) -> sp_blockchain::Result<<B as backend::Backend<Block>>::State> {


As discussed earlier I think here we should fetch the header first (either by block number or block hash, depending on the block id that's passed), and then we fetch the parent hash from it. The assumption we make is that the caller of this method will make the right decision on whether to use a block number or a hash.

andresilva · 2022-04-13T13:11:36Z

client/service/src/client/call_executor.rs

+		let state_data = self.backend.state_at(*at)?;
+		let state_code = self.state_with_runtime_code(*at)?;
+
+		let state_runtime_code = sp_state_machine::backend::BackendRuntimeCode::new(&state_code);


This executor will be used in both the context of general runtime calls as well as block construction (correct me if I'm wrong here @bkchr). We will need to somehow make that distinction because in the case of block construction we don't want to use the logic of picking the code from the parent, otherwise we'd end up in a situation where: we want to build block N, therefore we use the state from N-1 and the code from N-2. In the case of block construction we basically want to keep the logic as is, i.e. to create block N we already use the code from block N-1 (since N doesn't even exist in the first place).

RGafiyatullin · 2022-04-13T13:23:11Z

Disclaimer

I might have slightly oversimplified the things, but nonetheless the following is how I understand it.

@skunert , @wigy-opensource-developer , please feel free to correct me if I am criminally wrong.

Preface

When a block is computed — it yields:

its hash;
the State (conceptually — the State-trie, technically — the State-trie's root node's hash).

Once a block is canonicalised, the StateDb's "Pruning RefWindow" is notified.
The "Pruning RefWindow" appends that Block to the queue (aptly named death_rows).
The overall length of death_rows is kept under the specified max_blocks.

That as I understand it gives us "sort of guarantee" (a bit oxymoronic, hence the quotes) that

the states produced by the computation of non-canonicalised blocks are not pruned;
the states produced by the computation of the N last canonicalised blocks are not pruned.

The corner case here is when the death_rows queue is in fact shorter than the max_blocks:

the whole history is shorter (the whole chain has just been started);
the node has just restarted with the value for max_blocks larger than it used to run before the restart.

Option # 3

There is an existing API sc-client-api::Backend::state_at(&self, block_id: BlockId), it returns the State that is available as the result of the computation of block block_id.

So in order to
a) compute that block again (CallExecutor::prove_execution(...));
b) query the data of that state (CallExecutor::call(...));
c) ... erm... (CallExecutor::contextual_call(...) — please help me out with this one, what is that?)

we would need to execute the version of the runtime available just before that block was computed, i.e. state_at(block_id - 1).get(WellKnownKeys::CODE).

What states can be queried?

In essence all states to which both of the following is true:
a) the state with the data is not pruned;
b) the state with the runtime is not pruned.

That is (for all max_blocks > 0):

for every non-canonicalised block;
for almost every non-pruned block (except for the oldest one).

bkchr · 2022-04-13T13:32:54Z

Disclaimer, I have written this before @RGafiyatullin and @andresilva commented, so this may contains duplicate information. Just posting it for posterity.

I'm personally not convinced that using the wasm blob from the parent block is the best way to move forward (yes I know that I proposed it initially). When we do it that way we will need to change the pruning to keep one more block in a hidden way inside the unpruned blocks to always have the parent wasm blob around.
We also will need to "tag" the block production and block import to use the wasm blob from block X and not X - 1. Otherwise we would use the wrong wasm blob. Imagine we applied a runtime upgrade at block X with a migration. This migration needs to run while building/importing block X + 1. Building/importing a block is always done on the state of the parent block from which we currently take the runtime blob as well. With your changes we would take the runtime blob from the parent block of the parent of the block we are trying to build/import. If we would always have done it this way, it would have been no problem. However, now we need to stay backwards compatible and this means when you applied a runtime upgrade at block X, it needs to be used in block X + 1. (I hope you get what I want to say, if not please ask)

If we go with proposal 4, we would also need to annotate block production/import to switch to the new wasm file.

wigy-opensource-developer · 2022-04-13T13:45:27Z

Correct me if I am mistaken, but we cannot stay completely backward compatible for each of the following cases:

block importing
RPC runtime calls
block authoring

I really like the simplicity of this solution opposed to the really overengineered idea I had that also would allow multi-block migrations. If the dead_row keeps the state of a few finalized blocks alive in storage, I really do not see the issue with this direction.

wigy-opensource-developer · 2022-04-13T13:48:34Z

substrate/client/service/src/client/call_executor.rs

Lines 130 to 133 in 8ce5095

    
           let block_num_code = if block_num.is_zero() { block_num } else { block_num - One::one() }; 
        
           let block_id_code = BlockId::<Block>::Number(block_num_code); 
        
           let state_code = self.backend.state_at(block_id_code)?;

Also, @RGafiyatullin, I think André tried to get you navigate here using the previous block hash instead of just decreasing the block number. That implementation would be fork-aware.

bkchr · 2022-04-13T14:00:10Z

Correct me if I am mistaken, but we cannot stay completely backward compatible for each of the following cases:
1. block importing

2. RPC runtime calls

3. block authoring

These are basically all kind of interactions with the runtime ;)

We will need to stay backwards compatible anyway, because otherwise we can not import all the old blocks.

RGafiyatullin · 2022-04-13T14:31:22Z

@bkchr

My changes are based upon the assumption that the following is true:

pub trait Backend<Block: BlockT>: AuxStore + Send + Sync {
	// >8 >8 >8 >8 >8 >8 >8 >8 >8 >8 >8 >8

         /// Returns state backend with post-state of given block.
	fn state_at(&self, block: BlockId<Block>) -> sp_blockchain::Result<Self::State>;

Which gave me the following:

block N is calculated based on the following inputs:
- code taken from state_at(N - 1).get(CODE) ;
- state yielded by the execution of block N - 1, i.e. state_at(N - 1);
the State resulting from the execution of Block N contains the data compatible with the code kept in state_at(N - 1).get(CODE).

With your changes we would take the runtime blob from the parent block of the parent of the block we are trying to build/import.

Sorry, I do not get how that is true.

RGafiyatullin · 2022-04-13T14:34:25Z

In that if block N - 1 invoked system::set_code (or has set the state-db value under the key ":code" directly), then this block's post-state (i.e. backend.state_at(N - 1)) contains the runtime to compute block N.

RGafiyatullin · 2022-04-13T14:42:59Z

@bkchr please disregard the above... I traced a little bit more, I misunderstood what the input argument at: BlockId for CallExecutor's method meant.

RGafiyatullin · 2022-04-13T14:53:27Z

I found a critical mistake in the very base of the reasoning that stands behind that PR-draft, therefore I will close it.

I will draw several doodles reflecting my up-to-date vision of the problem and post it in paritytech/polkadot-sdk#64 .

bkchr · 2022-04-13T15:00:53Z

Ty @RGafiyatullin

RGafiyatullin added 4 commits April 8, 2022 23:42

Define code-for-block(n) = state(n - 1).get(CODE)

97a55d2

staet-db: a note on how the pruning-configuration is implemented

de4e6e3

cargo fmt

6f0a404

Merge remote-tracking branch 'origin/master' into rgafiyatullin-subst…

4c4078d

…rate-9997

RGafiyatullin added A0-please_review Pull request needs code review. B3-apinoteworthy C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit labels Apr 12, 2022

RGafiyatullin requested review from andresilva, koute, wigy-opensource-developer, skunert, bkchr and davxy April 12, 2022 20:58

cargo fmt

b0fc796

koute reviewed Apr 13, 2022

View reviewed changes

client/state-db/src/lib.rs Outdated Show resolved Hide resolved

Typo: s/constraitns/constraints/

8ce5095

andresilva reviewed Apr 13, 2022

View reviewed changes

RGafiyatullin closed this Apr 13, 2022

RGafiyatullin deleted the rgafiyatullin-substrate-9997 branch April 13, 2022 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the runtime-version choice. #11209

Fix the runtime-version choice. #11209

RGafiyatullin commented Apr 12, 2022

skunert commented Apr 13, 2022

wigy-opensource-developer commented Apr 13, 2022

skunert commented Apr 13, 2022 •

edited

Loading

andresilva left a comment

andresilva Apr 13, 2022

andresilva Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

bkchr commented Apr 13, 2022 •

edited

Loading

wigy-opensource-developer commented Apr 13, 2022

wigy-opensource-developer commented Apr 13, 2022 •

edited

Loading

bkchr commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

bkchr commented Apr 13, 2022

Fix the runtime-version choice. #11209

Fix the runtime-version choice. #11209

Conversation

RGafiyatullin commented Apr 12, 2022

skunert commented Apr 13, 2022

wigy-opensource-developer commented Apr 13, 2022

skunert commented Apr 13, 2022 • edited Loading

andresilva left a comment

Choose a reason for hiding this comment

andresilva Apr 13, 2022

Choose a reason for hiding this comment

andresilva Apr 13, 2022

Choose a reason for hiding this comment

RGafiyatullin commented Apr 13, 2022

Disclaimer

Preface

Option # 3

What states can be queried?

bkchr commented Apr 13, 2022 • edited Loading

wigy-opensource-developer commented Apr 13, 2022

wigy-opensource-developer commented Apr 13, 2022 • edited Loading

bkchr commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

RGafiyatullin commented Apr 13, 2022

bkchr commented Apr 13, 2022

skunert commented Apr 13, 2022 •

edited

Loading

bkchr commented Apr 13, 2022 •

edited

Loading

wigy-opensource-developer commented Apr 13, 2022 •

edited

Loading