Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference scripts for PlutusV1 #3965

Closed
lehins opened this issue Jan 3, 2024 · 19 comments · Fixed by #4059
Closed

Reference scripts for PlutusV1 #3965

lehins opened this issue Jan 3, 2024 · 19 comments · Fixed by #4059
Assignees
Labels
conway plutus Anything related to Plutus Scripts

Comments

@lehins
Copy link
Collaborator

lehins commented Jan 3, 2024

It has been brought to our attention that PlutusV1 scripts occupy around 40% of the whole blockchain since Alonzo. Natural solution to this problem is to enable reference scripts for PlutusV1 in Conway

Here is the relevant CIP: cardano-foundation/CIPs#679

The proposed solution in the above CIP is too complicated for no good reason. We can just enable the reference scripts. However this will require #3952 since we need mitigate the worries that @WhatisRT mentioned where one could use a fee field to infer inside a PlutusV1 script whether the script was provided in a witness set or in a reference input.

In order to confirm the report and the proposed solution of using reference scripts we did some investigation. We wrote a program that replays the chain and counts up the sizes of all of the Plutus scripts in all of the blocks per epoch starting from beginning of Alonzo until Christmas 2023. Here is the plot for the calculated data:

image

  • Blue is the total size of the blockchain per Epoch
  • Red is the total size of PlutusV1 scripts per epoch that were used as witnesses in transactions
  • Yellow (barely visible) is the total size of PlutusV2 scripts per epoch that were used as witnesses in transactions together with PlutusV2 scripts that were placed into TxOuts for future use as Reference Scripts. In other words total contribution of PlutusV2 scripts to on-chain data.
  • Green is the total size of PlutusV2 scripts that were actually used as reference scripts. In other words, this would have been the overhead added to the chain, if we didn't have reference scripts capabilities for the PlutusV2 scripts.

It is clear that reference scripts are a great solution to the problem.

This plot is just the zoomed version of only the PlutusV2 sizes:

image

The pie chart below depicts the percentage of the on-chain data occupied by PlutusV1 (red), PlutusV2 (yellow) the rest of the data (blue):

image

It was noted on Github that 60Gb overhead by scripts is not that big when comparing to other blockchains, which is true, but 41% of the total blockchain data is too much.

@lehins lehins added the conway label Jan 3, 2024
@lehins
Copy link
Collaborator Author

lehins commented Jan 3, 2024

Couple more useful plots depicting the continuing popularity of PlutusV1 scripts:

The total size of PlutusV1 scripts with respect to the total size of all blocks over time as a percentage. If, on this plot, we saw a trend of PlutusV1 scripts being used less and less over time, then we could have argued that with time this problem could have resolved itself. However, it does not seem to be the case.

image

PlutusV1 is still a very popular language version and from the plot below it is clear that it still even more often than PlutusV2

image

@Quantumplation
Copy link

@lehins Excellent analysis, thank you!

I'm a bit disappointed that we'll have to artificially inflate prices (by around 0.3 ADA per transaction, by my calculations) just to avoid the (IMO very) remote chance that someone somewhere deployed a plutus v1 script that depends on the fee being large (which is already a dangerous script, since the protocol parameters could be updated regardless)... but if that's what is needed to get this merged, I'm more than happy to deal with it 😛

@conraddit
Copy link

Does this mean that we'll see the blockchain database shrink by 60GB or is this a solution for future transactions?

@lehins
Copy link
Collaborator Author

lehins commented Jan 3, 2024 via email

@conraddit
Copy link

We can't rewrite history 😁Future transactions only. And only when users choose to switch using ref scripts.9:08 PM, January 3, 2024, "$conrad" @.>: Does this mean that we'll see the blockchain database shrink by 60GB or is this a solution for future transactions? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.> Отправлено из мобильной Яндекс.Почты: http://m.ya.ru/ymail

Thank you! But Cardano Node could, somehow, find a way of compressing this data, right? I.e., all redundant data (the repeated SCs) could be compressed to save space, no?

@AndrewWestberg
Copy link
Contributor

@conraddit Probably not worth compressing it although it's possible. Other solutions like pruning and mithril can help us with the on-disk requirements as the chain continues to grow.

@Mercurial
Copy link

I would say the best thing to focus on now is to prevent further growing the chain unnecessarily and improving performance by allowing Plutus v1 reference scripts

@Quantumplation
Copy link

@conraddit Also, see this comment regarding the on-disk space: cardano-foundation/CIPs#679 (comment)

TL;DR: Deduplicating things on disk may be an additional local optimization the node can apply (though any time you "break up" the data in a block and require it to be re-assembled, you introduce a lot of complexity and risk) irrespective of the on-chain semantics described by the linked CIP. The far more urgent issue is the limitations for data on the wire.

@GeorgeFlerovsky
Copy link

Great analysis!

@lehins
Copy link
Collaborator Author

lehins commented Jan 24, 2024

We did come to a conclusion that it is feasible and desirable to implement this feature for Conway

@colll78
Copy link

colll78 commented Jan 24, 2024

since we need mitigate the worries that @WhatisRT mentioned where one could use a fee field to infer inside a PlutusV1 script whether the script was provided in a witness set or in a reference input.

I hope this change has more to do with insuring that SPOs get paid for the work the node has to do to lookup the reference input and deserialize the script (as the linked CIP suggests); and has nothing to do with using the fee field to infer inside a PlutusV1 script whether the script is in the witness set or provided via reference inputs. If it is the second case as suggested here, then this change does not prevent Plutus V1 scripts from being capable of infering whether they were provided in a witness set or in a reference input. You can still write a Plutus V1 script that achieves this by constructing the expected txbody from the script context with the missing components passed in via redeemers / token names, then hashing the result and validating that it matches the txId in the script context. You use this to validate against tx metadata or any other part of the transaction body that is inaccessible directly from the script context.

@lehins
Copy link
Collaborator Author

lehins commented Jan 24, 2024

I hope this change has more to do with insuring that SPOs get paid for the work the node has to do to lookup the reference input and deserialize the script (as the linked CIP suggests); and has nothing to do with using the fee field to infer inside a PlutusV1 script whether the script is in the witness set or provided via reference inputs

Correct.

You can still write a Plutus V1 script that achieves this by constructing the expected txbody from the script context with the missing components passed in via redeemers / token names, then hashing the result and validating that it matches the txId in the script context.

That is not possible in general, since CBOR is non canonical, so even if you have all of the information from the transaction, the hash of the body will depend on how that information was encoded. An example would be was variable length encoding or exact length encoding was used when encoding inputs in the transaction body. There is just no way to know this without looking at the actual binary version of the transaction, which Plutus does not have an access to.

There is chance that you could be able to reconstruct the original transaction of the body from Plutus context and verify it with the hash of the body available to plutus, but if you can't, it doesn't mean you are missing some information. You might be just encoding the transaction in a different way. You could try all possibilities, but that would be too expensive.

In any case, I don't think this is that big of a deal, because Plutus script would really have to go out of its way to try and infer where the script is coming from: Witness set vs reference input, which is not the case we really care about. Our concern is primarily ensuring we do not affect PlutusContext accidentally in a way that would affect execution of normal PlutusV1 scripts. Which does not seem to be a problem for enabling reference scripts for PlutusV1

@Quantumplation
Copy link

That is not possible in general, since CBOR is non canonical, so even if you have all of the information from the transaction, the hash of the body will depend on how that information was encoded. An example would be was variable length encoding or exact length encoding was used when encoding inputs in the transaction body. There is just no way to know this without looking at the actual binary version of the transaction, which Plutus does not have an access to.

You could pass in those encoding decisions in the redeemer.

That being said, I think @colll78's point was more to highlight that it's a bit impractical to use a standard of "can we come up with any script that could conceivably be written to differentiate between the before and after" as a standard, and hopefully the standard we're using is at least a little closer to "is it very very unlikely that this will break any existing scripts?". It's encouraging to get confirmation that it is indeed closer to the later.

@lehins
Copy link
Collaborator Author

lehins commented Jan 25, 2024

You could pass in those encoding decisions in the redeemer.

You can pass the whole transaction body in serialized form in the redeemer 😄

I was merely trying to explain that there is no guarantee that a script can reliably reconstruct the transaction body just from the PlutusContext.

hopefully the standard we're using is at least a little closer to "is it very very unlikely that this will break any existing scripts?". It's encouraging to get confirmation that it is indeed closer to the later.

Absolutely.

@colll78
Copy link

colll78 commented Jan 25, 2024

You can pass the whole transaction body in serialized form in the redeemer 😄

You cannot, there is a cyclic relationship between the redeemer and the script_data_hash. If you pass in the serialized tx body in the redeemer, it would change script_data_hash and it would no longer be the same tx body. I agree that you cannot reliably reconstruct the transaction body from just the Plutus Context, thus "with the missing components passed in via redeemers / token names".

Without this method, the original method of a script checking the txFee field to determine it is present in the reference inputs or witness set does not work, because the on-chain transaction size effects the fee, and the tx metadata effects the size, so there isn't any way to distinguish between whether the fee change difference is a result of tx metadata being present or if it is the result of scripts being included in the witness set.

@lehins
Copy link
Collaborator Author

lehins commented Jan 25, 2024

there isn't any way to distinguish between whether the fee change difference is a result of tx metadata being present or if it is the result of scripts being included in the witness set.

Moreover, there is also minFeeA protocol parameter, which by definition affects the fee. So, I 100% agree withe the argument that it makes no sense for us to worry about providing guarantees of fee not affecting Plutus script execution.

You cannot, there is a cyclic relationship between the redeemer and the script_data_hash. If you pass in the serialized tx body in the redeemer, it would change script_data_hash and it would no longer be the same tx body.

Oh, yeah, I keep forgetting about scriptIntegrityHash 😄
So, theoretically it would be possible for PlutusV1 to reconstruct the transaction body hash, on two conditions:

  • if it was either the only script included in the transaction or it also accepted all of the other transaction redeemers in it's own redeemer, since PlutusV1 does not have access to the redeemers in the context.
  • Also the whole of CostModel would have to be supplied in the argument as well.

All of this would be necessary because the script would have to first reconstruct the scriptIntegrityHash, before it could attempt to reconstruct the body, even if the whole transaction body was supplied in two chunks with the scriptIntegrityHash being the only thing missing.

That would be one complex script for no good reason. 🙂
That was a fun mental experiment, thank you for that 😉

@Quantumplation
Copy link

As a side note, if you really had to, you could use this trick to validate the transaction metadata inside of a script heh

@colll78
Copy link

colll78 commented Jan 25, 2024

Also the whole of CostModel would have to be supplied in the argument as well.
All of this would be necessary because the script would have to first reconstruct the scriptIntegrityHash, before it could attempt to reconstruct the body, even if the whole transaction body was supplied in two chunks with the scriptIntegrityHash being the only thing missing.

You can actually just pass script_data_hash in as the name of a minted token, since you can serialize the mint field onchain (using the serialization config past from the redeemer). This works because mint is totally independent from script_data_hash.

As a side note, if you really had to, you could use this trick to validate the transaction metadata inside of a script heh

That was the exact intended purpose for which this solution came to exist haha, shout out to @MicroProofs for his help iterating on this (script_data_hash via token name in mint field was his idea).

@SebastienGllmt
Copy link
Contributor

FYI the discussion in this Github issue has been implemented:

  • Twitter thread here
  • Blog post (more detailed) here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conway plutus Anything related to Plutus Scripts
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

9 participants