-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Cross-Implementation Benchmarking Dataset for Plutus Performance #6626
Comments
I suppose this will give you the most popular scripts rather than the most diverse ones. But I think that's fine. So what do you want from us? A thumbs up? That all sounds great. I think whatever code we might have provided you with would just be skewed for our implementation and who cares about our implementation when people pay for the scripts on the mainnet. So just take it from the mainnet as per your plan, I think it's representative enough. I'm not sure how to triage this issue, so I'll triage it as "Low priority". |
@effectfully, this task has two stages with the following needs:
Regarding the representation of scripts, yes, with the random selection, more popular scripts have a stronger influence on the results than less popular ones. However, that's exactly the way they influence the Node's run time on the mainnet. Regarding approved scripts for each implementation, one of the benchmark's goals is to measure the parallel efficiency of each implementation (does the performance scale linearly with the number of workers). As I understand that may require some fine-tuning of the Haskell runtime parameters for optimal performance, and I don't want to take the risk of misconfiguring it. Another reason is that you could provide several scripts representing experimental optimizations. For example, Rust and C++ implementations seem to benefit drastically from optimizations related to memory allocation patterns. A strong impact of certain optimizations may help to advocate for their earlier inclusion into the mainnet release of Cardano Node. |
@sierkov everything you said makes perfect sense to me. @rvcas @MicroProofs are you folks interested in any of what's been discussed here? |
@effectfully, @rvcas, here are the links:
The README includes detailed information, such as the latest performance results of the C++ Plutus implementation and step-by-step instructions for reproducing the transaction sampling and dataset creation. The performance of the C++ implementation already meets our internal target by validating all transaction witnesses in under an hour on a high-end laptop. However, we believe there is room for further optimization and are eager to collect feedback and exchange ideas with other implementations. Feedback on the dataset, benchmarking script, and performance results is welcome. Let me know if you have questions or need support in preparing implementation-specific scripts. @Unisay, thank you for sharing the statistics. To generate the benchmarking dataset, mainnet data up to epoch 521 was analyzed. At the time of generation, the number of unique observed Plutus scripts was 95,459. However, the number of observed Plutus redeemers—and therefore Plutus script evaluations—was only 40,525,056. This figure appears significantly lower than the number you reported. Could you kindly double-check your results? If your figures are correct, I’d greatly appreciate it if you could share your methodology so we can better understand the discrepancy. In our case, the number of redeemers was calculated as the number of (non-unique) sub-entries of transaction witnesses of type 5 across all blockchain blocks. This analysis was performed by directly examining the raw blockchain data, allowing us to trace each number back to a specific block and transaction. |
@sierkov Interesting discrepancy, indeed. The code I've used to extract plutus script evaluations from Mainnet is currently in a private repo. I am working on making it public, once ready -- I'll share a link with you. I can also describe what is done there: |
@Unisay, thank you for the prompt follow-up. To better understand the issue, we analyzed a small random sample of transactions with Plutus witnesses by manually comparing redeemer counts against Cexplorer.io. All analyzed transactions matched precisely. Examples:
To help find the root cause of the discrepancy, I’m attaching two files with the following statistics:
Would it be possible for you to prepare tables with the same contents from for your dataset? That would allow us to trace the causes of discrepancies down to individual epochs and scripts. Then, we can confirm each case by manually analyzing the respective raw data. P.S. The epochs table reports both unique and non-unique redeemers. That's because a small fraction of transactions contains multiple redeemers with the same id (purpose tag + reference index) while Cardano Node evaluates only the final entry. Reporting this for comprehensiveness. |
@Unisay, your last message mentions '40929053' records (~40 million), whereas your first message mentions '403_665_609' (~400 million, 10x more). The numbers we observe are ~40 million. Could you confirm which is correct according to your dataset? |
I apologize for the confusion: I gave you the wrong number first time, it's 40 millions, not 400. |
@Unisay, thank you for sharing the code. Two quick questions:
|
We don't test computed stats currently 🤷🏼♂️
The majority of time spent is indexing from Genesis, and we've done that quite some time ago, IIRC it took very roughly 1 - 1.5 days to reach the "immutable tip". Since then we're running a cron job to catch-up daily. |
Describe the feature you'd like
I'm working on a C++ implementation of Plutus aimed at optimizing batch synchronization. We'd like to benchmark our implementation against existing open-source Plutus implementations to foster cross-learning and understand their relative performance. This issue is a request for feedback on the proposed benchmark dataset, as well as for approved code samples representing your implementation to include in our benchmarks. Detailed information is provided below.
The proposed benchmark dataset is driven by the following considerations:
The procedure for creating the proposed benchmark dataset is as follows:
<mainnet-epoch>/<transaction-id>-<script-hash>-<redeemer-idx>.flat.
To gather performance data across open-source Plutus implementations, I am reaching out to the projects listed below. If there are any other implementations not listed here, please let me know, as I’d be happy to include them in the benchmark analysis. The known Plutus implementations:
I look forward to your feedback on the proposed benchmark dataset and to your support in providing code that can represent your project in this benchmark.
Describe alternatives you've considered
No response
The text was updated successfully, but these errors were encountered: