Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get code refactor for stateless #8298

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

matkt
Copy link
Contributor

@matkt matkt commented Feb 13, 2025

PR description

This PR is necessary for the future transition to stateless. The idea behind this PR is to wrap the code in a specific Bytecode class. Why is this necessary? Because today, when we access the code, we retrieve the entire bytecode, which won’t be possible in a stateless environment where the code is fragmented in different places in the tree and only partially available in the witnesses. To address this, we need to try to retrieve only the portions needed during the execution of the EVM.

So, there will be a generic Bytecode class, which will be subdivided into two classes: FullBytecode when we can have the entire bytecode, and ChunkedBytecode when we won’t necessarily have it all. ChunkedBytecode will have the particularity of loading only the necessary parts depending on the get calls we make. It will manage slices, etc., in a way that it won’t load anything unless there’s an explicit get on a specific bytecode. This retrieval won’t involve disk access because the code will come from memory (since the witness will be the data source) and there will be no need for disk access during EVM execution.

ChunkedBytecode is not present in this PR but will be part of the "verkle" branch. However, some things had to be changed in the code, such as calculateJumpDest, which will now focus only on the necessary part. The EVM code will also load only the necessary portions, along with other minor changes.

To make this work, the call to toArrayUnsafe() has been replaced by a call to getRawByteArray, which will be a class in two versions: chunked and full (this limits inheritance to avoid performance issues). RawByteArray will simply be a wrapper for the byte array, as it is now, and should not affect performance, while ChunkedBytecode will have its own logic.

So, we have two levels:

Bytecode as the replacement for Bytes
RawByteArray as the replacement for the byte array obtained with toArrayUnsafe()

This will allow for different logic for these two levels depending on the mode used.

Why is RawByteArray important? Because Bytecode is a class that inherits from Bytes, it is difficult to remove this, so I made the choice for Bytecode to extend Bytes. We observed performance impacts when calling methods of classes that extend Bytes due to the multitude of classes inheriting from it, which slows down the JVM when calling methods of Bytes classes.By using RawByteArray for EVM execution, this class does not inherit from Bytes, so it will not have any significant performance impact when called during the execution of the EVM.

Fixed Issue(s)

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests

Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
@matkt matkt added the verkle label Feb 13, 2025
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
@matkt matkt force-pushed the bytecode-refactor branch from f29d313 to adf9301 Compare February 13, 2025 15:27
Signed-off-by: Karim Taam <[email protected]>
@matkt matkt force-pushed the bytecode-refactor branch from 979051c to c31a579 Compare February 13, 2025 16:32
@matkt matkt marked this pull request as ready for review February 21, 2025 08:19
@matkt
Copy link
Contributor Author

matkt commented Feb 21, 2025

We just tested the performance @shemnon :
4 nodes were launched on mainnet (2 based on the main branch and 2 on this branch). The result is that with this branch, no issues were detected with snapsync and no issues occurred while following the head for several days without any performance problems.

A benchmark was also performed on several opcodes, comparing main vs branch, and no performance issues were noticed. I'll let @ahamlat add more information if necessary.

Signed-off-by: Karim Taam <[email protected]>
@ahamlat
Copy link
Contributor

ahamlat commented Feb 21, 2025

I can confirm that benchmarking some of the opcodes, and doing Wall Clock profiling haven't shown any regressions related to this byte code refactoring.
We can see part of the results below, Besu 25.2-develop-5826567 represents this PR

image image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants