Beacon Fuzz - Progress Update #4
Beacon Fuzz - Update #04
Structural Fuzzing & Architecture Redesign
Sigma Prime is leading the development and maintenance of beacon-fuzz, a differential fuzzing solution for Eth2 clients. This post is part of our series of status updates where we discuss current progress, interesting challenges encountered, and direction for future work. See #00 and the repository's README for more context.
As client teams are ramping up their efforts to match the latest version of the Eth2 specification (
v0.11.2), the Beacon Fuzz crew has been pushing hard to uncover new bugs in these implementations. Key achievements and points of interest include:
- Structural Fuzzing using
- First bugs in Teku
- New bugs in Nimbus
- Update on Golang integration
- Challenges in replaying interesting samples across implementations
- Beacon Fuzz Redesign Proposal
Structural fuzzing using
When using a naive fuzzing strategy, raw bytes are passed to the target functions, with the expectation that coverage-guided fuzzing engines will instrument the relevant code and leverage their mutation algorithms to produce samples that will reach a high number of code blocks.
Some of the types used in Eth2 can be quite complex. For example, let's take a look at the
class BeaconBlockBody(Container): randao_reveal: BLSSignature eth1_data: Eth1Data # Eth1 data vote graffiti: Bytes32 # Arbitrary data # Operations proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS] attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS] attestations: List[Attestation, MAX_ATTESTATIONS] deposits: List[Deposit, MAX_DEPOSITS] voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS]
Each of the complex types forming the
BeaconBlockBody SSZ container are also defined in the specification. For example,
Eth1Data is represented as follows:
class Eth1Data(Container): deposit_root: Root deposit_count: uint64 block_hash: Bytes32
This means that if we want to efficiently fuzz the state transition functions that take a
BeaconBlock as input, we need to provide samples that ideally follow the structure described above. This is where structural fuzzing (a.k.a struct-aware or grammar-based fuzzing) comes in handy.
Previously, we were only making sure that the inputs passed to our state transition functions were valid SSZ containers. This however does not ensure that the SSZ containers are relevant to the state transition in the context of the Eth2 specification.
By leveraging the latest update to the
cargo_fuzz crates, we're now able to write fuzz targets that take well-formed instances of custom types by deriving and implementing the
Arbitrary trait, which allows us to create structured inputs from raw byte buffers.
These fuzzing targets live in the
arbitrary-fuzzing-fuzzer branch of the Lighthouse repository.
These fuzzers have been running for a few days (at the time of writing) and have not detected any panics on Lighthouse. We'll be using the generated samples as new inputs to our differential processor (see section Beacon Fuzz Redesign Proposal).
First Bugs in Teku
As mentioned in our previous update, we've been working on integrating Teku, the Java Eth2 implementation, to our fuzzing processes.
In doing so, the team has identified the following issues/hardening opportunities that were quickly addressed by the PegaSys Engineering team:
- Infinite loop when decoding SSZ
BitListwithout "end-of-list" marker bit (see Issue #1674)
IndexOutOfBoundsExceptionwhen SSZ decoding 0-byte
Bitlist(see Issue #1678)
These last two issues are not exploitable in the normal operation of the Teku client, as the related exceptions are caught by the client in full processing and handled gracefully. We'd like to thank Adrian Sutton and Ben Edgington for their help in triaging these bugs!
New Bugs in Nimbus
By replaying some of the samples generated from fuzzing Lighthouse, a few new bugs affecting Nimbus have been uncovered:
- Segmentation fault due to a stack allocation/overflow bug in
process_final_updates(see Issue #921)
AssertionErrorin state transition (See Issue #922)
IndexErrortriggered when parsing an empty
AttestationSSZ container (See Issue #931)
IndexErrors triggered when decoding invalid
BeaconStates (empty container and variable list reporting 0 length) (See Issue #896 and Issue #920)
Similarly, the issues affecting the parsing of a malformed
BeaconState are not exploitable in the normal operation of the Nimbus client. Kudos to Dustin Brody from the Nimbus team for fixing these bugs so quickly.
Update on Golang Integration
beacon-fuzz to newer spec versions, we encountered new issues with the existing Golang build process.
Our process failed for ZRNT which, as of
v0.10.1, had started relying on Herumi's cgo-based BLS implementation.
go-fuzz doesn't support cgo (See go-fuzz#101), it can normally build successfully without attempting cgo instrumentation.
Although a PR could provide a workaround, there would still remain the outstanding complications dealing with multiple Golang clients (discussed in detail in previous posts).
Prysm has started performing standalone fuzzing using the Go Compiler's built-in coverage instrumentation (experimentally released in v1.14).
Initial experiments have shown this is a promising way to remove our reliance on a modified
go-fuzz, and resolve many complications.
As before, the "out of the box" go114-fuzz-build tool is insufficient for our needs but implementing our own build tool is much more simple with the builtin instrumentation.
Our go-bfuzz-build tool implements a FFI interface that returns the bytes needed for differential comparison, instruments cgo code, and can be easily extended to export interfaces for multiple clients and harnesses within a single, static
This is effectively an implementation of option "D) Building without go-fuzz", as described in our Beacon Fuzz #01 post.
With this, we avoid the need for hacky solutions that combine multiple
c-shared libraries (each containing their own Go runtime) into a single executable.
There are still some outstanding complications with this approach but it is much more promising with regards to ongoing maintainability and functionality. Some issues still in development include:
- Integrating Prysm's libraries built by Bazel:
A good solution could have been to build the Prysm harnesses with Bazel as a binary-only package then combine with the rest to build a single
c-archive, but support has been dropped as of go1.13. Other possibilities include building the Prysm harness with a
sharedbuild mode (different to
c-shared), and linking it.
- Programmatically accessing cgo link-time settings (e.g. herumi/bls.go#L5-6).
Because the Go build tool does not perform linking when producing a
c-archive, we need to extract the link-time settings for use with our external linker, so it knows paths to relevant static libraries etc.
Challenges in Replaying Interesting Samples
We've worked on another tool,
eth2diff, that allows us to replay interesting state transitions (i.e. inputting a
BeaconState along with a
BeaconBlock) across different implementations, by leveraging the following utilities provided by client teams:
This has allowed us to identify a large portion of the bugs listed above. However, these utilities do not include some of the checks and verification steps implemented by Eth2 nodes. Specifically, most of these utilities assume that the blocks have passed the checks described in the Eth2 P2P networking specification (so are associated with the "current" slot), and states to be provided are valid i.e. are internally maintained, trusted objects.
This has lead to some confusion and some interesting conversations as captured in this issue.
Beacon Fuzz redesign proposal
Sigma Prime has been building Beacon Fuzz upon Guido Vranken's great work since late 2019. As the project evolved, the current architecture faces the following challenges:
- Difficult to evaluate fuzzing coverage;
- The project is developed in C++, which we don't have extensive experience in;
- The project is designed to support
libFuzzerexclusively as a fuzzing engine;
Additionally, we currently preprocess all corpora to combine the SSZ container input with a referenced
BeaconState, which are passed as
beacon-fuzz-testcases to each client. The conversion from corpus to test case can be represented as follows:
+-------------------+-------------------------+ +-------------------------+-------------------------+ | state_id (uint16) | Attestation (container) | --> | BeaconState (container) | Attestation (container) | +-------------------+-------------------------+ +-------------------------+-------------------------+
state_id represents a
BeaconState integer filename from our corpora.
This additional serialization step consumes a large amount of time during fuzzing execution, significantly slowing down the overall process.
We propose the following architecture for a new, modular version of
Tool #1 -
eth2fuzz - Coverage Guided Fuzzer To Generate Samples
To generate interesting samples, we'll use a dedicated tool leveraging explicit code coverage, allowing us to flag SSZ containers that are of interest, i.e. those that trigger new code paths. This tool can use multiple different fuzzing engines (AFL++, HonggFuzz, libFuzzer, etc.). In fact, we've already built this tool which lives here. Next step is to integrate the work done on the structural fuzzing into
Tool #2 -
eth2diff - Replaying Samples Across All Implementations
As mentioned above, we have built a tool that leverages the various state transition execution utilities (
lci, etc.) that replays all samples generated from
eth2fuzz. We've created dedicated
Docker containers for each implementation, and one central
Docker container to orchestrate the execution of
eth2diff. The goal of this tool is to detect crashes and differences across all supported implementations, for any given set of inputs (
This tool can be found here.
Tool #3 -
beacon-fuzz-2 - Differential Fuzzing with FFI Bindings
This tool is the successor of the current existing Beacon Fuzz C++ project. It will be developed in Rust (for ease of maintainability) and will leverage Foreign Function Interfaces (FFI) bindings. This will inevitably result in slower processing and fuzzing (compared to
eth2fuzz) but should enable the identification of more complex logic bugs.
We're very keen to get feedback on our new approach and are quite excited to continue helping the community ship safe and secure Eth2 clients. We've also updated the Trophies section of Beacon Fuzz, which shows that our fuzzing efforts have helped identify 16 unique bugs/hardening opportunities across 4 implementations.