Beacon Fuzz - Progress Update #9:
Critical consensus bug and DevOps endeavours.
Beacon Fuzz - Update #09
Sigma Prime is leading the development and maintenance of beacon-fuzz, a differential fuzzing solution for Eth2 clients. This post is part of our series of status updates where we discuss current progress, interesting challenges encountered and direction for future work. See #00 and the repository's README for more context.
Summary
- 2 consensus-related bugs identified on Prysm with
beaconfuzz_v2
- 1 minor/unexploitable discrepancy identified on Teku
- DevOps update
Differential Structural Fuzzing: 2 New Critical Vulnerabilities
The latest round of testing has once again showcased the effectiveness of our structural differential fuzzer. Over the last couple of weeks, we've identified two consensus bugs affecting Prysm, and one minor spec deviation (unexploitable) on Teku.
Prysm: Off-by-one bug in process_attestation
The target fuzz_process_attestation-struct
(structural differential fuzzer exercising the Attestation
processing functions as part of beaconfuzz_v2
) raised a difference between Lighthouse, Nimbus and Teku on one hand, and Prysm on the other, triggered with the following Attestation
:
{
"aggregation_bits": "AwE=",
"data": {
"slot": 0,
"index": 1,
"beacon_block_root": "0x008ffefefefe0000000000000000000000000000000000000000000000000000",
"source": {
"epoch": 0,
"root": "0x0000000000000000000000000000000000000000000000000000000000000000"
},
"target": {
"epoch": 0,,
"root": "0x0000000000000000000000000000000000000000000000000000000000000000"
}
},
"signature": "0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}
The interesting part is the index
field of this attestation which represents the committee index for that particular slot. In eth2, validators are shuffled into committees, and each slot is assigned a committee, i.e. a set of validators expected to attest at that slot (the actual number of committees depends on the number of active validators).
This crash was produced for a BeaconState
that only has 1 committee (i.e. the only valid committee index is 0
). Let's take a look at the attestation processing function, as per the eth2 specification:
def process_attestation(state: BeaconState, attestation: Attestation) -> None:
data = attestation.data
assert data.target.epoch in (get_previous_epoch(state), get_current_epoch(state))
assert data.target.epoch == compute_epoch_at_slot(data.slot)
assert data.slot + MIN_ATTESTATION_INCLUSION_DELAY <= state.slot <= data.slot + SLOTS_PER_EPOCH
assert data.index < get_committee_count_per_slot(state, data.target.epoch)
committee = get_beacon_committee(state, data.slot, data.index)
assert len(attestation.aggregation_bits) == len(committee)
pending_attestation = PendingAttestation(
data=data,
aggregation_bits=attestation.aggregation_bits,
inclusion_delay=state.slot - data.slot,
proposer_index=get_beacon_proposer_index(state),
)
if data.target.epoch == get_current_epoch(state):
assert data.source == state.current_justified_checkpoint
state.current_epoch_attestations.append(pending_attestation)
else:
assert data.source == state.previous_justified_checkpoint
state.previous_epoch_attestations.append(pending_attestation)
# Verify signature
assert is_valid_indexed_attestation(state, get_indexed_attestation(state, attestation))
As we can see, the Attestation
committee index is validated in this fourth assertion (assert data.index < get_committee_count_per_slot(state, data.target.epoch)
). Let's now take a look at how Prysm handles this check (ProcessAttestationNoVerify()
function):
...
c := helpers.SlotCommitteeCount(activeValidatorCount)
if att.Data.CommitteeIndex > c {
return nil, fmt.Errorf("committee index %d >= committee count %d", att.Data.CommitteeIndex, c)
}
...
As we can see, the inequality check is not following the spec, as att.Data.CommitteeIndex > c
should be att.Data.CommitteeIndex >= c
(off-by-one bug). As a result, Prysm accepts invalid attestations (for which the committee index is out-of-range), and produces a post-BeaconState
when other implementations reject it (Lighthouse for example throws a BadCommitteeIndex
index).
Please refer to this GitHub issue for more details. This critical vulnerability was patched by Prysmatic Labs in this PR.
Props to @Daft-Wullie for reporting this!
Prysm: Incorrect epoch in process_proposer_slashing
Another consensus-related discrepancy was identified by the target fuzz_proces_proposer_slashing-struct
(structural differential fuzzer exercising the ProposerSlashing
processing functions as part of beaconfuzz_v2
). The following ProposerSlashing
object triggered this bug:
{
"signed_header_1": {
"message": {
"slot": 0,
"proposer_index": 0,
"parent_root": "0x000000000000000000ffc33bffff0a0a00000000000000000000000000000000",
"state_root": "0x0000000000000000000000000000000000000000000000000000000000000000",
"body_root": "0x0000000000000000000000000000000000000000000000000000000000000000"
},
"signature": "0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
},
"signed_header_2": {
"message": {
"slot": 0,
"proposer_index": 0,
"parent_root": "0x0000000000000000000000000000000000000000000000000000000000000000",
"state_root": "0x0000000000000000000000000000000000000000000000000000000000000000",
"body_root": "0x0000000000000000000000000000000000000000000000000000000000000000"
},
"signature": "0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}
}
While Lighthouse, Nimbus and Teku reject this ProposerSlashing
(with Lighthouse throwing a ProposerNotSlashable
error), Prysm accepts it as a valid slashing and produces a post-BeaconState
.
Let's take a look at how these slashings are expected to be validated. The is_slashable_validator()
function called in the ProposerSlashing
processing is defined as:
def is_slashable_validator(validator: Validator, epoch: Epoch) -> bool:
"""
Check if ``validator`` is slashable.
"""
return (not validator.slashed) and (validator.activation_epoch <= epoch < validator.withdrawable_epoch)
which is called as follows in the process_proposer_slashing
function (fourth assertion):
def process_proposer_slashing(state: BeaconState, proposer_slashing: ProposerSlashing) -> None:
header_1 = proposer_slashing.signed_header_1.message
header_2 = proposer_slashing.signed_header_2.message
# Verify header slots match
assert header_1.slot == header_2.slot
# Verify header proposer indices match
assert header_1.proposer_index == header_2.proposer_index
# Verify the headers are different
assert header_1 != header_2
# Verify the proposer is slashable
proposer = state.validators[header_1.proposer_index]
assert is_slashable_validator(proposer, get_current_epoch(state))
# Verify signatures
for signed_header in (proposer_slashing.signed_header_1, proposer_slashing.signed_header_2):
domain = get_domain(state, DOMAIN_BEACON_PROPOSER, compute_epoch_at_slot(signed_header.message.slot))
signing_root = compute_signing_root(signed_header.message, domain)
assert bls.Verify(proposer.pubkey, signing_root, signed_header.signature)
slash_validator(state, header_1.proposer_index)
Again, let's compare the specification with the Prysm implementation (VerifyProposerSlashing()
function in proposer_slashing.go
):
if !helpers.IsSlashableValidatorUsingTrie(proposer, helpers.SlotToEpoch(hSlot)) {
return fmt.Errorf("validator with key %#x is not slashable", proposer.PublicKey())
}
The IsSlashableValidatorUsingTrie()
function is defined in validators.go
as follows:
// IsSlashableValidatorUsingTrie checks if a read only validator is slashable.
func IsSlashableValidatorUsingTrie(val stateTrie.ReadOnlyValidator, epoch uint64) bool {
return checkValidatorSlashable(val.ActivationEpoch(), val.WithdrawableEpoch(), val.Slashed(), epoch)
}
func checkValidatorSlashable(activationEpoch, withdrawableEpoch uint64, slashed bool, epoch uint64) bool {
active := activationEpoch <= epoch
beforeWithdrawable := epoch < withdrawableEpoch
return beforeWithdrawable && active && !slashed
}
As we can see, Prysm is using an incorrect epoch
when validating the ProposerSlashing
. Instead of using the current epoch from the BeaconState
(get_current_epoch(state)
in the specification), the Golang client relies on the epoch provided in the ProposerSlashing
. As such, for that particular slashing, Prysm accepts it while it should be rejected, since the related validator/proposer is in fact not-slashable.
Please refer to this GitHub issue for more details. This critical vulnerability was patched by Prysmatic Labs in this PR.
Teku: Minor difference in process_proposer_slashing
A minor/unexploitable specification deviation affecting Teku was identified by the target fuzz_proposer_slashing-struct
(structural differential fuzzer exercising the ProposerSlashing
processing functions as part of beaconfuzz_v2
).
This same discrepancy was previously identified on Prysm (refer to our previous update), whereby the ProposerSlashing
processing compares the two SignedBeaconBlockHeader
s instead of the BeaconBlockHeader
s.
This discrepancy is not directly exploitable, as it would require a malicious actor to be able to produce two different, valid BLS signatures for the same message (BeaconBlockHeader
). Nonetheless, this was quickly fixed by the Teku team in this PR.
Please refer to the relevant GitHub issue and our previous blog post for more details.
Cloud Fuzzing Infrastructure
We're delighted to be working with a future Ethereum Foundation team member who will be joining the eth2 DevOps effort imminently and help automate the deployment and monitoring of our various fuzzers on our dedicated AWS fuzzing infrastructure. They have already produced handy Ansible scripts and we look forward to leveraging their expertise to define and implement processes to streamline the execution of our fuzzing targets.