Beacon Fuzz - Update #09

Beacon Fuzz - Progress Update #9:

Critical consensus bug and DevOps endeavours.

Beacon Fuzz - Update #09

Sigma Prime is leading the development and maintenance of beacon-fuzz, a differential fuzzing solution for Eth2 clients. This post is part of our series of status updates where we discuss current progress, interesting challenges encountered and direction for future work. See #00 and the repository's README for more context.

Summary

  • 2 consensus-related bugs identified on Prysm with beaconfuzz_v2
  • 1 minor/unexploitable discrepancy identified on Teku
  • DevOps update

Differential Structural Fuzzing: 2 New Critical Vulnerabilities

The latest round of testing has once again showcased the effectiveness of our structural differential fuzzer. Over the last couple of weeks, we've identified two consensus bugs affecting Prysm, and one minor spec deviation (unexploitable) on Teku.

Prysm: Off-by-one bug in process_attestation

The target fuzz_process_attestation-struct (structural differential fuzzer exercising the Attestation processing functions as part of beaconfuzz_v2) raised a difference between Lighthouse, Nimbus and Teku on one hand, and Prysm on the other, triggered with the following Attestation:

{
  "aggregation_bits": "AwE=",
  "data": {
    "slot": 0,
    "index": 1,
    "beacon_block_root": "0x008ffefefefe0000000000000000000000000000000000000000000000000000",
    "source": {
      "epoch": 0,
      "root": "0x0000000000000000000000000000000000000000000000000000000000000000"
    },
    "target": {
      "epoch": 0,,
      "root": "0x0000000000000000000000000000000000000000000000000000000000000000"
    }
  },
  "signature": "0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}

The interesting part is the index field of this attestation which represents the committee index for that particular slot. In eth2, validators are shuffled into committees, and each slot is assigned a committee, i.e. a set of validators expected to attest at that slot (the actual number of committees depends on the number of active validators).

This crash was produced for a BeaconStatethat only has 1 committee (i.e. the only valid committee index is 0). Let's take a look at the attestation processing function, as per the eth2 specification:

def process_attestation(state: BeaconState, attestation: Attestation) -> None:
    data = attestation.data
    assert data.target.epoch in (get_previous_epoch(state), get_current_epoch(state))
    assert data.target.epoch == compute_epoch_at_slot(data.slot)
    assert data.slot + MIN_ATTESTATION_INCLUSION_DELAY <= state.slot <= data.slot + SLOTS_PER_EPOCH
    assert data.index < get_committee_count_per_slot(state, data.target.epoch)

    committee = get_beacon_committee(state, data.slot, data.index)
    assert len(attestation.aggregation_bits) == len(committee)

    pending_attestation = PendingAttestation(
        data=data,
        aggregation_bits=attestation.aggregation_bits,
        inclusion_delay=state.slot - data.slot,
        proposer_index=get_beacon_proposer_index(state),
    )

    if data.target.epoch == get_current_epoch(state):
        assert data.source == state.current_justified_checkpoint
        state.current_epoch_attestations.append(pending_attestation)
    else:
        assert data.source == state.previous_justified_checkpoint
        state.previous_epoch_attestations.append(pending_attestation)

    # Verify signature
    assert is_valid_indexed_attestation(state, get_indexed_attestation(state, attestation))

As we can see, the Attestation committee index is validated in this fourth assertion (assert data.index < get_committee_count_per_slot(state, data.target.epoch)). Let's now take a look at how Prysm handles this check (ProcessAttestationNoVerify() function):

...
c := helpers.SlotCommitteeCount(activeValidatorCount)
if att.Data.CommitteeIndex > c {
  return nil, fmt.Errorf("committee index %d >= committee count %d", att.Data.CommitteeIndex, c)
}
...

As we can see, the inequality check is not following the spec, as att.Data.CommitteeIndex > c should be att.Data.CommitteeIndex >= c (off-by-one bug). As a result, Prysm accepts invalid attestations (for which the committee index is out-of-range), and produces a post-BeaconState when other implementations reject it (Lighthouse for example throws a BadCommitteeIndex index).

Please refer to this GitHub issue for more details. This critical vulnerability was patched by Prysmatic Labs in this PR.

Props to @Daft-Wullie for reporting this!

Prysm: Incorrect epoch in process_proposer_slashing

Another consensus-related discrepancy was identified by the target fuzz_proces_proposer_slashing-struct (structural differential fuzzer exercising the ProposerSlashing processing functions as part of beaconfuzz_v2). The following ProposerSlashing object triggered this bug:

{
  "signed_header_1": {
    "message": {
      "slot": 0,
      "proposer_index": 0,
      "parent_root": "0x000000000000000000ffc33bffff0a0a00000000000000000000000000000000",
      "state_root": "0x0000000000000000000000000000000000000000000000000000000000000000",
      "body_root": "0x0000000000000000000000000000000000000000000000000000000000000000"
    },
    "signature": "0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  },
  "signed_header_2": {
    "message": {
      "slot": 0,
      "proposer_index": 0,
      "parent_root": "0x0000000000000000000000000000000000000000000000000000000000000000",
      "state_root": "0x0000000000000000000000000000000000000000000000000000000000000000",
      "body_root": "0x0000000000000000000000000000000000000000000000000000000000000000"
    },
    "signature": "0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  }
}

While Lighthouse, Nimbus and Teku reject this ProposerSlashing (with Lighthouse throwing a ProposerNotSlashable error), Prysm accepts it as a valid slashing and produces a post-BeaconState.

Let's take a look at how these slashings are expected to be validated. The is_slashable_validator() function called in the ProposerSlashing processing is defined as:

def is_slashable_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is slashable.
    """
    return (not validator.slashed) and (validator.activation_epoch <= epoch < validator.withdrawable_epoch)

which is called as follows in the process_proposer_slashing function (fourth assertion):

def process_proposer_slashing(state: BeaconState, proposer_slashing: ProposerSlashing) -> None:
    header_1 = proposer_slashing.signed_header_1.message
    header_2 = proposer_slashing.signed_header_2.message

    # Verify header slots match
    assert header_1.slot == header_2.slot
    # Verify header proposer indices match
    assert header_1.proposer_index == header_2.proposer_index
    # Verify the headers are different
    assert header_1 != header_2
    # Verify the proposer is slashable
    proposer = state.validators[header_1.proposer_index]
    assert is_slashable_validator(proposer, get_current_epoch(state))
    # Verify signatures
    for signed_header in (proposer_slashing.signed_header_1, proposer_slashing.signed_header_2):
        domain = get_domain(state, DOMAIN_BEACON_PROPOSER, compute_epoch_at_slot(signed_header.message.slot))
        signing_root = compute_signing_root(signed_header.message, domain)
        assert bls.Verify(proposer.pubkey, signing_root, signed_header.signature)

    slash_validator(state, header_1.proposer_index)

Again, let's compare the specification with the Prysm implementation (VerifyProposerSlashing() function in proposer_slashing.go ):

if !helpers.IsSlashableValidatorUsingTrie(proposer, helpers.SlotToEpoch(hSlot)) {
  return fmt.Errorf("validator with key %#x is not slashable", proposer.PublicKey())
}

The IsSlashableValidatorUsingTrie() function is defined in validators.go as follows:

// IsSlashableValidatorUsingTrie checks if a read only validator is slashable.
func IsSlashableValidatorUsingTrie(val stateTrie.ReadOnlyValidator, epoch uint64) bool {
    return checkValidatorSlashable(val.ActivationEpoch(), val.WithdrawableEpoch(), val.Slashed(), epoch)
}

func checkValidatorSlashable(activationEpoch, withdrawableEpoch uint64, slashed bool, epoch uint64) bool {
    active := activationEpoch <= epoch
    beforeWithdrawable := epoch < withdrawableEpoch
    return beforeWithdrawable && active && !slashed
}

As we can see, Prysm is using an incorrect epoch when validating the ProposerSlashing. Instead of using the current epoch from the BeaconState (get_current_epoch(state) in the specification), the Golang client relies on the epoch provided in the ProposerSlashing. As such, for that particular slashing, Prysm accepts it while it should be rejected, since the related validator/proposer is in fact not-slashable.

Please refer to this GitHub issue for more details. This critical vulnerability was patched by Prysmatic Labs in this PR.

Teku: Minor difference in process_proposer_slashing

A minor/unexploitable specification deviation affecting Teku was identified by the target fuzz_proposer_slashing-struct (structural differential fuzzer exercising the ProposerSlashing processing functions as part of beaconfuzz_v2).

This same discrepancy was previously identified on Prysm (refer to our previous update), whereby the ProposerSlashing processing compares the two SignedBeaconBlockHeaders instead of the BeaconBlockHeaders.

This discrepancy is not directly exploitable, as it would require a malicious actor to be able to produce two different, valid BLS signatures for the same message (BeaconBlockHeader). Nonetheless, this was quickly fixed by the Teku team in this PR.

Please refer to the relevant GitHub issue and our previous blog post for more details.

Cloud Fuzzing Infrastructure

We're delighted to be working with a future Ethereum Foundation team member who will be joining the eth2 DevOps effort imminently and help automate the deployment and monitoring of our various fuzzers on our dedicated AWS fuzzing infrastructure. They have already produced handy Ansible scripts and we look forward to leveraging their expertise to define and implement processes to streamline the execution of our fuzzing targets.