NEAR: Sharding & Cross Contract Calls

Category Cybersecurity

NEAR: Sharding & Cross Contract Calls

Introduction

NEAR Protocol implements an innovative sharding solution called "Nightshade" that aims to tackle the problem of high scalability while maintaining security and usability. This article explores the technical details of NEAR's sharding implementation and how cross-contract calls work within this architecture with a focus on security implications.

The Magic of Nightshade Sharding

Sharding in distributed systems like blockchains is a method of increasing the network’s speed and capacity by dividing the network, here blockchain, into multiple “shards”. This division would allow for each shard to process transactions independently of the others — theoretically increasing the transaction throughput.

Nightshade Sharding

NEAR implements their sharding method called “Nightshade”. Each shard processes a subset of the network’s transactions in parallel. NEAR sharding creates shards (”chunks”) within each block instead of sharding chains. It thus maintains only one chain and allows for asynchronous (”cross-shard”) transactions.

Here is a simplified illustration of what Nightshade looks like compared to the Beacon Chain: Beacon Chain comparaed to NightshadeFigure 1. Nightshade sharding illustrated. Source: LiNEAR Protocol blog

Dynamic Resharding

NEAR’s sharding allows for something called “Dynamic Resharding”. It allows NEAR to adjust the number of shards based on network demand automatically. NEAR validators also only need to track the state of their assigned shard.

Cross-Contract Calls: The Bridge Between Shards

Because of the nature of NEAR’s sharding technology, its cross-contract calls are asynchronous and independent. This has important implications on how one should handle the callbacks.

The Building Blocks

Cross-contract calls in NEAR operate through a promise system. Every cross-contract interaction involves two key components: making the call and handling the result. Here is what happens under the hood:

When Contract A needs to interact with Contract B, it creates a promise using NEAR's SDK. This promise represents the future result of the call to Contract B. The process is asynchronous, meaning Contract A does not wait idly for Contract B to respond – it continues with other operations while waiting for the callback.

Here is an example of how this works in code:

#[near_bindgen]
impl MyContract {
    pub fn initiate_cross_contract_call(&mut self) -> Promise {
        Promise::new("contract-b.near".parse().unwrap())
            .function_call(
                "process_request".to_string(),
                json!({
                    "data": "example"
                }).to_string().into_bytes(),
                0, // attached deposit
                Gas(5_000_000_000_000) // attached gas
            )
            .then(
                Promise::new(env::current_account_id())
                    .function_call(
                        "callback_function".to_string(),
                        Vec::new(),
                        0,
                        Gas(2_000_000_000_000)
                    )
            )
    }

    #[private]
    pub fn callback_function(&mut self, #[callback_result] call_result: Result<String, PromiseError>) {
        match call_result {
            Ok(result) => {
                // Handle successful result
            }
            Err(_) => {
                // Handle error case
            }
        }
    }
}

Security Considerations

The asynchronous nature of these cross-contract calls introduces potential vulnerabilities that must be carefully considered.

Race conditions present one of the most significant security challenges. During the period between a cross-contract call and its callback execution (typically 1-2 blocks), your contract remains active and callable. This means a malicious user could potentially exploit this window to manipulate the contract state.

Let us examine two concrete scenarios that illustrate both proper implementation and potential vulnerabilities.

Scenario 1: Normal Operation Flow

Normal Operation Flow Diagram In this diagram, we see a legitimate user "Alice" interacting with the smart contract:

  1. Alice initiates by depositing 100N
  2. The contract calls stake() with 100N to receive 200 tokens
  3. The external contract processes the swap
  4. A promise result returns from the external contract
  5. After 2 blocks, the callback executes to decrease the internal NEAR balance

The final state shows Alice with 0N balance and 200 tokens received. This is the expected, secure behavior where the contract maintains proper state management throughout the asynchronous operation.

Scenario 2: Potential Exploit

However, without proper protections, a malicious user could exploit the asynchronous nature of cross-contract calls: Exploit Operation Flow Diagram

In this diagram, we see how an attacker could potentially exploit the system:

  1. Attacker deposits 100N
  2. They call stake() multiple times (2 times in this example) with 100N before the first callback completes
  3. The external contract processes multiple swap requests
  4. Promise results return from the external contract
  5. Callbacks execute after 2 blocks, leading to two possible outcomes:
    • Successful case: NEAR balance decreases to 0, but the attacker receives 400 tokens
    • Error case: A panic occurs due to underflow when trying to decrease the balance after the second callback

This vulnerability exists because the original implementation does not prevent multiple stake() calls during the processing period.

Mitigating the Vulnerability

To protect against such exploits, implement these essential security measures:

  • The callback function needs to be public, but only callable by the contract
    • Add #[private] decorator above the function. This ensures that only the contract can call the callback function and not external users.
  • Ensure the contract is not in an exploitable state between call and callback
  • Manually rollback any state changes in the callback if the external call failed
    • Enough gas is assigned in the callback function to make the transfer of funds back

Here is an example of how this works in code:

#[near_bindgen]
impl StakingContract {
    pub fn stake(&mut self) -> Promise {
        // Add state lock to prevent multiple calls
        assert!(!self.processing, "Already processing a stake operation");
        self.processing = true;

        // Store initial state for potential rollback
        self.last_stake_amount = self.stake_balance;

        Promise::new(self.staking_target.clone())
            .function_call(
                "stake_tokens".to_string(),
                // ... stake parameters ...
                Gas(5_000_000_000_000)
            )
            .then(Self::ext(env::current_account_id())
                .with_static_gas(Gas(2_000_000_000_000))
                .stake_callback())
    }

    #[private]
    pub fn stake_callback(&mut self, #[callback_result] call_result: Result<(), PromiseError>) {
        // Always reset processing flag in callback
        let processing = std::mem::replace(&mut self.processing, false);
        assert!(processing, "Callback called without active processing");

        match call_result {
            Ok(_) => {
                // Verify state changes are valid
                assert!(
                    self.stake_balance >= self.last_stake_amount,
                    "Invalid state change detected"
                );
            }
            Err(_) => {
                // Rollback any state changes
                self.stake_balance = self.last_stake_amount;
                env::log_str("Stake operation failed, state rolled back");
            }
        }
    }
}

Conclusion

This post has covered the essence of what Sharding entails in the NEAR blockchain and has touched upon some of the intricacies of cross-contract calls and their security implications.

Remember that this type of vulnerability is not unique to our example of staking operations. Any cross-contract call that modifies contract state could be vulnerable to similar race conditions if not properly protected. It is wise to assume that malicious users will attempt to exploit the time window between call and callback execution.

Testing and understanding these security measures is important for developers and security reviewers. It increases the protocols' security and protects users' assets and keeps the overall ecosystem safe and trusted. We encourage all developers and security reviewers to stay informed and be proactive in finding and mitigating these types of vulnerabilities.

At Sigma Prime, we are committed to securing and hardening Blockchain networks and protocols of all kinds. If you are building solutions and want to harness our cutting-edge security expertise in this area, get in touch!