In this scenario, we’ll explore Transfers, one of the core events in any blockchain, as it encapsulates the flow of tokens. We’ll be also working through a specific query to extract business value for a given question.

Scenario

Let’s say you’ve just joined a brand-new EVM chain, one of quite a few now:

Untitled

Your first task as a “data analyst”, “web3 analyst”, “blockchain analyst” or insert any other cool sounding title for “data analyst” here, is to create a token transfers table to be able to track the flow of erc20 tokens in and out of the network. You need to do this to serve their other request which is to know how many USDC and USDT tokens are inflows/outflows (minted/burned) on chain in the last month.

You are extremely lucky because the data engineering team has already set up a pipeline for ingesting data directly from a node and has provided you with the tables we discussed earlier, blocks, transactions, logs and traces.

Assuming the chain you’re working with follows the Ethereum Improvement Proposal standard for ERC20 tokens, then the event (emitted in logs) you’re looking for is the following:

event Transfer(address indexed _from, address indexed _to, uint256 _value)

From reading the reading material here we know that the Transfer event gets hexed and can then be found in the logs table. More specifically in topics as the first topic (topics[0] or topics[1], depending on your SQL engines indexing preferences). If you are using BigQuery as we will in this tutorial, the indices start from 0 onwards, so it would be found in topics[0].

Too long didn’t read

If you’ve not read the reading material in the earlier section, then you may ask google or even ChatGPT…

Untitled

It will give you some implementation using the web3.js library and the abi of the contract. The problem is that open-source projects, verify their code. Therefore others usually fork it and modify it, without necessarily verifying the contract, or even modify the emitted events, usually parameter names and what is indexed. For this reason, we’ll focus on extracting value out of the hex of an event directly instead of decoding it using the ABI. This gives us the flexibility of also capturing copy-cats.

Here is how hexing an event is done in Python:

Which in turn will return the following hex:

import sha3
k = sha3.keccak_256() # Hashing algorithm
event_emitted = 'Transfer(address,address,uint256)'
k.update(event_emitted.encode('utf-8'))
hexed_event = '0x' + k.hexdigest()

or if you prefer…

from web3 import Web3
event_emitted = 'Transfer(address,address,uint256)'
hash_object = Web3.keccak(text=event_emitted) # Hashing algorithm
hexed_event = hash_object.hex()

'0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef'

If you’re observant you may say, wait a minute, we’re only using the types of arguments, not the names or whatever this indexed thing is. You would be correct! This is how the events get hexed in the EVM, to make them more deterministic. In other words, I can also create an event like so: