Ethereum Transaction Data: Principles and Construction

Understanding Blockchain Transactions

When data needs to be written to the Ethereum blockchain, this process is called a transaction. Retrieving data from the blockchain is known as a call. Unlike traditional databases, Ethereum requires data to be encoded into hexadecimal bytecode before storage. Blockchain blocks store basic types such as Bytes32, Address, etc. Reading this data involves converting hexadecimal bytecode back into UTF-8 encoded text to form meaningful information.

Example: ETH Transfer Transaction

A common transaction is an ETH transfer. Consider Alice sending ETH to Bob on the Ethereum network. The transaction receipt includes:

From: Sender's address
To: Recipient's address
Value: Transfer amount

This structure is straightforward for ETH transfers. However, ERC20 token transfers via smart contracts require more complex data construction.

ERC20 Token Transfer Data Structure

For ERC20 transfers:

From: Sender's address
To: ERC20 token contract address
Input Data: Contains the core transfer details encoded for the EVM.

Key Components of Input Data

MethodID: Function signature for transfer(address _to, uint256 _value).
Parameters:
- Recipient address (left-padded to 64 characters).
- Token amount (left-padded to 64 characters).

Example ERC20 transfer data:

0xa9059cbb
000000000000000000000000d0292fc87a77ced208207ec92c3c6549565d84dd
0000000000000000000000000000000000000000000000000de0b6b3a7640000

Why Left-Padding?

EVM bytecode execution follows these rules:

Static types (e.g., uint, address, bytes32): Left-padded with zeros to 64 characters.
Dynamic types (e.g., string, bytes, arrays): Right-padded with zeros to 64 characters.

Advanced Transaction Data Construction

For complex smart contract methods involving dynamic arrays, data construction requires placeholders to locate array values.

Example Function:

function analysisHex(
    bytes name,          // Dynamic
    bool b,              // Static
    uint[] data,         // Dynamic array  
    address addr,        // Static
    bytes32[] testData   // Dynamic array
) {}

Transaction Data Breakdown

Function Signature: 0x4b6112f8 (derived from hashing the function name and parameter types).
Placeholders: Indicate where dynamic array values start (e.g., 0xa0, 0xe0, 0x180 in hex).
Static Values: Directly embedded (bool, address).
Dynamic Values: Appended after placeholders:
- bytes data length and value ("Alice").
- uint[] array length and values ([9,8,7,6]).
- bytes32[] array length and values (["张三","Bob","老王"]).

👉 Explore Ethereum transaction tools

Static vs. Dynamic Arrays

Static Array Example

function analysisHex(
    bytes32 name,        // Static
    bool b,              // Static  
    uint[4] data,        // Static array
    address addr,        // Static
    bytes32[3] testData  // Static array
) {}

No placeholders needed: Values are directly written in order.
Simpler construction but less flexible.

Building Transactions with Web3j

Web3j simplifies transaction data construction by:

Generating function signatures.
Calculating dynamic array offsets.
Automatically padding values.

Key methods:

encode(): Constructs the full payload (signature + parameters).
encodeParameters(): Handles static/dynamic type encoding.

👉 Web3j transaction encoding source

FAQs

1. Why is hexadecimal used for Ethereum transactions?

EVM operates on bytecode, and hexadecimal compactly represents binary data for efficient processing.

2. How are dynamic arrays stored in transaction data?

Dynamic arrays use placeholders to indicate their start position, followed by length and values.

3. What’s the difference between `bytes` and `bytes32`?

bytes: Dynamic length (requires length prefix).
bytes32: Fixed 32-byte length (treated as static).

4. Can I manually construct transaction data?

Yes, but tools like Web3j reduce errors and handle complex encoding rules automatically.

5. How does Ethereum verify transaction data?

Miners validate transactions by executing the encoded data in the EVM and checking signatures.

6. Are there gas costs for dynamic arrays?

Yes—larger arrays incur higher gas fees due to increased computation and storage.

Conclusion

Constructing Ethereum transaction data requires understanding encoding rules, static/dynamic types, and placeholder logic. Tools like Web3j streamline this process, especially for dynamic arrays. Mastery of these principles is essential for developers interacting with smart contracts.

For further reading:
👉 Dynamic array documentation