The business of sending transactions on Ethereum

Published in

Coinmonks

11 min readDec 1, 2018

This post aims to be a guide to different techniques, patterns and mechanisms utilised for sending transactions across the Ethereum ecosystem. This aims to be an evolving resource as more techniques emerge and thus should be considered a work in progress.

Included in this admittedly broad topic are:

Introduction

Ethereum is an account based system. There are two types of account: Externally Owned Account and a Contract account. Both accounts have an associated address, nonce and balance. The contract account additionally has immutable code and storage associated. This is a good resource for a detailed explanation of the basics.

A transaction has the following fields:

nonce the count of the number of outgoing transactions, starting with 0
gasPrice the price to determine the amount of ether the transaction will cost
gasLimit the maximum gas that is allowed to be spent to process the transaction
to the account the transaction is sent to, if empty, the transaction will create a contract
value the amount of ether to send
data could be an arbitrary message or function call to a contract or code to create a contract

Crucially there is no from field, this is derived from the public-private key pair that signs the hash of the transaction, after the appropriate RLP encoding has been applied.

Gas Usage and GasToken

From a very removed perspective a blockchain can be described as a shared database. Each read and/or write costs gas in order to prevent spam attacks. More specifically each computation step costs gas in order to prevent the halting attack. The cost per opcode are outlined in the yellow paper. The cost per opcode is the subject of ongoing debate, as the community examines the possibility of introducing storage rent and even dynamic gas/opcode pricing.

Writing state can be very expensive, creating a new non-zero storage slot costs 20,000 gas, almost the same as the base transaction cost of 21,000 which facilitates a simple Ether transfer (data field above left blank). As an incentive to mitigate blockchain bloat, the Ethereum protocol refunds 10,000 gas for the deletion of old storage that is no longer required.

This refund can pay for up to half of the gas used by a contract transaction (simple transfer are not eligible for a refund, since they already use the minimum amount of gas; batched sends to contracts, however, can benefit from this refund mechanism). GasToken allows developers to take advantage of this in a simplistic and optimised way, by tokenizing gas — storing gas when it is cheap and using / deploying this gas when it is expensive.

Indeed there was recently a vulnerability discovered at a few exchanges with did not have gasLimit set appropriately. The attack was simple: request a withdrawal from an exchange, directing the transfer to a contract you (the attacker) have deployed which has a fallback function that mints new gasToken.

Metatransactions

Metatransactions are a sending pattern which allows the sender to sign a valid ethereum transaction and forward it off-chain to a relayer which is willing to pay for the associated gas costs, and propagates this signed transaction on the network.

Ethereum transaction wizards love abstraction

This is useful since the sender no longer has the requirement to keep Ether associated with a key pair, which has many benefits in terms of user experience. I have previously written about metatransactions and their affect on UX here.

The transactions destination is typically a smart contract that in some sense understands that the signer of the incoming transaction is not the true originator of the transaction. msg.sender will return the relayers address which likely will not have appropriate permissions to act on the signers behalf, and thus is not overly useful in this scenario. Instead many metatransactions rely upon on-chain signature validation (using ecrecover) and checking that the signer is on an appropriate whitelist for whatever action the transaction intends to execute.

Submarine Sends

(not to be confused with Submarine Swaps!)

Miner frontrunning is a fundamental problem in blockchain-based markets in which miners reorder, censor, and/or insert their own transactions to directly profit from markets running on blockchain economic mechanisms. Submarine sends tackle the problem of miner frontrunning.

Submarine sends aim at a strong confidentiality property. Rather than just concealing transaction amounts submarine sends conceal the very existence of a transaction. Of course, a permanently concealed transaction isn’t very useful. Submarine sends thus also permit a transaction to be surfaced by the sender at any desired time in the future — thus the term “submarine”.

Using Submarine sends the User’s transaction cannot be front run by miners.

The commit transaction contains a cryptographic commit to whatever application specific data the user wished to submit to the smart contract and also locks up any associated ether or tokens in the Submarine Address, which is indistinguishable from a fresh address. Any value locked up in this address can only be unlocked by the smart contract. By attaching monetary value to the commit transaction (that is burned unless the user reveals), we can create strong incentives preventing a malicious user from selectively revealing commits. Once the commit transaction is safely included in the blockchain, the user then reveals her commit to the smart contract, and the smart contract executes its application specific logic.

Counterfactual Instantiation

The term Counterfactual stems from a concept in philosophy and logic. A counterfactual statement is generally a valid chain of reasoning with an intentionally untrue premise and a conclusion. Despite the premise being false, the overall claim is true, because if the premise was true then the conclusion would be too. In relation to blockchain transactions, this logic takes into account how the chain might look, not just how it currently looks, for deployed contracts.

More concretely, the pattern of getting the address of a contract before actually deploying it, is known as counterfactual instantiation and was made popular by L4 in their Counterfactual State Channels paper.

Currently, new contract addresses are generated deterministically with the CREATE opcode according to the address of its creator (sender) and how many transactions the creator has sent (nonce). The sender and nonce are RLP encoded and then hashed with Keccak256 to give the new address.

Skinny CREATE2 improves the situation by allowing interactions to be made with addresses that do not exist yet on-chain but can be relied on to only possibly eventually contain code that has been created by a particular piece of init code. CREATE2 instead useskeccak256( 0xff ++ address ++ salt ++ keccak256(init_code)))[12:] instead of the usual sender-and-nonce-hash as the address where the contract is initialized at.

This pattern is particularly important for state-channel use cases that involve counterfactual interactions with contracts, allowing the Ethereum root chain to be used as a dispute layer potentially without inferring the cost of deploying the contract. Similarly it can be used in situations where a fresh address is generated with known functionality, such as a loan repayment address.

Zero Confirmation Transactions

Zero-conf transactions are an interesting yet currently unproven area of research stemming from the Bitcoin Cash community, since the block times are even more UX-inhibiting on that network. Zero-conf transaction senders post a bond, such that a doublespend results in the loss of the bond. With Bitcoin Cash, doublespends are detected by reuse of UTXO inputs. Anyone (presumably miners) can submit the two transactions and collect the bond.

In the account based Ethereum network, instead of UTXOs, we look for reuse of the same nonce from the same sender. A contract is deployed with a reportDoubleSpend function that accepts two completing signed transactions, checks their senders and nonces, and if they match, awards the deposit/bond to the reporter (function caller). The theory is: If the bond is sufficient in size, this acts as a deterrent for the transaction sender to cheat (doublespend) as they risk losing their deposit. It is thought that this type of transaction is most suited to once-off singular payments of low value, since an array of attack vectors are possible with this pattern.

Batched Transfers

One of the main problems of the ERC20 token interactions is that they require 2 different transactions — one for approve and one for doingSomethingWithTheToken towards arbitrary contract (that internally calls transferFrom). This introduces all the problems of non-atomic transactions. The simplest problem is that even if your doSomethingForTokens transaction fails, your approve does not and your allowance (set with approve) still stands.

ERC-20 approve() and transferFrom() non-atomicity

This particular flavour of batched transfer is implemented by Limechain. Using the principle of on-chain signature validation borrowed from Metatransactions, a failed doSomethingForTokens() will now revert the approve() call, transforming the non-atomicity of the ERC-20 approve() and transferFrom() pattern.

SMS-based Payments

CoinText is arguably the best known SMS cryptocurrency payments provider focusing on Bitcoin Cash transactions. These payment mechanisms are particularly useful for resource constrained devices in emerging economies. Eth2 have built and deployed similar technology on the Ethereum platform, which operates in conjunction with a traditional app-based Ethereum wallet such as Trust wallet.

eth2.io SMS-based cryptocurrency payments

This particular construction uses an escrow smart contract. The sender generates a transit private-public key pair, deposits ether to the escrow smart contract and assigns/associates the transit public key with the deposit. The private key is encrypted with a randomly generated symmetric key and this ciphertext is sent (via email, sms, whatsapp) to the centralised verification server. On withdrawal, after verification of the receiver's phone number, the verification server sends the receiver the ciphertext, the receiver decrypts it and signs a withdraw message that the escrow smart contract verifies to be signed by the transit private key.

A centralized server is used for phone verification and keystore transferring, though the Eth2 server does not have control over Ether locked in the escrow smart contract. If the server is compromised, the transaction will simply fail and Ether will remain on the escrow. To receive Ether back, sender will be able to cancel the transfer with a call to the escrow smart contract.

Subscription Payments

Opt-out subscription based payments dominate the way services are compensated in the Web2.0 world — Spotify, Netflix, Headspace, Tinder etc. all basing their income models on subscription payments.

The concept of cryptocurrency subscription payments is not new — in Bitcoin the nLocktime can be utilised to ensure a pre-signed transaction cannot be included in a block until a particular block number. Pre-signing Ethereum transactions for future payments is of limited effectiveness since the account nonce may be incorrect for a future payment, since the nonce will increment in the meantime if transactions are sent from that account, deeming the transaction invalid.

However, the Turing-completeness of Ethereum comes to the rescue: there are several architectures for repeated subscription type transactions. These architectures have various tradeoffs in terms of staking (liquidity), UX complexity, optionality, gas use (cost) and extensibility

Oracle Enabled Calls

Another more unusual method to send a transaction is to utilise an oracle service such as Oraclize to reduce gas consumption at the cost of relative centralisation, as outlined in this post.

Using Oraclize to reduce gas consumption for a constant function call

This pattern works when a non-transactional (constant) function call is required. This function can be called from a node synced to the Ethereum network using theeth_call JSON-RPC method. The query is made through Oraclize’s oraclize_query() which is available after inheriting usingOraclize into your contract. Additionally a __callback(bytes32 queryId, string results) method must be defined in this contract as the returned query calls this method, which holds the results.These constant state results may have been more expensive to retrieve and compute using direct on-chain calls, in comparison to a call to Oraclize.

Multisends with one-time addresses

As mentioned in the introduction section above, there is no from field present on transactions. This is instead calculated using ecrecover . This poses the question “What if we just filled in whatever we liked for the transaction signature?” It turns out that half of all signatures are valid in the sense that ecrecover returns a public key (and thus an address). Since we have no control over what address that is, what we’ve just done is to build a transaction that can spend funds from a seemingly random address.

If we create such a transaction, then fund the generated address with some Ether, the transaction will be able to execute just like a regular one. We’ve effectively created a “single use address”, where funds can only be spent by one transaction. If we publish the transaction, and choose values for the signature field in some predictable fashion, we can prove to anyone that funds sent to that address will be usable only by that transaction, and nothing else.

In the above schema which aims to send transactions to 11,140 destinations, a series of transactions are generated that send ether to multiple addresses — 110 of them per transaction — and generate addresses for them using the process described above. For the signature fields, we fill in ‘0xDA0DA0DA0…’ — a predictable value so that you can be certain someone doesn’t actually have a private key for the generated address.

This produces a series of transactions with ‘one time addresses’ that can be used to fund them. 104 transactions is still a few too many for the trustees to easily sign, so we repeat the process one more time forming a cascade pattern, building another transaction that sends ether to the 104 transactions we generated in the first step, producing a single transaction with its own unique address. After verification that the code performs as expected, anyone can submit the transactions to the network, starting the whole process that will result in sending ether to everyone on the list — all with only a single signature required.