RSK Research News > Storage rent
This is the first of a series of articles I will be posting during 2017. I want developers and users to understand the benefits and the uniqueness of RSK platform design. The topics I plan to cover in this and subsequent posts relate to improvements to full node technologies: consensus, security, privacy, scalabilty, auditability, and more. Some of the improvements will be present in the first public release of RSK (code-name Ginger) and others will be introduced in following releases. However, as we’re still working on the details, the final functionality may differ slightly from the current platform state I’m presenting today.
Every blockchain must make non-trivial trade-offs to achieve the designer’s goals. The information created during the design process, the dead-ends, the mistakes, are never present on cryptocurrency whitepapers. Personally I’m am skeptic when I read about a blockchain-based platform that tries to do everything, from connecting all other blockchains to serving every IoT device on earth. The search for the perfect blockchain is futile. Mimblewimble will probably become highly scalable and private but cannot support smart-contracts, zCash is anonymous but not much scalable, DAG-based cryptocurrencies could be highly decentralized but cannot support contract state.. New cryptocurrency platforms have the benefit of being able to innovate faster, and RSK tries to maximize this opportunity. However, we prioritize our innovation efforts toward clear goals. At RSK we aim for financial inclusion. Our main value contribution is an open, cheap and scalable system that can be compliant with regulations. Anonymity and absolute decentralization are second in our priority list. Every technical decision in RSK serves RSK purpose and priorities.
In this article we’ll discuss the need of storage rent, and how RSK satisfies this need. In a nutshell, storage rent is a fee users pay in order to have their accounts, contracts and memory live on the network at any time, so their data can be accessed fast and at at a low cost. As we’ll see, storage rent is required to assure the long term viability of the platform, but the implementation of this feature is very tricky. In many cases the amount of memory a user persists in a contract is so small that the rent becomes a micro-transaction, and the cost of processing the rent payment through a micro-transaction is higher than the amount paid itself. A seemly reasonable implementation of storage rent may fail to consider this tricky cost unbalance. One also has to consider the CPU and space costs of storage accounting and the cost of managing misbehaving contracts. This tasks can can easily become bottlenecks to scalability. At RSK we considered several designs, their pros and cons, until we settled on the current approach. We must note that Ethereum community also discussed adding storage rent to their platform, but the proposal was dismissed
New kinds of Full Nodes
Originally, a cryptocurrency full node was a networked computer that maintained the full state of the blockchain (or more precisely, the best-chain) and had a CPU that was capable of verifying the correctness of new blocks. On the contrary, an SPV node didn’t need to store the full blockchain nor to verify the complete correctness of blocks, With the advent of sharding protocols and pruning algorithms we now need to differentiate between different kinds of full nodes. Most partial nodes only store part of the blockchain history but still fully verify new blocks by storing an additional cache, the “blockchain state” or more precisely the “best block state”. Generally the last N block states are also . stored in this cache to allow efficient reorganizations of the blockchain. We’ll name “selfish full node” to any node that is capable of verifying the correctness of new blocks and maintaining the best-chain up to a certain maximum depth, but may or may not store historic blocks. While the original definition of full node related to a service provided to the network (bootstrap new nodes), the selfish definition of full node refers to the capability of a node to be in sync with the rest of the network.
As the best block state grows, the cost of maintaining a full node originates more from the cost of fast access the state than from accessing historic blockchain data, Access to the best block state must be fast to increase the throughput and security of the network. If verifying a new block takes too much time, then miners must either mine empty blocks or empty uncles.. However, there is no need for fast access to historic blockchain data as it can be downloaded slowly in background even while the full node verifies the latest blocks, if trusted checkpoints are used. Therefore the most valuable resource any cryptocurrency tries to protect from bloat is the best block state, and not the blockchain.
The best block state in RSK grows when new accounts or contracts are created, and when contracts request additional contract storage, so those operations must be examined.
Who Should Pay for Blockchain State Storage
One of the unsolved problems of Ethereum is that state storage can be acquired at a low cost, or even zero cost (if there is no transaction backlog), and never released, forcing all full nodes to store that state information forever. There are almost no examples in real-world commerce where users acquire eternal rights over a property that requires continued maintenance performed by third parties, but it is acquired by a single non-recurring payment. But that is the case of blockchain state storage in Ethereum, and, to a lower extend, UTXO storage in Bitcoin. Maintaining the state space requires paying for electricity and the amortization cost of storage hardware, and the cost must be multiplied for every full node in existence. It can be argued that full nodes are altruistic, and therefore they are willing to incur in any state storage cost the cryptocurrency users demand. While this may have been partially true for Bitcoin nodes in the past, the altruistic behavior has stopped greatly as the blockchain size grown. The number of Bitcoin nodes is steady while the number of Bitcoin users has increased considerably. It is expected that block pruning and sharding techniques allow new users to commit a certain smaller amount of historic blockchain storage, but yet the state must be maintained in full. Requesting all state information required to verify a block (read or written branches of the current state trie) is generally not possible in real-time without a huge bandwidth and peers willing to provide such data..
If the use of state storage is not protected from abuse, we risk to price out full nodes. Controlling the state size reduces the centralization pressure while maintaining a free market. Considering long term risks of state bloat and the uncertainty of Moore’s law and similar trends in the future, is clear that preventively users should pay a state storage rent. These central economic decisions cannot be later applied without breaking the community contract.
To Whom Should Storage Rent be Paid
At a first glance, as full nodes store the blockchain and the state, it seems that storage rent should be paid to full nodes. However, HDD storage costs keep decreasing at a rate of 40% per year (this trend is known as Kryder Law), so under this trend the real cost of storage is bounded. A similar trend exists for SDD storage. Electricity cost of storage can be fixed if the memory is unmodified, and depend on the number of read or write accesses per second. However, this is not what storage rent accounts for. The bloat of state affects mainly miners, who cannot start mining a child block containing transactions before the parent block has been fully validated. If the state does not fit in RAM, or in SDD, then state access is greatly slowed down, and miners must mine empty blocks until they fully verify a block. This establishes strong incentives for centralization, as bigger pools do not suffer this delay. Miners have incentives to use the faster and more reliable storage and faster CPUs to reduce the verification delay. Therefore RSK pays most storage rent to miners. Even if we would like to redirect part of the fees to full nodes, there is no tested secure protocol to perform this payment. In RSK we’re testing the Proof of Unique Blockchain Storage (PoUBS) protocol, designed by Sergio Lerner, which is currently the only protocol that has the potential to solve full-node reward problem without trusted parties.
The Problem of Storage rent on DApps
For contracts that are controlled by a central entity, it is clear who must pay the storage rent. But for some use cases it is not clear who should pay for this rent. Many contracts (and probably the most interesting ones) are crowd-contracts: programs that are fueled and used by the crowd, without any manager. Crowd-contracts can consume a lot of contract storage, but no single user is in position of carrying the burden of the paying the rent. No single user will be willing to pay for all. Luckily crowd-contracts that consume high amounts of storage tend to be frequently accessed.
One can imagine that a well designed crowd-contract should have a revenue generation method for paying for the storage rent. For example, each crowd-contract method call should be accompanied by a payment in bitcoins to a special rent sub-account where the crowd-contract collects all rent-oriented income. However, this approach has several problems:
- Most crowd-contracts are immutable, such revenue collecting method must be defined before the contract is deployed. If there is a direct relation between a user and the memory it consumes, each user can pay a partial rent independently. But if this is not the case, and most users only receive a service, then it will be highly unclear what proportion should be paid by each operation, to keep the contract from missing the rent deadline.
- The cost in gas required to manage the rent collection process may be so high that make the service offered too expensive. The collection process involves several steps such as computing the amount of rent each user must pay, collecting rents, keeping a registry of which users have or haven’t pay the rent, removing the data of users who did not pay the rent, etc.
In an ideal world, a DNS-like contract would manage a independent balance to pay a rent for each name registered, However, as previously stated, this approach may be highly inefficient as each payment for a small chunk of memory may represent a hundredth of a US cent.
The Problems with Collecting Rents with Fixed Periods
There are several complications that arise when trying to implement storage rent as a payment for a fixed period, ending at a certain specific time. The rent-paying contract execution could be scheduled using a crontab-like method. But one of the design choices of RSK/Ethereum is to avoid crontab-like scheduling because the CPU consumed by contract execution has a direct effect on block propagation, so the crontab system can be used to perform a DoS attack on certain miners (e.g. by using an expensive computation which the attacker knows the result, but not the honest miner). Since a contract cannot schedule an action at a specific time, triggering the rent-paying code would need to be done from a message coming from the outside world, before the rent deadline comes. The payed amount must be specified in gas, because all prices are relative to gas units, so the transaction gasPrice applies. A contract cannot easily determine the adequate gasPrice to be paid unless the price is semi-fixed. Even if the minimum gas price is published by miners, miners are not forced to accept the rent-paying contract execution, not any other transaction. A miner can try to censor a transaction that tries to pay a rent. Therefore the period to pay for the contract rent should be long enough so no single miner can censor a rent payment.. To prevent further censorship, the payment should be carried out by the execution of an opcode, and not by an explicit kind of transaction, to force censorship to carry the cost of the transaction execution (e.g. halting-problem). If deterministic deadlines are set some fixed time after the contract creation time, then the simultaneous creation of multiple contracts in the same block can lead to a high number of contracts requesting deadline checks at the same time. Therefore rent-deadline events should be chosen randomly at contract creation time, so the event is not foreseeable.
Short deadlines intervals may be fine for long-time savings accounts, where there will be no balance changes for years. However short deadlines are overkill for other kinds of contracts. Monthly payments add too much pressure on users. As a comparison, owners of domain names prefer to pay an annual fees, rather than worrying about a monthly fee. Also short deadlines adds uncertainty because the gas price may vary from month to month, preventing users from planning ahead of time. Also, for accounts, it’s suspected that the execution of contract rent payments consumes computing resources that are more expensive that the micro-rent that is being paid for a month..
Considering all these difficulties, we tested a design based on fixed rent periods and realized it increases the complexity and cost of contract programming.
Therefore we decided that the RSK platform would use a simpler approach and collect rent for the intervals between uses, rather than for long fixed periods of time.
Solutions to Micro-rents
One way to tackle the problem of micro-rents is by using a probabilistic approach, A random user of the smart contract is pseudo-randomly chosen from a seed derived from the block hash, to pay the rent for all the users at a given period. But the VM does not know who the contracts users are: only the contract knows. Another probabilistic approach is selecting one every 100 transactions that call a specific contract and force it to pay the full rent. The winning transaction can be chosen also by some lottery system, selecting the winning transaction using a pseudo-random generator based on the block hash. However this implies that the result cannot be reflected on the world-state of the current block, but on the next. If the parent block hash is used as the random seed, then this allows miners to re-order transactions to favor certain users not to pay the rent ever.
A better way to avoid these problems is that every operation on a contract pays a rent proportional to the the amount storage the contract acquired multiplied by the last period the contract was inactive. This is not entirely fair, as a user who uses the contract a single time is forced to pay for all memory previously acquired. However, this system is fair assuming:
- miners do not reorder transactions in a block to just bias rent micro-payments,
- we weight every operation equally, and not based on the gas consumed.
- users contract calls are spaced evenly over time, and not concentrated in a few blocks.
Under these assumptions, users will pay a share of the contract rent in proportion to their usage rate.
To Kill or not to Kill
One has to decide what to do if a contract does not pay the rent: Killing the offending contract seems as an outrageous action: the user assets balances would be burned if the user forgets to pay the rent. A softer alternative is required. At RSK we decided that a misbehaving contract should be hibernated, which means that all contract state is replaced by a single hash digest of it. Also, the block height where the contract was hibernated is stored. Later, the user can recover the contract, including its balance, by providing the missing pre-image information. A user can inspect the stored hibernation block height to query a peer and obtain the missing data, if the peer can supply it.
Should Accounts be Hibernated?
The RSK platform, similar to Ethereum, has two types of addresses: contracts and accounts. Contracts are controlled by code, and only the code can dictate an outgoing transfer. Accounts are controlled by a single ECDSA private key. Because accounts have no code, they occupy very little space in the state: about 40 bytes in compact form. Hibernation can only remove about 16 of those bytes, while it adds a new hash digest consuming at least 20 bytes. Therefore hibernating accounts increases the size of the state instead of decreasing it. This can also happen with contracts having a few bytes of code and no storage, but such contracts are rare. Hibernation of accounts can still be performed if we allow cascaded hibernation of nodes in a tree. This is an interesting feature we’ve researched that to collapse any number of contracts into a single hash digest. However, this feature requires an additional binary tree to uniquely number all leaf nodes corresponding to contracts/accounts, and we’ll explain it in a following post.
To prevent spam using accounts the platform mandates a minimum amount of smart bitcoins transferred to an account upon creation.
Most Rent Payments are Micro-transactions
Because of the small cost of HDD and SSD storage, some rent payments will represent micro-transactions, such as 3000 units of gas per 32 bytes of storage per year (assuming same USD/gas cost). If paying the rent requires sending a unique transaction, the fees paid for a transaction (21K gas) far exceed the amount transacted. This is a huge overhead for the network and it’s against the common good. Therefore is preferable to piggy-back the payment to a transaction that performs another operation. This saves bandwidth and reduces computation time to a single digital signature verification.
There are alternatives, that we discarded because of other reasons, for example:
- aggregating many rent payments in a single transaction that uses contract to pay multiple (maybe thousands) of rents.
- having multi-output transactions, and using one output to pay rents
- adding a new field in transactions specifically for specifying rents.
All these solutions, while technically possible, incur in more overhead and violate the principle of simplicity. Therefore RSK simply re-uses the transaction gas system to pay storage rent, by consuming gas from every CALL operation in proportion to the target contract size and time past after last contract use.
RSK has assigned new costs to SSTORE and to persist a contract code byte The initial cost of persistence is 50% cheaper than in Ethereum (assuming equal gas/USD cost), Afterwards annual cost is about 25% of the initial cost. Therefore RSK storage is cheaper than Ethereum for any application that runs for less than 3 years.
How and When to Hibernate
If one year passes and a contract is not successfully called the contract is ready to be hibernated. Hibernation is a very lightweight operation in RSK. Therefore users are encouraged to hibernate contracts that did not pay the rent for the common good, at no cost.
To hibernate a contract, RSK has the new opcode “HIBERNATE”. The only argument of HIBERNATE is the contract address or a prefix of an address to hibernate. The actual hibernation is delayed until all transactions in the block have been executed. If an address prefix is given as argument to HIBERNATE, the platform will scan all addresses with that prefix that can be hibernated and hibernate them (but its stop at the first that cannot be hibernated). Therefore a single HIBERNATE call can hibernate an unlimited number of contracts. Hibernation freezes a contract until external wake up is performed. During hibernation, a contract or an account cannot receive payments. Any call rises an exception and no action is performed. Contract hibernation can be self-inflicted or inflicted by a 3rd party. Self-inflicted hibernation can be used to reduce the cost of maintenance of a contract that lives long but it is used very sporadically. However, this is only generally economically rational if the hibernation period is of several years, because to wake up a contract the removed data must be provided back. Third party hibernation means that any user (including miners) can hibernate misbehaving contracts.
Wake me up, before you go-go.
To bring the contract alive again, the WAKEUP opcode is used. The wake up opcode receives as arguments the address of the contract to wake up, and a pointer to all the missing data. However, the cost of waking up a contract also comprises the cost of supplying the missing data (including transfer, temporary storage or re-computing costs). Users can always recover the contract data by re-processing the blockchain from an old checkpoint (or from the genesis block) or by asking a peer for the state prior hibernation.
Conclusions
Storage rent is not a prevention from short-term spam attacks. Storage rent is needed to protect the blockchain users of the future from market driven measures, such as a reduction in the gas price, that miners may take selfishly or strategically. It can be expected that during certain periods miners will lower the gas price to increase user adoption and to outperform the competition. Without storage rent, these market driven decisions turn into populism: sacrificing resources for short-term gains and preventing long term success. Storage rent also protects the blockchain against miscalculations and erroneous predictions on technology or blockchain adoption rate. However, implementing storage rent is complex, as many rent payments are micro-transactions, the system implemented must make sure the rent collection cost does introduce a new limiting factor to scaling.
Summary and What’s Next
In this post we’ve discussed storage rent and concluded that::
- there is a need for storage rent
- It entails several design and implementation trade-offs
- rent payments are generally micro-transactions,
- rent payments need to use side-channels on existing transactions fees
- RSK implements storage rent using two new opcodes: HIBERNATE and WAKEUP.
- RSK storage cost 50% less than Ethereum when storing short-lived data, but costs more for long-lived data.
.In a following post we’ll investigate other related improvements, such as survive and temporary memory spaces to allow more efficient persistent memory management and how a new binary tree structure can be used in RSK to allow the collapse of an unlimited amount of hibernated contracts into higher level hashes, and therefore reduce state storage even more.