The story before the launch of Ethereum
Author: Vitalik Buterin
Original Title: 《A Prehistory of the Ethereum Protocol》
Publication Date: September 14, 2017
Although the ideas behind the current Ethereum protocol have been largely stable for two years, Ethereum did not appear all at once, nor was it fully formed in its current conception. Before the blockchain launched, the protocol underwent many significant developments and design decisions. The purpose of this article is to browse through the various evolutions of the protocol from its inception; the countless work done on protocol implementations such as Geth, Cppethereum, Pyethereum, and Ethereumj, as well as the history of applications and businesses in the Ethereum ecosystem, are intentionally beyond the scope.
The history of Casper and Sharding Research also appears. While we could certainly publish more blog posts discussing all the ideas of Vlad, Gavin, myself, and others, including "proof of work," "hub-and-spoke chains," "hyperlegislation," shadow chains (arguably a precursor to Plasma), chain fibers, and various iterations of Casper, we will omit Vlad's contributions in a post.
Let’s start with the earliest version that eventually became Ethereum, when it wasn’t even called Ethereum. When I visited Israel in October 2013, I spent a lot of time with the Mastercoin team and even provided them with some features. After thinking about what they were doing a few times, I sent the team a suggestion to make the protocol more general and support more types of contracts without adding the same large and complex features:
https://web.archive.org/web/20150627031414/http://vbuterin.com/ultimatescripting.html
Note that this is far from the later broader vision of Ethereum: it was purely what Mastercoin was trying to specialize in, namely two-party contracts, where parties A and B would both put in funds, and then they would later receive funds according to certain formulas specified in the contract (for example, a bet might say: "If X happens, give all the funds to A, otherwise give all the funds to B"). The scripting language was not complete.
The Mastercoin team was impressed, but they were not interested in abandoning everything they were doing to move in this direction, and I increasingly believed this was the right choice. Thus, here is version 2 from December 2:
https://web.archive.org/web/20131219030753/http://vitalik.ca/ethereum.html
Here, you can see the result of a substantial rethinking, primarily due to my realization that smart contracts have the potential to be fully general, a result of my long trip to San Francisco. Rather than merely describing the terms of a relationship between two parties, the contract itself is a complete account, capable of holding, sending, and receiving assets, and even maintaining permanent storage (at that time, permanent storage was referred to as "memory," and the only temporary "memory" was 256 registers). The language transitioned from a stack-based computer to a register-based computer of my own design; aside from cases that seemed more complex, I had almost no argument against it.
Additionally, note that there is now a built-in fee mechanism:
At this point, Ether is essentially gas. After each computational step, the balance of the contract requested by the transaction will decrease, and if the contract runs out of currency to execute, it will stop. Note that this "receiver pays" mechanism means that the contract itself must require the sender to pay fees to the contract and exit immediately in the absence of such fees; the protocol allocates an allowance of 16 free execution steps to allow contracts to reject non-paying transactions.
This was the time when the Ethereum protocol was entirely my own creation. However, from here, new participants began to join the fold. The most prominent aspect of the protocol at this point was Gavin Wood, who reached out to me in December 2013.
The chief developer of the GO client, Jeffrey Wilcke (then known as "Ethereal"), also reached out around the same time, although his contributions were more in client development than protocol research.
"Hey, Jeffrey, great to see you interested in Ethereum…"
Gavin's initial contributions were twofold. First, you might notice that the contract call model in the initial design was asynchronous: although contract A could create "internal transactions" for contract B (the term "internal transactions" is from Etherscan; initially, they were just called "transactions" and later "message calls" or "calls"), the execution of internal transactions would only start after the execution of the first transaction was fully completed. This meant that transactions could not use internal transactions as a way to obtain information from other contracts; the only way was through the Extro OpCode (similar to what you could use to read the storage of other contracts with Sload), which was subsequently removed with the support of Gavin and others.
In implementing my initial specifications, Gavin naturally synchronized the implementation of internal transactions without even realizing the intent was different - that is, in Gavin's implementation, when a contract calls another contract, the internal transactions are executed immediately, and once the execution is complete, the VM returns to the contract that created the internal transaction and continues with the next OPCODE. To both of us, this approach seemed superior, so we decided to adopt it as part of the specifications.
Secondly, discussions between him and myself (the exact details of which will forever be lost to the winds of history, with perhaps one or two copies buried in the NSA's deep archives) shifted the transaction fee model from a "contract payment" approach to a "sender pays" approach, and also switched to a "gas" architecture. The transaction sender does not immediately take away each individual transaction step, but rather allocates some "gas" (roughly a counter for computational steps), and the computational steps are taken from this gas allowance. If a transaction runs out of gas, the gas will still be forfeited, but the entire execution will be reverted; this seemed to be the safest thing, as it removed the entire course.
Gavin can also be largely credited with the vision of viewing Ethereum as a platform for building programmable money, with blockchain-based contracts capable of holding digital assets and transferring them according to preset rules, evolving into a general-purpose computing platform. This began with subtle shifts in focus and terminology, which later became more pronounced with the increasing emphasis on the "Web 3" ensemble, which views Ethereum as a set of decentralized technologies, with the other two being Whisper and Swarm.
Others suggested changes around early 2014 as well. After Andrew Miller and others proposed the idea, we eventually returned to a stack-based architecture.
Charles Hoskinson suggested switching from Bitcoin's SHA256 to the new SHA3 (or more accurately, Keccak256). Despite some controversy for a time, discussions with Gavin, Andrew, and others led to the determination that the size of values on the stack should be limited to 32 bytes. Another alternative considered was unlimited-sized integers, which posed the problem of how to figure out how much gas, addition, multiplication, and other operations would cost.
The initial mining algorithm we conceived as early as January 2014 was a device called Dagger:
https://github.com/ethereum/wiki/blob/master/Dagger.md
Dagger is named after the "Directed Acyclic Graph" (DAG) mathematical structure used in the algorithm. The idea was that every n blocks, a new DAG would be generated pseudo-randomly from a seed, and the underlying DAG would be a collection of nodes that would require several gigabytes to store. However, generating any individual value in the DAG required computing thousands of entries. "Dagger computation" involves obtaining some values from random positions in this bottom dataset and then combining them. This means there is a fast way to perform Dagger computation - already having the data stored in memory, rather than a memory-intensive method - from each value you need to obtain from scratch.
The goal of the algorithm was to have the same "memory-hard" properties as algorithms that were popular at the time, such as Scrypt, but still be light client-friendly. Miners would use the fast method, so their mining would be limited by memory bandwidth (theoretically, consumer-grade RAM was already very valued, making it difficult to further optimize it via ASIC), but light clients could verify using a non-memory but slower version. The fast method might take a few microseconds, and the non-memory way would be slow without milliseconds, making it still very feasible for light clients.
From here, the algorithm underwent several changes during the Ethereum development process. The next idea we explored was "adaptive proof of work." Here, proof of work would involve executing randomly selected Ethereum contracts, and there was a clever rationale for how this would be ASIC-resistant: if ASICs were developed, competing miners would have the incentive to create and publish many contracts that demonstrated the contracts were not well-suited for ASIC execution. This story did not apply to something like general-purpose computing ASICs, as that was just CPU, so we could use this adversarial incentive mechanism to essentially create a proof of work that executed general computations.
The reason it failed was simple: remote attacks. An attacker could start a chain from block 1, filling it with simple contracts for which they could create dedicated hardware, and quickly surpass the main chain. So… back to the drawing board.
The next algorithm was called "random circuits," described in this Google document, proposed by myself and Vlad Zamfir, and analyzed by Matthew Wampler-Doty. The idea here was also to simulate general computation in the mining algorithm, this time by executing randomly generated circuits. There was nothing difficult to prove, and based on these principles, it was unworkable, but the computer hardware experts we contacted in 2014 were often very pessimistic about it. Matthew Wampler-Doty himself suggested a proof of work based on SAT solving, but that was ultimately rejected as well.
Finally, we refined the circuit with an algorithm called "Dagger Hashimoto." "Dashimoto" is sometimes referred to as "dashimoto," borrowing many ideas from Hashimoto, Hashimoto is Thaddeus Dryja's proof of work algorithm, which first proposed the concept of "I/O proof of work," where the main limiting factor for mining speed is not hashes per second, but rather the large range of RAM accesses per second. However, it combined this with Dagger's light client-friendly DAG-generated dataset. After I made many adjustments, Matthew, Tim, and others, these ideas finally merged into what we now call ethash.
By the summer of 2014, the protocol had stabilized significantly, and aside from the proof of work algorithm, it would not reach the Ethash stage until early 2015, existing in semi-formal specifications in the form of Gavin's yellow paper.
In August 2014, I developed and introduced the uncle block mechanism, which allowed Ethereum's blockchain to have shorter block times and higher capacity while mitigating centralization risks. This was introduced as part of POC6.
Discussions with the Bitshares team led us to consider adding stacks as a first-class data structure, although we ultimately did not do so due to lack of time, and later security audits and DOS attacks would indicate that it was actually much more difficult than we had thought at the time to securely execute this.
In September, Gavin and I planned the next two major changes to the protocol design. First, alongside the state tree and transaction tree, each block would also contain a "receipt tree." The receipt tree would include hashes of logs created by transactions and the intermediate state root. The logs would allow transaction creators to store "outputs" in the blockchain, which could be accessed by light clients, but future state calculations could not access. This could be used to allow decentralized applications to easily query events such as token transfers, purchases, created and filled exchange orders, starting auctions, and so on.
There were other ideas considered, such as creating a Merkle tree from the entire execution trace of a transaction to prove anything. The choice of logs was made because they were a compromise between simplicity and completeness.
The second was the idea of "precompiles," addressing the issue of allowing complex cryptographic computations to be available in the EVM without dealing with EVM overhead. We also went through more ambitious ideas about "native contracts" - "if miners could optimize the implementation of certain contracts, they could 'vote' for the packing fees of those contracts," so that contracts that most miners could execute faster would naturally lower gas prices; however, all these ideas were rejected because we could not propose a cryptoeconomically secure way to implement such things. Attackers could always create a contract that executed certain trapdoor cryptographic operations, assigning the trapdoors to themselves and friends to execute this contract faster, and then vote on gas and use it for the network. Instead, we chose the ambitious approach of simply specifying a small number of precompiles in the protocol for common operations such as hashing and signing schemes.
Gavin was also a key initial voice in developing the concept of "protocol abstraction" - moving many parts of the protocol, such as Ether balances, transaction signature algorithms, nonces, etc., as contracts themselves, with the theoretical ultimate goal being to describe the entire Ethereum protocol as a case of calling functions on a virtual machine with certain primitive states. These ideas did not have enough time to enter the initial frontier version, but the principles were expected to slowly integrate through some Constantinople changes, Casper contracts, and sharding specifications.
All of this was implemented in POC7; after POC7, the protocol did not change much, except for minor details that would emerge from security audits…
In early 2015, Jutta Steiner and others organized a pre-release security audit, which included software code reviews and academic reviews. The software review was primarily led by Gavin Wood and Jeffrey Wilcke for the C++ and GO implementations, although there was also a smaller audit for my Pyethereum implementation. In these two academic reviews, one was conducted by Ittay Eyal (famous for "selfish mining"), and the other by Andrew Miller and other minimal authorities. The EYAL audit led to minor protocol changes: the total difficulty of the chain did not include uncles. The minimal authority audit focused more on smart contracts and gas economics and the Patricia tree. This audit led to several protocol changes. One minor change was to use sha3(addr) and sha3(key) as trie keys instead of direct addresses and keys; this would make the worst-case attacks on the Trie more difficult.
The warning may be a bit ahead of its time…
Another important thing we discussed was the gas limit voting mechanism. At that time, we were already concerned about the lack of progress in the Bitcoin block size debate and wanted a more flexible design in Ethereum that could be adjusted over time as needed. But the challenge was: what is the optimal limit? My initial idea was to establish a dynamic limit based on a long-term exponential moving average of actual gas usage, so that over the long term, average blocks would be full. However, Andrew pointed out that this could be exploited in certain ways - specifically, miners wanting to raise the limit would only include transactions in their own blocks that consumed a lot of gas but took little time to process, thus always creating entire blocks without spending money. Therefore, at least in the upward direction, the security model was equivalent to simply letting miners vote on the gas limit.
We could not come up with a gas limit strategy that was unlikely to break, so Andrew's recommended solution was simply to let miners explicitly vote on the gas limit and set the default voting strategy to EMA rules. The reason was that we had a long way to go to find the right way to set a maximum gas limit, and the risk of any particular method failing seemed greater than the risk of miners abusing their voting rights. Therefore, we might as well let miners vote on the gas limit and accept the risk of it being set too high or too low in exchange for the benefits of flexibility and the ability for miners to work together to quickly adjust the limit up or down as needed.
After a small black marathon between Gavin, Jeff, and myself, POC9 was launched in March, intended to be the final proof of concept release. Using the scheme intended for use in Livenet, a long-term plan for Ethereum was established, running a testnet for four months. Vinay Gupta wrote a blog post "Ethereum Launch Process" describing the four anticipated phases of Ethereum Livenet development and giving them their current names: Frontier, Homestead, Metropolis, and Serenity.
The hackathon ran for four months. In the first two months, many bugs were found across various implementations, consensus failures occurred, among other issues, but by around June, the network stabilized significantly. In July, it was decided to freeze the code and then release it, with the launch occurring on July 30.