HashKey: An Analysis of the Economic Models and Application Exploration of Decentralized Oracles such as Chainlink and Nest
This article was published on July 9, 2020, on the Chain News HashKey Capital Research account, authored by Qian Bojun, with the original title: "The Design of Decentralized Oracles"
This article studies the core design concepts of decentralized oracles, as well as the economic design and application exploration of various decentralized oracles. The conclusion is that with technological advancements, decentralized oracles will become mainstream, and the economic models and incentive mechanisms will be the focus of competition among various decentralized oracles. As the user base of DeFi and other public chain applications expands, decentralized oracles will become essential infrastructure, and cross-oracle applications will be a major trend in enhancing the security of data sources.
There are significant differences in the operational logic between the crypto world and the real world. The crypto world operates on-chain through consensus mechanisms, cryptography, distributed nodes, and smart contracts. In a smart contract, when inputting variable X, the execution of result Y is predictable; this result is irreversible and possesses determinism and trustworthiness.
To achieve an accurate Y result, the source of variable X is crucial. There are two sources for variable X data: on-chain data and off-chain data. On-chain trusted data can be directly obtained through the blockchain, while off-chain trusted data needs to be provided by oracles. This article mainly introduces various ways in which oracles provide trusted data, along with their economic incentive designs.
The article is divided into three parts: the first part introduces the mechanisms of centralized and decentralized oracles, the second part compares the economic incentive designs of existing market oracles, and the third part discusses the applications and prospects of oracles.
1. Types and Mechanisms of Oracles
The function of an oracle is to convert external information into data written on the blockchain, completing the data interchange between the blockchain and the real world, serving as a means for smart contracts to interact with external data. Oracles need to filter data from a highly uncertain and unverified database and input it into a reliable and secure closed system; thus, the quality of the data significantly affects the operation of the entire system.
Currently, the sources of oracle databases mainly include four types: first, internet connections and search engines; second, on-chain data from other blockchains; third, data stored on IPFS; fourth, data from IoT sensors. There are three types of oracles in the crypto market: first, centralized oracles; second, decentralized oracles; third, consortium oracles. The following sections introduce the mechanisms and differences of these three types of oracles.
Centralized Oracles
Centralized oracles provide data to smart contracts from a trusted centralized institution. There are two mechanisms for centralized oracles: the first is that the centralized institution allows the oracle to operate in a trusted execution environment, and data requesters do not need to trust the centralized institution. This mechanism can prove to data requesters that the data source has not been modified throughout the process through trusted cryptographic proof technology.
Provable is a typical example of this mechanism, using TLSnotary proof technology, allowing the entire process of data source integration into the blockchain to be auditable by third parties. As long as data requesters trust the data source, the entire process of data integration into the blockchain is trustworthy.
The second mechanism involves oracles developed by the data source itself, where data requesters need to trust the centralized institution. In this mechanism, the data source is usually a trusted off-chain institution that extends off-chain credibility to on-chain and is fully responsible for data quality.
The mechanism of centralized oracles is relatively straightforward and aligns with traditional societal data sources, mainly having three advantages: first, in centralized oracles, the integrity and security of data directly affect the oracle's credibility, and since providing data is a commercial activity, the motivation for wrongdoing is relatively low. Second, since all data is provided by centralized oracles, there is no competitive behavior among participants, resulting in higher data retrieval efficiency. Third, the trustworthiness of centralized oracle data is independent of the user scale; even with a small ecosystem, the oracle can still operate normally.
However, centralized oracles have limitations in two aspects. First, scalability; they cannot accommodate data provided by other oracles. Second, security; the intrinsic value of centralized oracles is insufficient to support the security required for high-value contracts. Centralized oracles possess intrinsic value, which can be bought out at a price.
When centralized oracles serve as data providers for larger DeFi ecosystems, data requesters can bribe or even buy out centralized oracles to manipulate data sources for their own profit in DeFi contracts.
Decentralized Oracles
The design of decentralized oracle mechanisms is consistent with the distributed philosophy of blockchain, primarily providing data services through a trusted network of numerous nodes, enhancing the fault tolerance of the entire oracle system. Decentralized oracles do not enhance the trustworthiness of oracles through technology but achieve data trustworthiness through economic incentives and multi-signature mechanisms.
Decentralized oracles involve multiple nodes, and the design must consider the following issues: first, the collusion problem among nodes; second, the confidentiality of data content; third, the timeliness of data acquisition; fourth, the issue of malicious nodes copying data from other nodes; fifth, the problem of data corruption caused by witch attacks.
The execution process of decentralized oracles typically involves five steps: 1. The smart contract saves the transaction state. 2. The current transaction is halted, waiting for the decentralized oracle to call data. 3. The oracle uses a multi-signature mechanism to have data providers sign for the corresponding nodes simultaneously. 4. The oracle uses cryptographic algorithms to organize and summarize the data from each node and adjusts the transaction state. 5. The smart contract verifies the result, and the transaction is completed.
Regardless of the type of decentralized oracle, their core method of calling data has commonalities, though the implementation methods differ. Decentralized oracles have two limitations compared to centralized oracles: first, the fees are relatively high, requiring multiple nodes to participate. Second, the ecosystem must have a certain scale, as the trustworthiness of data is highly correlated with the scale of the ecosystem.
Consortium Oracles
Consortium oracles provide data to smart contracts from a trusted consortium, representing a type of decentralized oracle. Similar to consortium chains, the nodes in a consortium oracle network are composed of designated trusted individuals or institutions. The trust composition of this type of oracle has multiple layers, including trust in each node, trust in the oracle's mechanism itself, and trust in the oracle governance institution.
The MakerDao oracle belongs to this category, composed of 14 trusted nodes that provide real-time ETH/USD prices to internal users. Its nodes include anonymous individual data sources as well as designated data source institutions like 0x, dYdX, Set Protocol, and Gnosis.
There are two issues to be aware of with consortium oracles: first, the confidentiality of trusted node identities relates to the possibility of nodes being manipulated or extorted. Second, it is necessary to consider whether trusted nodes and governance institutions have conflicts of interest.
In the case of the MakerDao consortium oracle, MKR holders can decide on two key points of oracle operation: first, the list of participating nodes in the consortium oracle. Second, MKR holders can prevent malicious nodes from manipulating the oracle by delaying price responses. However, when MKR holders (the governance institution) collude or become corrupt, the MakerDao ecosystem will struggle to maintain checks and balances.
Dishonest MKR holders can collude to manipulate ETH prices by holding large amounts of CDP or DAI, triggering global settlements for profit. Thus, it can be seen that consortium oracles can produce efficient and decentralized effects in environments with relatively high trust, serving as a solution in the early stages of the industry.
2. Economic Models of Decentralized Oracles
Chainlink
Two-Layer Structure
Chainlink is an oracle system built on Ethereum, consisting of a two-layer structure, where the lower layer provides data to oracle nodes from multiple data sources, and the upper layer provides data to the blockchain from multiple oracle nodes.
Figure 1: Chainlink Two-Layer Structure
Chainlink's two-layer structure has two characteristics: first, data requesters can customize the composition of data sources, including the reputation and number of nodes. Second, the lower layer ensures the decentralization of data, while multiple oracles in the upper layer ensure that the system can continue to operate even if any single oracle experiences a failure. Third, Chainlink uses on-chain aggregation of data to send data to data requesters.
All oracles send data to the on-chain smart contract, which then filters out anomalies and provides reasonable data to the data requester. The advantage of on-chain data aggregation is that the data can undergo multiple audits, and the data provided by the sources is recorded on the blockchain, increasing reliability. However, the downside of this method is that when the data volume is large, transaction fees can become very expensive and lead to network congestion.
Therefore, Chainlink utilizes threshold signatures to solve this problem. Threshold signatures allow oracles to communicate with each other and reach consensus off-chain. Off-chain oracles aggregate data using threshold signature technology, only needing to transmit data to the blockchain once, thus only paying transaction fees once.
Economic Model
The main participants in the Chainlink oracle ecosystem are data requesters and data providers, primarily operating by collaborating with trusted nodes and incentivizing them with LINK tokens. The design of Chainlink's economic mechanism has two main aspects: first, staking LINK tokens. Nodes must stake Chainlink's LINK tokens before providing services.
If a node engages in malicious behavior, including providing false data, copying data, or failing to act, the LINK tokens staked by the node will be forfeited by the system to protect data requesters. Second, the Chainlink oracle network has a reputation system. Off-chain nodes can earn a certain amount of LINK as a reward for providing data services, while other nodes will evaluate them based on the quality of the data they provide, thus affecting the node's reputation.
Factors affecting a node's reputation include the frequency of data provision, completion rate, response time, and quality. The more LINK tokens a node stakes, the more it implicitly affirms its own service and data, leading to a higher reputation and greater opportunities for assignments from data requesters, thus increasing potential earnings.
In addition to providing real data on-chain, Chainlink also offers services for verifying randomness for projects based on smart contracts. Users can access DApps (blockchain games, gambling) and verify their randomness.
The operational steps involve four: 1. The smart contract sends a request to Chainlink for random number verification. 2. Chainlink generates a random number. 3. Chainlink sends the random number to the VRF smart contract for random verification. 4. The result is sent back to the smart contract. Chainlink VRF can help users discern the authenticity and fairness of projects by verifying the randomness of DApps, and through a payment mechanism, it connects LINK tokens with various Ethereum projects, with the potential to develop a more complete token economic model within the ecosystem in the future.
Issues and Solutions
(1) Data Privacy Issues
Maintaining data privacy during open input and queries is quite challenging, especially in the financial sector. For example, when analyzing user credit records and personal information in DeFi lending to determine a user's credit rating, the user's information may be completely recorded on-chain due to aggregation, affecting their willingness to make requests.
Currently, Chainlink employs trusted execution environments to protect data. The characteristic of Chainlink's trusted execution environment is that it encrypts and isolates part of the code and data from the external environment, allowing the results of the trusted execution environment to be read only through specific means. Even if a node's computer is compromised, the data within the trusted execution environment can remain secure, achieving a dual-layer protection effect.
(2) Collusion Issues
Any oracle may face collusion among multiple nodes, including bribery or witch attacks. The greatest risk of node collusion to the oracle ecosystem is that it can deliberately report false data for self-gain, thereby affecting the interests of data requesters. Since data requesters can determine the identity and number of specific nodes, the anti-collusion capability of Chainlink is further questioned.
Currently, the proposed solution in the community is to allow smart contract developers to use secure random beacons to randomly select nodes from all nodes to provide data. Ethereum 2.0 can achieve secure random beacons, and the thousands of nodes on Ethereum minimize the possibility of collusion.
Nest
Nest is a decentralized oracle primarily serving DeFi protocols, attempting to address the issue of malicious node collusion. There are three roles in the Nest ecosystem: miners, validators, and data requesters. First, miners provide data (quotes) to the oracle and earn mining rewards in NEST.
Anyone can become a miner, but they must pay a certain commission when quoting; the higher the commission paid, the more mining rewards in NEST they can earn. Second, validators can choose to execute trades at deviated market prices when quotes deviate from market prices to earn profits. After a transaction, validators must enforce the quote but do not need to pay a commission and will not earn mining rewards. Third, data requesters call Nest quotes and pay fees.
Two points to note: first, under the Nest mechanism, miners quote first, and data requesters can only call after, driven by supply rather than demand, which is different from Chainlink. Second, Nest essentially utilizes the arbitrage mechanism of validators to drive miners to provide real quotes, all conducted on-chain, meaning that whether miners engage in malicious behavior does not affect data requesters' ability to obtain real quotes.
Third, validators must initiate their own quotes at the same time as executing a transaction, and the scale of the quoted funds must be twice that of the validation funds. This mechanism prevents malicious validators from validating their own quotes for arbitrage. If a validator attempts to validate their own quote for arbitrage, the subsequent arbitrage costs will grow exponentially, and the arbitrage space will diminish.
The biggest problem with Nest is that the ecosystem is supply-driven. The premise for this game system to achieve accurate quotes is a large number of participants and frequent participation. When the scale of miners is insufficient, it cannot generate real data on-chain through the arbitrage mechanism. Moreover, all quoting actions are completed on-chain, and gas fees will be one of the cost considerations for miners.
If the ETH network experiences congestion and gas fees soar, miners may not be able to maintain a balance between profit and loss, affecting their willingness to quote. Currently, the main source of miner income is the commission from miner quotes rather than the fees for oracle calls, meaning that the value of the NEST token will only become prominent when there is significant arbitrage space in ETH prices.
At other times, the motivation for mining is clearly insufficient, and short-term sustainability is low. Only when the scale of data requesters increases can the internal value of the Nest ecosystem truly grow.
3. Development Directions of Decentralized Oracles
Compared to centralized oracles, decentralized oracles have higher costs and lower efficiency under limited scales. Therefore, decentralized oracles need to elevate their solutions from addressing blockchain data issues to solving trust issues to truly expand their applications. I believe that the future practical application scenarios for decentralized oracles will have three factors: first, scenarios with high demand for randomness; second, scenarios involving multiple institutions; third, scenarios for synthetic asset trading.
High Randomness
Applications involving high randomness on the blockchain include gambling and prediction platforms. The core of these platforms is randomness, unpredictability, and verifiability, creating a strong demand for decentralized oracles. Currently, many gambling DApps generate random numbers on-chain without the involvement of oracles.
However, in 2018, a gambling DApp on EOS was hacked due to random number issues, resulting in asset losses for the project, while publicly available random number algorithms on the blockchain can lead to predictability.
DApps can obtain more secure random numbers through two methods: first, by using oracle APIs to call for random numbers from external sources. Second, by utilizing Verifiable Random Functions (VRF) to generate a secure, unpredictable random number off-chain and directly return this random number to users.
Multi-Party Participation
Scenarios involving multiple parties are suitable for obtaining data through decentralized oracles, such as decentralized insurance. First, the data sources for decentralized insurance cover a wide range, such as flight delay insurance and health insurance, where a single case may require data from multiple sources like IoT, GPS systems, legal precedents, or hospital data.
For instance, in auto insurance claims, insurance companies often have disputes with clients over whether to pay claims. Since insurance companies have the final decision-making power, it inevitably leads to some clients withholding information. Auto insurance claims involve multiple data sources, and the investigation process can be lengthy, increasing operational costs and extending processing times.
Decentralized oracles can quickly obtain insurance-related data from different sources and upload the results and related data regarding whether to pay claims to the blockchain through off-chain aggregation. Additionally, the largest cost for insurance institutions is the cost of trust; when the guaranteed value exceeds the intrinsic value of centralized oracles, centralized oracles become difficult to trust.
Synthetic Assets
Synthetic assets have various design mechanisms, and as long as there are trading counterparts in the market, synthetic asset contracts can be established. Synthetic assets offer flexibility, allowing market participants to hedge risks that would otherwise be untradeable. The outcomes of synthetic asset transactions on the blockchain entirely depend on decentralized oracles, and smart contracts on the blockchain cannot discern whether the data sources are correct. Therefore, decentralized oracles are a necessary role in synthetic asset trading.
Decentralized oracles can leverage the advantages of multiple nodes to flexibly provide data sources for various synthetic asset contracts. Decentralized oracles have four methods to enhance the security of high-value contract data sources: first, economic incentives and penalty mechanisms; second, multi-node audits; third, the intrinsic value of decentralized oracles will cyclically increase as the ecosystem grows, enhancing security; fourth, high interoperability, enabling cross-oracle services.
Although decentralized oracles can support higher contract value limits compared to centralized oracles, they still need to address compliance issues with data sources.
Traditional centralized oracles providing data for legal synthetic asset trading, including government agencies, stock exchanges, and banks, are all subject to strict government regulation. The anti-censorship nature of decentralized oracles means that data requesters must fully trust the technology and mechanisms of the oracles. If the government intervenes to regulate oracles, the original intention of blockchain decentralization will be lost.
Therefore, whether future DeFi contracts increase limit clauses or decentralized oracles develop self-regulation, how decentralized oracles meet regulatory requirements will be a key factor in the development of DeFi. From a macro perspective on data source security, simultaneously employing multiple decentralized oracles can further achieve decentralization. In the future, the decentralized oracle market will operate in parallel across multiple angles, dispersing risks and creating a safer data supply environment.