From Filecoin, Arweave to Walrus, Shelby: How far is the road to the popularization of decentralized storage?

Movemaker
2025-06-27 18:21:58
Collection
From coin logic to usage logic, Shelby's breakthrough may mark the end of an era—more importantly, the beginning of another era.

Author: @BlazingKevin_, the Researcher at Movemaker

Storage has long been one of the top narratives in the industry. Filecoin, as the leading project in the last bull market, once had a market cap exceeding $10 billion. Arweave, as a comparable storage protocol, promoted permanent storage as its selling point, reaching a market cap of $3.5 billion at its peak. However, as the availability of cold data storage has been debunked, the necessity of permanent storage has come into question, leaving a big question mark over whether the decentralized storage narrative can succeed. The emergence of Walrus has stirred the long-silent storage narrative, and now Aptos, in collaboration with Jump Crypto, has launched Shelby, aiming to elevate decentralized storage in the hot data sector. So, can decentralized storage make a comeback and provide widespread use cases? Or is it just another topic hype? This article analyzes the developmental trajectories of Filecoin, Arweave, Walrus, and Shelby, attempting to find an answer to the question: How far is the path to the popularization of decentralized storage?

Filecoin: Storage is the Surface, Mining is the Essence

Filecoin is one of the initially emerging altcoins, and its development direction naturally revolves around decentralization, which is a common characteristic of early altcoins—seeking the meaning of decentralization in various traditional sectors. Filecoin is no exception; it links storage with decentralization, naturally leading to the drawbacks of centralized storage: the trust assumption on centralized data storage service providers. Therefore, what Filecoin does is shift centralized storage to decentralized storage. However, some aspects sacrificed in the process to achieve decentralization have become pain points that later projects like Arweave or Walrus aimed to address. To understand why Filecoin is merely a mining coin, one needs to grasp the objective limitations of its underlying technology, IPFS, which is not suitable for hot data.

IPFS: Decentralized Architecture, Yet Stopped by Transmission Bottlenecks

IPFS (InterPlanetary File System) was launched around 2015, aiming to disrupt the traditional HTTP protocol through content addressing. The biggest drawback of IPFS is its extremely slow retrieval speed. In an era where traditional data service providers can achieve millisecond-level responses, retrieving a file from IPFS still takes several seconds, making it difficult to promote in practical applications, which explains why it is rarely adopted by traditional industries, except for a few blockchain projects.

The underlying P2P protocol of IPFS is mainly suitable for "cold data"**, which refers to static content that does not change frequently, such as videos, images, and documents. However, when it comes to handling *hot data*, such as dynamic web pages, online games, or AI applications, the P2P protocol does not have a significant advantage over traditional *CDNs*.

However, despite the fact that IPFS itself is not a blockchain, its directed acyclic graph (DAG) design concept aligns closely with many public chains and Web3 protocols, making it inherently suitable as a foundational framework for blockchain. Therefore, even if it lacks practical value, it is sufficient as a foundational framework that carries the blockchain narrative. Early altcoin projects only needed a workable framework to set sail, but as Filecoin developed over time, the hard limitations brought by IPFS began to hinder its progress.

The Mining Coin Logic Beneath the Storage Facade

The original design intention of IPFS was to allow users to store data while also being part of the storage network. However, without economic incentives, it is difficult for users to voluntarily use this system, let alone become active storage nodes. This means that most users will only store files on IPFS but will not contribute their storage space or store others' files. It is in this context that Filecoin was born.

Filecoin's token economic model mainly involves three roles: users who are responsible for paying fees to store data; storage miners who earn token incentives for storing user data; and retrieval miners who provide data when users need it and receive incentives.

This model has potential for malicious behavior. Storage miners may fill their allocated storage space with junk data to earn rewards after providing storage space. Since this junk data will not be retrieved, even if it is lost, it will not trigger the penalty mechanism for storage miners. This allows storage miners to delete junk data and repeat the process. Filecoin's proof-of-replication consensus can only ensure that user data has not been privately deleted, but cannot prevent miners from filling storage with junk data.

The operation of Filecoin largely relies on miners' continuous investment in the token economy, rather than on end-users' genuine demand for distributed storage. Although the project continues to iterate, at this stage, the ecological construction of Filecoin aligns more with the "mining coin logic" rather than the "application-driven" definition of storage projects.

Arweave: Success in Long-Termism, Failure in Long-Termism

If Filecoin's design goal is to build an incentivized, verifiable decentralized "data cloud" shell, then Arweave takes a different extreme direction in storage: providing the capability for permanent storage of data. Arweave does not attempt to build a distributed computing platform; its entire system revolves around a core assumption—important data should be stored once and forever remain on the network. This extreme long-termism makes Arweave fundamentally different from Filecoin in terms of mechanisms, incentive models, hardware requirements, and narrative perspectives.

Arweave looks to Bitcoin as a learning model, attempting to continuously optimize its permanent storage network over long cycles measured in years. Arweave does not care about marketing, nor does it concern itself with competitors and market trends. It simply continues to iterate on network architecture, indifferent to whether anyone pays attention, because this is the essence of the Arweave development team: long-termism. Thanks to its long-termism, Arweave was fervently embraced during the last bull market; and because of its long-termism, even after hitting rock bottom, Arweave may still survive several rounds of bull and bear markets. The question remains whether decentralized storage will have a place for Arweave in the future. The existence value of permanent storage can only be proven over time.

Since the mainnet version 1.5 to the recent version 2.9, Arweave has been committed to enabling a broader range of miners to participate in the network at minimal cost, incentivizing miners to maximize data storage, thereby continuously enhancing the robustness of the entire network. Arweave's conservative approach, fully aware that it does not align with market preferences, does not embrace miner communities, and its ecosystem has completely stagnated, aims to upgrade the mainnet at minimal cost while continuously lowering hardware thresholds without compromising network security.

A Review of the Upgrade Path from 1.5 to 2.9

Version 1.5 of Arweave exposed a vulnerability where miners could rely on GPU stacking rather than actual storage to optimize block generation probabilities. To curb this trend, version 1.7 introduced the RandomX algorithm, limiting the use of specialized computing power and requiring general CPUs to participate in mining, thereby weakening computational centralization.

In version 2.0, Arweave adopted SPoA, transforming data proofs into a concise path of Merkle tree structures and introducing format 2 transactions to reduce synchronization burdens. This architecture alleviated network bandwidth pressure, significantly enhancing node collaboration capabilities. However, some miners could still evade the responsibility of holding real data through centralized high-speed storage pool strategies.

To correct this bias, version 2.4 introduced the SPoRA mechanism, incorporating global indexing and slow hash random access, requiring miners to genuinely hold data blocks to participate in effective block generation, thereby mechanically weakening the effects of computational stacking. As a result, miners began to focus on storage access speeds, driving the application of SSDs and high-speed read/write devices. Version 2.6 introduced hash chains to control block generation rhythm, balancing the marginal benefits of high-performance devices and providing fair participation space for small and medium miners.

Subsequent versions further strengthened network collaboration capabilities and storage diversity: version 2.7 added collaborative mining and pool mechanisms to enhance the competitiveness of small miners; version 2.8 introduced a composite packaging mechanism, allowing large-capacity low-speed devices to participate flexibly; and version 2.9 introduced a new packaging process in the replica29 format, significantly improving efficiency and reducing computational dependencies, completing the closed loop of data-oriented mining models.

Overall, Arweave's upgrade path clearly presents its long-term strategy oriented towards storage: continuously resisting the trend of computational centralization while lowering participation thresholds to ensure the long-term operational viability of the protocol.

Walrus: Is Embracing Hot Data Hype or a Hidden Gem?

From a design perspective, Walrus is completely different from Filecoin and Arweave. Filecoin's starting point is to create a decentralized verifiable storage system at the cost of cold data storage; Arweave's starting point is to build a chain-based library of Alexandria capable of permanent data storage, at the cost of too few scenarios; Walrus's starting point is to optimize storage costs for hot data storage protocols.

Magically Modified Erasure Codes: Cost Innovation or Old Wine in New Bottles?

In terms of storage cost design, Walrus believes that the storage costs of Filecoin and Arweave are unreasonable, as both adopt a fully replicated architecture, whose main advantage lies in each node holding a complete copy, possessing strong fault tolerance and independence among nodes. This type of architecture ensures that even if some nodes go offline, the network still maintains data availability. However, this also means that the system requires multiple copies of redundancy to maintain robustness, thereby driving up storage costs. Especially in Arweave's design, the consensus mechanism itself encourages node redundant storage to enhance data security. In contrast, Filecoin is more flexible in cost control, but at the expense of potentially higher data loss risks in some low-cost storage. Walrus attempts to find a balance between the two, enhancing availability through structured redundancy while controlling replication costs, thus establishing a new compromise path between data availability and cost efficiency.

The key technology of Redstuff, created by Walrus, is to reduce node redundancy, originating from Reed-Solomon (RS) coding. RS coding is a very traditional erasure code algorithm, which allows for the doubling of data sets by adding redundant fragments (erasure code) for the purpose of reconstructing the original data. From CD-ROMs to satellite communications to QR codes, it is frequently used in daily life.

Erasure codes allow users to obtain a block, for example, 1MB in size, and then "amplify" it to 2MB, where the additional 1MB is special data known as erasure code. If any byte in the block is lost, users can easily recover those bytes through the code. Even if up to 1MB of the block is lost, the entire block can still be recovered. The same technology allows computers to read all data from a CD-ROM, even if it is damaged.

Currently, the most commonly used is RS coding. The implementation starts with k information blocks, constructs related polynomials, and evaluates them at different x-coordinates to obtain coded blocks. Using RS erasure codes, the probability of randomly sampling large blocks of lost data is very low.

For example: A file is divided into 6 data blocks and 4 parity blocks, totaling 10 pieces. As long as any 6 of them are retained, the original data can be fully restored.

Advantages: Strong fault tolerance, widely used in CD/DVD, fault-tolerant disk arrays (RAID), and cloud storage systems (such as Azure Storage, Facebook F4).

Disadvantages: Decoding calculations are complex and costly; not suitable for frequently changing data scenarios. Therefore, it is usually used for data recovery and scheduling in off-chain centralized environments.

In decentralized architectures, Storj and Sia have adjusted traditional RS coding to meet the actual needs of distributed networks. Walrus has also proposed its own variant based on this—the RedStuff coding algorithm, to achieve lower costs and more flexible redundancy storage mechanisms.

What is the biggest feature of Redstuff? By improving the erasure coding algorithm, Walrus can quickly and robustly encode unstructured data blocks into smaller fragments, which are distributed across a storage node network. Even if up to two-thirds of the fragments are lost, the original data block can be quickly reconstructed using partial fragments. This is made possible while maintaining a replication factor of only 4 to 5 times.

Thus, it is reasonable to define Walrus as a lightweight redundancy and recovery protocol redesigned around decentralized scenarios. Compared to traditional erasure codes (like Reed-Solomon), RedStuff no longer pursues strict mathematical consistency but instead makes realistic trade-offs regarding data distribution, storage verification, and computational costs. This model abandons the immediate decoding mechanism required for centralized scheduling, instead verifying whether nodes hold specific data copies through on-chain proofs, thus adapting to a more dynamic and marginalized network structure.

The core design of RedStuff is to split data into two categories: primary slices and secondary slices: primary slices are used to restore the original data, and their generation and distribution are strictly constrained, with a recovery threshold of f+1, requiring 2f+1 signatures as availability endorsement; secondary slices are generated through simple operations like XOR combinations, serving to provide elastic fault tolerance and enhance the overall robustness of the system. This structure essentially lowers the requirements for data consistency—allowing different nodes to temporarily store different versions of data, emphasizing a "final consistency" practical path. Although similar to the lenient requirements for backtracking blocks in systems like Arweave, it achieves some effect in reducing network burdens, but at the same time weakens the guarantees of data immediate availability and integrity.

It is important to note that while RedStuff achieves effective storage in low-computational, low-bandwidth environments, it essentially remains a "variant" of an erasure code system. It sacrifices some data reading determinacy in exchange for cost control and scalability in decentralized environments. However, whether this architecture can support large-scale, high-frequency interactive data scenarios remains to be seen. Furthermore, RedStuff has not truly broken through the long-standing computational bottlenecks of erasure codes but has instead avoided high coupling points of traditional architectures through structural strategies, with its innovation more reflected in engineering-side combinatorial optimization rather than a fundamental algorithmic disruption.

Thus, RedStuff is more like a "reasonable modification" made for the current decentralized storage reality. It indeed brings improvements in redundancy costs and operational loads, allowing edge devices and non-high-performance nodes to participate in data storage tasks. However, in large-scale applications, general computational adaptation, and scenarios with higher consistency requirements, its capability boundaries remain quite evident. This makes Walrus's innovation more of an adaptive transformation of the existing technical system rather than a decisive breakthrough in advancing the paradigm shift of decentralized storage.

Sui and Walrus: Can High-Performance Public Chains Drive Storage Practicality?

From Walrus's official research article, we can see its target scenario: "The original design intention of Walrus is to provide solutions for storing large binary files (Blobs), which are the lifeblood of many decentralized applications."

Large blob data typically refers to large, unstructured binary objects, such as videos, audio, images, model files, or software packages.

In the crypto context, it more often refers to NFTs, images, and videos in social media content. This also constitutes Walrus's main application direction.

  • Although the article also mentions potential uses for storing AI model datasets and data availability layers (DA), the recent downturn in Web3 AI has left very few related projects, and the number of projects that will truly adopt Walrus's protocol in the future may be very limited.
  • As for the DA layer direction, whether Walrus can serve as an effective alternative still needs to wait for mainstream projects like Celestia to reignite market attention to verify its feasibility.

Therefore, Walrus's core positioning can be understood as a hot storage system serving content assets like NFTs, emphasizing dynamic invocation, real-time updates, and version management capabilities.

This also explains why Walrus needs to rely on Sui: leveraging Sui's high-performance chain capabilities, Walrus can build a high-speed data retrieval network, significantly reducing operational costs without developing a high-performance public chain itself, thus avoiding direct competition with traditional cloud storage services in terms of unit costs.

According to official data, Walrus's storage costs are about one-fifth of traditional cloud services. Although it appears dozens of times more expensive than Filecoin and Arweave, its goal is not to pursue extremely low costs but to build a decentralized hot storage system usable in real business scenarios. Walrus itself operates as a PoS network, with the core responsibility of verifying the honesty of storage nodes, providing the most basic security guarantees for the entire system.

As for whether Sui truly needs Walrus, it currently remains more at the level of ecological narrative. If financial settlement is the primary use, Sui does not urgently need off-chain storage support. However, if it hopes to support more complex on-chain scenarios in the future, such as AI applications, content assetization, and composable agents, the storage layer will be indispensable in providing context, context, and indexing capabilities. High-performance chains can handle complex state models, but these states need to be bound to verifiable data to build a trustworthy content network.

Shelby: Dedicated Fiber Network Completely Unleashes Web3 Application Scenarios

Among the biggest technical bottlenecks facing current Web3 applications, "read performance" has always been a difficult shortcoming to overcome.

Whether it is video streaming, RAG systems, real-time collaboration tools, or AI model inference engines, they all rely on low-latency, high-throughput hot data access capabilities. Decentralized storage protocols (from Arweave, Filecoin to Walrus) have made progress in data persistence and trustlessness, but because they operate over the public internet, they can never escape the limitations of high latency, unstable bandwidth, and uncontrollable data scheduling.

Shelby attempts to address this issue at its root.

First, the Paid Reads mechanism directly reshapes the "read operation" dilemma in decentralized storage. In traditional systems, reading data is almost free, and the lack of effective incentive mechanisms leads service nodes to be generally lazy in responding and cutting corners, resulting in actual user experiences lagging far behind Web2.

Shelby links user experience directly to service node income by introducing a pay-per-read model: the faster and more reliably nodes return data, the more rewards they can earn.

This model is not an "ancillary economic design," but rather the core logic of Shelby's performance design—without incentives, there is no reliable performance; with incentives, there can be sustainable improvements in service quality.

Secondly, one of the biggest technological breakthroughs proposed by Shelby is the introduction of a Dedicated Fiber Network, which is equivalent to building a high-speed rail network for the instant reading of hot data in Web3.

This architecture completely bypasses the public transport layer that Web3 systems generally rely on, directly deploying storage nodes and RPC nodes on a high-performance, low-congestion, physically isolated transmission backbone. This not only significantly reduces the latency of cross-node communication but also ensures the predictability and stability of transmission bandwidth. The underlying network structure of Shelby is closer to the dedicated line deployment model between AWS's internal data centers, rather than the "upload to a miner node" logic of other Web3 protocols.

Source: Shelby White Paper

This architectural reversal at the network level makes Shelby the first decentralized hot storage protocol capable of genuinely delivering Web2-level user experiences. Users reading a 4K video, invoking embedding data from a large language model, or tracing a transaction log on Shelby no longer have to endure the second-level delays commonly found in cold data systems but can achieve sub-second responses. For service nodes, the dedicated network not only enhances service efficiency but also significantly reduces bandwidth costs, making the "pay-per-read" mechanism genuinely economically viable, thus incentivizing the system to evolve towards higher performance rather than higher storage capacity.

It can be said that the introduction of the dedicated fiber network is the key support that allows Shelby to "look like AWS, but at its core is Web3." It not only breaks the natural opposition between decentralization and performance but also opens up real possibilities for Web3 applications in high-frequency reading, high-bandwidth scheduling, and low-cost edge access.

In addition, in balancing data persistence and cost, Shelby employs the Efficient Coding Scheme built with Clay Codes, achieving storage redundancy as low as <2x while maintaining 11 nines of persistence and 99.9% availability. While most Web3 storage protocols still hover around 5x to 15x redundancy rates today, Shelby is not only more efficient technically but also more competitive in terms of cost. This means that for dApp developers who truly value cost optimization and resource scheduling, Shelby offers a "cheap and fast" practical option.

Conclusion

Looking at the evolutionary paths from Filecoin, Arweave, Walrus to Shelby, we can clearly see that: the narrative of decentralized storage has gradually shifted from a technological utopia of "existence is reasonable" to a realistic route of "usability is justice." Early Filecoin drove hardware participation through economic incentives, but genuine user demand has long been marginalized; Arweave chose extreme permanent storage but has become increasingly isolated amid an application ecosystem that has gone silent; Walrus attempts to find a new balance between cost and performance, but still leaves questions in the construction of landing scenarios and incentive mechanisms. It wasn't until Shelby emerged that decentralized storage first proposed a systematic response to "Web2-level usability"—from the dedicated fiber network at the transmission layer to the efficient erasure code design at the computation layer, and to the pay-per-read incentive mechanism, these capabilities, originally exclusive to centralized cloud platforms, are beginning to be reconstructed in the Web3 world.

The emergence of Shelby does not mean the end of problems. It has not solved all challenges: issues such as developer ecology, permission management, and terminal access still lie ahead. But its significance lies in opening up a possible path of "performance without compromise" for the decentralized storage industry, breaking the binary paradox of "either anti-censorship or usable."

The path to the popularization of decentralized storage will ultimately not rely solely on conceptual hype or token speculation but must move towards an application-driven phase of "usable, integrable, and sustainable." In this phase, whoever can first address the genuine pain points of users will reshape the narrative landscape of the next round of infrastructure. From mining coin logic to usage logic, Shelby's breakthrough may mark the end of an era—yet it is also the beginning of another era.

About Movemaker

Movemaker is the first official community organization authorized by the Aptos Foundation, jointly initiated by Ankaa and BlockBooster, focusing on promoting the construction and development of the Aptos ecosystem in the Chinese-speaking region. As the official representative of Aptos in the Chinese-speaking area, Movemaker is committed to building a diverse, open, and prosperous Aptos ecosystem by connecting developers, users, capital, and numerous ecological partners.

Disclaimer:

This article/blog is for reference only, representing the author's personal views and does not represent the position of Movemaker. This article does not intend to provide: (i) investment advice or investment recommendations; (ii) offers or solicitations to buy, sell, or hold digital assets; or (iii) financial, accounting, legal, or tax advice. Holding digital assets, including stablecoins and NFTs, carries high risks, significant price volatility, and may even become worthless. You should carefully consider whether trading or holding digital assets is suitable for you based on your financial situation. For specific issues, please consult your legal, tax, or investment advisor. The information provided in this article (including market data and statistics, if any) is for general reference only. Reasonable care has been taken in compiling this data and charts, but no responsibility is accepted for any factual errors or omissions expressed therein.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators