Gradients: Decentralized AI training infrastructure of the Bittensor ecosystem
Summary
Gradients is a decentralized AI training subnet (SN56) built on Bittensor, which focuses on transforming model training from a complex technical process into a market-driven collaborative network process through mechanisms such as "task publishing, miner competition, and validation screening." Architecturally, it combines AutoML with distributed computing power, forming a training market centered around incentive mechanisms, which not only lowers the threshold for AI usage but also enhances computing power utilization efficiency. From an ecological and data performance perspective, Gradients has completed the foundational network setup, but currently, the incentive weights and capital inflow are relatively limited. Gradients fills the training infrastructure gap in the TAO ecosystem and explores a new paradigm of "market-driven AI optimization," with long-term potential to develop into an important entry layer for decentralized AI training.
1. Starting from Web2 AutoML: The Current State and Limitations of AI Training
1.1 What is AutoML
In traditional understanding, training an AI model is a high-threshold task that requires engineers to handle data, select models, repeatedly adjust parameters, and evaluate results, making the entire process complex and time-consuming. The emergence of AutoML (Automated Machine Learning) essentially packages these cumbersome steps into an "automated tool for model creation." It can be understood as a "tool for automatically creating models": users only need to provide data and specify the goals they want to achieve, such as classification, prediction, or recognition, while the remaining processes, including model selection, parameter tuning, and training optimization, are automatically completed by the system. This transforms AI from a tool for a few specialized engineers into a capability that ordinary developers and even enterprises can use, marking an important step towards the popularization of AI.
1.2 Core Limitations of Traditional AutoML
Currently, mainstream implementations of AutoML are concentrated on cloud vendor platforms, such as Google Vertex AI and AWS SageMaker, which provide "AI training as a service." Although Web2 AutoML significantly lowers the threshold for AI usage, its underlying model still has obvious limitations. The first is the centralization issue, where computing power, pricing, and rules are controlled by the platform, leading to strong user dependence on a single service provider and a lack of bargaining power. Secondly, costs are high and opaque; the GPU resources relied upon for AI training are mainly concentrated in the hands of cloud vendors, and the pricing mechanism lacks market competition. More critically, there is an upper limit to optimization efficiency. Traditional AutoML essentially remains "a system helping you find the optimal solution," and regardless of how complex this system is, it fundamentally belongs to a single technological path of optimization. Its exploration space is limited, making it difficult to simultaneously try various completely different approaches. Therefore, the current Web2 AI training is a "closed system," where model training, optimization, and resource scheduling all occur in an environment controlled by a single platform. While this model is efficient, its boundaries are gradually becoming apparent as demand grows.
2. Gradients: Reconstructing AI Training with "Networks"
2.1 What is Gradients: A Decentralized AutoML Platform
In the previous section, we mentioned that the core issue of traditional Web2 AutoML lies in the "closed system," where model training relies on the platform, optimization paths are limited, and resource flow is restricted. Gradients is a reconstruction of this model. Originating from a decentralized engineer community initiated by WanderingWeights, Gradients is built on the Bittensor network and operates as an AI training subnet on Subnet 56. Unlike traditional platforms, it does not provide centralized services but instead breaks down the training process and hands it over to an open network. Users only need to define task objectives, such as model type and data, while the remaining processes, including training execution, parameter optimization, and result screening, are automatically completed by the network. In this model, AI training is abstracted from a complex engineering process to a simple process of "submitting requirements and obtaining results," making it closer to a general capability rather than a highly specialized technical task.
2.2 From Closed Systems to Open Collaboration: What Problems Does Gradients Solve
The core change of Gradients lies in transforming the originally closed training process within a single platform into an open collaborative network process. Training tasks are no longer completed by a single system but are distributed to multiple participants for parallel attempts, and the best results are selected through a unified evaluation mechanism. This structure first reduces dependence on centralized service providers, establishing training on distributed computing power; at the same time, dispersed GPU resources are integrated into the same network, forming a resource allocation method closer to marketization through competition. More importantly, model optimization is no longer limited to a single path but continuously approaches better solutions through parallel exploration of various methods, thus enhancing the overall optimization ceiling.
2.3 Essential Change: From Tool to "Training Market"
In traditional AutoML, the platform acts more like a tool, helping users find optimal solutions through internal algorithms. In Gradients, this process resembles a continuously operating "market": users publish demands, different participants compete around the same task, and results are filtered through an evaluation mechanism. Consequently, model performance no longer relies on a single system's capability but comes from ongoing competition and iteration among multiple participants. AutoML also shifts from a relatively closed technical optimization problem to a dynamic process driven by incentives, allowing optimization capabilities to expand as the number of participants increases. This change enables AI training to begin exhibiting self-evolution characteristics similar to a market.
2.4 Role in the TAO Ecosystem: AI Training Infrastructure Layer
In the subnet system of Bittensor, different Subnets undertake various functions such as inference, data processing, and training, with Gradients positioned at the training layer. It is responsible for converting dispersed computing power into actual model outputs and enables continuous scheduling and optimization of these resources through task distribution and evaluation mechanisms. At the same time, it connects computing power supply with model demand, transforming training from a mere resource consumption process into a network collaborative process that can be organized and optimized. Within this system, Gradients acts more like a central link, converting distributed resources into usable AI capabilities and supporting the development of upper-layer applications.
3. Core Architecture: How AI Training is Completed in the Network
In the previous section, we mentioned that Gradients transforms AI training from being "completed within a platform" to being "completed through network collaboration." So, how does this network operate specifically? The core of this section is to break down this process in a more intuitive way.
3.1 Distributed Training: How a Task is "Completed by Multiple People"
One can imagine Gradients as a continuously running "training collaboration network." When a user submits a training task, this task is not assigned to a single system but is simultaneously distributed to multiple participants in the network. These participants will attempt different training methods based on the same data and objectives and submit results within a specified time. Subsequently, the system will conduct a unified evaluation of these results and select the best-performing solutions. Ultimately, the better-performing results receive rewards, while other solutions are eliminated. From the user's perspective, this process only requires initiating a task, which is equivalent to simultaneously "calling" various optimization ideas and automatically selecting the optimal solution. The key to this approach lies not in the strength of individual nodes but in the parallel attempts by multiple people combined with automatic screening, allowing results to continuously approach the optimal.
In this network, there are primarily three types of participants: users, miners, and validators. Users are responsible for proposing training demands; miners provide computing power and attempt different training methods; validators are responsible for evaluating results and selecting the optimal models. This division of labor allows the training process to run continuously and constantly filter out better solutions. Overall, it constitutes a collaborative network driven by "demand, supply, and evaluation."
3.2 Market-Driven AutoML
From the mechanism breakdown above, it can be seen that Gradients does not simply move AutoML onto the blockchain but changes the underlying logic of model optimization by introducing multi-party participation and incentive mechanisms. Traditional AutoML relies on a single system to find optimal solutions within limited paths, while in Gradients, this process is expanded to the entire network: different participants continuously attempt various methods around the same task and iteratively filter and refine through unified evaluation. This makes model optimization no longer a one-time computational process but a dynamic process that can evolve repeatedly. Under this mechanism, better-performing results will receive higher rewards, thereby continuously attracting participants to optimize strategies and driving overall performance improvement.
4. Incentive and Competition Mechanisms: How AI Training Forms a "Positive Cycle"
4.1 Incentive Mechanism (TAO-Driven): From Training Behavior to Revenue Return
The key to Gradients' long-term operation lies in the incentive mechanism behind it. This relies on the native incentive system provided by Bittensor. Among them, TAO is the native token of the Bittensor network and serves as the "value carrier" within the entire network: on one hand, it is used to reward participants who provide computing power and model contributions; on the other hand, it also participates in subnet weight distribution through staking and other means, influencing how resources flow between different subnets.
The Bittensor mainnet continuously generates new incentive emissions, namely TAO (currently about 3600 TAO per day), and distributes them to different subnets according to certain rules. The amount each subnet can receive depends on its "performance" within the entire network, such as activity level, contribution quality, and funding support. For the subnet where Gradients is located, this portion of allocated TAO will be redistributed internally to participants. The core basis for distribution is that those who contribute better models will receive more rewards.
Specifically, miners submit training results, and validators are responsible for testing and scoring these results. The system calculates each participant's "contribution weight" based on the scoring situation and then distributes rewards according to this weight. Better-performing models (for example, those with stronger generalization ability and more stable performance) will receive higher rewards, while validators who score more accurately and reflect true quality will also receive more incentives. This design directly correlates "doing better" with "earning more," thus driving participants to continuously optimize models.
4.2 Competition Between Subnets: Not Only Internal Competition but Also External Ranking
In addition to internal competition within subnets, Gradients also faces "horizontal competition" within the entire Bittensor network. Since the distribution of TAO is dynamic, different subnets will compete for higher weights. Only those subnets that continuously produce high-quality results and attract more participants can obtain larger shares of rewards. Therefore, the incentives for Gradients depend not only on internal model performance but also on its relative competitiveness within the entire ecosystem. The entire system forms a multi-layered cycle: there is competition among models within subnets; there is overall performance competition between subnets. Ultimately, computing power input, model effectiveness, and economic returns are bound together, forming a continuously operating positive feedback mechanism.
4.3 Gradients 5.0: From Competition to "Tournament Mechanism"
Building on the foundation of early continuous competition, Gradients has further evolved into a more structured mechanism known as "tournament-style training." This can be understood as a periodic competition: each round of training will set a time window, and multiple participants will compete around the same task, gradually eliminating through multiple rounds of screening to ultimately select the optimal solution. This format emphasizes periodic comparison and centralized evaluation. An important change is that miners no longer directly submit training results but submit "training methods" (code), which are then uniformly executed by validation nodes. This approach enhances fairness by avoiding interference from different computing environments and also better protects the privacy of data and training processes. Additionally, winning solutions are often retained, becoming reusable methods, similar to an accumulating "best practices" library. In the long run, this mechanism not only selects the optimal models but also builds a continuously evolving library of training methods.
5. Ecological Status
5.1 Participant Structure: A Collaborative Network Composed of Demand, Supply, and Evaluation
The Gradients ecosystem consists of three core roles: users (demand side), miners (supply side), and validators (evaluation side). Users mainly include AI developers, small and medium-sized enterprises, and Web3 builders, who typically have a certain technical foundation but lack computing power or complete model training capabilities, thus preferring to use Gradients to complete model construction at a lower cost. Miners provide GPU computing power and participate in the competition for training tasks, with their core motivation being to obtain TAO rewards; validators are responsible for evaluating and ranking training results, playing a key role in ensuring model quality and the effective operation of the mechanism.
From a more detailed user profile perspective, the actual user base of Gradients exhibits a clear "semi-developer" characteristic: it is neither top AI laboratories nor completely non-technical ordinary users, but primarily developers and Web3 technology users with certain engineering capabilities. This is also reflected in its community structure, which currently is predominantly English-speaking, with core users mainly distributed among developer communities in North America and Europe, while also covering some miners in Southeast Asia and global GPU resource providers. Overall, it resembles a technology-driven developer community.
5.2 Current Status of Ecosystem Operation
As of May 12, the price of Gradients' alpha token is approximately 0.0255 TAO, with about 4,890 holding addresses, 243 miners, and 12 validators, with an emission share of 1.61%. Meanwhile, the TAO share in its liquidity pool is 2.19%, while Alpha accounts for 97.81%. From the price and number of holding addresses, Gradients has established a certain user base and level of attention, but overall it is still in the early diffusion stage. In comparison, the leading project in the TAO ecosystem, Chutes, had an alpha token price of 0.0877 TAO and 13,409 holding addresses on the same day.

Figure 1. Gradients data. Source:https://bittensormarketcap.com/subnets/56
Next is the emission incentive mechanism. In the Bittensor system, emission refers to the real-time allocation weight of the subnet in the overall network's new rewards. The Bittensor network continuously generates new TAO and distributes it to various subnets according to their weights, while Gradients' current 1.61% means it only receives a small portion of the new incentives in the entire network. This metric essentially reflects the "voting results" of the market through capital flow (such as staking) on different subnets. Therefore, a level of 1.61% typically indicates that current market recognition and capital inflow are relatively limited, while also suggesting that there is still room for increasing its weight in the future. From the funding structure (liquidity pool) perspective, the TAO share is only 2.19%, while Alpha is as high as 97.81%, indicating that external capital inflow is still limited, with the current supply being predominantly internal to the subnet. Prices are sensitive to new capital; once more TAO flows in, it may bring a more pronounced amplification effect.
6. Competitive Landscape and Advantages and Disadvantages
6.1 Industry Positioning: Decentralized AutoML Training Infrastructure
Gradients is positioned in the niche track of "AI training infrastructure + decentralized AutoML." It aims to liberate model training from centralized platforms and achieve more efficient resource utilization and model optimization through networked mechanisms. In the Web2 system, this track is relatively mature, with typical representatives including Google Vertex AI and AWS SageMaker. These platforms provide one-stop model training and deployment services for developers through cloud computing, but their essence remains centralized architecture. In contrast, Gradients' differentiation lies not in "more features," but in its different underlying logic: it transforms training from "platform service" to "network collaboration" and selects optimal results through competitive mechanisms, making it closer to a market-operated training system.
6.2 Horizontal Comparison: Differences Between Web2 and Web3 AutoML
From a broader perspective, the differences between Web2 and Web3 in the direction of AutoML essentially compare two different paradigms. The Web2 model emphasizes efficiency and stability, providing a controllable and mature service experience through centralized resources and engineering optimization; while the Web3 model emphasizes openness and incentive mechanisms, allowing model optimization to continuously evolve through multi-party participation and competition. Specifically, Web2 AutoML resembles "a powerful tool," where users hand over tasks to the platform, and the system internally completes the search for optimal solutions; while Web3 AutoML, represented by Gradients, resembles "an open market," where users publish demands, different participants provide solutions, and results are filtered through evaluation mechanisms. The direct impact of this difference is that the former is more stable and controllable but has limited optimization paths; the latter has a larger exploration space and higher potential ceilings but still has room for improvement in stability and maturity.
6.3 Gradients' Differentiation in Web3
In the current Web3 AI track, most projects are still focused on inference layers or AI agents, while relatively few projects focus on "training infrastructure." Some projects attempt to provide training capabilities by combining computing power networks or data networks, but overall, most remain at the level of resource scheduling or computing power markets. Gradients' differentiation lies in that it not only provides computing power matching but also extends further up to the "model optimization mechanism" itself, introducing evaluation and competition systems that enable the training process to have continuous evolution capabilities. This means it is not only solving "where the computing power comes from," but also addressing "how to use this computing power more efficiently." From a positioning perspective, Gradients is closer to a "training result-oriented" network rather than a simple computing power market or tool platform, which is a core distinction from most Web3 AI projects.
6.4 Core Advantages: Mechanism-Driven Efficiency Improvement
Overall, Gradients' advantages mainly lie in its mechanism design. First, it lowers the usage threshold through task abstraction, allowing users to obtain model results without deeply engaging in complex training processes, thereby expanding the potential user base. Secondly, in terms of resources, the introduction of distributed computing power means that training no longer relies on a single cloud vendor, theoretically allowing for a more flexible cost structure through competition. More importantly, its method of optimization has changed. By enabling multiple participants to explore in parallel and combining screening mechanisms, Gradients provides a solution different from traditional single-path optimization, allowing models to achieve better performance in a shorter time. This "competition-driven optimization" model is its core advantage.
6.5 Potential Challenges
Model quality may have stability issues. Decentralized training relies on multi-party participation, which can enhance the ceiling but may also lead to result fluctuations, presenting certain uncertainties in controllability compared to centralized systems. Secondly, there is the issue of enterprise-level trust. For enterprise users, data security and the verifiability of the training process are crucial, and ensuring that data is not misused and results can be audited in a decentralized environment remains a key challenge. Finally, there is a reliance on token economics. Gradients' operation heavily depends on the incentive mechanism; if the attractiveness of TAO rewards decreases, it may affect miner participation and overall network activity. Therefore, its long-term sustainability is to some extent contingent on whether the economic model can form a stable positive cycle.
7. Future Outlook: Can Decentralized AutoML Be Established?
From the current stage, Gradients is still in its early phase, and its future success depends on several key points. The most critical is whether it can continuously attract genuine training demands, rather than just participation driven by incentives; next is model quality, whether the decentralized approach can stably produce usable or even superior results; and whether the economic mechanism can form a positive cycle, maintaining a long-term balance between computing power supply and revenue.
In the larger industry context, AI training is diverging into two paths. One is the Web2 model, dominated by leading tech companies, which continuously enhances model performance through centralized resources and engineering capabilities, with advantages in stability and maturity; the other is the Web3 path represented by Gradients, which allows more participants to jointly engage in model optimization through open networks and incentive mechanisms, continuously raising the ceiling through competition. The former is about "building a stronger system," while the latter resembles "constructing a self-evolving network."
From this perspective, Gradients' exploration represents a new possibility: AI training is no longer just a technical issue but a combination of "computing power + data + market mechanisms." If this model can be established, it has the potential to become the entry point for decentralized AI training and play a key infrastructure role in the Bittensor ecosystem. Of course, this direction still requires time to validate, but it has already provided an evolutionary approach to AutoML that differs from traditional paths.
References
Bittensor Documentation:https://docs.learnbittensor.org
Gradients website:https://www.gradients.io/
Gradients X:https://x.com/gradients_ai
5. Taostats:https://taostats.io/subnets/56/chart











