YZi Labs invests in Gata, an AI "data mining" project explained in one article | CryptoSeed
Author: Patti, ChainCatcher
Editor: TB, ChainCatcher
At the end of April 2025, Gata announced the completion of a $4 million seed round financing, with investors including YZi Labs, IDG Blockchain, Maelstrom Fund, etc. This financing has once again sparked community interest in its airdrop plan and has placed it on the hot project list of several "yield farming" communities.
According to official information, Gata, formerly known as Aggregata, is a decentralized AI data infrastructure platform dedicated to generating, distributing, and utilizing high-quality training data in a fairer and more efficient manner. The project has been favored by Binance Labs since its early stages, having been selected for its MVB (Most Valuable Builder) program and winning the "Innovation Excellence Award" at the BNB Chain ecosystem Catalyst Awards.
Decentralized AI Data Value Chain
Unlike traditional data platforms, Gata does not view data merely as training material but expands it into an "AI asset"—encompassing datasets, models, intermediate weights, processes, and operating environments. Its core goal is to reconstruct the production, use, and value distribution processes of AI training data through decentralized means, allowing more people to participate in the AI economy fairly.
To this end, Gata has built a relatively complete product system, from the user-oriented "GPT-to-Earn" mechanism to the automated data agent tool "DataAgent," and to the decentralized data marketplace and model training pipeline, gradually forming a closed loop of "user-generated data—platform evaluation and selection—model training application—users receiving rewards."
Core Modules and Functions of the Platform
1. GPT-to-Earn
The first product launched by Gata is the Chrome browser extension GPT-to-Earn. When users utilize language models like ChatGPT, the extension automatically uploads anonymous conversation data for subsequent training use, and the uploaded data will earn points as rewards.
2. DataAgent
DataAgent is the core tool of the Gata platform, aimed at replacing traditional data annotation processes. Users can run specific DataAgent scripts to allow AI to automatically generate structured training data and conduct quality assessments.
For example, the currently highlighted DVA (Data Validation Agent) automatically scores image-text paired datasets, distinguishing useful from ineffective data for training cutting-edge models like Stable Diffusion and GPT-4o.
3. Decentralized Data Storage and On-Chain Marketplace
Gata is built on the Greenfield network of BNB Chain, utilizing its decentralized storage capabilities to ensure clear and immutable data ownership. At the same time, the platform has developed an on-chain data marketplace that allows users to list and trade generated data, even embedding fine-tuning tools and training clients, enabling non-technical users to easily participate in the AI data economy.
Airdrop Participation Methods
Gata emphasizes "data as assets, participation as value." As a key component of community incentives, Gata has designed an airdrop plan around GPT-to-Earn and DataAgent. Users can participate in the following ways:
- Install the Chrome extension, authorize the upload of ChatGPT conversations, and link your EVM wallet.
- Run DVA, complete tasks through interactions with ChatGPT, and earn points.
- Connect Discord, X, and other social accounts to complete tasks, invite friends, and earn points.
Users need to pay a small BNB gas fee when uploading data, which can be transferred from the mainnet to the Greenfield network via the official cross-chain bridge.
Data Mining? Time Will Tell
In Web3, "data mining" has long surpassed the traditional meaning of data analysis and has become a new mechanism for capturing user data value. Whether it is on social protocols like Lens and CyberConnect that put user social behaviors on-chain as assets, or in Ocean Protocol where data is tokenized as NFTs for others to authorize use, "data as assets" is becoming a new paradigm.
Gata's GPT-to-Earn and DataAgent models are products of this trend. Although Gata hopes to build a complete mechanism for "everyone to mine data," there are still challenges to truly forming a sustainable data economy closed loop.
From Gata's product layout, the lightweight user entry and the underlying infrastructure are taking shape; however, more technical and ecological support is still needed on key points such as data quality governance, incentive loop construction, and real data utilization.
In the future AI economy, data will shift from platform monopoly to universal participation.
As a new concept, "data mining" is still in the theoretical validation and mechanism refinement stage. Whether Gata can become a practical example of this route remains an unanswered long-term question.
This article does not constitute investment advice, and due to limited information, please DYOR.

