SlowMist × Bitget AI Security Report: Is it really safe to entrust money to AI Agents like "Lobster"?

2026-03-18 13:42:14

Collection

SlowMist collaborates with Bitget to release the AI Agent Trading Security Report: revealing new threats such as prompt injection and plugin poisoning, advocating for the establishment of a security red line for Web3 automated trading through "minimum privilege APIs, sub-account isolation, and human-machine confirmation signatures."

I. Background

With the rapid development of large model technology, AI Agents are gradually evolving from simple intelligent assistants to automated systems capable of executing tasks autonomously. This change is particularly evident in the Web3 ecosystem. More and more users are beginning to involve AI Agents in market analysis, strategy generation, and automated trading, making the concept of a "24/7 automated trading assistant" a reality. With Binance and OKX launching multiple AI Skills, and Bitget introducing the Skills resource site Agent Hub and the installation-free lobster GetClaw, Agents can directly connect to trading platform APIs, on-chain data, and market analysis tools, thus taking on trading decision-making and execution tasks that previously required human intervention.

Compared to traditional automated scripts, AI Agents possess stronger autonomous decision-making capabilities and more complex system interaction abilities. They can access market data, call trading APIs, manage account assets, and even extend functionality through plugins or Skills. This enhancement significantly lowers the barrier to using automated trading, allowing more ordinary users to start engaging with and using automated trading tools.

However, the expansion of capabilities also means an increase in attack surfaces.

In traditional trading scenarios, security risks typically focus on issues such as account credentials, API Key leaks, or phishing attacks. In the AI Agent architecture, new risks are emerging. For example, prompt injection could affect the decision-making logic of the Agent, malicious plugins or Skills could become new entry points for supply chain attacks, and improper configuration of the operating environment could lead to the misuse of sensitive data or API permissions. Once these issues are combined with automated trading systems, the potential impact may not be limited to information leakage but could directly result in real asset losses.

At the same time, as more users begin to connect AI Agents to trading accounts, attackers are quickly adapting to this change. New types of scams targeting Agent users, malicious plugin poisoning, and API Key abuse are gradually becoming new security threats. In the Web3 scenario, asset operations often involve high value and irreversibility; once an automated system is abused or misled, the risk impact may be further amplified.

Based on this background, SlowMist and Bitget jointly authored this report, systematically addressing the security issues of AI Agents in various scenarios from the perspectives of security research and trading platform practices. We hope this report can provide some security references for users, developers, and platforms, helping to promote a more robust development of the AI Agent ecosystem between security and innovation.

II. Real Security Threats of AI Agents | SlowMist

The emergence of AI Agents has shifted software systems from "human-led operations" to "model-involved decision-making and execution." This architectural change significantly enhances automation capabilities but also expands the attack surface. From the current technical structure, a typical AI Agent system usually consists of multiple components, including user interaction layers, application logic layers, model layers, tool invocation layers (Tools / Skills), memory systems, and underlying execution environments. Attackers often do not target a single module but attempt to influence the Agent's behavioral control through multiple layered paths.

1. Input Manipulation and Prompt Injection Attacks

In the AI Agent architecture, user inputs and external data are often directly incorporated into the model context, making prompt injection an important attack method. Attackers can construct specific instructions to induce the Agent to perform operations that should not be triggered. For example, in some cases, simply using chat commands can lead the Agent to generate and execute high-risk system commands.

A more complex attack method is indirect injection, where attackers hide malicious instructions within web content, document descriptions, or code comments. When the Agent reads these contents during task execution, it may mistakenly regard them as legitimate instructions. For instance, embedding malicious commands in plugin documentation, README files, or Markdown files could cause the Agent to execute attack code during environment initialization or dependency installation.

This attack mode is characterized by its reliance not on traditional vulnerabilities but on the model's trust mechanism regarding contextual information to influence its behavioral logic.

2. Supply Chain Poisoning in Skills / Plugin Ecosystem

In the current AI Agent ecosystem, the plugin and Skills system (Skills / MCP / Tools) is an important way to extend Agent capabilities. However, this plugin ecosystem is also becoming a new entry point for supply chain attacks.

SlowMist's monitoring of the OpenClaw official plugin center ClawHub found that as the number of developers increased, some malicious Skills began to infiltrate. After merging and analyzing the IOC of over 400 malicious Skills, SlowMist discovered that many samples pointed to a small number of fixed domain names or multiple random paths under the same IP, showing obvious resource reuse characteristics, resembling gang-like, bulk attack behavior.

In OpenClaw's Skill system, the core file is usually SKILL.md. Unlike traditional code, these Markdown files often serve as "installation instructions" and "initialization entry points," but in the Agent ecosystem, they are often directly copied and executed by users, forming a complete execution chain. Attackers only need to disguise malicious commands as dependency installation steps, such as using curl | bash or Base64 encoding to hide the real instructions, to induce users to execute malicious scripts.

In actual samples, some Skills adopt a typical "two-stage loading" strategy: the first stage script is only responsible for downloading and executing the second stage Payload, thus reducing the success rate of static detection. For example, in a highly downloaded "X (Twitter) Trends" Skill, a segment of Base64 encoded commands is hidden in its SKILL.md.

After decoding, it is revealed that its essence is to download and execute a remote script:

The second stage program disguises itself as a system pop-up to obtain the user's password and collects local information, desktop documents, and files in the download directory, ultimately packaging and uploading them to a server controlled by the attacker.

The core advantage of this attack method is that the Skill shell itself can remain relatively stable, while the attacker only needs to change the remote Payload to continuously update the attack logic.

3. Risks in Agent Decision-Making and Task Orchestration Layers

In the application logic layer of AI Agents, tasks are usually broken down into multiple execution steps by the model. If an attacker can influence this breakdown process, it may lead to the Agent exhibiting abnormal behavior while executing legitimate tasks.

For example, in business processes involving multi-step operations (such as automated deployment or on-chain transactions), attackers can manipulate key parameters or interfere with logical judgments, causing the Agent to replace target addresses or execute additional operations during the execution process.

In previous security audit cases conducted by SlowMist, malicious prompt words were returned to the MCP to pollute the context, thereby inducing the Agent to call the wallet plugin to execute on-chain transfers.

The characteristic of this type of attack is that the error does not stem from the model generating code but from the tampering of the task orchestration logic.

4. Privacy and Sensitive Information Leakage in IDE / CLI Environments

As AI Agents are widely used for development assistance and automated operations, many Agents begin to run in IDE, CLI, or local development environments. These environments typically contain a large amount of sensitive information, such as .env configuration files, API Tokens, cloud service credentials, private key files, and various access keys. If the Agent can read these directories or index project files during task execution, it may inadvertently incorporate sensitive information into the model context.

In some automated development processes, the Agent may read configuration files in the project directory during debugging, log analysis, or dependency installation. If there is no clear ignore policy or access control, this information may be recorded in logs, sent to remote model APIs, or even exfiltrated by malicious plugins.

Additionally, some development tools allow the Agent to automatically scan code repositories to establish contextual memory, which may also expand the exposure of sensitive data. For example, private key files, mnemonic backups, database connection strings, or third-party API Tokens may be read during the indexing process.

In Web3 development environments, this issue is particularly prominent, as developers often store test private keys, RPC Tokens, or deployment scripts in local environments. Once this information is obtained by malicious Skills, plugins, or remote scripts, attackers may further control developer accounts or deployment environments.

Therefore, in scenarios where AI Agents are integrated with IDE / CLI, establishing clear sensitive directory ignore policies (such as .agentignore, .gitignore mechanisms) and permission isolation measures is an important prerequisite for reducing the risk of data leakage.

5. Uncertainty in Model Layer and Automation Risks

AI models themselves are not completely deterministic systems, and their outputs exhibit a certain degree of probabilistic instability. The so-called "model hallucination" refers to the model generating seemingly reasonable but actually incorrect results in the absence of information. In traditional application scenarios, such errors usually only affect information quality, but in the AI Agent architecture, model outputs may directly trigger system operations.

For example, in some cases, the model did not query real parameters when deploying a project but generated an incorrect ID and continued executing the deployment process. If similar situations occur in on-chain transactions or asset operation scenarios, erroneous decisions may lead to irreversible financial losses.

6. High-Value Operation Risks in Web3 Scenarios

Unlike traditional software systems, many operations in the Web3 environment are irreversible. For example, on-chain transfers, Token Swaps, liquidity additions, and smart contract calls are typically difficult to revoke or roll back once transactions are signed and broadcasted to the network. Therefore, when AI Agents are used to execute on-chain operations, their security risks are further amplified.

In some experimental projects, developers have begun to try to involve Agents directly in on-chain trading strategy execution, such as automated arbitrage, fund management, or DeFi operations. However, if the Agent is influenced by prompt injection, context pollution, or plugin attacks during task breakdown or parameter generation, it may replace target addresses, modify transaction amounts, or call malicious contracts during the trading process. Additionally, some Agent frameworks allow plugins to directly access wallet APIs or signing interfaces. Without signature isolation or manual confirmation mechanisms, attackers may even trigger automatic trading through malicious Skills.

Therefore, in Web3 scenarios, fully binding AI Agents to asset control systems is a high-risk design. A safer model is usually to have the Agent only responsible for generating trading suggestions or unsigned transaction data, while the actual signing process is completed by an independent wallet or manual confirmation. At the same time, incorporating address reputation checks, AML risk control, and transaction simulations can also reduce the risks associated with automated trading to some extent.

7. System-Level Risks from High-Permission Execution

Many AI Agents have high system permissions in actual deployments, such as accessing the local file system, executing Shell commands, or even running with Root privileges. Once the behavior of the Agent is manipulated, its impact may far exceed a single application.

SlowMist has tested binding OpenClaw with instant messaging software like Telegram for remote control. If the control channel is taken over by an attacker, the Agent could be used to execute arbitrary system commands, read browser data, access local files, or even control other applications. Combined with the plugin ecosystem and tool invocation capabilities, such Agents have, to some extent, already exhibited characteristics of "intelligent remote control."

In summary, the security threats posed by AI Agents are no longer limited to traditional software vulnerabilities but span multiple dimensions, including model interaction layers, plugin supply chains, execution environments, and asset operation layers. Attackers can manipulate the behavior of the Agent through prompt manipulation, implant backdoors at the supply chain level through malicious Skills or dependencies, and further expand the attack impact in high-permission operating environments. In Web3 scenarios, due to the irreversibility of on-chain operations and the involvement of real asset values, these risks are often further amplified. Therefore, in the design and use of AI Agents, relying solely on traditional application security strategies is no longer sufficient to fully cover new attack surfaces; a more systematic security protection system needs to be established in areas such as permission control, supply chain governance, and transaction security mechanisms.

III. AI Agent Trading Security Practices | Bitget

As the capabilities of AI Agents continue to enhance, they are no longer just providing information or assisting in decision-making but are beginning to directly participate in system operations, even executing on-chain transactions. This change is particularly evident in the cryptocurrency trading scenario. More and more users are trying to involve AI Agents in market analysis, strategy execution, and automated trading. When Agents can directly call trading interfaces, access account assets, and automatically place orders, their security issues shift from "system security risks" to "real asset risks." How should users protect their accounts and funds when AI Agents are used for actual trading?

Based on this, this section introduces key security strategies to focus on when using AI Agents for automated trading, combining the practical experience of the Bitget security team from multiple perspectives, including account security, API permission management, fund isolation, and trading monitoring.

1. Major Security Risks in AI Agent Trading Scenarios

2. Account Security

With the emergence of AI Agents, the attack paths have changed:

No need to log into your account—just need to obtain your API Key.
No need for you to notice—Agents operate 24/7, and abnormal operations can continue for days.
No need for withdrawals—trading directly on the platform can deplete assets, which is also a target for attacks.

The creation, modification, and deletion of API Keys must be done through a logged-in account—if the account is compromised, the Key management rights are also compromised. The security level of the account directly determines the upper limit of API Key security.

What you should do:

Enable Google Authenticator as the primary 2FA instead of SMS (SIM cards can be hijacked).
Enable Passkey passwordless login: based on the FIDO2/WebAuthn standard, public-private key encryption replaces traditional passwords, making phishing attacks ineffective at the architectural level.
Set up anti-phishing codes.
Regularly check the device management center, immediately kick out unfamiliar devices and change passwords.

3. API Security

In the AI Agent automated trading architecture, the API Key serves as the "execution permission credential" for the Agent. The Agent itself does not directly hold account control; all operations it can execute depend on the permissions granted to the API Key. Therefore, the API permission boundaries determine what the Agent can do and the extent of potential losses in the event of a security incident.

Permission configuration matrix—minimum permissions, not convenient permissions:

In most trading platforms, API Keys typically support various security control mechanisms, which, if used properly, can significantly reduce the risk of API Key abuse. Common security configuration recommendations include:

User common mistakes:

Directly pasting the main account API Key into the Agent configuration—exposing full permissions of the main account.
Selecting "all" for business types for convenience, actually opening all operational scopes.
Not setting a Passphrase, or the Passphrase is the same as the account password.
Hardcoding the API Key in the code, which gets crawled within 3 minutes after being pushed to GitHub.
Authorizing one Key to multiple Agents and tools, where any one being compromised fully exposes the Key.
Not revoking the Key immediately after a leak, allowing attackers to continue exploiting the window period.

Key lifecycle management:

Rotate API Keys every 90 days, immediately delete old Keys.
Immediately delete corresponding Keys when disabling Agents, leaving no residual attack surface.
Regularly check API call records, and immediately revoke if unfamiliar IPs or abnormal time periods are found.

4. Fund Security

The extent of damage an attacker can cause after obtaining an API Key depends on how much money that Key can manipulate. Therefore, when designing the trading architecture for AI Agents, in addition to account security and API permission control, a fund isolation mechanism should be implemented to set clear loss limits for potential risks.

Sub-account isolation mechanism:

Create dedicated sub-accounts for Agents, completely separate from the main account.
The main account only allocates the funds that the Agent actually needs, not all assets.
Even if the sub-account Key is stolen, the maximum amount the attacker can manipulate = funds within the sub-account, with no impact on the main account.
Manage multiple Agent strategies with multiple sub-accounts, isolating them from each other.

Fund password as a second lock:

The fund password is completely separate from the login password; even if the account is logged in, withdrawals cannot be initiated without the fund password.
Set the fund password and login password to different passwords.
Enable withdrawal whitelists: only pre-added addresses can withdraw, and new addresses require a 24-hour review period.
After modifying the fund password, the system automatically freezes withdrawals for 24 hours—this is a protective mechanism for you.

5. Trading Security

In AI Agent automated trading scenarios, security issues often do not manifest as one-time abnormal behaviors but may gradually occur during the continuous operation of the system. Therefore, in addition to account security and API permission control, it is also necessary to establish continuous trading monitoring and anomaly detection mechanisms to timely discover and intervene in issues at early stages.

Monitoring systems that must be established:

Anomaly signal identification—immediately stop and check in the following situations:

The Agent has been inactive for a long time, but new orders or positions appear in the account.
API call logs show requests from non-Agent server IPs.
Notifications of transactions from trading pairs that were never set up are received.
The account balance shows unexplained changes.
The Agent repeatedly prompts "more permissions are needed to execute"—first clarify why, then decide whether to authorize.

Management of Skill and tool sources:

Only install Skills released by official channels and reviewed.
Avoid installing third-party extensions from unknown or unverified sources.
Regularly review the list of installed Skills and delete those no longer in use.
Be wary of community "enhanced" or "localized" Skills—any unofficial version is a risk.

6. Data Security

AI Agents' decisions rely on a large amount of data (account information, positions, transaction history, market data, strategy parameters). If this data is leaked or tampered with, attackers may infer your strategies or even manipulate trading behaviors.

What you should do:

Minimum data principle: only provide the data necessary for the Agent to execute trades.
Sensitive data desensitization: logs and debugging information should not allow the Agent to output complete account information, API Keys, and other sensitive data.
Prohibit uploading complete account data to public AI models (such as public LLM APIs).
If possible, separate strategy data from account data.
Disable or restrict the Agent from exporting historical transaction data.

Common user mistakes:

Uploading complete trading history to AI for "help me optimize my strategy."
Printing API Key / Secret in Agent logs.
Posting screenshots of trading records in public forums (including order IDs, account information).
Uploading database backups to AI tools for analysis.

7. Security Design at the AI Agent Platform Level

In addition to user-side security configurations, the security of the AI Agent trading ecosystem largely depends on the security design at the platform level. A mature Agent platform typically needs to establish systematic protective mechanisms in areas such as account isolation, API permission control, plugin review, and basic security capabilities to reduce the overall risks faced by users when accessing automated trading systems.

Common security designs in actual platform architectures usually include the following aspects.

Sub-account isolation system

In automated trading environments, platforms typically provide sub-account or strategy account systems to isolate funds and permissions of different automated systems. In this way, users can allocate independent accounts and fund pools for each Agent or trading strategy, avoiding the risks associated with multiple automated systems sharing the same account.

Fine-grained API permission configuration

The core operations of AI Agents rely on API interfaces, so platforms typically need to support fine-grained control in API permission design, such as trading permission divisions, IP source restrictions, and additional security verification mechanisms. Through this permission model, users can grant the Agent only the minimum permission range necessary to complete tasks.

Agent plugin and Skill review mechanism

Some platforms set up review mechanisms for the release and listing processes of plugins or Skills, such as code reviews, permission assessments, and security testing, to reduce the likelihood of malicious components entering the ecosystem. From a security perspective, such review mechanisms add a layer of platform-level filtering to the plugin supply chain, but users still need to maintain basic security awareness regarding the installed extension components.

Platform basic security capabilities

In addition to the security mechanisms related to Agents, the account security system of the trading platform itself also significantly impacts Agent users. For example:

8. New Types of Scams Targeting Agent Users

Impersonating customer service

"Your API Key has a security risk; please reconfigure it immediately." Then they provide you with a phishing link.

→ Officials will not proactively message you to request your API Key.

Poisoning Skill packages

Community shares "enhanced trading Skills," which silently send your Key during runtime.

→ Only install Skills from official reviewed channels.

Impersonating upgrade notifications

"You need to reauthorize," clicking leads to a counterfeit page.

→ Check email for anti-phishing codes.

Prompt injection attacks

Embedding instructions in market data, news, or candlestick annotations to manipulate the Agent into executing unexpected operations.

→ Set sub-account fund limits; even if injected, losses have hard boundaries.

Malicious scripts disguised as "security detection tools"

Claiming to check if your Key has leaked, but actually stealing the Key.

→ Check API call situations through logs or access records provided by the official platform.

9. Troubleshooting Path

Discover any anomalies

↓

Immediately revoke or disable suspicious API Keys

↓

Check for abnormal orders/positions in the account; withdraw immediately if possible

↓

Check withdrawal records to confirm whether funds have been transferred out

↓

Change login password + fund password, kick out all logged-in devices

↓

Contact platform security support, providing the time of the anomaly and operation records

↓

Investigate the Key leak path (code repository / configuration files / Skill logs)

Core principle: In case of any suspicion, first revoke the Key, then investigate the cause; the order cannot be reversed.

IV. Recommendations and Summary

In this report, SlowMist and Bitget analyze typical security issues of current AI Agents in the Web3 scenario, including the risks of Prompt Injection manipulating Agent behavior, supply chain risks in the plugin and Skill ecosystem, API Key and account permission abuse issues, and potential threats from automated execution leading to misoperations and permission escalation. These issues are often not caused by a single vulnerability but are the result of the interaction of Agent architecture design, permission control strategies, and operational environment security.

Therefore, when building or using AI Agent systems, security design should be approached from an overall architectural perspective, such as following the principle of least privilege when assigning API Keys and account permissions to Agents, avoiding enabling unnecessary high-risk features; implementing permission isolation for plugins and Skills at the tool invocation layer to prevent a single component from having data acquisition, decision generation, and fund operation capabilities simultaneously; setting clear behavioral boundaries and parameter limits when the Agent executes critical operations, and adding manual confirmation mechanisms in necessary scenarios to reduce the irreversible risks brought by automated execution. At the same time, for external inputs relied upon by the Agent, prompt design and input isolation mechanisms should be implemented to prevent Prompt Injection attacks, avoiding directly using external content as system commands in the model inference process. During actual deployment and operation phases, API Key and account security management should be strengthened, such as only enabling necessary permissions, setting IP whitelists, regularly rotating Keys, and avoiding storing sensitive information in plaintext in code repositories, configuration files, or logging systems; in development processes and operational environments, measures such as plugin security reviews, controlling sensitive information in logs, and behavior monitoring and auditing mechanisms should be implemented to reduce the risks of configuration leaks, supply chain attacks, and abnormal operations.

On a more macro level of security architecture, SlowMist has proposed a multi-layer security governance approach for AI and Web3 intelligent agent scenarios, aiming to systematically reduce the risks of intelligent agents in high-permission environments by constructing a layered protection system. In this framework, L1 security governance first establishes a unified development and usage security baseline, providing a unified source of strategies and audit standards for teams when introducing AI toolchains by establishing security specifications covering development tools, Agent frameworks, plugin ecosystems, and operational environments. Based on this, L2 can effectively constrain the execution scope of high-risk operations through convergence of Agent permission boundaries, minimum permission control for tool invocation, and human confirmation mechanisms for critical behaviors. Meanwhile, L3 introduces real-time threat perception capabilities at the external interaction entry level, pre-checking external resources such as URLs, dependency repositories, and plugin sources to reduce the probability of malicious content or supply chain poisoning entering the execution chain; in scenarios involving on-chain transactions or asset operations, L4 on-chain risk analysis and independent signing mechanisms provide additional security isolation, allowing Agents to construct transactions without directly accessing private keys, thus reducing systemic risks associated with high-value asset operations. Finally, L5 establishes operational mechanisms such as continuous inspections, log audits, and periodic security reviews, forming a closed-loop security capability of "pre-execution pre-checks, in-execution constraints, and post-execution reviews." This layered security approach is not a single product or tool but a security governance framework aimed at AI toolchains and intelligent agent ecosystems, with the core goal of helping teams establish a sustainable, auditable, and evolvable Agent security operation system while effectively addressing the evolving security challenges in the context of the deep integration of AI and Web3.

Overall, AI Agents bring a higher degree of automation and intelligence to the Web3 ecosystem, but their security challenges cannot be ignored. Only by establishing comprehensive security mechanisms at multiple levels, such as system design, permission management, and operational monitoring, can potential risks be effectively reduced while promoting technological innovation in AI Agents. We hope this report can provide references for developers, platforms, and users when building and using AI Agent systems, jointly promoting the formation of a safer and more reliable Web3 ecosystem while advancing technological development.

Appendix

Extended Resources

OpenClaw Minimal Security Practice Guide

An end-to-end Agent security deployment manual from the cognitive layer to the infrastructure layer, systematically outlining the security practices and deployment recommendations for high-permission AI Agents in real production environments.

https://github.com/slowmist/openclaw-security-practice-guide

MCP Security Checklist

A systematic security checklist for quickly auditing and strengthening Agent services, helping teams avoid missing key defense points when deploying MCPs/Skills and related AI toolchains.

https://github.com/slowmist/MCP-Security-Checklist

MasterMCP

An open-source malicious MCP server example for reproducing real attack scenarios and testing the robustness of defense systems, useful for security research and defense validation.

https://github.com/slowmist/MasterMCP

MistTrack Skills

A plug-and-play Agent skill package that provides professional cryptocurrency AML compliance and address risk analysis capabilities for AI Agents, useful for on-chain address risk assessment and pre-trade risk judgment.

https://github.com/slowmist/misttrack-skills

Comprehensive Security Solution for AI and Web3 Intelligent Agents

A comprehensive security solution for AI and Web3 intelligent agents, aiming to achieve a security closed loop of pre-execution pre-checks, in-execution constraints, and post-execution reviews through a "five-layer progressive digital fortress" architecture and ADSS governance baseline, along with capabilities such as MistEye, MistTrack, and MistAgent.

https://mp.weixin.qq.com/s/mWBwBANlD7UchU9SqDp_cQ