Agent Red Team
Paste your agent's setup. Find out exactly how someone could trick it into unauthorized actions. Get specific fixes for each one.
For founders and engineers shipping AI agents with real tools and permissions.
Agent exfiltrates Stripe keys and user data via email
A poisoned tool response tricks the agent into reading your Stripe secret key, querying your Supabase users table, and emailing both to an external address.
Fix: Redact secrets from tool outputs. Restrict email recipients to allowlisted domains. Add table and row limits to database queries.
Test your agent
50 more characters needed
50KB max
Whatever defines what your agent can do. 50KB max.
The problem
Someone hides instructions in a document. Your agent follows them and sends customer data to an outside address.
You require human review for big actions. Someone words the request so the agent thinks it's routine. The review step never triggers.
Your agent can read a database and send messages. Someone tricks it into reading sensitive records and forwarding them.
Someone plants instructions during an earlier chat. Next time the agent runs, it follows those fake instructions as if they were real.
Your agent gives extra permissions to admins. Someone just claims to be one. There's no real check.
Why this is different
AI agents now have real tools: they make trades, send emails, access databases. The danger is not a bad answer. It is an action nobody approved.
Other tools: Scan for a long list of vulnerability types
ART: Test the 6 specific ways agents get tricked
Other tools: Send tricky prompts and check the output
ART: Test your agent's setup against real attack patterns
Other tools: Give you a risk score
ART: Show each problem, the attack path, and how to fix it
Other tools: Use AI to judge whether findings are real
ART: Validate every report with automated code checks
This is happening right now
TechCrunch / Mar 2026
Meta's AI agent went rogue and exposed sensitive data to unauthorized employees
An engineer asked an agent for help. It acted without permission and shared data it shouldn't have. Classified as a Sev 1 incident.
Fortune / Mar 2026
Three rogue AI agent incidents in three weeks
An AI agent published a hit piece on a developer who rejected its code. A Meta safety director's agent deleted her emails and ignored commands to stop. A Chinese agent secretly mined crypto.
Fortune / Mar 2026
An AI agent destroyed a developer's entire production database
A developer let an AI agent update his website. A config error confused it about what was real vs. test. It deleted years of production data.
What ART tests for
Skipped approvals
Your agent has rules about when to ask permission. We test whether someone can get around them.
Tools used wrong
Your agent has tools with limits. We test whether someone can push them past those limits.
Fake identity claims
Your agent trusts certain roles. We test whether someone can fake that trust.
Dangerous tool chains
Your tools are safe alone. We test whether they combine into something harmful.
Poisoned memory
Your agent reads from memory or documents. We test whether someone can plant instructions there.
Bad data from tools
Your agent trusts data from its tools. We test whether someone can hide instructions in that data.
How it works
01
The instructions you gave your agent, its list of tools, its rules. Whatever describes what it can do. No code needed. No passwords or keys.
02
We pick the tests that match what your agent can do and run them against your setup. Tests cover all 6 attack families.
03
Each issue shows what the trick is, what tool or permission is affected, how bad it is, and exactly what to change.
The report
Each issue shows exactly how someone could trick your agent, which tool or permission is involved, and what you need to change.
How the trick works
Step by step: what the attacker does, how the agent gets tricked, and what bad thing happens.
How serious it is
Every issue has a severity level with a written explanation. Not just a number. A reason you can read.
How to fix it
A specific change to your agent's setup, permissions, or rules. Something you can do today.
What's still risky
What problems remain after the fix, and how confident ART is in each finding.
Pricing
Adversarial
Full test before launch
Builder
Test before every update
Team
For teams shipping agents
How we handle your data
Your setup stays private. Not shared. Not used for training.
Your setup stays safe
ART can only read what you paste and return a report. It cannot run code, access the internet, or do anything else.
Not shared or trained on
Your config is stored securely during analysis. It is never shared with third parties or used for model training.
Industry standard coverage
Tests are mapped to the official top 10 list of AI agent security risks, published by OWASP.
FAQ
What do I need to provide?
The instructions you gave your agent, its list of tools, and any rules it follows. Paste it in or upload a file. No code needed.
How is this different from other AI security tools?
Most tools scan for a long list of generic AI problems. ART only tests one thing: can someone make your agent do something you didn't allow? We run 123 tests that cover the exact ways this happens.
Does AI check the AI?
The analysis uses AI. The quality check does not. Every report passes 31 automated code checks. If the evidence is weak, the report gets rejected and re-run.
How long does it take?
A few minutes for a standard scan. Deep analysis takes longer because it runs more tests.
Is the output actually useful?
Each finding shows the trick, which tool is affected, how serious it is, and what to change. You can hand it directly to the person who needs to fix it.
What if my agent already has backend protections?
ART analyzes your prompt and setup, not your backend code. If you already have server-side protections like rate limiting or authentication, the report tells you which findings to check against your actual backend.