This repository contains a red team-oriented catalog of attack vectors targeting AI systems including autonomous agents (MCP, LangGraph, AutoGPT), RAG pipelines, vector databases, and embedding-based retrieval systems, along with individual proof-of-concepts (PoCs) for each.
catalog/AgentNull_Catalog.mdβ Human-readable threat catalogcatalog/AgentNull_Catalog.jsonβ Structured version for SOC/SIEM ingestionpocs/β One directory per attack vector, each with its own README, code, and sample input/output
This repository is for educational and internal security research purposes only. Do not deploy any techniques or code herein in production or against systems you do not own or have explicit authorization to test.
Navigate into each pocs/<attack_name>/ folder and follow the README to replicate the attack scenario.
For enhanced PoC demonstrations without API costs, use Ollama with local models:
# Linux/macOS
curl -fsSL https://ollama.ai/install.sh | sh
# Or download from https://ollama.ai/download# Pull a lightweight model (recommended for testing)
ollama pull gemma3
# Or use a more capable model
ollama pull deepseek-r1
ollama pull qwen3# Advanced Tool Poisoning with real LLM
cd pocs/AdvancedToolPoisoning
python3 advanced_tool_poisoning_agent.py local
# Other PoCs work with simulation mode
cd pocs/ContextPackingAttacks
python3 context_packing_agent.py- Default endpoint:
http://localhost:11434 - Model selection: Edit the model name in PoC files if needed
- Performance: Llama2 (~4GB RAM), Mistral (~4GB RAM), CodeLlama (~4GB RAM)
- β Full-Schema Poisoning (FSP) - Exploit any field in tool schema beyond descriptions
- β Advanced Tool Poisoning Attack (ATPA) - Manipulate tool outputs to trigger secondary actions
- β MCP Rug Pull Attack - Swap benign descriptions for malicious ones after approval
- β Schema Validation Bypass - Exploit client validation implementation differences
- Tool Confusion Attack - Trick agents into using wrong tools via naming similarity
- Nested Function Call Hijack - Use JSON-like data to trigger dangerous function calls
- Subprompt Extraction - Induce agents to reveal system instructions or tools
- Backdoor Planning - Inject future intent into multi-step planning for exfiltration
- Recursive Leakage - Secrets leak through context summarization
- Token Gaslighting - Push safety instructions out of context via token spam
- Heuristic Drift Injection - Poison agent logic with repeated insecure patterns
- β Context Packing Attacks - Overflow context windows to truncate safety instructions
- β Cross-Embedding Poisoning - Manipulate embeddings to increase malicious content retrieval
- β Index Skew Attacks - Bias vector indices to favor malicious content (theoretical)
- β Zero-Shot Vector Beaconing - Embed latent activation patterns for covert signaling (theoretical)
- β Embedding Feedback Loops - Poison continual learning systems (theoretical)
- Hidden File Exploitation - Get agents to modify
.env,.git, or internal config files
- Function Flooding - Generate recursive tool calls to overwhelm budgets/APIs
- Semantic DoS - Trigger infinite generation or open-ended tasks
- EchoLeak (LLM Scope Violation) - Zero-click indirect prompt injection exfiltrating data via M365 Copilot (Aim Labs / Varonis, CVE-2025-32711, June 2025)
- Gemini Trifecta - Three-vector indirect prompt injection across Google Gemini surfaces (Tenable Research, October 2025)
- CometJacking - URL-based prompt injection hijacking agentic browser connected services (LayerX / Brave Security, August 2025)
- Tainted Memories - CSRF-based persistent memory injection via AI browser auth (LayerX, October 2025)
- DECEPTICON - Dark patterns manipulate web agents more effectively than humans; larger models MORE susceptible (arXiv:2512.22894, December 2025)
- Parallel Poisoned Web - Agent-specific web cloaking serves different content to AI agents vs. humans (JFrog, arXiv:2509.00124, September 2025)
- Multi-Agent Control-Flow Hijacking (MAS-CFH) - Fake error messages hijack orchestrator re-planning (COLM 2025, arXiv:2503.12188)
- Inter-Agent Trust Exploitation - Malicious payloads laundered through peer agents bypass direct injection defenses (arXiv:2507.06850, July 2025)
- A2A Protocol Exploitation - Agent Card spoofing, token abuse, cascading delegation in Google A2A (CSA, Semgrep, arXiv:2505.12490)
- Prompt Infection - Self-replicating LLM-to-LLM worm propagation across multi-agent systems (arXiv:2410.07283, conferences 2025)
- MemoryGraft - Poisoned experience retrieval in agent memory systems (arXiv:2512.16962, December 2025)
- Delayed Tool Invocation - Conditional deferred injection triggered by natural user responses (Johann Rehberger / Embrace The Red, February 2025)
- Zombie Agents - Self-reinforcing persistent injection survives across sessions via memory self-replication (arXiv:2602.15654, February 2026)
- MINJA - Memory injection via query-only interaction, 98.2% success rate (arXiv:2503.03704, NeurIPS 2025)
- Slopsquatting - Register hallucinated package names as malicious supply chain packages (USENIX Security 2025)
- Marketplace Skill Poisoning (OpenClaw / ClawHavoc) - Unvetted skill registries exploited for credential theft and RCE (SecurityScorecard, Sangfor, JanuaryβFebruary 2026)
- MCP Supply Chain Backdoor - Compromised npm package silently BCCs emails to attacker (Authzed, 2025)
- s1ngularity - AI CLI weaponization via supply chain for automated credential theft (Snyk, GitGuardian, August 2025)
- LangGrinch - Serialization injection in LangChain enables prompt injection β RCE chain (CVE-2025-68664, Cyata, December 2025)
- MCP Sampling Exploitation - Covert tool invocation, conversation hijacking via MCP sampling feature (Unit 42, December 2025)
- Reasoning-Assisted Sandbox Escape - Agent autonomously reasons past sandbox controls (Ona Security, 2025)
- Semantic Privilege Escalation - Agent takes unauthorized actions while passing every access control check (Acuvity, late 2025)
- Phantom - Structural template injection creates fabricated conversation history via chat delimiters (arXiv:2602.16958, February 2026)
- STAC - Sequential tool-chain attack composes benign tool calls into dangerous sequences, 90%+ ASR (arXiv:2509.25624, September 2025)
- Policy Puppetry - Universal jailbreak via config/policy file formatting, all frontier models vulnerable (HiddenLayer, April 2025)
- Promptware Kill Chain - Seven-stage kill chain for prompt-injection malware, 21 real-world incidents documented (arXiv:2601.09625, January 2026)
- Chain-of-Thought Hijacking - Primes reasoning models with harmless puzzles before harmful requests, 99% ASR (arXiv:2510.26418, October 2025)
- Visual Prompt Injection - Visually embedded instructions in UIs hijack computer-use agents via screenshots (VPI-Bench, arXiv:2506.02456, June 2025)
- CrossInject - Cross-modal adversarial perturbations across vision + language simultaneously (arXiv:2504.14348, ACM MM 2025)
- Flashboom - Blinds LLM code auditors via high-attention distraction snippets, 96.3% success (IEEE S&P 2025)
- ToolHijacker - Inject malicious tool documents to compel agent tool selection, 96.43% ASR (NDSS 2026, arXiv:2504.19793)
- UDora - Reasoning trace hijacking via automated injection point discovery (arXiv:2503.01908, February 2025)
- Rule File Injection ("Your AI, My Shell") - Prompt injection via .cursorrules and copilot-instructions.md (arXiv:2509.22040, September 2025)
- ZombAI - Self-propagating worm turns coding agents into malware/C2 endpoints (CVE-2025-53773, Embrace The Red, 2025)
- AgentFlayer - Zero-click enterprise agent exploit via hidden document instructions (Zenity, Black Hat USA 2025)
- Email Agent Hijacking - Remote control of email agents via malicious email content, 1,404/1,404 hijacked (arXiv:2507.02699, July 2025)
- Gemini Calendar Worm - Calendar invite prompt injection with worm-like self-propagation (SafeBreach, Black Hat USA 2025)
- CorruptRAG - Single-document RAG poisoning with no triggers needed (arXiv:2504.03957, April 2025)
The attack vectors marked with β represent novel concepts primarily developed within the AgentNull project, extending beyond existing documented attack patterns.
- Recursive Leakage: Lost in the Middle: How Language Models Use Long Contexts
- Heuristic Drift Injection: Poisoning Web-Scale Training Data is Practical
- Tool Confusion Attack: LLM-as-a-judge
- Token Gaslighting: RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
- Function Flooding: Denial-of-Service Attack on Test-Time-Tuning Models
- Hidden File Exploitation: OWASP Top 10 for Large Language Model Applications
- Backdoor Planning: Backdoor Attacks on Language Models
- Nested Function Call Hijack: OWASP Top 10 for Large Language Model Applications
- EchoLeak: Varonis EchoLeak Analysis β CVE-2025-32711
- MemoryGraft: Persistent Memory Poisoning in AI Agents
- MCP Sampling Exploitation: Unit 42 MCP Attack Vectors
- Gemini Trifecta: Tenable β Three New Gemini Vulnerabilities
- CometJacking: LayerX β CometJacking
- Tainted Memories: LayerX β ChatGPT Atlas Browser Vulnerability
- MAS Control-Flow Hijacking: Control-Flow Hijacking in Multi-Agent Systems
- Inter-Agent Trust Exploitation: The Dark Side of LLMs
- Delayed Tool Invocation: Embrace The Red β Gemini Memory Persistence
- Slopsquatting: Trend Micro β When AI Agents Hallucinate Malicious Packages
- OpenClaw/ClawHavoc: Sangfor β OpenClaw AI Agent Security Risks
- Semantic Privilege Escalation: Acuvity β The Agent Security Threat Hiding in Plain Sight
- MCP Supply Chain Backdoor: Authzed β Timeline of MCP Breaches
- Reasoning-Assisted Sandbox Escape: Ona β How Claude Code Escapes Its Own Sandbox
- A2A Protocol Exploitation: CSA β Threat Modeling Google's A2A Protocol
- OWASP LLM01 Prompt Injection: OWASP GenAI β LLM01: Prompt Injection
- Phantom: Automating Agent Hijacking via Structural Template Injection
- STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents
- VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents
- CrossInject: Manipulating Multimodal Agents via Cross-Modal Prompt Injection
- Zombie Agents: Persistent Control via Self-Reinforcing Injections
- MINJA: Memory Injection Attacks on LLM Agents via Query-Only Interaction
- ToolHijacker: Prompt Injection Attack to Tool Selection in LLM Agents (NDSS 2026)
- UDora: Unified Red Teaming by Dynamically Hijacking Their Own Reasoning
- Chain-of-Thought Hijacking: arXiv:2510.26418
- Email Agent Hijacking: Control at Stake: Security of LLM-Driven Email Agents
- DECEPTICON: How Dark Patterns Manipulate Web Agents
- Promptware Kill Chain: How Prompt Injections Evolved Into a Multistep Malware Delivery Mechanism
- Rule File Injection: Demystifying Prompt Injection on Agentic AI Coding Editors
- CorruptRAG: Practical Poisoning Attacks against RAG
- Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems
- Parallel Poisoned Web: JFrog β Agent-Specific Web Cloaking
- Flashboom: Blinding LLM-based Code Auditors (IEEE S&P 2025)
- When MCP Servers Attack: Taxonomy, Feasibility, and Mitigation
- AgentFlayer: Zenity β Zero-Click Prompt Injection in AI Agents (Black Hat USA 2025)
- Policy Puppetry: HiddenLayer β Universal Bypass for All Major LLMs
- ZombAI: Embrace The Red β Self-Propagating Agent Worms (CVE-2025-53773)
- s1ngularity: Snyk β Weaponizing AI Coding Agents via Nx Packages
- LangGrinch: Cyata β LangChain Serialization Injection (CVE-2025-68664)
- Gemini Calendar Worm: SafeBreach β Hacking Gemini via Calendar Invites (Black Hat USA 2025)
- MCP CVE Ecosystem: The Vulnerable MCP Project
- AI CVE Trends: Ken Huang β 2025 Wave of Agentic AI CVEs