[2510.23883] Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
paperCredibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
Metadata
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Heavy Scaffolding / Agentic Systems | Concept | 57.0 |
Cached Content Preview
HTTP 200Fetched Apr 30, 202698 KB
# Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Shrestha Datta
Shahriar Kabir Nahin
Anshuman Chhabra
Corresponding Author.
Prasant Mohapatra
###### Abstract
Agentic AI systems powered by large language models (LLMs) and endowed with planning, tool use, memory, and autonomy, are emerging as powerful, flexible platforms for automation. Their ability to autonomously execute tasks across web, software, and physical environments creates new and amplified security risks, distinct from both traditional AI safety and conventional software security. This survey outlines a taxonomy of threats specific to agentic AI, reviews recent benchmarks and evaluation methodologies, and discusses defense strategies from both technical and governance perspectives. We synthesize current research and highlight open challenges, aiming to support the development of secure-by-design agent systems.
## 1 Introduction
Artificial Intelligence (AI) has become one of the most transformative technologies of the twenty-first century \[ [1](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib1 "")\]. From early rule-based expert systems \[ [2](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib2 "")\] to modern deep learning architectures \[ [3](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib3 "")\], AI has steadily expanded in both capability and scope. Traditionally and over the past decade, AI has excelled at narrow, task-specific applications such as image classification, speech recognition, recommendation systems, and predictive analytics \[ [4](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib4 ""), [3](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib3 "")\]. These systems typically operate within well-defined boundaries and are optimized for performance on constrained datasets, but lack the ability to flexibly adapt beyond their original input/output designs.
Recently, the advent of large language models (LLMs), such as OpenAI’s GPT \[ [5](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib5 ""), [6](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib6 "")\] and Meta’s LLaMA \[ [7](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib7 "")\], has marked a paradigm shift for AI models. Trained on vast corpora of text (and now, even multimodal data), these models exhibit impressive generalization abilities and can generate coherent, contextually relevant responses across a wide range of domains \[ [8](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib8 ""), [9](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib9 "")\]. LLMs have enabled breakthroughs in conversational agents, code generation, content summarization, and multimodal reasoning \[ [10](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib10 ""), [11](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib11 ""), [12](https://ar5iv.labs.arxiv.org/html/2510.23883#bib.bib12 "")\]. Moreover, by design, most LLM deployments remain passive: they respond to input prompts con
... (truncated, 98 KB total)Resource ID:
618c4cd9417d12f3 | Stable ID: sid_5OKLjX2esw