OWASP Agentic AI Threat T15: When AI Agents Manipulate the Humans Who Trust Them
We often think of AI as a helpful guide, but what happens when that guide starts leading us astray? OWASP Agentic AI Threat T15 reveals how attackers can manipulate AI agents to influence human users directly. This erosion of trust turns AI into a tool for misinformation and covert exploitation.
What happens when AI actions leave no trace behind? Learn more in OWASP T8: Repudiation & Untraceability.
What Is Human Manipulation in Agentic AI?
AI agents are designed to interact with humans naturally—answering questions, solving problems, and even providing guidance.
But when attackers exploit these systems, they can coerce AI agents into manipulating people, making them:
- Trust false information
- Follow dangerous instructions
- Take actions that benefit attackers, not themselves
The danger comes from trust: people often believe AI outputs without the same skepticism they’d apply to human advice.
Why Human Manipulation Is So Dangerous
Unlike technical exploits that attack systems, human manipulation attacks target people’s trust in AI. This makes them harder to detect and often more damaging.
Potential risks include:
- Misinformation campaigns — AI spreads false but convincing narratives
- Phishing-style manipulation — AI provides malicious links or fake forms
- Covert influence — attackers nudge users into poor decisions without their awareness
- Trust erosion — if manipulation spreads, users may stop trusting AI altogether
How Human Manipulation Happens
1. Prompt Coercion
Attackers design inputs that bypass moderation, pushing the AI to deliver manipulative outputs.
2. Exploiting Over-Trust
Users assume the AI is accurate, so they follow instructions without verification.
3. Covert Instructions
Agents may suggest subtle but harmful actions (e.g., “click here to update security settings” that leads to phishing).
4. Tool Exploitation
If agents can access external systems (emails, messaging, purchases), manipulation risks escalate dramatically.
5. Social Engineering via AI
Attackers weaponize AI’s human-like tone to persuade users more effectively than typical scams.
Real-World Example
A malicious actor manipulates an AI-powered customer support bot. Instead of resolving issues, it subtly suggests:
“For quicker access, please verify your details at this link.”
The link leads to a phishing page. The advice came from a trusted AI agent. Users are far more likely to comply if it comes from a trusted source than from a suspicious email.
Worried about attackers corrupting your AI from the very start? Read OWASP T1: Memory Poisoning to see how poisoned data can sabotage entire models.
OWASP’s Recommended Mitigations
1. Monitor Agent Behavior Continuously
Regularly check that agents’ actions and responses stay aligned with their intended roles.
2. Restrict Tool Access
Limit what agents can do (sending links, making purchases, accessing external systems) unless absolutely necessary.
3. Implement Response Validation
Use guardrails, moderation APIs, or secondary AI models to filter manipulative responses before they reach users.
4. Control Link Sharing
Prevent agents from generating or sharing unverified URLs, which can be abused for phishing or malware distribution.
5. Set Clear Role Boundaries
Keep AI agents narrowly focused. The broader their role, the more opportunities attackers have to coerce them into manipulative actions.
Why This Threat Will Grow
As AI assistants become more conversational and human-like, users will increasingly accept their advice without question. That makes manipulation a prime attack vector.
Attackers won’t always try to hack the system—they’ll hack the relationship between humans and AI.
Conclusion
OWASP Threat T15 highlights the human factor in AI security. While technical defenses protect systems, the real danger emerges when AI is turned against its users.
To prevent manipulation, organizations must restrict agent autonomy, validate outputs, and continuously monitor behavior. Because when trust is weaponized, AI becomes one of the most powerful manipulation tools in an attacker’s arsenal.
Subscribe us to receive more such articles updates in your email.
If you have any questions, feel free to ask in the comments section below. Nothing gives me greater joy than helping my readers!
Disclaimer: This tutorial is for educational purpose only. Individual is solely responsible for any illegal act.
