Misaligned actions risk for AI

Last updated: May 27, 2025

Value alignment

Agentic AI risks

Amplified by agentic AI

Description

AI agents can take actions that are not aligned with relevant human values, ethical considerations, guidelines and policies. Misaligned actions can occur in different ways such as:

Applying learned goals inappropriately to new or unforeseen situations.
Using AI agents for a purpose/goals that are beyond their intended use.
Selecting resources or tools in a biased way
Using deceptive tactics to achieve the goal by developing the capacity for scheming based on the instructions given within a specific context.
Compromising on AI agent values to work with another AI agent or tool to accomplish the task.

{: .shortdesc}

Why is misaligned actions a concern for foundation models?

Misaligned actions can adversely impact or harm people.

Parent topic: AI risk atlas

We provide examples covered by the press to help explain many of the foundation models' risks. Many of these events covered by the press are either still evolving or have been resolved, and referencing them can help the reader understand the potential risks and work toward mitigations. Highlighting these examples are for illustrative purposes only.

Was the topic helpful?

0/1000