Incomplete AI agent evaluation risk for AI
Last updated: May 29, 2025
Description
Evaluating the performance or accuracy or an agent is difficult because of system complexity and open-endedness.
Why is incomplete ai agent evaluation a concern for foundation models?
Insufficient evaluation of an agent’s performance or accuracy can lead to the use of agents that do not perform to expectations. Incorrect agent behavior can result in harms to an agent’s users or others.
Parent topic: AI risk atlas
We provide examples covered by the press to help explain many of the foundation models' risks. Many of these events covered by the press are either still evolving or have been resolved, and referencing them can help the reader understand the potential risks and work toward mitigations. Highlighting these examples are for illustrative purposes only.
Was the topic helpful?
0/1000