0 / 0

Extraction attack risk for AI

Last updated: May 27, 2025
Extraction attack risk for AI
Robustess Icon representing robustness risks.
Robustness: model behavior manipulation
Inference risks
Amplified by generative AI

Description

An extraction attack attempts to copy or steal an AI model by appropriately sampling the input space and observing outputs to build a surrogate model that behaves similarly.

Why is extraction attack a concern for foundation models?

With a successful extraction attack, the attacker can perform further adversarial attacks to gain valuable information such as sensitive personal information or intellectual property.

Parent topic: AI risk atlas

We provide examples covered by the press to help explain many of the foundation models' risks. Many of these events covered by the press are either still evolving or have been resolved, and referencing them can help the reader understand the potential risks and work toward mitigations. Highlighting these examples are for illustrative purposes only.