Why Adversarial AI Testing & AI Threat Modeling Need Each Other

AI red teaming is dead - the new paradigm is purple | Edition 43

Feb 09, 2026

∙ Paid

Image: Still on the Hill. Been heads down working in the lab. Announcements coming soon.

One way that AI is arguably different from other software on a system level is the way that red and blue teams–and data teams as well–are forced to interact with one another in order to simply deploy AI successfully.

Much less securely.

Because AI systems have what amounts to an effectively infinite attack surface, testing them for security holes that can actually be remediated is a new type of challenge for security teams.

And this is perhaps why we’ve seen so much security theater in the space.

Much money and time have been ~~spent~~ wasted on spraying thousands of potentially unrelated “attacks” at prompt-based systems and calling it “red teaming” AI.

Of course, this was never actual red teaming, nor was it effective.

And, in my opinion worst of all, it left the untold numbers of predictive AI systems unguarded.

The solution to this problem is a complete overhaul of what it means to “red team” AI systems.

Because the AI attack surface is so large, “test everything” is not going to work as a strategy. This is for 2 reasons.

First of all, we can’t. There isn’t time to test everything that a particular model might be susceptible to. There’s just not. Even if we could find all of the attacks, running against a system would be both time–and cost–prohibitive.

Imagine testing an LLM system for prompt-based attacks. If you’re responsible for costs, are you really willing to pay the tokens for a potentially infinite number of attacks to be launched against your system?

I’d argue that if you are, you really should think twice about that.

Even if we had the time, there is no universe in which this is a cost-effective strategy.

Particularly when the worst outcomes for a prompt-based chatbot breach are usually in the lawsuit category, making them far less expensive potentially than paying for the “red team engagement” itself.

The ROI just isn’t there.

Second of all–and this is really the first problem–we can’t find all the attacks in the space to begin with. So the point is moot.

But I introduce it here to point out to anyone who thinks that these systems are “improving” their security somehow and/or maybe the solutions to these problems are right around the corner, they’re not.

Infinite Attack Surface, Hyperfocused Testing

When the attack surface is infinite (or near-infinite, for our purposes), testing can’t expand to meet it.

This might be the point at which some are tempted to throw up their hands and come to the conclusion that testing itself is pointless.

I want to be very clear that this is, unequivocally, not the case.

Security testing for any software system should be considered mandatory before and during deployment. AI is no exception.

But just like anywhere else, testing needs to matter. Otherwise, it’s useless theater and throwing money down the drain at best.

For AI, the situation is even worse–improper testing hands attackers even more of an advantage beyond just the false feeling of safety.

The OPSEC disadvantages of what’s come to be known as “AI red teaming” in many cases far outweigh any perceived advantage to such testing– and that’s even if the tests themselves mattered.

But they don’t.

So this raises the question: If testing is still important, but the way it’s being done is useless, and the mathematics of the systems make comprehensive testing impossible, what exactly are we supposed to do?

The answer: Start with threat modeling.

What To Test? Threat Models To The Rescue

Threat modeling allows us to understand what can go wrong–and also to decide what to do about it. And this is critical to actually testing AI.

Let’s break it down.

In an infinite attack space, focused testing is key. We can’t test everything, so we need to test what matters.

But how do we know what that is, when search is impossible?

Proper AI threat modeling tells us exactly that.

Rather than sift through a nearly infinite number of attacks by category, threat modeling allows teams to narrow down the massive space, by focusing in on the attacks that would directly impact the system.

Here’s a concrete example: We can’t block every potential data poisoning scenario for a given system. We can’t even know what they all are in the first place–much less defend against all of them.

So knowing that data poisoning is a threat isn’t enough. To monitor for it or apply other pertinent controls requires knowing exactly what a data poisoning attack might look like, in your system.

This narrows the potential attack space–and testing requirements–considerably.

Of course, proper AI threat modeling doesn’t happen by accident. Nor is it the same as modeling threats to traditional software systems.

In fact, AI threat modeling can be thought of as an add-on to robust cybersecurity considerations that every team should already have in place.

When you add AI systems, you add in their security vectors. Your threat models need to adjust accordingly.

And threat modeling for Agentic systems requires its own special considerations.

Still, it’s easy to see why threat modeling early and throughout the AI lifecycle can literally save money, time, tokens, and developer frustration on security testing.

I’ll say it plainly: In my opinion, “red teaming” your AI systems without informed AI threat modeling is a waste of everyone’s time, effort, and money.

To test AI effectively, threat model it first.

Stay frosty.

The Threat Model

An effectively infinite attack surface like AI can’t be fully probed using traditional methods–focused, directed testing is required.

The larger the probabilistic attack surface, the more focused testing needs to become.
To test AI systems effectively, start with threat modeling–otherwise you’re throwing money, and time, away.

Resources To Go Deeper

Mauri, Lara and Ernesto Damiani. “Modeling Threats to AI-ML Systems Using STRIDE.” Sensors (Basel, Switzerland) 22 (2022): n. Pag.
Wilhjelm, Carl and Awad A. Younis. “A Threat Analysis Methodology for Security Requirements Elicitation in Machine Learning Based Systems.” 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C) (2020): 426-433.
Boisvert, L’eo, Mihir Bansal, Chandra Kiran Reddy Evuru, Gabriel Huang, Abhay Puri, Avinandan Bose, Maryam Fazel, Quentin Cappart, Jason Stanley, Alexandre Lacoste, Alexandre Drouin and Krishnamurthy Dj Dvijotham. “DoomArena: A framework for Testing AI Agents Against Evolving Security Threats.” ArXiv abs/2504.14064 (2025): n. Pag.

Executive Analysis, Research, & Talking Points

Why AI Threat Modeling Is Different: Backwards First

AI threat modeling is different. Applying STRIDE or some other methodology developed for traditional cyber threats isn’t going to cut it. This is not a surprise. What is a surprise: That threat modeling AI systems can sometimes work better when it’s backward.

Continue reading this post for free, courtesy of Disesdi Shoshana Cox.

Or purchase a paid subscription.

Angles of Attack: The AI Security Intelligence Brief