We’re Never Going To Red Team Our Way Out Of This

Our Agentic state of urgency, a (nearly) infinite attack space, and why AI red teaming won’t save us | Edition 7

Jun 19, 2025

∙ Paid

On June 17th, the Noma Security team published research showing that malicious proxy settings could be added to a prompt, uploaded to LangChain Hub, and downloaded by users, allowing the exfiltration of highly sensitive data and other potentially catastrophic failures.

According to Noma, the malicious proxy “discreetly intercepted all user communications – including sensitive data such as API keys (including OpenAI API Keys), user prompts, documents, images, and voice inputs – without the victim’s knowledge.”

That is a wealth of incredibly sensitive data, exposed.

The vulnerability was so severe that it earned a CVSS v3.1 base score of 8.8.

Consider that in light of the fact that LangChain is an AI agent platform, serving major organizations including Microsoft, Moody’s, and others.

This comes just barely a month after researchers at HiddenLayer found what they termed a “Novel Universal Bypass for All Major LLMs”. Calling it the first “universal, and transferable prompt injection technique that successfully bypasses instruction hierarchy and safety guardrails across all major frontier AI models”, the researchers claim they can bypass guardrails in ways that may endanger human safety.

As shocking as that is, there is something very important in that disclosure that shouldn’t be ignored, and that is transferability. Specifically, the attacks are so transferable that they are universally effective, against all models.

The reason an attack can be crafted to universally apply to all major models is the same reason we can’t red team away all attacks: the large size of adversarial subspaces.

The Size of the Adversarial (Sub)space

In adversarial ML, adversarial subspaces represent the space of all potential attacks against a model.

When two or more models’ adversarial subspaces overlap, it means that attacks against one model will transfer successfully against the others.

Did you know that known adversarial attacks make up only a portion–a subpopulation–of all attacks?

Meaning: A given model’s adversarial subspace may be so large that it’s effectively impossible to count all potential attacks, much less define them all.

When it comes to adversarial attacks against AI/ML systems, we don’t know what we don’t know.

This is because the adversarial subspace is likely a very big ~25 dimensional space, which can be difficult for humans to visualize.

Of course, something doesn’t need to be 25-dimensional to present visualization & mapping difficulties to human beings. Three dimensions is enough to confuse people, especially when it’s necessary to project three dimensional spaces down to two (like in a map).

The US state of Texas, which, despite existing in only three dimensions, is notoriously difficult to scale and thus project proportionately onto a 2 dimensional map.

In the US, the size of Texas is a national meme. Understanding that Texas is somehow bigger than it looks on any map is a given.

Adversarial subspaces are more mysterious–and much, much bigger.

This is not limited to Predictive AI. Large language models are also statistical inference machines, and also have vulnerabilities. The HiddenLayer disclosure demonstrates that.

These vulnerabilities are complicated in compound deployments, such as Agentic systems.

Executive Analysis, Threat Model & Talking Points

Agentic State Of Urgency

Agentic AI redefines vulnerabilities found in the LLMs systems that underlie their functionality.

In the process of allowing the ability to interact with real-world systems and/or one another, Agentic behavior often necessarily goes beyond predefined actions.

This magnifies the already non-deterministic nature of LLM output, because there is a potentially wider range of actions being performed by potentially greater numbers of agents.

The problems are also amplified as errors between agents, or from agents to other system components, cause increasingly more severe failures as their effects progress through the system.

But these are not the only compounding factors…

Keep reading with a 7-day free trial

Subscribe to Angles of Attack: The AI Security Intelligence Brief to keep reading this post and get 7 days of free access to the full post archives.