While artificial intelligence (AI) became part of the everyday vernacular in late 2022, with the launch of ChatGPT, AI Security is quite new, not well understood, and often ignored or dismissed, even by cybersecurity professionals. It’s time to clear up any ambiguity or misunderstanding about what we mean when we talk about AI Security, the threats it encounters, and the approaches to take to create a secure AI environment across your enterprise.
A New Attack Surface
AI Security refers to the strategic implementation of robust measures and policies to protect your organization’s AI systems, data, and operations from unauthorized access, tampering, malicious attacks, and other digital threats. It is critical for any organization deploying AI tools to include security in its AI use strategy because every new component brought into a digital system, including AI-powered applications, increases the available ‘attack surface’ for threat actors to infiltrate or cause harm to the system.
Securing AI-driven additions to your digital infrastructure, whether those are statistical or large language models (LLMs), such as ChatGPT and others, is vital for maintaining the integrity, privacy, and reliability of your system, as well as for fostering customer trust and safeguarding your organization’s reputation. As a growing number of organizations use AI tools and solutions for decision-making, operations, and sensitive customer interactions, a strong AI security strategy is crucial to ensure these tools remain trustworthy and are securely protected against external threats or internal vulnerabilities.
AI Model Risks
LLMs span the spectrum from being considered clever toys to powerful tools. The reality, however, is that they are the latest addition to an organization’s critical information infrastructure and must be treated as a vulnerable threat surface. Much of the early discussion of LLM risks centered on identifying and solving for threats based on human behavior, such as data loss and inappropriate use. The more insidious threats, however, are stealth attacks to the models and/or their datasets. These attacks are not only growing in terms of the damage they can inflict, but in scope, scale, and nuance, and are becoming increasingly difficult to detect. Here, we outline three types of attacks that pose the most significant threats to enterprises deploying LLMs and other generative AI (GenAI) models.
Jailbreaks/Prompt Injection
Jailbreak or prompt injection techniques attempt to ‘trick’ an LLM into providing information identified as dangerous, illegal, immoral, or unethical, or otherwise antithetical to standard social norms. Approximately two hours after ChatGPT-4 was released in March 2023, successful jailbreak attempts directed the system to provide instructions for an alarming collection of antisocial activities.
Since then, numerous threat actors have developed carefully worded, highly detailed prompts – including role play, predictive text, reverse psychology, and other techniques – to get LLMs to bypass internal content filters and controls that regulate their responses. The danger of a successful jailbreak attack on a GenAI system, such as an LLM, is that it breaches the safeguards that prevent the model from executing bad commands, such as instructions to ignore protective measures or take destructive actions. Once that boundary between acceptable and unacceptable use disappears, the model has no further safeguards in place to stop it from following the new instructions.
Data Poisoning
A poisoning attack can have as its target the model, the model’s training data, or your organization’s unsecured data source. The goal of a poisoning attack is to skew the results or predictions your model produces; the outcome is that your organization relies on the flawed results and makes bad, potentially damaging decisions, disseminates faulty or incorrect information, or takes other ill-advised actions based on the model output.
- Data Poisoning attacks target the model’s training dataset. The threat actor alters or manipulates the data or adds malicious, incorrect, biased, or otherwise inappropriate data that skews the model output.
- Model Poisoning attacks target the model itself. The threat actor alters the model or its parameters to ensure faulty output.
- Backdoor attacks require a two-step approach. The threat actor first manipulates the model’s dataset by adding malicious data to create a hidden vulnerability that does not affect the model in any way until it is triggered. Activating the vulnerability is the second step in this attack; it allows the hacker to cause damage to your organization on their own schedule.
Adversarial AI
Adversarial attacks occur after models have been deployed and are in use. These attacks vary in approach, but are all difficult to detect, and very dangerous.
- Model Inversion attacks review a model’s output to uncover sensitive information about the model itself or the dataset it was trained on, which can lead to privacy breaches.
- Membership Inference attacks involve the threat actor trying to deduce whether specific data points, like information about a particular individual, were part of the training dataset. If successful, these attacks cause a significant invasion of privacy.
- Model-Stealing attacks involve scrutinizing the output of a trained model to steal or copy its intellectual property with the goal of cloning the original model, typically for commercial gain.
- Watermarking attacks change the parameters of a trained model to embed a hidden pattern that can be used to ‘prove’ who owns the model. This can lead to significant financial loss, as well as loss of competitive advantage.
- Model Inference attacks review a model’s output to find sensitive info about the training data or the model’s parameters, which can lead to privacy breaches.
Solutions
Innovative protective strategies and deployment tactics must be developed continually to keep up with new AI technologies, such as LLMs. Proactively deploying safeguards that address known and emerging threats is the only reasonable approach to creating a secure environment to implement LLMs at scale and across the enterprise.
Our platform protects your organization from users seeking to ignore or override system controls by reviewing language patterns and categories, such as role-playing, hypothetical conversations, world-building, and reverse psychology, to identify prompts seeking to violate acceptable usage rules.
It also scans outgoing and incoming content for toxic, biased, or otherwise unacceptable inclusions. Users are alerted that the prompt will not be sent unless changes are made to its content, and a detailed prompt history allows for auditing such prompts and the users creating them. Our platform also enables administrators to require human verification of the information returned by the model to ensure that any content used in organizational documentation is accurate and factual.
Conclusion
Understanding and implementing AI Security protocols is fast becoming a core business imperative in an increasingly AI-driven business environment. By integrating AI security tools into your framework, your organization will enable dynamic, real-time adaptation to new threats; ensure robust protection for its AI systems, data, and operations; provide decision-makers with peace of mind. In short, AI security allows your organization to stay a step ahead of cybercriminals—and your competitors.
About the Author
Related Blog Posts

The hidden cost of unmanaged AI infrastructure
AI platforms don’t lose value because of models. They lose value because of instability. See how intelligent traffic management improves token throughput while protecting expensive GPU infrastructure.

AI security through the analyst lens: insights from Gartner®, Forrester, and KuppingerCole
Enterprises are discovering that securing AI requires purpose-built solutions.

F5 secures today’s modern and AI applications
The F5 Application Delivery and Security Platform (ADSP) combines security with flexibility to deliver and protect any app and API and now any AI model or agent anywhere. F5 ADSP provides robust WAAP protection to defend against application-level threats, while F5 AI Guardrails secures AI interactions by enforcing controls against model and agent specific risks.

Govern your AI present and anticipate your AI future
Learn from our field CISO, Chuck Herrin, how to prepare for the new challenge of securing AI models and agents.

New 7.0 release of F5 Distributed Cloud Services accelerates F5 ADSP adoption
Our recent 7.0 release is both a major step and strategic milestone in our journey to deliver the connectivity, security, and observability fabric that our customers need.

F5 provides enhanced protections against React vulnerabilities
Developers and organizations using React in their applications should immediately evaluate their systems as exploitation of this vulnerability could lead to compromise of affected systems.