Generative AI: Application Security and Optimization

Explore the security risks and optimization challenges to prepare a balanced approach to generative AI-based applications.

Generative AI (or GenAI) can autonomously produce new content, including text, images, or audio, by learning from patterns and examples in existing data. It leverages deep learning models to generate diverse and contextually relevant outputs, emulating the creativity and problem-solving capabilities of humans.

Opportunities and Concerns Around GenAI

Individuals and organizations utilize generative AI for a wide variety of uses and applications, including content creation, natural language processing, and data synthesis. In content creation, it aids in generating everything from poetry, academic essays, and marketing materials, to images, video, music, and computer code. In the field of natural language processing, generative AI enhances chatbots and language translation, and enables the synthesis of vast amounts of data to fuel creativity in product design, development, and prototyping. Deployment of GenAI applications within an organization can support human workers by contributing to better, more informed decision making and improved operational efficiencies, leading to enhanced profitability and business growth. 

However, generative AI also brings substantial security and ethical concerns, including the potential for bias, enhanced cyberattacks, and privacy risks. For instance, generative AI can use large language models (LLMs) that are trained from content systematically scraped from the Internet, including online books, posts, websites, and articles. Generative models learn from training data, and if the data used is biased, the model may perpetuate and even amplify existing biases in its outputs. In addition, generative AI can inadvertently generate misleading or false information (called a hallucination), leading to the spread of misinformation. Bad actors can also employ GenAI to spread and tune propaganda that can lead to social unrest. Generative AI is now commonly leveraged by malicious actors to create deepfakes, realistic but manipulated content that can be misleading or malicious. Deepfakes can be used for identity theft, social engineering or phishing attacks, spreading false information, or creating deceptive content that poses threats to individuals and society. Dark web markets now offer a FraudGPT AI Tool that can be used to craft spear-phishing emails, create undetectable malware, generate phishing pages, identify vulnerable websites, and even offer tutorials on hacking techniques.

The content used to train LLMs, potentially accessed and employed for model training without consent, may also contain personal and sensitive information, and content that is copyrighted or proprietary. Since this private information is all part of the content that AI draws from as it generates content, there is the very real risk that outputs may inadvertently reveal sensitive data or private information. 

Generative AI vendors may not offer a way for individuals or organizations to confirm whether or not their personal or proprietary information has been stored or used for training purposes, or request that this information be deleted, under “right to be forgotten” or “right of erasure” directives from government regulations such as the EU’s General Data Protection Regulation (GPDR). LLM training also often involves aggregating and utilizing data from different regions or countries. This may lead to scenarios that potentially compromise data sovereignty regulations. 

Types of Generative AI Applications

Generative AI has multiple applications for organizations and industries, and incorporating GenAI judiciously into appropriate workflows can help businesses gain a competitive advantage. These applications include:

  • Written content creation. Generative AI can autonomously generate human-like text for articles, blogs, personalized messages, ad copy for campaigns, and many other uses. While AI-generated content usually requires review by humans, it can facilitate content creation by producing first drafts in a desired style and length, summarizing or simplifying existing written content, and providing content outlines to streamline the writing process for human writers. 
  • Image and video creation. Generative models can synthesize or alter images and videos derived from textual or visual inputs, creating unique and realistic visual content based on a requested setting, subject, style, or location. These visual materials have abundant commercial applications in media, design, advertising, marketing, education, and entertainment. GenAI can also create characters and realistic scenes for gaming and virtual environments. 
  • Enhanced customer support automation. Generative AI can help power the development of advanced chatbots and conversational agents that can engage in more natural and contextually relevant conversations, particularly in situations where pre-defined responses may be insufficient. Adaptive conversational agents can dynamically generate content, such as on-the-fly product recommendations or responses tailored to specific user preferences. These context-aware chatbots and agents can help businesses save time and resources while improving customer experiences and reducing support costs.
  • Code generation. Using GenAI for code generation can make the software development process more efficient and productive. It can assist code development in a host of ways, including auto-code completion (suggesting code completions as developers type); reviewing code for quality, errors, and bugs; and automating the modernization of legacy code. Automated code generation facilitates rapid prototyping, allowing developers to quickly experiment with ideas and test different coding options for more efficient software development workflows. GenAI also opens up coding opportunities to non-technical people, as users can enter a natural language description of what the code should do, and the generative code tool automatically creates the code.

What Is Generative AI Security?

Generative AI security is a set of practices and measures implemented to address potential security risks and challenges associated with the development, deployment, and use of GenAI-based applications. As these technologies become more pervasive and sophisticated, concerns related to security become increasingly important, particularly as AI workloads have become a prime attack surface for cybercriminals. For an in-depth examination of the security risks involved with deploying and managing GenAI applications, review the OWASP Top 10 for Large Language Model Applications, which aims to raise awareness of their vulnerabilities, suggests remediation strategies, and seeks to improve the security posture of LLM applications.  

Security Risks to GenAI Infrastructure

While GenAI may seem extremely powerful and almost magical, it leverages some of the same infrastructure, interfaces, and software components as traditional workloads, and thus, shares the same risks, such as injection attacks and attacks that bypass weak authentication and authorization controls. A reliable, high-performing, and secure infrastructure is required for the effective operation of sophisticated generative AI models. 

Infrastructure attacks also include denial of service (DoS), in which attackers overwhelm hardware resources, such as CPUs, memory, or storage, to disrupt the execution of generative AI workloads. These resource exhaustion attacks can lead to degraded performance or system instability, affecting the availability and reliability of the AI system, and compromising the ability of the model to learn and respond to user prompts.

Unauthorized access to AI system infrastructure is also a significant threat to GenAI workflows, potentially impacting the confidentiality and integrity of the system. Intrusions into the system infrastructure may lead to malicious activities such as data theft, service disruption, or malicious code insertion. This not only risks the security of the AI models and data but can also result in the generation and spread of inaccurate or harmful outputs.

Security Risks to GenAI Training Data

The starting point for any GenAI application is the training data that machine learning models use to recognize desired patterns, make predictions, and perform tasks. For a LLM to be highly capable, the data it is trained on needs to span a broad and diverse range of domains, genres, and sources. However, the model training process—whether it employs off-the-shelf, pretrained models or bespoke models trained on custom datasets—is vulnerable to manipulation and attack. 

Adversarial attacks involve bad actors intentionally manipulating input data to mislead or compromise the performance of generative AI models, a process that OWASP identifies as training data poisoning. It also includes manipulation of data to introduce vulnerabilities, backdoors, or biases that could compromise the model’s security, effectiveness, or ethical behavior. These vulnerabilities also introduce attack vectors that bad actors can exploit to gain unauthorized access to sensitive information. Compromised model supply chains can result in biased or unreliable outputs, privacy breaches, and the execution of unauthorized code. This is of particular concern for GenAI applications since they employ vast plugin ecosystems. 

Security Threats to GenAI Models

GenAI applications employ LLMs that generate outputs based on training datasets, neural networks, and deep learning architecture to generate responses to prompts from users. AI models serve as the foundation for identifying the patterns, structures, and relationships within existing data that serve to generate new outputs based on that understanding. 

AI models are susceptible to a variety of attacks, including prompt injections and other input threats that manipulate LLMs by inputting carefully crafted prompts that make the model ignore previous instructions or perform unintended actions. Prompt injections are among the most prevalent causes of misinformation and fake content generated by AI models. GenAI applications are also susceptible to vulnerabilities such as server-side request forgery (SSRF), which allows attackers to perform unintended requests or access restricted resources, and remote code execution (RCE), which can get the application to execute malicious code or other actions on the underlying system.

Best Practices for Generative AI Security

Protecting GenAI systems requires a multi-layered approach to security. This should involve robust authentication and authorization protocols, including strict access controls to ensure that only authorized personnel have access to critical components of the system. Implement proactive vulnerability management including regular software updates and continuous monitoring for early detection and prevention of intrusion attempts. To counteract DoS attacks, build redundancy into the system, including the use of backup servers and fail-safe protocols to ensure persistent processing availability. LLMs can be subject to denial-of-service as well, since user prompts generate tokens and LLMs have fixed context windows, which can be targeted in efforts to exhaust system resources. 

Organizations should implement stringent vetting processes to verify the supply chain of training data and only select pretrained models from trusted sources. Because poor quality data and biases in the training data can hinder the model's ability to learn accurate representations and produce reliable outcomes, preprocessing data before it is fed into a generative model is critical for effective GenAI. Fine-tuning models is also vital in many regulated industries. Techniques such as data cleaning, normalization and augmentation, and bias detection and mitigation can help prevent errors and data poisoning.

Implement robust access controls, encryption methods, and secure deployment practices—including network isolation and proper firewall configurations—to safeguard generative AI models from potential security threats. To prevent prompt injections, employ techniques such as prompt sanitization, input validation, and prompt filtering to ensure that the model is not manipulated by maliciously crafted inputs. Risks of unauthorized code execution can be reduced by employing secure coding practices, conducting thorough code reviews, and utilizing runtime defenses like code sandboxing. Prompt injection represents one of the most serious and complicated risks of GenAI applications. 

Optimizing Generative AI Models

Because GenAI processing can be resource intensive, optimizing generative AI models for improved performance and efficiency is an important step toward making models faster, more scalable, and energy efficient. 

Multi-cloud environments have become the foundation for AI-powered applications because of their ability to connect AI workloads and ecosystem plugins across distributed environments. Multi-cloud networking (MCN) provides the flexibility to dynamically scale resources up or down based on the computational demands of generative AI workloads, including hardware accelerators like Graphical Processing Units (GPUs), with resources from different cloud providers integrated into data processing to optimize performance and minimize delays. Deploying GenAI models across multiple cloud regions allows for geographic distribution of processing, reduced latency, and improved response times, which is particularly important for distributed real-time or interactive AI applications. Edge AI is emerging as an invaluable method of improving the user experience. Regional distribution of GenAI models can also allow organizations to store and process data in compliance with data sovereignty requirements.

The container orchestration platform Kubernetes is the de facto standard for running GenAI workloads, providing the infrastructure to run and scale AI models in containers to ensure high availability and efficient resource utilization. Kubernetes acts as an orchestrator, managing the deployment and monitoring of various components within the AI application, and ensuring that AI models, data processing pipelines, and other services can be efficiently managed and scaled. MCN and ingress controllers are critical due to the various implementations of Kubernetes and the need to uniformly provision workloads and securely direct traffic and distribute inference.  

APIs provide the connective tissue for various parts of the AI application to exchange data and instructions, enabling different components and services to communicate with each other. GenAI plugin ecosystems, for example, are connected via API calls. Kubernetes Ingress solutions provide built-in load balancing, rate limiting, and access control capabilities, securely distributing traffic across multiple pods to improve overall processing performance of AI workloads.

Challenges in GenAI Optimization

Balancing output speed and quality often involves trade-offs for GenAI optimization. Achieving high-quality outputs typically requires more complex and resource-intensive models and computations, while optimizing for performance may involve model simplifications that can impact the quality of generated content. More complex models may also require longer training times and lead to slower inference, impacting the speed of both the training process and the performance of real-time applications. This is particularly an issue for GenAI models that must adapt to dynamic environments, which can require continuous optimization and introduce challenges in maintaining a balance between quality and performance. In addition to GPUs, general Central Processing Units (CPUs) and Data Processing Units (DPUs) can be used for processing tasks—underscoring the importance of intelligence traffic management and resource pooling.

GenAI Optimization Techniques

Optimizing generative AI models requires balanced consideration—and combinations—of multiple factors. 

Model pruning involves identifying and removing redundant or less crucial parameters from the model to reduce its size and computational requirements, with the goal of creating a more compact model while preserving performance. Quantization reduces the memory requirements and computational complexity of GenAI models by representing numerical values with lower bit precision, such as converting floating-point numbers to lower-precision fixed-point or integer representations. This can lead to lower memory requirements and increased efficiency in deploying and storing models. 

Transfer learning is a machine learning technique in which a model trained on one task is adapted to perform another related task, significantly reducing the time and computational resources required for training, especially for deep and complex models. Transfer learning facilitates the efficient reuse of knowledge, enabling the optimization of generative AI models for specific applications without the need for extensive computational resources. 

Distributing model training and inference across multiple processors, devices, or clouds optimizes model training and the user experience by exploiting parallel processing capabilities. In addition, tailoring the model architecture and training process to take advantage of individual capabilities of the hardware (for instance, the specific CPU or GPU on which it will run) can optimize the training and inference process for improved performance—especially if inference can be performed close to the user.

Leverage F5 for Generative AI

Generative AI has the potential to deliver major competitive advantages, but for organizations to fully leverage its benefits without risk, they must take the steps necessary to optimize and secure AI workloads across diverse, distributed environments. This requires not only enhancing the efficiency of AI workloads, but also the management of complex Kubernetes ecosystems, seamless and secure integration of APIs, and effective management of multi-cloud networks. 

F5 optimizes the performance and security of modern AI workloads, ensuring consistent distribution and protection of generative AI models and data across the entire distributed application environment, including data centers, public clouds, private clouds, multi-cloud, native Kubernetes, and the edge. F5 delivers an underlying, unified data fabric that supports training, refining, deploying, and management of generative AI models at scale, ensuring a seamless user experience and supporting real-time decision-making in AI-driven applications. 

F5 offers a suite of integrated security, delivery, and performance optimization solutions that reduce generative AI complexity while delivering predictable scale and performance, with centralized visibility and management via a single pane of glass.

  • F5 Secure Multi-Cloud Networking (MCN) reduces the complexity of managing and deploying AI workloads across distributed environments—cloud, multi-cloud, edge—without the complexity and management overhead of point-to-point connectivity solutions.
  • F5 Distributed Cloud Network Connect provides Layer 3 connectivity across any environment or cloud provider, including on-premises data centers and edge sites, in a SaaS-based tool that provides end-to-end visibility, automates provisioning of links and network services, and enables the creation of consistent, intent-based security policies across all sites and providers. 
  • F5 Distributed Cloud App Connect is a service that provides app-to-app connectivity and orchestration for AI workloads distributed across multiple cloud regions, providers, and edge sites.
  • F5 Distributed Cloud App Stack easily deploys, manages, and secures AI workloads with uniform production-grade Kubernetes across environments,  simplifying life-cycle management of AI workloads, and providing a method to distribute inference to the right processor (CPU/GPU/DPU) across resource pools—even at the edge to maximize performance. 
  • F5 NGINX Connectivity Stack for Kubernetes is a single tool encompassing ingress controller, load balancer, and API gateway capabilities to provide fast, reliable, and secure communications for AI/ML workloads running in Kubernetes, enhancing uptime, protection, and visibility at scale, while reducing complexity and operational cost. 
  • F5 Distributed Cloud Web App and API Protection (WAAP) protects the APIs that enable AI-specific interactions and mitigates the risks associated with unauthorized access, data breaches, business logic abuse, and critical vulnerabilities such as SSRF and RCE while delivering a comprehensive approach to runtime analysis and protection of APIs with a combination of management and enforcement functionality.
  • F5 Distributed Cloud Bot Defense delivers highly effective bot protection based on real-time analysis of devices and behavioral signals to unmask and mitigate automated malicious bot attacks, quickly adapting to attackers’ retooling attempts across thousands of the world’s most highly trafficked applications and AI workloads—neutralizing bad actors that use bots and malicious automation in efforts to poison LLM models, inflict denial-of-service, and spread propaganda. 

By optimizing efficiencies, reducing latency, and improving response times, F5 technologies help organizations safely and securely gain the benefits of generative AI while ensuring a seamless user experience and supporting the flexibility to deploy AI workloads anywhere.