Generative AI’s Impact on Application Performance

F5 Ecosystem | May 21, 2024

Lori Mac VittieDistinguished Engineer and Chief Evangelist | F5

Studies abound on the impatience of consumers. They’ll dump an app, delete an app, and complain loudly on social media if an app performs poorly. And to many, poorly means “responds in more than a couple seconds.”

Enter generative AI—which, based on experience and benchmarks, generally takes far longer than a couple of seconds to respond. But, like our text conversations with friends and family, while chatbots are ‘thinking,’ we are thoughtfully presented with an animated “…” to indicate a response is forthcoming.

For some reason, the animation activates a nearly digital Pavlovian reaction that makes us willing to wait. Perhaps the reason lies in the psychology of anthropomorphism, which tends to make us look kindlier upon non-humans imbued with human-like personality. Thus, because we perceive the AI as at least human-like, we grant it the same grace we would grant, well, a human being.

Whatever the reason behind our willingness to wait for today’s AI user experience, it raises the question about how far that grace will go, and for how long? As more and more applications become integrated, augmented, and imbued with AI capabilities, the questions about acceptable performance become more and more important to answer.

Just how much latency is acceptable for an AI user experience? Does it matter where that latency is introduced, or is it only acceptable when we know there’s generative AI involved?

This is an important area to examine because we know that one of the taboos of application security is the introduction of latency into the process. Despite the reality that requires latency to inspect and evaluate content against known threats—SQLi, malicious code, prompt injection—users of application security services are quick to shut down any solution that causes performance to degrade.

I give you exhibit A, the responses to a question on this topic from our 2022 State of Application Strategy survey, in which just about 60% of both IT and business leaders would turn off security controls for between a 1% and 50% gain in performance.

Clearly performance matters and latency is viewed as a Very Bad Thing™. So, the question becomes just how much latency is acceptable for the AI user experience? Are the old measures of “response must be less than X seconds” still applicable? Or is AI pushing that limit further out for all apps, or merely for those that are obviously AI.

And if our patience is only an initial reaction, partially due to the novelty of generative AI, what do we do when the novelty wears off?

If, as is currently the trend, inferencing becomes faster, perhaps the question will be moot. But it if does not, will the components and services that deliver, secure, and support AI need to be even faster to make up for how slow inferencing is?

This is how fast the industry is moving. We have questions that generate more questions and before we have answers, new questions arise. The backlog of unanswered questions looks like trouble tickets in an enterprise where someone unplugged a core switch and then all of IT left for the day.

We know that app delivery and security is going to change because of AI. Both in the needs of those who want to use AI to augment customer and corporate operations, and in those who build the solutions for them. The obvious solutions—AI gateways, data security, and defenses against traditional attacks like DDoS—are easy to answer, and we are already on that. But understanding the long-term impact is a much more difficult task, especially when it comes to performance.

Because the other reality is that hardware is only going to get us so far before we run into physical constraints, and then it will be up to the rest of the industry to figure out how to improve the performance of what will certainly be a critical component of every business.

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Tags: Office of the CTO, 2024, Generative AI, Application Delivery, Application Security, DDoS

About the Author

Lori Mac VittieDistinguished Engineer and Chief Evangelist | F5

More blogs by Lori Mac Vittie

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Related Blog Posts

F5 Ecosystem | 11/24/2025

Multicloud chaos ends at the Equinix Edge with F5 Distributed Cloud CE

Simplify multicloud security with Equinix and F5 Distributed Cloud CE. Centralize your perimeter, reduce costs, and enhance performance with edge-driven WAAP.

API,

F5 Ecosystem | 10/22/2024

At the Intersection of Operational Data and Generative AI

Help your organization understand the impact of generative AI (GenAI) on its operational data practices, and learn how to better align GenAI technology adoption timelines with existing budgets, practices, and cultures.

F5 Ecosystem | 12/19/2022

Using AI for IT Automation Security

Learn how artificial intelligence and machine learning aid in mitigating cybersecurity threats to your IT automation processes.

Office of the CTO,

2022

F5 Ecosystem | 02/24/2022

Most Exciting Tech Trend in 2022: IT/OT Convergence

The line between operation and digital systems continues to blur as homes and businesses increase their reliance on connected devices, accelerating the convergence of IT and OT. While this trend of integration brings excitement, it also presents its own challenges and concerns to be considered.

Office of the CTO,

2022

F5 Ecosystem | 10/05/2020

Adaptive Applications are Data-Driven

There's a big difference between knowing something's wrong and knowing what to do about it. Only after monitoring the right elements can we discern the health of a user experience, deriving from the analysis of those measurements the relationships and patterns that can be inferred. Ultimately, the automation that will give rise to truly adaptive applications is based on measurements and our understanding of them.

2020,

Office of the CTO

F5 Ecosystem | 12/23/2019

Inserting App Services into Shifting App Architectures

Application architectures have evolved several times since the early days of computing, and it is no longer optimal to rely solely on a single, known data path to insert application services. Furthermore, because many of the emerging data paths are not as suitable for a proxy-based platform, we must look to the other potential points of insertion possible to scale and secure modern applications.

2019,

Office of the CTO