Recently, F5 published a research paper in Elsevier Network Security journal related to the empirical analysis of various Machine Learning (ML)/ Artificial Intelligence (AI) frameworks to determine the performance and effectiveness of running ML/AI algorithms in a distributed manner. The research explores the usage of different ML/AI advanced frameworks such as NVIDIA’s Morpheus, ONNX, TF/SKL etc. to significantly improve cyber security on a large scale. This has resulted in accelerating detection capabilities, achieved through optimized model formatting and GPU enabled parallel workload processing. The combination of software optimization and hardware acceleration has not only reduced latency but also increased overall throughput.
We selected the problem of detecting Algorithmically Generated Domains (AGDs) for comparing the different ML/AI frameworks (NVIDIA’s Morpheus, ONNX, TF/SKL). In general, the attackers use Domain Generation Algorithms (DGAs) to generate AGDs for data exfiltration and Command and Control (C&C) communication over DNS layer. We used the DGA threat detection models to test various ML/AI frameworks.
The ML/AI models to detect AGDs were deployed and executed on the given platforms with batch sizes of 32 and 1024. GPU configurations outperformed the CPU configurations of the TensorFlow implementation by 11-fold and 43-fold with the given batch sizes respectively. Using the ONNX model format, the CPU execution provider performed 5-fold better than TensorFlow CPU for batch size 32 and 1.5-fold better for batch size 1024. ONNX with CUDA GPU execution provider outperformed ONNX with CPU execution provider by 6x and 13x for batch sizes 32 and 1024, respectively. Morpheus-GPU outperformed other architectures achieving throughputs of 22382 req/sec and 208077 req/sec for batch sizes 32 and 1024, respectively. This is a 200-fold increase in throughput when compared to the TensorFlow-GPU configuration.
In our analysis, we found that Morpheus-GPU offers superior latency and throughput, enabling the use of larger batch sizes for serving large networks. Real-time DGA detection on data center-level DNS traffic is possible with Morpheus-GPU and caching.
The key-learnings from the research are:
Organizations can utilize this research to implement AI-accelerated cybersecurity by selecting robust infrastructure (combination of processing units and ML/AI frameworks) to process data at a large scale to obtain inference results in a fast and scalable manner. The research allows organizations to implement methods using the validated results related to ML/AI frameworks in the existing product offerings, particularly in areas of ML/AI.
Get your copy of the paper at the Network Security Journal online portal.