Skip to content
This repository has been archived by the owner on Jan 29, 2021. It is now read-only.

Performance Testing #21

Closed
Zalgo2462 opened this issue Jul 31, 2018 · 6 comments
Closed

Performance Testing #21

Zalgo2462 opened this issue Jul 31, 2018 · 6 comments
Assignees

Comments

@Zalgo2462
Copy link
Contributor

Currently, we do not know how the project scales with more CPU/ RAM. However, we do have rough numbers for logstash. We need to set up similar benchmarks so we can estimate what resources will be needed for various loads.

@Zalgo2462
Copy link
Contributor Author

https://github.com/robcowart/elastiflow

Elastiflow is a suite of configurations for logstash which utilizes the logstash netflow codec.

The following table is given for performance.

flows/sec (v)CPUs Memory Disk (30-days) ES JVM Heap LS JVM Heap
250 4 24 GB 305 GB 8 GB 4 GB
1000 8 32 GB 1.22 TB 12 GB 4 GB
2500 12 64 GB 3.05 TB 24 GB 6 GB

That is unacceptable.

https://www.elastic.co/guide/en/logstash/current/plugins-codecs-netflow.html

The main page for the codec gives the following information:

For high-performance production environments the configuration below will decode up to 15000 flows/sec from a Cisco ASR 9000 router on a dedicated 16 CPU instance. If your total flowrate exceeds 15000 flows/sec, you should use multiple Logstash instances.

According to this a 16 CPU system should be able to handle 15000 flows/sec. Much better than the numbers quoted from Elastiflow. Elastiflow may be performing further processing which requires more CPU power.

logstash-plugins/logstash-codec-netflow#85

This logstash issue quotes roughly 1-1.5K flows per second per vCPU (@2.8Ghz) maxing out at 6300 flows per second. Adding more vCPUs beyond 6 doesn't seem to increase the rate. With dedicated CPU cores (rather than sharing them in AWS) at 2.9Ghz, scaling improved.

vCPU flows/sec
1 2300
2 4300
4 6700
8 9100
16 15000
32 16000

From the same issue, horizontal scaling appears more efficient than vertical scaling.

@Zalgo2462
Copy link
Contributor Author

I have conducted performance tests with digital ocean droplets.

When using YAF as the data source, on a non-cpu optimized droplet, the max flows/ second seems to be 2100 flows/ second obtained on a system with 8gb of RAM and 4 vCPUs running with 4 input workers. Scaling vCPUs and input workers seems to have no effect.

On a CPU optimized system, 3000 flows/ second was obtained with 32gb of RAM and 16 vCPUs and 16 workers. Additionally, the java options "-xmx8g -xms8g" was passed to logstash allowing it to spend less time on garbage collection. Without the RAM optimization, roughly 2500 flows/ second was achieved.

What is the most interesting is that different types of flows seem to have different performance characteristics despite carrying the same data. Cisco ASR v9 flows are faster than cisco ASA v9 flows which are faster than sonicWALL IPFIX flows which are faster than YAF ipfix flows.

@Zalgo2462 Zalgo2462 self-assigned this Aug 13, 2018
@Zalgo2462
Copy link
Contributor Author

YAF seems to produce an edge case in the netflow codec that degrades performance. Other flow formats achieve reasonable performance logstash-plugins/logstash-codec-netflow#151.

@Zalgo2462
Copy link
Contributor Author

Zalgo2462 commented Aug 13, 2018

The following was captured using Digital Ocean CPU-Optimized Droplets while running MongoDB, Logstash, Elasticsearch, and Kibana. Elasticsearch and Kibana were needed in order to collect the data.

The system was benchmarked with the scripts here as well as with two yaf commands run directly on a sample pcap file.

codec version RAM -xmx -xms GB udp buffer MB vCPU Workers CPU Optimized Mongo YAF flows /sec sonicWALL IPFIX flows/ sec sonicWALL v9 flows/ sec Cisco ASA flows/ sec Cisco ASR flows/ sec yaf ps-empire --no-tombstone --no-stats --silk flows/ sec yaf ps-empire --no-tombstone --no-stats flows/ sec
4.1.0 32 8 64 16 16 1 1 4000 13250 11500 11500 13500 2850 3000
4.1.0 16 6 64 8 8 1 1 3600 10000 6750 11100 11750 2500 2800
4.1.0 16 6 64 6* 6 1 1 3500 8700 5600 10000 12900 2500 2650
4.1.0 16 6 64 4* 4 1 1 3100 6200 3750 6800 9200 2100 2350
4.1.0 16 6 64 2* 2 1 1 2350 3200 1900 3600 4750 1550 1650

* CPU limited via docker cpu scheduler on 8 core machine

flows_ sec vs workers

EDIT: Added more details for each benchmark

YAF flows /sec sonicWALL IPFIX flows/ sec sonicWALL v9 flows/ sec Cisco ASA flows/ sec Cisco ASR flows/ sec
Flows/ packet 25 5 5 14 21
Packet Size 1415 285 289 1452 1392
Rough Avg Flow Size 56.6 57 57.8 103.7142857 66.28571429

@Zalgo2462
Copy link
Contributor Author

The following data was also collected:

codec version RAM -xmx -xms GB udp buffer MB vCPU Workers CPU Optimized Mongo YAF flows /sec sonicWALL IPFIX flows/ sec sonicWALL v9 flows/ sec Cisco ASA flows/ sec Cisco ASR flows/ sec
3.11.4 (default) 32 8 16 16 16 1 1 3000 5500 3200 6200 8500
4.1.0 32 8 16 16 16 1 1 3800 12500 11500 12000 12900
4.1.0 32 8 64 16 16 1 0 4000 17500 12000 11500 18200

The latest version of the plugin significantly outperforms the default version. Additionally, the MongoDB plugin seems to cap the performance at around 13000 flows/ second.

@Zalgo2462
Copy link
Contributor Author

Closing this issue. Please track logstash-plugins/logstash-codec-netflow#151 for more information on YAF

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants