ossec-authd performance issues when generating large number of keys (20k+ ) #873

avisri · 2016-06-24T23:08:24Z

ossec-authd is taking 60 sec + to generate client keys when generating id >20k client key and uses up 80%-100% of Cpu / core/ request on a 8 core system (vm) .

Was able to reproduce -

use a windows or a mac client (s) ( one /server core ) to keep requesting key .
You can queue upto 512

count=0
while true ; 
do 
   testhost=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 8 | head -n 1) 
   # replace authserver with test auth server name /ip  
   time  bin/agent-auth -m <authserver> -p 1515 -A $count-$testhost
   ((count++))
done

server side mods:

# checkout src  and reconfigure max agents and reinstall
cd src/ ; 
#make sure you have this patch if you are on a older branch 
# https://github.com/ossec/ossec-hids/commit/5e41e53c31b6553873d26e06c9be299598c99cc5
echo "50000"| make setmaxagents; ../install.sh 
# alternatively make all and just copy new ossec-authd  bin to your installation location (/var/ossec/bin) 

#run it 
bin/ossec-authd  -d

ddpbsd · 2016-06-29T11:34:17Z

Most of OSSEC isn't threaded or anything, so the CPU usage makes sense. If you stop requesting keys, then request 1 key with a high agent id, does it still take 60+ seconds?

avisri · 2016-06-29T13:47:34Z

Dan @vikman90 pointed out to this patch

*The response time went down from mins to ms .

Our use case:

we don't request with id numbers
we don't reuse the ids ( reinstall creates new ids )

we did some load test and here are some results :

With 0.5 sec gap/sleep between requests:

all request ( 107 of them ) succeeded with an average response time of 0.28 secs for each key

with 0.1 sec gap/sleep between requests:

all requests succeeded with an average response time of 0.32 sec

with 0.05 sec gap/sleep between requests:

for 250 requests: 100% succeeded with an average response time of 5.8 sec
for 350 requests: 99% succeeded with an average response time of 8.4 sec
for 500 requests: 95% succeeded with an average response time of 10.0 sec
for 874 requests: 92% succeeded with an average response time of 13.0 sec

For concurrent requests ( while using only one client to create parallel threads):

~150 concurrent requests was supported with an average creation response time of 7.14 sec

ddpbsd · 2016-06-29T13:49:04Z

Neat. Can you put together a pull request?

avisri · 2016-07-10T17:21:48Z

Sorry about the delays @ddpbsd . We have deployed the in prod and it works great :D.

Need suggestion on the patch .

I presume for folks who are not using at scale like ours they might still like to turn it on as the search space is small .

We use a macro ( #define REUSE_RIDS 0 ) to replace the old code .
We can also use logic to help us out : if MAX_AGENTS < 4000 use old else use Dichotomic search

Kindly let me know which option is preferred :) ? Then I will quickly drop the pull request.

awiddersheim · 2016-07-11T01:05:09Z

First of all great patch @vikman90. @avisri it doesn't make much sense to me to keep both code paths. I'd rather just see @vikman90 patch applied as is.

ddpbsd · 2016-07-11T13:27:03Z

@avisri I haven't been able to look at this much, but the closer your patch is to the original the better IMO. That should make it easier to keep both versions up to date.

vikman90 · 2016-07-11T19:04:52Z

Hi all.

I just sent the PR with the original patch that I made. I think that it isn't necessary to use the old way with few agents because the number of attempts stills being low. E.g. with 4000 agents, ossec-authd will do as much 12 attemps (base-2 log 4000).

@avisri I'm very glad that it's working great in your system! Thank you very much for testing it and for your feedback.

Thank you all for your interest, I hope it's useful to the OSSEC community.
Best regards.

awiddersheim · 2016-07-12T01:37:21Z

Merged. Thanks.

avisri · 2016-07-12T05:03:23Z

+1

Thanks

Sent from my iPhone

On Jul 10, 2016, at 18:05, Andrew Widdersheim [email protected] wrote:

First of all great patch @vikman90. @avisri it doesn't make much sense to me to keep both code paths. I'd rather just see @vikman90 patch applied as is.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

avisri · 2016-07-12T17:17:10Z

Thanks thats helps. Will make it a compile time switch instead of runtime logic switch. Considering MAX_AGENTS itself is a compile time option it will be easy to follow it and make a new Macro/flag REUSE_ID .

On Jul 11, 2016, at 6:27 AM, Dan Parriott <[email protected] mailto:[email protected]> wrote:

@avisri https://github.com/avisri I haven't been able to look at this much, but the closer your patch is to the original the better IMO. That should make it easier to keep both versions up to date.

vikman90 · 2016-07-13T01:25:55Z

@avisri: Do you mean the REUSE_ID option from the @wazuh project? That flag serves a different function: It removes an agent from client.keys instead of comment it in order to avoid the file from growing too much.

This feature allows authd and manage_agents to grant an ID that was previously assigned to another agent, but it leaves "holes" in the agent list and it becomes incompatible with the boosted algorithm (dichotomic search assumes that there's no fragmentation) In other words: no number under any assigned ID must be assignable.

Because of this, we disable the boosted search when REUSE_ID flag is set.

avisri · 2016-07-14T02:13:25Z

Thanks again for the details . Yes I did understand it similarly that it was one or the other in your patch . However the macro did kindle a few questions in my mind .

This is a good segue way to my next question / feature request :
Wondering whether is it possible for clients renew the keys using the same ID/agent name (hostname) . This would come handy to renew keys x-days kind of scenario without needing to clean all client rids and reinstall . This would then go very well with our current direction of not needing to reuse the ID :D .

Sent from my iPhone

On Jul 12, 2016, at 18:26, Vikman Fdez-Castro [email protected] wrote:

@avisri: Do you mean the REUSE_ID option from the @wazuh project? That flag serves a different function: It removes an agent from client.keys instead of comment it in order to avoid the file from growing too much.

This feature allows authd and manage_agents to grant an ID that was previously assigned to another agent, but it leaves "holes" in the agent list and it becomes incompatible with the boosted algorithm (dichotomic search assumes that there's no fragmentation) In other words: no number under any assigned ID must be assignable.

Because of this, we disable the boosted search when REUSE_ID flag is set.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

avisri changed the title ~~ossec-authd is taking 60 sec + to generate a client keys when generating id >25k client key~~ ossec-authd performance issues when generating large number of keys (20k+ ) Jun 25, 2016

vikman90 mentioned this issue Jul 11, 2016

Dichotomic search to add agents with authd #890

Merged

awiddersheim closed this as completed Jul 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ossec-authd performance issues when generating large number of keys (20k+ ) #873

ossec-authd performance issues when generating large number of keys (20k+ ) #873

avisri commented Jun 24, 2016 •

edited

Loading

ddpbsd commented Jun 29, 2016

avisri commented Jun 29, 2016

ddpbsd commented Jun 29, 2016

avisri commented Jul 10, 2016

awiddersheim commented Jul 11, 2016

ddpbsd commented Jul 11, 2016

vikman90 commented Jul 11, 2016 •

edited

Loading

awiddersheim commented Jul 12, 2016

avisri commented Jul 12, 2016

avisri commented Jul 12, 2016

vikman90 commented Jul 13, 2016

avisri commented Jul 14, 2016

ossec-authd performance issues when generating large number of keys (20k+ ) #873

ossec-authd performance issues when generating large number of keys (20k+ ) #873

Comments

avisri commented Jun 24, 2016 • edited Loading

Was able to reproduce -

ddpbsd commented Jun 29, 2016

avisri commented Jun 29, 2016

we did some load test and here are some results :

With 0.5 sec gap/sleep between requests:

with 0.1 sec gap/sleep between requests:

with 0.05 sec gap/sleep between requests:

For concurrent requests ( while using only one client to create parallel threads):

ddpbsd commented Jun 29, 2016

avisri commented Jul 10, 2016

awiddersheim commented Jul 11, 2016

ddpbsd commented Jul 11, 2016

vikman90 commented Jul 11, 2016 • edited Loading

awiddersheim commented Jul 12, 2016

avisri commented Jul 12, 2016

avisri commented Jul 12, 2016

vikman90 commented Jul 13, 2016

avisri commented Jul 14, 2016

avisri commented Jun 24, 2016 •

edited

Loading

vikman90 commented Jul 11, 2016 •

edited

Loading