Upgrade to any version > 0.18.2 more than doubles CPU usage #209

Lidbetter · 2017-06-22T23:16:22Z

Hi,

Using Rollbar 0.18.2:
We normally see 2-4k requests per min come through our load balancer, which are then served by 2 AWS m1.large instances at ~40-60% CPU utilization

Using Rollbar 1.0.1 or 1.1.1 (tried to upgrade twice):
Our 2 AWS m1.large instances immediately jumped to 90-100% CPU utilization, we then scaled to 5 AWS m1.large instances to support the same amount of traffic at ~40-60% CPU utilization

After experiencing the issue with 1.0.1 I had hoped that this pr: #158 would fix the issue we ran into. That does not appear to be the case.

When upgrading the library, the entirety of our diff is as follows:

// setup rollbar location:
 -    Rollbar::init([
 +    \Rollbar\Rollbar::init([

// composer.json
 -    "rollbar/rollbar": "^0.18.2",
 +    "rollbar/rollbar": "^1.1.1",

// composer.lock
// snipped, no other dependencies were changed

// manual reporting helpers
function reportErrorMessage($message, $extra_data = []) {
 // ... snip
 -    \Rollbar::report_message($message, Level::ERROR, $extra_data);
 +    \Rollbar\Rollbar::log('error', $message, $extra_data);
}

function reportErrorMessage($message, $extra_data = []) {
 // ... snip
 -    \Rollbar::report_message($message, Level::ERROR, $extra_data);
 +    \Rollbar\Rollbar::log('error', $message, $extra_data);
}

function reportException(Exception $exception, $extra_data = []) {
 // ... snip
 -    \Rollbar::report_exception($exception, $extra_data);
 +    \Rollbar\Rollbar::log('error', $exception, $extra_data);
}

function notifyException(Exception $exception, $extra_data = []) {
 // ... snip
 -    \Rollbar::report_exception($exception, $extra_data, [
 -        'level' => 'info',
 -    ]);
 +    \Rollbar\Rollbar::log('info', $exception, $extra_data);
}

There were no additional factors/changes which were introduced at the same time as the Rollbar upgrade.

Please let me know if there is any other information I can provide which would help with tracking down the cause of this issue.

Thanks.

The text was updated successfully, but these errors were encountered:

rokob · 2017-07-06T02:41:35Z

Hey sorry about this, I have been working on profiling the library and fixing the hot spots, this PR #217 is the work. The main changes from 0.18.2 to 1.0 are that we are doing more scrubbing/truncation work and the logs no longer are batchable. So I am addressing those two areas. Will update you with what I find.

Lidbetter · 2017-07-06T07:59:25Z

Thanks for the update, looks good so far

cordoval · 2017-07-20T21:46:30Z

@rokob are you using blackfire or what? just curious 👍 great job

rokob · 2017-07-20T21:49:54Z

Xdebug and Webgrind

elazar · 2017-08-10T17:54:56Z

@rokob Was this was resolved by #217? This comment from that PR indicates that the performance fixes are behind a configuration flag that's disabled by default and doesn't appear to be documented anywhere. Can you provide further details here?

ArturMoczulski · 2017-08-11T01:23:07Z

@elazar I believe the configuration flag is batched

rokob · 2017-08-14T16:30:27Z

So there are a few potential issues that could have been causing performance problems. I fixed all of the ones that I could find. I also reintroduced the ability to send errors in batches instead of as they arise. However, the way we send batched errors is different as we no longer support an api endpoint that accepts a batch of errors. Instead we are using libcurl's multiplexing feature. So you can choose to batch errors with the boolean config parameter batched, and you can configure the size of the batch with batch_size which is 50 by default.

So by default a decent chunk of the performance improvements are already part of the current release as they do not change the way requests are sent. However, if you want to try turning on batching you can choose to do so, but note that it may have a positive or negative impact on your performance depending on the actual workload of your app.

Lidbetter · 2017-08-14T17:47:17Z

Thanks for your effort on this @rokob we will be testing out 1.3.1 this week (probably today).

rokob · 2017-08-19T01:13:07Z

Going to close this, please re-open or open a new issue if there is still a problem.

Lidbetter · 2017-08-19T03:45:34Z

We still experienced a massive performance hit (pretty much unchanged from before - with both batching (50) on and off). I realize that is not super helpful in narrowing down where the problem is. Just got approval for blackfire, so will benchmark the differences and hopefully narrow down where the issue lies.

nandgate7400 · 2017-08-29T18:26:51Z

I'd like to chime in that I'm also experiencing massive performance issues. I updated to the latest version (1.3.1) and my server load goes up by a factor of 20 or more. It is unusable in a production environment.

rokob added this to the v1.3.0 milestone Jul 6, 2017

rokob self-assigned this Jul 7, 2017

rokob closed this as completed Aug 19, 2017

Lidbetter mentioned this issue Aug 29, 2017

Version 1.3.1 continues to have massive performance issues. #256

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to any version > 0.18.2 more than doubles CPU usage #209

Upgrade to any version > 0.18.2 more than doubles CPU usage #209

Lidbetter commented Jun 22, 2017

rokob commented Jul 6, 2017

Lidbetter commented Jul 6, 2017

cordoval commented Jul 20, 2017

rokob commented Jul 20, 2017

elazar commented Aug 10, 2017 •

edited

Loading

ArturMoczulski commented Aug 11, 2017

rokob commented Aug 14, 2017

Lidbetter commented Aug 14, 2017

rokob commented Aug 19, 2017

Lidbetter commented Aug 19, 2017

nandgate7400 commented Aug 29, 2017

Upgrade to any version > 0.18.2 more than doubles CPU usage #209

Upgrade to any version > 0.18.2 more than doubles CPU usage #209

Comments

Lidbetter commented Jun 22, 2017

rokob commented Jul 6, 2017

Lidbetter commented Jul 6, 2017

cordoval commented Jul 20, 2017

rokob commented Jul 20, 2017

elazar commented Aug 10, 2017 • edited Loading

ArturMoczulski commented Aug 11, 2017

rokob commented Aug 14, 2017

Lidbetter commented Aug 14, 2017

rokob commented Aug 19, 2017

Lidbetter commented Aug 19, 2017

nandgate7400 commented Aug 29, 2017

elazar commented Aug 10, 2017 •

edited

Loading