Performance issues after upgrading from 1.0.4 -> 1.0.8 #2857

neerajrj · 2015-04-06T17:37:09Z

Hello,

We are seeing high levels of lock contention after our upgrade to RxJava 108

Apologize for not having a unit test to reproduce this we have a fairly complex system and we are having trouble figuring out which areas to dig deeper to find a reproducible case.

This is a paste from a JMC view. As far as we know nothing should be getting unsubscribed in our application.
We would appreciate if anyone can shed some light on what kind of behavior would trigger the stack below.

Stack Trace Sample Count    Percentage(%)
java.util.concurrent.locks.LockSupport.unpark(Thread)   1,113   76.233
 java.util.concurrent.locks.AbstractQueuedSynchronizer.unparkSuccessor(AbstractQueuedSynchronizer$Node) 1,113   76.233
      java.util.concurrent.locks.AbstractQueuedSynchronizer.release(int)    1,113   76.233
         java.util.concurrent.locks.ReentrantLock.unlock()  1,113   76.233
            rx.internal.util.SubscriptionList.remove(Subscription)  1,113   76.233
               rx.internal.schedulers.ScheduledAction$Remover2.unsubscribe()    1,113   76.233
                  rx.internal.util.SubscriptionList.unsubscribeFromAll(Collection)  1,113   76.233
                     rx.internal.util.SubscriptionList.unsubscribe()    1,113   76.233
                        rx.internal.schedulers.ScheduledAction.unsubscribe()    1,113   76.233
                           rx.internal.schedulers.ScheduledAction.run() 1,113   76.233
                              java.util.concurrent.Executors$RunnableAdapter.call() 1,113   76.233
                                 java.util.concurrent.FutureTask.run()  1,113   76.233
                                    java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor$ScheduledFutureTask)    1,113   76.233
                                       java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()   1,113   76.233
                                          java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)  1,113   76.233
                                             java.util.concurrent.ThreadPoolExecutor$Worker.run()   1,113   76.233
                                                java.lang.Thread.run()  1,113   76.233

The text was updated successfully, but these errors were encountered:

akarnokd · 2015-04-06T18:00:45Z

I guess you are scheduling a lot of work on the computation scheduler. We changed the tracking of tasks from using synchronized to j.u.c.Lock because it gives better throughput according to our JMH benchmarks. It appears task addition takes longer while holding the lock and the unsubscribe part spins and parks; most of the time, spin should be enough. Do you measure performance degradation?

neerajrj · 2015-04-06T18:17:36Z

We do not have formal performance benchmarks on our jobs. We autoscale our cluster based on resource utilization and we saw our cluster size go up by about 30% - 100% depending on the workload.
To give you a bit more context we have built a reactive stream processing system that basically reads data items as they flow in and processes them on a computation threadpool using observeOn. In terms of scale the jobs seeing the most impact process <= 10k messages/sec of size a few kBs each

akarnokd · 2015-04-06T18:55:38Z

Sounds like your data rate reaches a critical frequency where the submission of new values in observeOn overlaps its drain and thus causes extra contention. The change from 1.0.4 to 1.0.8 consists of two parts: lock in SubscriptionList and the use of SubscriptionList instead of CompositeSubscription for non-timed tasks inside the computation scheduler. What is the java version you are running and can you name the virtualization environment?

neerajrj · 2015-04-06T19:23:02Z

Hey David,

We are on Java 8 and we running in a Mesos container inside an AWS instance
(m3-2xl series)

Thanks
Neeraj

On Mon, Apr 6, 2015 at 11:56 AM, David Karnok [email protected]
wrote:

Sounds like your data rate reaches a critical frequency where the
submission of new values in observeOn overlaps its drain and thus causes
extra contention. The change from 1.0.4 to 1.0.8 consists of two parts:
lock in SubscriptionList and the use of SubscriptionList instead of
CompositeSubscription for non-timed tasks inside the computation scheduler.
What is the java version you are running and can you name the
virtualization environment?

—
Reply to this email directly or view it on GitHub
#2857 (comment).

akarnokd · 2015-04-06T19:40:44Z

Thanks. Not sure what's the cause, but you could try shifting the contention by using batching before the observeOn and unbatch after it;

source.batch(4).observeOn(Scheduler.computation()).concatMap(v -> Observable.from(v))

davidmoten · 2015-04-06T19:46:18Z

That would be buffer rather than batch, right? I was going to suggest the
same thing, sounds like a fast round robin on a scheduler.

Thanks. Not sure what's the cause, but you could try shifting the
contention by using batching before the observeOn and unbatch after it;

source.batch(4).observeOn(Scheduler.computation()).concatMap(v ->
Observable.from(v))

—
Reply to this email directly or view it on GitHub
#2857 (comment).

akarnokd · 2015-04-06T20:01:46Z

@davidmoten Sure.

neerajrj · 2015-04-06T20:04:19Z

Hey David,
Batching is a good suggestion however that would be a breaking change

On Mon, Apr 6, 2015 at 1:02 PM, David Karnok [email protected]
wrote:

@davidmoten https://github.com/davidmoten Sure.

—
Reply to this email directly or view it on GitHub
#2857 (comment).

akarnokd · 2015-04-06T20:07:04Z

This is why I suggested using concatMap to flatten the batches again after the observeOn so your subsequent computation chain doesn't need to change.

neerajrj · 2015-04-06T20:07:34Z

Oops hit the send button too soon.

As I was saying adding a batch or buffer would change our public API from T
to a List.

You mentioned earlier you saw significant throughput improvement in the JMH
tests can you provide more details, I am curious why the JMH tests do not
reflect the performance we see in production.
Were there any other benefits (did it resolve the issue of the NPE in merge
as well ?)

Thanks
Neeraj

On Mon, Apr 6, 2015 at 1:04 PM, Neeraj Joshi [email protected] wrote:

Hey David,
Batching is a good suggestion however that would be a breaking change

On Mon, Apr 6, 2015 at 1:02 PM, David Karnok [email protected]
wrote:

@davidmoten https://github.com/davidmoten Sure.

—
Reply to this email directly or view it on GitHub
#2857 (comment).

neerajrj · 2015-04-06T20:14:12Z

Ah ok, I see now, let me try out your suggestion. Thanks!

akarnokd · 2015-04-06T20:25:05Z

There are 2 PRs that did perf enhancements:

#2603, #2773

Benchmark                                   (size)     1.0.4      PR 2603    PR 2773
r.s.ComputationSchedulerPerf.observeOn           1  104110.926  115707.286  113905.358
r.s.ComputationSchedulerPerf.observeOn        1000    3212.434   13020.027   28618.423
r.s.ComputationSchedulerPerf.observeOn     1000000       9.508      16.559      32.166

akarnokd · 2015-04-23T15:56:07Z

I've run into this performance degradation and indeed, for some concurrency level (4+ in my case) the degradation was enormous. Could you check if PR #2912 fixes your case?

neerajrj · 2015-04-23T23:07:07Z

Hey David,
Our current setup doesn't lend itself to trying out a jar at scale (I can
only do local tests). I will try it out but the results may not be the same
at production scale.
Thanks for the fix.
-Neeraj

On Thu, Apr 23, 2015 at 8:56 AM, David Karnok [email protected]
wrote:

I've run into this performance degradation and indeed, for some
concurrency level (4+ in my case) the degradation was enormous. Could you
check if PR #2912 #2912 fixes
your case?

—
Reply to this email directly or view it on GitHub
#2857 (comment).

neerajrj · 2015-05-11T16:41:26Z

Hey David,

Looks like the performance is back to what it used to be with release 1.0.4 after we upgraded to RxJava 1.0.10! Looks like your fixes worked
Thanks!

akarnokd · 2015-05-11T16:57:08Z

@neerajrj Hi and thanks for confirming.

akarnokd added the Question label Apr 9, 2015

This was referenced Apr 23, 2015

Operator Publish full rewrite #2814

Merged

Fix the performance degradation due to different schedule execution and #2912

Merged

neerajrj closed this as completed May 11, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issues after upgrading from 1.0.4 -> 1.0.8 #2857

Performance issues after upgrading from 1.0.4 -> 1.0.8 #2857

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

davidmoten commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

akarnokd commented Apr 23, 2015

neerajrj commented Apr 23, 2015

neerajrj commented May 11, 2015

akarnokd commented May 11, 2015

Performance issues after upgrading from 1.0.4 -> 1.0.8 #2857

Performance issues after upgrading from 1.0.4 -> 1.0.8 #2857

Comments

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

davidmoten commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

neerajrj commented Apr 6, 2015

neerajrj commented Apr 6, 2015

akarnokd commented Apr 6, 2015

akarnokd commented Apr 23, 2015

neerajrj commented Apr 23, 2015

neerajrj commented May 11, 2015

akarnokd commented May 11, 2015