Unable to obtain > 40 RPS after migrating to our own Parse server #2030

sohagfan · 2016-06-10T22:43:33Z

Our setup is as follows:

Elastic Beanstalk
64bit Amazon Linux 2016.03 v2.1.1 running Node.js (v4.4.3)
nginx proxy server
m4.large single instance hosted in Virginia
Parse Server (v2.2.11)
Mongo DB driver (v2.1.18)
mLab dedicated cluster (v3.0.10)

We are getting an average of only around 25 RPS and a peak of 40 RPS. When we exceed the peak, we see high latency and connection dropped errors in the logs.

An example error from the nginx log is as follows:

"2016/06/09 06:46:34 [error] 2684#0: *254 upstream prematurely closed connection while reading response header from upstream, client: , server: , request: "GET /1/classes/. . . host: “www.example.com""

With the same application, on the hosted Parse.com, we were able to scale to as high as required. We were able to request 70+ RPS successfully without any requests dropped.

Are there any configuration changes in any of the setup mentioned above (EB, Node.js, nginx, Parse Server, Mongo DB driver, mLab) or some other that we have not mentioned or missed to get a better performance?

If you have a better performance, what is your setup?

Any pointers / comments will be much appreciated.

bohemima · 2016-06-11T06:40:14Z

You might want to have a look at the parse-server logs (run with VERBOSE=1 environment variable) and also have a look at indexes and enable profiling in your mongod to pinpoint slow running queries.

flovilmart · 2016-06-22T23:39:50Z

You may wanna run on multiple smaller instances

Knana · 2016-06-23T00:40:17Z

At NodeChef we have customers performing 150+ req/sec doing complex ad-hoc queries with just two 512 MB RAM app containers before they even run into the issue you describe. Our databases run on 8 physical cores that is the equivalent of either c4.2xlarge or r3.2xlarge on AWS. We use bare metal infrastructure providing the the best performance. We also provide you with RPS stats in real-time as well so you can gauge this for yourself. We can help you get started if interested.

flovilmart · 2016-06-23T00:49:11Z

RAM is definitely not an issue with parse-server, as it's much more I/O bound and CPU bound, that's what I experience on my side on GAE, We'd be able to process ~100 rps on 2 instances while maintaining CPU < 50% with n1-highcpu-2 (2vCPU, 1.7Gb of ram each).

sohagfan · 2016-06-24T23:33:57Z

Thanks for all your comments and suggestions.

@bohemima: We have run the server with VERBOSE=1, but have had no further insight. Thanks for your suggestion of building indexes in Mongo DB to address slow queries. We have done this previously and may need to continue doing this. However, we don't feel this is the issue, as the queries that we run during the regular course of our application have optimized indexes.

@flovilmart and @Knana: Thanks for your suggestion to run on multiple smaller instances and for the performance information you have provided. In order to test the limits of a single server, we disabled autoscaling. It is good to know that you were able to achieve 150+RPS with two containers.
Question: Do you see a temporary performance hit when a new EC2 instance fires up?

Your suggestions are all good suggestions, however we suspect our problem is in the interaction between the Parse server and our application.
One more question: Do you use Node.js profiling tools? We need to gain insight into where the request-response loop is being delayed within the stack.

Thanks in advance.

kranzky · 2016-07-06T08:50:04Z

We are using NodeChef and currently run 10 256MB app containers. No queries to mongo take more than 20ms, and most take a tiny fraction of that, and yet we do have performance issues. The problem seems to be a rather complex cloud code function which performs a query which returns 10 objects, and then performs an additional 4 queries for each of the returned objects, resolving these 40 queries in a single big Parse.Promise.when(...).

This wouldn't be an issue, I don't think, if each of those additional queries was hitting mongo directly. Which would make sense to me; they're cloud code functions running within the server. But they don't, each additional query hits the app containers which, apart from introducing inefficiencies, means that even running this single cloud code function causes requests to our parse api to queue up. The end result is that this single cloud code function can take up to ten seconds to complete. I just don't get it.

lpremraj · 2016-07-06T10:44:32Z

Thanks Jason for taking the time to respond.
Much appreciated. We'll look into this and respond
Thanks Again
Prem

On Wednesday 6 July 2016, Jason Hutchens [email protected] wrote:

We are using NodeChef and currently run 10 256MB app containers. No
queries to mongo take more than 20ms, and most take a tiny fraction of
that, and yet we do have performance issues. The problem seems to be a
rather complex cloud code function which performs a query which returns 10
objects, and then performs an additional 4 queries for each of the returned
objects, resolving these 40 queries in a single big
{{Parse.Promise.when(...)}}.

This wouldn't be an issue, I don't think, if each of those additional
queries was hitting mongo directly. Which would make sense to me; they're
cloud code functions running within the server. But they don't, each
additional query hits the app containers which, apart from introducing
inefficiencies, means that even running this single cloud code function
causes requests to our parse api to queue up. The end result is that this
single cloud code function can take up to ten seconds to complete. I
just don't get it.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2030 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AIt2fD1zerV8GbPSMBRwQXoU6lKBeqZGks5qS2xSgaJpZM4IzY5G
.

sohagfan · 2016-07-27T23:52:28Z

@flovilmart and @Knana: We are revisiting this issue which remains a problem for us. Thanks once again for your previous helpful suggestions and insights. I had a couple of new questions for both of you.

As we know, RPS is not available from the Elastic Beanstalk Monitoring graphs - only latency, CPU utilization, network bytes in and out, network packets in and out.

My questions are:

Q1. How are you measuring RPS? Are you calculating it using some benchmark tests or are you inferring it based on what it was when you had pointed your app to hosted parse.com or something else altogether?

Q2. Are you using primarily Cloud Code functions (/functions endpoint) or are you using the /batch endpoint or /class endpoint or a mixture of some or all of these?

@jasonhutchens , this may be of interest to you: in one of our benchmark tests, using:

Autoscaling 1->4 on
Autoscale trigger latency >= 2 seconds
c4.xlarge instances
4 instances active
the rest of the setup being similar as mentioned in the first post,

we saw a latency of 20-25 seconds with VERBOSE environmental variable set. When we removed the VERBOSE environment variable latency dropped an order of magnitude to between 1 and 2 seconds, all though 4 instances remained active.

kranzky · 2016-07-28T00:37:30Z

thanks @sohagfan; I'll benchmark with verbose logging disabled. although I still don't understand why Parse queries made from cloud code functions still need to be routed back through the api layer when they could bypass a lot of that?

sohagfan · 2016-07-28T01:01:38Z

@jasonhutchens: sure, no problem. Hope it helps. You may be aware of this already, but in case you aren't: you have to actually delete the environmental variable for it to stop taking effect; it doesn't matter if the variable exists and its value is 1 or 0; setting the value to 0 makes no difference, it would be as though the variable is still set.

kranzky · 2016-07-28T02:19:22Z

@sohagfan no, we're running on NodeChef; I'll let them know about this

Knana · 2016-07-28T21:27:32Z

@sohagfan see below responses to your questions

NodeChef metrics measure request/second as well as number of connections in real-time. So we are actually inferring the RPS on a live app the customer is using in production. This information is available from our dashboard. The RPS is calculated by summing the number of request dispatched to all app containers within a second and the connections simply measure the number of open sockets for the app.
What we measured is typically a mixed workload scenario, direct class queries as well as cloud code making queries back to query a class and so on.

hope this helps.

bkprabhak · 2016-08-03T21:44:27Z

@flovilmart please let me know what tests you ran to determine you got ~100 rps with 2 n1-highcpu-2 machine? We are running tests on similar machines on AWS Elastic Beanstalk through REST API and are seeing much worse performance numbers. Should we also expect a difference in performance between Cloud Clode methods triggered using the REST API v clients (through Parse iOS and Android SDKs)?

Also, what trigger did you setup for autoscaling Latency/CPU/Networkout?

@drew-gross @hramos Happy to hear what others are doing as well. We are stuck with our load testing at the moment and it's preventing us from moving to production.

Let me know what tests you recommend for load testing our dev environment so that we can feel comfortable moving to production. Most of our load comes from clients and we are not sure how to simulate this due to the latency issues we have noticed when running benchmark tests through the REST API.

flovilmart · 2016-08-06T10:32:22Z

For now there should be no difference between cloud code and the client SDK's as all cloud code request go though the HTTP interface. There is a pull request that attempts to run cloud code with direct access to the JS interface instead of the HTTP one.

reasonman · 2016-08-07T19:07:14Z

@flovilmart I'd be interested in how you got to your 100+ RPS number as well. In my tests using a single n1-highcpu-2 on GCE and testing with Locust, I can get between 20-30 RPS before the CPUs peg. I didn't try 2 but suspect I would only get around double what I'd get with a single instance.

For reference, my load tests using Locust via REST against a clustered Parse instance(PM2: https://nodejs.org/api/cluster.html#cluster_cluster and http://pm2.keymetrics.io/docs/usage/cluster-mode/):

f1-micro-1: ~5 RPS @ ~40ms response time
n1-highcpu-2: ~30 RPS @ ~400ms response time
n1-highcpu-4: ~70 RPS @ ~600ms response time

flovilmart · 2016-08-07T22:45:43Z

@reasonman we don't use PM2 nor cluster, I spawned 10 AWS instances, and queried random objects from our DB.

In general we see the request times below 30ms with stack driver.

Since then, we changed our setup to cap at 50rps and the CPU is still below 35%.

Note that the DB is in the same zone as the servers.

kranzky · 2016-08-16T09:36:33Z

@flovilmart I'm using Parse server on nodechef and definitely have problems with the SDK routing everything through the HTTP interface when running cloud code functions. Right now this blocks us from moving our production apps from Parse to Parse Server.

I spun up a simple nodechef Parse instance with one app server to demonstrate the issue. I deployed the following main.js to cloud code:

Parse.Cloud.define("niceFunction", function(request, response) {
  response.success("Hello world!");
});

function allUsers(num) {
  var query = new Parse.Query(Parse.User);
  return query.find().then(function() { return num; });
}

Parse.Cloud.define("insaneFunction", function(request, response) {
  var num = request.params.num || 1;
  var promises = [];

  for (var i = 0; i < num; ++i) {
    promises.push(allUsers(i));
  }

  Parse.Promise.when(promises).then(function(results) {
    response.success(results);
  }, function(error) {
    response.error(error);
  });
});

I called niceFunction from Postman and got a response latency of 2ms according to the nodechef stats. I then called insaneFunction without any parameters, and got a response latency of 29ms for the function call, noting that an additional API query to _User was made prior to the request completing (with a response latency of 8ms).

I then called insaneFunction with the num parameter set to 10 and got a response latency of 150ms, with the response latency of the 10 separate API queries to _User ranging between 14ms and 34ms.

So routing queries through the HTTP interface is causing requests to queue up, basically serialising what should be parallel operations. What's concerning to me is that many of our production cloud code functions perform many more than 10 queries when called.

My expectation is that calling insaneFunction with num set to 10 should only send a single request to the API, and should run 10 mongo queries in parallel, completing in about 15ms, not 150ms. We shouldn't need to scale out our app servers to handle this kind of workload :)

Now, having said all that, I suspect Parse also suffered from the same issue. I'm just curious to know what it would take to have the SDK talk directly to Mongo from cloud code to avoid having functions spawn requests that queue up at the HTTP interface? I'm hopeful this would be a performance win, and would avoid the problem of a single function call blocking access to our entire cluster (which is what happens at the moment with our staging apps, which use 10 app servers, and which have cloud code functions that may fire of 100 queries, causing all 10 app servers to do work to serve a single request).

flovilmart · 2016-08-16T12:00:20Z

There is a PR for that #2316

kranzky · 2016-08-16T12:45:42Z

@flovilmart cheers, I'll test that on our nodechef instance (with help from the team there) in the hope that this will improve our numbers. thanks!

flovilmart · 2016-08-16T12:52:33Z

Actually, there are some things to fix in that Pr before you can roll it out confidently

kranzky · 2016-08-17T00:19:52Z

@flovilmart yes, understood... I just meant we'd do some testing, and I can confirm that the insaneFunction which previously took 150ms now takes 40ms to complete, which is a very nice optimisation. looking forward to being able to use this in production :)

flovilmart · 2016-09-03T18:25:46Z

I'm gonna close that issue as this turned into a discussion and we're explored different ways to improve the server performance.

Also #2316 will probably land in the next version protected by the EXPERIMENTAL flag.

Default cluster support from the CLI #2596 will also appear next version

flovilmart closed this as completed Sep 3, 2016

snyk-bot mentioned this issue Feb 20, 2022

refactor: upgrade winston from 3.4.0 to 3.5.0 #7817

Merged

snyk-bot mentioned this issue May 22, 2022

refactor: upgrade ws from 8.5.0 to 8.6.0 #8011

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to obtain > 40 RPS after migrating to our own Parse server #2030

Unable to obtain > 40 RPS after migrating to our own Parse server #2030

sohagfan commented Jun 10, 2016

bohemima commented Jun 11, 2016

flovilmart commented Jun 22, 2016

Knana commented Jun 23, 2016

flovilmart commented Jun 23, 2016

sohagfan commented Jun 24, 2016

kranzky commented Jul 6, 2016 •

edited

Loading

lpremraj commented Jul 6, 2016

sohagfan commented Jul 27, 2016

kranzky commented Jul 28, 2016

sohagfan commented Jul 28, 2016

kranzky commented Jul 28, 2016

Knana commented Jul 28, 2016 •

edited

Loading

bkprabhak commented Aug 3, 2016

flovilmart commented Aug 6, 2016

reasonman commented Aug 7, 2016

flovilmart commented Aug 7, 2016

kranzky commented Aug 16, 2016

flovilmart commented Aug 16, 2016

kranzky commented Aug 16, 2016

flovilmart commented Aug 16, 2016

kranzky commented Aug 17, 2016

flovilmart commented Sep 3, 2016 •

edited

Loading

Unable to obtain > 40 RPS after migrating to our own Parse server #2030

Unable to obtain > 40 RPS after migrating to our own Parse server #2030

Comments

sohagfan commented Jun 10, 2016

bohemima commented Jun 11, 2016

flovilmart commented Jun 22, 2016

Knana commented Jun 23, 2016

flovilmart commented Jun 23, 2016

sohagfan commented Jun 24, 2016

kranzky commented Jul 6, 2016 • edited Loading

lpremraj commented Jul 6, 2016

sohagfan commented Jul 27, 2016

kranzky commented Jul 28, 2016

sohagfan commented Jul 28, 2016

kranzky commented Jul 28, 2016

Knana commented Jul 28, 2016 • edited Loading

bkprabhak commented Aug 3, 2016

flovilmart commented Aug 6, 2016

reasonman commented Aug 7, 2016

flovilmart commented Aug 7, 2016

kranzky commented Aug 16, 2016

flovilmart commented Aug 16, 2016

kranzky commented Aug 16, 2016

flovilmart commented Aug 16, 2016

kranzky commented Aug 17, 2016

flovilmart commented Sep 3, 2016 • edited Loading

kranzky commented Jul 6, 2016 •

edited

Loading

Knana commented Jul 28, 2016 •

edited

Loading

flovilmart commented Sep 3, 2016 •

edited

Loading