Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak? Memory increasing with every collection().add() #620

Closed
CyrisXD opened this issue Aug 15, 2019 · 10 comments
Closed

Possible memory leak? Memory increasing with every collection().add() #620

CyrisXD opened this issue Aug 15, 2019 · 10 comments
Assignees

Comments

@CyrisXD
Copy link

CyrisXD commented Aug 15, 2019

Describe your environment

  • Operating System version: Heroku
  • Firebase SDK version: 8.3.0
  • Firebase Product: Database

Describe the problem

I'm parsing a CSV with 130k lines and as I'm iterating through each row I'm adding it to the collection, this is rated limited to 450/s. However with each CSV parse it seems Firestore is using more and more memory and stays. Removing the collection().add() code fixes this problem and the parsing of the CSV alone never goes above 80mb.

image

Steps to reproduce:

I'd imagine you could run the relevant code through a loop, in my case is was 130k times for the CSV size. If there is a more efficient way of achieving this please do let me know.

Relevant Code:

db.collection("games")
        .doc(req.body.docid)
        .collection("players")
        .doc()
        .add({
          Number: row[0],
          FirstName: row[1],
          Lastname: row[2],
          Email: row[3],
          Prize: row[4],
          gameID: req.body.docid,
          guid: row[5]
        })
        .then(() => {
          progressCount++;
          updateProgress(req.body.docid);
        })
        .catch(err => {
          console.log(err.message);
        });
    });
@CyrisXD
Copy link
Author

CyrisXD commented Aug 15, 2019

I can confirm 24 hours later the memory is still present. Full code can be found here - PasteBin

Upon my 4th CSV upload, Firestore pushed the memory above Heroku limits and crashed the Dyno, once restarted the Dyno memory went back to ~50mb.

schmidt-sebastian added a commit to googleapis/nodejs-firestore that referenced this issue Aug 16, 2019
This PR logs when we create or delete instances in the client pool, helping us debug firebase/firebase-admin-node#620
@schmidt-sebastian
Copy link
Contributor

@CyrisXD Thanks for reporting this! There are a couple things that could be going on here, but I can only think of one possible leak in the Firestore library itself rather than in its dependencies. To narrow that down, I am preparing a PR that adds logging. It would be nice if you could try again once we have released a client with this PR (I'll follow up).

If this is an issue with our dependencies, then we can forward it to one of the client teams that work on our network stack.

schmidt-sebastian added a commit to googleapis/nodejs-firestore that referenced this issue Aug 16, 2019
This PR logs when we create or delete instances in the client pool, helping us debug firebase/firebase-admin-node#620
@CyrisXD
Copy link
Author

CyrisXD commented Aug 16, 2019

Thanks @schmidt-sebastian I'm happy to help out where I can. I'm working on a project that's supposed to hit production Monday so I'm really hoping we can figure this one out. Let me know what to do once you're ready. Cheers

@schmidt-sebastian
Copy link
Contributor

The logging has been added and released. Do you mind updating to 2.2.7 of @google-cloud/firestore and enabling logging (via admin.firestore.setLogFunction(console.log)). Note that this is just a first step in diagnosing your issue and that we will likely not be able to resolve it by Monday.

@CyrisXD
Copy link
Author

CyrisXD commented Aug 17, 2019

Thanks @schmidt-sebastian, here is the logs. Pastebin.

Just did a small amount ~ 250 records. Let me know if you need anything else.
Cheers

@schmidt-sebastian
Copy link
Contributor

From looking at the logs, everything seems to be operating somewhat normally. It does seem though that some part of the system can't keep up with your workload, and hence I would recommend throttling your update rate. By the time you issued your 250 write requests, none of them have been sent to the backend, which means that we cannot release any memory.
Furthermore, once the backend receives the writes you might also be hitting write rate limits (depending on your data model -https://firebase.google.com/docs/firestore/quotas#writes_and_transactions).

If you are able to throttle the rate at which you process documents I suspect that your memory usage will remain near constant (once you get into a steady state).

Please do let us know if this issue persists.

@schmidt-sebastian
Copy link
Contributor

I'm going to close this issue for now as I believe that everything operates to our expectations. I will re-open this issue if throttling does not alleviate the rise in memory consumption.

@CyrisXD
Copy link
Author

CyrisXD commented Aug 18, 2019

Hi @schmidt-sebastian, the function is rate limited. I've now got it at 450/s. As from what I read Firestore supports up to 500/s.

My code runs as follows.

Send 450 requests.
Wait 1 second.
Send next 450 requests.
Wait 1 second.
And So forth....

Testing locally I can see the RAM usage increasing with each upload as well, some of it does get garbage collected but still continues to grow on each upload.

image

@hiranya911
Copy link
Contributor

Does using batch writes help make things a little better?

@schmidt-sebastian
Copy link
Contributor

It looks like preparing the uploads alone takes up almost all of the CPU and none of the continuations get a chance to run. If WriteBatches and further throttling doesn't help, you might have to spawn multiple Node processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants