Possible memory leak? Memory increasing with every collection().add() #620

CyrisXD · 2019-08-15T04:19:02Z

Describe your environment

Operating System version: Heroku
Firebase SDK version: 8.3.0
Firebase Product: Database

Describe the problem

I'm parsing a CSV with 130k lines and as I'm iterating through each row I'm adding it to the collection, this is rated limited to 450/s. However with each CSV parse it seems Firestore is using more and more memory and stays. Removing the collection().add() code fixes this problem and the parsing of the CSV alone never goes above 80mb.

Steps to reproduce:

I'd imagine you could run the relevant code through a loop, in my case is was 130k times for the CSV size. If there is a more efficient way of achieving this please do let me know.

Relevant Code:

db.collection("games")
        .doc(req.body.docid)
        .collection("players")
        .doc()
        .add({
          Number: row[0],
          FirstName: row[1],
          Lastname: row[2],
          Email: row[3],
          Prize: row[4],
          gameID: req.body.docid,
          guid: row[5]
        })
        .then(() => {
          progressCount++;
          updateProgress(req.body.docid);
        })
        .catch(err => {
          console.log(err.message);
        });
    });

The text was updated successfully, but these errors were encountered:

CyrisXD · 2019-08-15T21:40:01Z

I can confirm 24 hours later the memory is still present. Full code can be found here - PasteBin

Upon my 4th CSV upload, Firestore pushed the memory above Heroku limits and crashed the Dyno, once restarted the Dyno memory went back to ~50mb.

This PR logs when we create or delete instances in the client pool, helping us debug firebase/firebase-admin-node#620

schmidt-sebastian · 2019-08-16T17:33:52Z

@CyrisXD Thanks for reporting this! There are a couple things that could be going on here, but I can only think of one possible leak in the Firestore library itself rather than in its dependencies. To narrow that down, I am preparing a PR that adds logging. It would be nice if you could try again once we have released a client with this PR (I'll follow up).

If this is an issue with our dependencies, then we can forward it to one of the client teams that work on our network stack.

This PR logs when we create or delete instances in the client pool, helping us debug firebase/firebase-admin-node#620

CyrisXD · 2019-08-16T19:54:42Z

Thanks @schmidt-sebastian I'm happy to help out where I can. I'm working on a project that's supposed to hit production Monday so I'm really hoping we can figure this one out. Let me know what to do once you're ready. Cheers

schmidt-sebastian · 2019-08-16T22:24:22Z

The logging has been added and released. Do you mind updating to 2.2.7 of @google-cloud/firestore and enabling logging (via admin.firestore.setLogFunction(console.log)). Note that this is just a first step in diagnosing your issue and that we will likely not be able to resolve it by Monday.

CyrisXD · 2019-08-17T07:49:44Z

Thanks @schmidt-sebastian, here is the logs. Pastebin.

Just did a small amount ~ 250 records. Let me know if you need anything else.
Cheers

schmidt-sebastian · 2019-08-18T04:31:05Z

From looking at the logs, everything seems to be operating somewhat normally. It does seem though that some part of the system can't keep up with your workload, and hence I would recommend throttling your update rate. By the time you issued your 250 write requests, none of them have been sent to the backend, which means that we cannot release any memory.
Furthermore, once the backend receives the writes you might also be hitting write rate limits (depending on your data model -https://firebase.google.com/docs/firestore/quotas#writes_and_transactions).

If you are able to throttle the rate at which you process documents I suspect that your memory usage will remain near constant (once you get into a steady state).

Please do let us know if this issue persists.

schmidt-sebastian · 2019-08-18T04:32:18Z

I'm going to close this issue for now as I believe that everything operates to our expectations. I will re-open this issue if throttling does not alleviate the rise in memory consumption.

CyrisXD · 2019-08-18T04:54:21Z

Hi @schmidt-sebastian, the function is rate limited. I've now got it at 450/s. As from what I read Firestore supports up to 500/s.

My code runs as follows.

Send 450 requests.
Wait 1 second.
Send next 450 requests.
Wait 1 second.
And So forth....

Testing locally I can see the RAM usage increasing with each upload as well, some of it does get garbage collected but still continues to grow on each upload.

hiranya911 · 2019-08-18T20:46:30Z

Does using batch writes help make things a little better?

schmidt-sebastian · 2019-08-19T15:49:41Z

It looks like preparing the uploads alone takes up almost all of the CPU and none of the continuations get a chance to run. If WriteBatches and further throttling doesn't help, you might have to spawn multiple Node processes.

google-oss-bot added the api: database label Aug 15, 2019

hiranya911 added api: firestore and removed api: database labels Aug 15, 2019

hiranya911 assigned schmidt-sebastian Aug 15, 2019

schmidt-sebastian added a commit to googleapis/nodejs-firestore that referenced this issue Aug 16, 2019

fix: Add logging to ClientPool

8521e0c

This PR logs when we create or delete instances in the client pool, helping us debug firebase/firebase-admin-node#620

schmidt-sebastian mentioned this issue Aug 16, 2019

fix: add logging to client pool googleapis/nodejs-firestore#733

Merged

schmidt-sebastian added a commit to googleapis/nodejs-firestore that referenced this issue Aug 16, 2019

fix: add logging to client pool

db64e26

This PR logs when we create or delete instances in the client pool, helping us debug firebase/firebase-admin-node#620

schmidt-sebastian closed this as completed Aug 18, 2019

Jeremip11 mentioned this issue Nov 27, 2023

[Snyk] Fix for 15 vulnerabilities Jeremip11/firebase-admin-node#17

Open

takeratta mentioned this issue Nov 29, 2023

[Snyk] Fix for 19 vulnerabilities takeratta/firebase-admin-node#30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible memory leak? Memory increasing with every collection().add() #620

Possible memory leak? Memory increasing with every collection().add() #620

CyrisXD commented Aug 15, 2019

CyrisXD commented Aug 15, 2019 •

edited

Loading

schmidt-sebastian commented Aug 16, 2019

CyrisXD commented Aug 16, 2019

schmidt-sebastian commented Aug 16, 2019

CyrisXD commented Aug 17, 2019

schmidt-sebastian commented Aug 18, 2019

schmidt-sebastian commented Aug 18, 2019

CyrisXD commented Aug 18, 2019

hiranya911 commented Aug 18, 2019

schmidt-sebastian commented Aug 19, 2019

Possible memory leak? Memory increasing with every collection().add() #620

Possible memory leak? Memory increasing with every collection().add() #620

Comments

CyrisXD commented Aug 15, 2019

Describe your environment

Describe the problem

Steps to reproduce:

Relevant Code:

CyrisXD commented Aug 15, 2019 • edited Loading

schmidt-sebastian commented Aug 16, 2019

CyrisXD commented Aug 16, 2019

schmidt-sebastian commented Aug 16, 2019

CyrisXD commented Aug 17, 2019

schmidt-sebastian commented Aug 18, 2019

schmidt-sebastian commented Aug 18, 2019

CyrisXD commented Aug 18, 2019

hiranya911 commented Aug 18, 2019

schmidt-sebastian commented Aug 19, 2019

CyrisXD commented Aug 15, 2019 •

edited

Loading