-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.7.01] Cursor errors on large queries #9944
Comments
I also just got this when running a migration on staging which does a find({}).forEach on ~2500 docs, and updates them inside it.
|
A little more searching makes me think this is just the cursor timeout settings in the mongo server. Our timeout happened after 5 mins exactly. We have switched to Atlas for this project and also 3.6 so maybe the default cursor timeout has changed to 5 mins. Will try and find out. |
Yeah it is definitely a cursor timeout as mine also happens every 5 minutes. Is there some meteor value that changed to 5 minutes to make these start to repro? Or was it just a mongo change that makes it faile in an unexpected way? My query is trying to retrieve about 6MB of data and doesn't timeout every time, but once it starts timing out it seems to timeout a lot more often. I would be fine with this behavior if it just logged an error and retried since these are not essential queries for my app, the main problem here is that it doesn't seem to handle it gracefully and leaks what appears to be the entire set of data it was querying. |
I don't think anything in Meteor changed, more the mongo server and driver. If we want to hang on to cursors for longer than 5 minutes now you need to pass a cursor flag noCursorTimeout: true . I'm not sure exactly how we do this from Meteor, I'm guessing the rawCollection is needed and then you can do |
Well I normally have cursors open for a lot longer than 5 minutes, and those are not really the ones that are the problem. The ones that are causing problems are the ones that are pulling in too much information. |
Hi guys! I ran into a similar problem. 2018-06-29T00:22:41.084+0000 I COMMAND [conn58] command zzp.users command: find { find: "users", filter: {}, sort: { createdAt: -1 }, limit: 1, returnKey: false, showRecordId: false, lsid: { id: UUID("d4e2768d-e5d1-489e-a516-fe241e9bae68") }, $db: "zzp" } planSummary: COLLSCAN keysExamined:0 docsExamined:50511 hasSortStage:1 cursorExhausted:1 numYields:395 nreturned:1 reslen:905 locks:{ Global: { acquireCount: { r: 792 } }, Database: { acquireCount: { r: 396 } }, Collection: { acquireCount: { r: 396 } } } protocol:op_query 146ms more calls that query in background. There are a lot of similar queries on the database's database in idle applications Exception while polling query {"collectionName":"users","selector":{},"options":{"transform":null}} { MongoError: cursor id 342334298148 not found |
Running Meteor 1.7 and Mongo 3.6.2. All of the problems we're seeing happening in our Meteor job collection jobs, but no reason to believe that it is limited to that--that just happens to be where a lot of data processing happens. Experiencing a lot of cursor not found errors. The Mongo logs show the following: I've included the actual output below. It looks like the process that cleans up the cursor happens every 5 minutes. Setting the feature compatibility of Mongo down to 3.4 from 3.6 ( I'm testing this and reproducing by running a find on a collection of greater than 20000 records with a constant sleep in in the forEach loop. For example:
|
It might be related to https://jira.mongodb.org/browse/SERVER-34810 |
I have created a small meteor application which demonstrates the issue by just running https://github.com/harleyholt/meteor-issue-9944-reproduction |
+1, same problem! We tried to use const cursor = Hosts.rawCollection().find({}, {timeout: false});
cursor.addCursorFlag('noCursorTimeout',true);
while (Promise.await(cursor.hasNext())) {
const host = Promise.await(cursor.next());
processHost(host);
}
cursor.close(); ...but no luck :( We get the same error. This is urgent and affect our production apps. Please share any ideas to fix or workaround this. |
I can confirm that occasionally this happens not only in big jobs but also on regular |
@artpolikarpov When you do this simple find().fetch() of a small number of docs, does the operation take > 5 mins ? That seems to be the hard limit. We've redesigned our long running uses of cursors to avoid it and haven't seen it appear again since. |
@lynchem, no. |
First off, thanks for the excellent reproduction @harleyholt! As @koszta mentioned in |
Our production Mongo is 3.6.6, and this error still happens with it.
|
@artpolikarpov Good to know. The reproduction from #9944 (comment) works every time with Mongo 4. Any chance you can update to Mongo 4? We'll run more tests with 3.6.6. |
@hwillson while this issue is definitely caused by a change in MongoDB I think my PR on batching oplog messages (when they are received in rapid follow up) could also help alleviate these kinds of issues. It's quite simply not a good idea to be (live) subscribed to a collection when thousands of updates occur within a second. Leaving this here in the hope you guys will see the merit in those changes, as there have been no comments/reviews as of yet... (I understand these are difficult issues with a lot of architectural considerations, but some feedback would be greatly appreciated.) |
I've opened #10107 and when we switched to Meteor 1.7.1-beta.22 the errors immediately stopped. No problems since then, our Atlas runs 3.6.6 but the only change we did in production was moving to 1.7.0.3 |
Thanks everybody for the work tracking this down/solving. @hwillson is there any reason this can't be released as a patch? The issue seems pretty rough and we're still working through some side-effects from upgrading to Meteor |
I'm facing exactly the same error as described here. Every 5th minute some of my cursors, who happen to be still open, are crashing with the described message, and it is also reproducible for me using the package provided by @harleyholt. @hwillson I'm using a mongodb decoupled from Meteor. I've tested this with both MongoDB But sadly I can't prove it easily because updating the package |
@artpolikarpov our production version is also 3.6.7 and we're seeing it. Did you ever figure out how to resolve? |
I now digged even deeper: To me it does not look to be a problem with the version of your MongoDB server, but of the MongoDB driver, which is bundled in Meteor. I created a local copy of the package If you are like me and can't wait for version
I usually use the 99 to show that it's a customized version. @hwillson If you need a prove, try this with the repository provided earlier. I could see it working. |
Excellent effort, Simon!
… On 12 Sep 2018, at 15:28, Simon Schick ***@***.***> wrote:
I now digged even deeper:
I created a local copy of the package npm-mongo and started playing around and I can confirm that this bug is solved after switching from v3.0.11 to v3.1.0. I didn't go deeper from there, because the mongodb package 3.1.0 also works against all other versions of MongoDB: https://docs.mongodb.com/ecosystem/drivers/driver-compatibility-reference/#node-js-driver-compatibility
If you are like me and can't wait for version 1.7.1 or don't want to work with a beta or release-candidate, I advise you to do the same as I:
Create a folder packages in your project
Copy the package npm-mongo in the version used in your meteor version (e.g. from https://github.com/meteor/meteor/tree/release-1.7.0.3/packages/npm-mongo) into this folder
Change the version in package.js inside the call Package.describe to 3.0.99 and the one inside the call Npm.depends to 3.1.1
I usually use the 99 to show that it's a customized version.
@hwillson If you need a prove, try this with the repository provided earlier. I could see it working.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@etyp we rolled back to 1.6.1.3 |
I rolled back to 1.6.1.3 too. |
I rolled back from Meteor 1.7.0.5 to 1.6.1.4 and npm-mongo downgraded from 3.0.11 to 2.2.34. |
On 1.8 this error has stopped happening. I am still getting a memory leak from somewhere, but I'm not sure if it is related to this as i am not getting any logs any more, and the memery leak is a lot less severe. |
Oh yeah, this is definitely resolved in 1.8 with the new mongo driver. |
@alienw8 , Late, but you showed a query with filters without index. Because this your query is slow too. I saw this in your result: planSummary: COLLSCAN . You can check https://docs.mongodb.com/manual/indexes/
|
The answers showed this was solved but I got the same error on large data manipulating. I got over 10,000 data and .find().forEach() performed with 2 second for each data, So the process took a long time could happen this in a new version?
|
I just migrated my website to 1.7.0.1 and I am getting a lot of crashes and failures only one production. It never reproed on my test endpoint or locally
I have a bunch of different ones of these popping up, I seem to be able to fix most of them if I set a limit on them or lower the scope
None of these happened on any earlier versions so I think it has to do with upgrading to the new mongo
Before I fixed some of easier to fix iterations of this bug I could see a pretty big memory leak that was going to crash my server every few hours:
I am now at a point where it is crashing only on queries I cannot really remove, some of which are built into the collection FS package, and one where I am displaying an "infinite" scroll list of items on my homepage, and as soon as it gets too big it seems to hit this crash occasionally.
The text was updated successfully, but these errors were encountered: