Skip to content
This repository has been archived by the owner on Feb 14, 2023. It is now read-only.

The datastore operation timed out, or the data was temporarily unavailable #35

Open
teebu opened this issue Jul 12, 2020 · 14 comments
Open

Comments

@teebu
Copy link

teebu commented Jul 12, 2020

I'm seeing this error, not sure why, but it should just keep retrying, at least a few times, instead of ending the process.

Error: Error 4: The datastore operation timed out, or the data was temporarily unavailable.
    at QueryWatch.onData (elasticstore\node_modules\@google-cloud\firestore\build\src\watch.js:350:34)
    at PassThrough.<anonymous> (elasticstore\node_modules\@google-cloud\firestore\build\src\watch.js:297:26)
    at PassThrough.emit (events.js:315:20)
    at addChunk (_stream_readable.js:302:12)
    at readableAddChunk (_stream_readable.js:278:9)
    at PassThrough.Readable.push (_stream_readable.js:217:10)
    at PassThrough.Transform.push (_stream_transform.js:152:32)
    at PassThrough.afterTransform (_stream_transform.js:96:10)
    at PassThrough._transform (_stream_passthrough.js:46:3)
    at PassThrough.Transform._read (_stream_transform.js:191:10)

I'm not too familiar with firestore, but it seems to me that you are downloading instead of streaming the collection. And if its too big, it will time out?

Is this the solution? https://cloud.google.com/nodejs/docs/reference/firestore/1.3.x/Query#stream

Here is the issue that they claim fixed this:
googleapis/nodejs-firestore#1040

@teebu teebu changed the title Should retry on errors. The datastore operation timed out, or the data was temporarily unavailable Jul 12, 2020
@acupofjose
Copy link
Owner

@teebu, I haven't seen the stream API for firestore queries before, so that's a cool find. Unfortunately, it looks like it is just for one-off queries to firestore, not for listeners. Elasticstore requires maintaining listeners for added/removed/modified.

It looks like that has been updated upstream in nodejs-firestore, but I'm not sure it's made it into firebase-admin yet.

@teebu
Copy link
Author

teebu commented Jul 12, 2020

So you're saying the nodejs-firestore has the functionality of stream queue with onSnapshot listener to handling both streaming the query and listening for events?

What about the solution provided in the ticket, googleapis/nodejs-firestore#1040 (comment)?

@acupofjose
Copy link
Owner

No, it doesn't have that functionality - or at least, doesn't have that functionality right now. Nor, unfortunately will the solution he proposes work for our purposes.

"I need to not only get the first snapshot, but also subsequent updates, which makes the implementation more complex than the example you have given."
googleapis/nodejs-firestore#1040 (comment)

Now, an alternative might be to stream the initial data filling and then do an onSnapshot for data deltas only - but onSnapshot doesn't allow you to only listen for changes as it is currently implemented.

That said, doing an onSnapshot on a large collection is supposed to be stable.

@teebu
Copy link
Author

teebu commented Jul 13, 2020

Some ideas from https://stackoverflow.com/questions/33885059/how-to-only-get-new-data-without-existing-data-from-a-firebase

One proposal is to have a created_at field in the collection, and you can use the where selector to now, or something like child_added, used here. https://github.com/FirebaseExtended/firechat/blob/master/src/js/firechat.js#L347

@thelaughingman
Copy link

Same problem here 😥
Any news on solutions/workarounds?

@teebu
Copy link
Author

teebu commented Sep 10, 2020

I contacted firebase support, their response:

Engineering team has suggested using limits and pagination to avoid this kind of errors. In these documents you will find more details about it.

@acupofjose
Copy link
Owner

acupofjose commented Sep 10, 2020

Well. Given that from support - any suggestions on how to make this work?

We could restart the listener in the event that it fails - but that would pull all the initial documents again. Which, in my mind, would have a lot of unnecessary reads. The created_at/updated_at field query is what I've implemented in the past to limit the number of reads and achieve long running listeners.

Frankly, I don't have a project that I'm able to see this error on. My example project https://elasticstore.netlify.app has long running listeners and has not had this problem.

Is there any way y'all can provide a project I can use to debug with?

(Also, thanks for keeping us in the loop @teebu!)

@teebu
Copy link
Author

teebu commented Sep 10, 2020

Restarting the listener is a bad idea. I ran into an issue where I had this running on a kubernetes pod, and it ran out of memory relatively quickly because of how much data it was pulling, ended up restarting like 200 times in a couple of days, ended up costing like $100 bucks in firebase fees. I don't have any good solutions. The idea would be sync using stream and pagination, and on a separate process do the snap from now onwards.

@acupofjose
Copy link
Owner

Ooof. Sorry to hear it. I'll keep mulling on it then. Thanks @teebu

@thelaughingman
Copy link

Thanks for the feedback guys.
My collection in Firestore is relatively small, around 36k records, and the documents have 500/600 fields each, with seven sub-collections.

Is it too big/heavy?

@acupofjose if you wish I can give you access to my Firestore database

@timtim001hk
Copy link

Any update for this problem?
I also encountered this error...In my case, the problem only occurs in collection with more than 100,000 records.

@timtim001hk
Copy link

timtim001hk commented Aug 10, 2021

Can this be solved in the following way?

Firestore reactive pagination
Simple pagination for loading more content while still listening for updates
https://medium.com/@650egor/firestore-reactive-pagination-db3afb0bf42e

@acupofjose
Copy link
Owner

@timtim001hk - sorry for the delay, I've been out of the country.

That's an interesting solution! I think it would require counting the number of records in the listener so that you can maintain the cursors. There is also a question of how big the listener blocks should be - 10,000 records? 50,000?

Curious as to your thoughts on those!

@timtim001hk
Copy link

timtim001hk commented Aug 10, 2021

I think 10,000 records is reasonable. It would be better to become configurable.

Waiting for your good news!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants