Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigquery-firestore-export - export large ammount of data #427

Open
sjerzak opened this issue Mar 25, 2024 · 1 comment
Open

bigquery-firestore-export - export large ammount of data #427

sjerzak opened this issue Mar 25, 2024 · 1 comment

Comments

@sjerzak
Copy link

sjerzak commented Mar 25, 2024

Hi

I got few questions about bigquery-firestore-export extesion.

I did tried to use bigquery-firestore-export extension with large amount of data(200kk+ rows)
IIRC firestore have write limit for batch operations and it doesn't allows to write more than 1000 writes/second.
Google cloud functions which export data from temp table created by extension can be runned for max 9 minutes = 540 seconds.

This give us arround 540 000 writes. Is there any way to insert more data?
I also tried to configure extension to use multiple function intances but it din't change anything.
Even when I forced "Minimum function instances" to be higher the amount of data writed to firestore didn't change comparing to runs when "Minimum function instances" was't set.

Also when "Minimum function instances" wasn't set and "Maximum function instances" was set to 10 extension run didn't scale automatically and used only two instances.

Can I ask how does setting "Min/Max function instances" affect function execution?

@huangjeff5
Copy link
Collaborator

Hey there,
Can you share more about your use case? (what exactly is the data you're exporting to Firestore and why?). That would help me understand if there's another way to solve your problem well!

To answer some of your questions though:

First, Min/Max function instances doesn't impact execution in this case because ultimately it runs in a single function execution as you noticed.

Second, I think there might be ways that we can better support a big data import into Firestore with Dataflow. (something we're exploring with another extension that we're working on). That may be worth implementing eventually, if it makes sense for this use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants