import: no job or progress during setup when importing pgdump/mysqldump #48598

dt · 2020-05-08T18:09:28Z

When importing "bundle" formats like pg_dump or mysqldump which include schema definitions and row-data in a single file, we currently download and parse the file to extract the schema definitions during import planning, because we want to resolve or create all the tables we will import into before we create the IMPORT job.

However when presented with something like a 300GB pg_dump file, this has the unfortunate effect of meaning that the IMPORT statement spends a long time in planning before creating a job, and in those minutes (or hours?) the user has no indication of what is going on -- there is no job to inspect or on which to report progress, even though we're clearly doing bulk-y work.

At the very least, it'd be more user-friendly to move the fetching and parsing of schemas to a prepare step of the actual import job execution, rather than in the planning phase. Ideally we'd avoid the step, and double download and parse of the file entirely though and instead simply parse schema definitions as we go, in the same pass which processes the row data. Unfortunately pg_dump dumps the index definitions after the row data though, while one of the big advantages of IMPORT is that we can generate all the KVs for a row, including index KVs, in one pass, so it isn't clear what the "right" way to do this is. We could optimistically say there are no indexes and import in one pass and then if/when we see indexes, either queue normal index creation and/or a second pass that just generates index kvs for those indexes?

blathers-crl · 2020-05-08T18:09:30Z

Hi @dt, please add a C-ategory label to your issue. Check out the label system docs.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

dt added the A-disaster-recovery label May 8, 2020

dt mentioned this issue Jun 23, 2020

importccl: read SQL schema after planning phase #38801

Closed

adityamaru assigned mokaixu Oct 6, 2020

mokaixu mentioned this issue Oct 14, 2020

importccl: moved bundle import logic from planning to execution phase #55511

Merged

craig bot closed this as completed in 9fd5351 Oct 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import: no job or progress during setup when importing pgdump/mysqldump #48598

import: no job or progress during setup when importing pgdump/mysqldump #48598

dt commented May 8, 2020

blathers-crl bot commented May 8, 2020

import: no job or progress during setup when importing pgdump/mysqldump #48598

import: no job or progress during setup when importing pgdump/mysqldump #48598

Comments

dt commented May 8, 2020

blathers-crl bot commented May 8, 2020