Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-add source_ prefix to archived source files #140

Merged
merged 1 commit into from
Jul 21, 2023
Merged

Conversation

mikeknep
Copy link
Contributor

In the PR moving source data from memory to disk, I introduced a bug in the creation of the source data archive (source_tables.tar.gz) that broke the restore-from-backup functionality.

Original backup-and-restore behavior

  • Write each source table from Pandas DataFrame to a local CSV named source_{table}.csv
  • Archive those files, upload the tar to the project as a project artifact
  • When restoring from backup file, pull down the archive, unpack, and read those files back into DataFrames

Bug from previous PR

Since the tables now exist on disk and we only store pointers to them in memory, we don't need to call dataframe.to_csv; instead we just add the existing local files to the archive. The problem is that the filenames of the archived files changed—they no longer had the source_ prefix but were instead just {table}.csv:
Screen Shot 2023-07-21 at 10 43 46 AM
Consequently, when restore pulls down and unpacks the archive, it looks for files that do not exist (e.g. it tries to find source_users.csv but the file in the archive is just users.csv).

The fix (this PR)

Add the source_ prefix to each table filename when adding it to the archive.

Copy link
Contributor

@gracecvking gracecvking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest code change to review 👍

@mikeknep mikeknep merged commit 795ee98 into main Jul 21, 2023
@mikeknep mikeknep deleted the restore-fix branch July 21, 2023 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants