Skip to content

Releases: bluelabsio/records-mover

BigQuery loading via GCS

15 Nov 15:27
Compare
Choose a tag to compare

Breaking changes:

  • None

New features:

  • BigQuery import from GCS buckets (#113)
  • Allow Parquet records format to be specified on mvrec command line (#129)
  • Allow environment-based configuration of GCS creds (#120)
  • Do slow redshift unload via SELECT when bucket unload not available (#117)
  • Load via INSERT on Redshift when scratch bucket not available (#114)

Bug fixes / reliability improvements:

  • Prefer file extension to dialect compression defaults in targets (#111)
  • Handle multiple fileobjs in DoMoveFromFileobjsSource (#107)
  • Fix README.md code sample errors (#106)
  • Also downcast constraints and statistics when downcasting field types (#103)
  • Add dependency fix for sudden BigQuery test failure (#109)
  • Allow Parquet to be used in import to BigQuery from records directories (#130)
  • Handle 'operation timed out error' during long Redshift unloads (#128)
  • Retry on more Google rate limit exceptions (#126)
  • Better error message when S3 _format_* file doesn't exist. (#121)
  • Improve logging for large moves (#122)
  • Update PyYAML dependency to match awscli (#102)

Other updates:

  • Fix dependencies for Homebrew processing (#135)
  • Rename some internally used methods (#124) (#116)
  • Drop dead code (#123) (#115)
  • Rename module for better Mypy support (#125)
  • Also test Redshift without S3 scratch bucket (#118)
  • Introduce component test suite (#108)
  • Bump Python version for internal development (#100)

1.0 release 🎉🎉🎉

27 Aug 20:43
Compare
Choose a tag to compare

Breaking changes:

  • None

New features:

  • None

Bug fixes / reliability improvements:

  • Adjust to breaking change in Pandas 1.1 (#101)

Other updates:

  • Add usage example GIF (#99)

Add logo, API documentation

13 Jul 14:47
Compare
Choose a tag to compare

Breaking changes:

  • None

New features:

  • None

Bug fixes / reliability improvements:

  • Fix broken "cli" extra - you can now install pip3 install records-mover[cli] and have a working mvrec (#58)

Other updates:

Improved enterprise configuration

03 Jun 21:25
Compare
Choose a tag to compare

Breaking changes:

  • The default session_type no longer uses LastPass for GCP/Google Sheets credentials. Set session_type to "lpass" for that mode - see the new configuration file setting in CONFIG.md to make that configuration permanent.
  • Stricter hint validation is done internally, which may affect use of undocumented date/time formats--please file an issue if this affects a date/time hint value you rely on. (#66)

New features:

  • Allow session type to be configured via config file, enabling a long-term default for which secret management system to use for credentials (#70)
  • Allow IAM username suffix to be added to S3 scratch URL, enabling enterprise configurations using a common scratch bucket (#69)
  • All hints can now be configured via CLI arguments on both input and output--notably, this allows a header row to be configured on output! (#66)

Bug fixes / reliability improvements:

  • Minor error message tweaks (#71)

Better hint sniffing, initial GCS bucket support

01 Jun 16:14
Compare
Choose a tag to compare

Breaking changes:

  • If you are relying on CSV hint inference in production, see detail on #57 and test carefully to ensure the new engine works correctly for your files.

New features:

  • Default db-facts and credential objects for databases, AWS and GCP can be passed directly into Session() (#68)
  • Initial support for GCS buckets, allowing for copying files to and from them. This does not yet include bulk loading and unloading directly to/from these buckets. (#61)
  • Records mover understands many more types of CSV files without being given explicit records format information. Hint inference now covers many additional types of hints. (#57)
  • Support for a new user- or system-level configuration mechanism, initially used for S3 scratch buckets. (#47)

Bug fixes / reliability improvements:

  • Better error message when processing zero byte CSV files. (#65)
  • Address inconsistencies in records specs around CSV date/time hints (#63)

Other updates:

  • Internal improvements doing validation on CSV hints (#55)

Breaking changes: retire context mover support, change db-facts config (breaking)

12 May 16:48
Compare
Choose a tag to compare

Breaking changes:

  • Import latest db-facts, which renames an entry from db_connect_method to exports_from in db-facts config: bluelabsio/db-facts#23 (#60)
    • Please update your records-mover with pip3 install --upgrade records-mover in your virtualenvs to make sure your code matches the new config format once your config is updated.
  • Remove context manager support in move() (Breaking change) (#52)
    • Please update your move() statements to remove any with statements around the sources and targes - see this example for a valid use.

New features:

  • Provide simplified way to get directly to sources/targets/move() without Session (#54)

Bug fixes / reliability improvements:

  • Improve tests and fix bugs related to unhandled_hints (#51)

Other updates:

  • Move files around to clean up root directory in GitHub (#62)
  • Import records and records schema specs to docs/ directory (#59)

Bulk MySQL import, PostgreSQL export support

05 May 20:38
Compare
Choose a tag to compare

New features:

  • MySQL bulk load (import) support
  • PostgreSQL bulk unload (export) support

Bug fixes / reliability improvements:

  • None

Other updates:

  • Refactor db_driver load/unload functionality and document how to add bulk loading to a database driver

Breaking changes: Rearrange "extras", MySQL support, remove deprecated args, sqlalchemy-redshift workaround

22 Apr 00:21
Compare
Choose a tag to compare

Breaking changes:

  • Separate out "extras" in setup.py to avoid installing unnecessary dependencies - #38
    • You will need to change how you run pip install to get all of the dependencies you need. Please see INSTALL.md for the new things to add and alter any pip install records-mover calls accordingly.
  • Remove deprecated and unused parameters in DataFrame source factory method - #44
    • Please remove schema_name, table_name and db_engine parameters to sources.dataframe() calls.

New features:

  • Initial MySQL support - including importing and exporting of small amounts of data using INSERT and SELECT for now - #31

Bug fixes / reliability improvements:

Other updates:

  • Improvements to database-driver internal integration testing and documentation - #31

Postgres COPY FROM support, handle empty strings in Redshift, library logging setup

20 Mar 15:35
Compare
Choose a tag to compare

Breaking changes:

  • None

New features:

  • Faster loading into Postgres via the COPY FROM statement

Bug fixes / reliability improvements:

  • Handle empty strings in Redshift loading correctly from Pandas dataframes

Other updates:

  • Convenience method (session.set_stream_logging()) to set some reasonable logging defaults and get debugging information when using Records Mover as a library. Note that this was already set by default when using the mvrec CLI tool.

v0.3.1

08 Mar 20:10
baa5ab5
Compare
Choose a tag to compare

Breaking changes:

  • None

New features:

  • None

Bug fixes / reliability improvements:

  • Fix issues with pip install including setuputils.py

Other updates:

  • None