Releases: bluelabsio/records-mover
Releases · bluelabsio/records-mover
BigQuery loading via GCS
Breaking changes:
- None
New features:
- BigQuery import from GCS buckets (#113)
- Allow Parquet records format to be specified on mvrec command line (#129)
- Allow environment-based configuration of GCS creds (#120)
- Do slow redshift unload via SELECT when bucket unload not available (#117)
- Load via INSERT on Redshift when scratch bucket not available (#114)
Bug fixes / reliability improvements:
- Prefer file extension to dialect compression defaults in targets (#111)
- Handle multiple fileobjs in DoMoveFromFileobjsSource (#107)
- Fix README.md code sample errors (#106)
- Also downcast constraints and statistics when downcasting field types (#103)
- Add dependency fix for sudden BigQuery test failure (#109)
- Allow Parquet to be used in import to BigQuery from records directories (#130)
- Handle 'operation timed out error' during long Redshift unloads (#128)
- Retry on more Google rate limit exceptions (#126)
- Better error message when S3
_format_*
file doesn't exist. (#121) - Improve logging for large moves (#122)
- Update PyYAML dependency to match awscli (#102)
Other updates:
- Fix dependencies for Homebrew processing (#135)
- Rename some internally used methods (#124) (#116)
- Drop dead code (#123) (#115)
- Rename module for better Mypy support (#125)
- Also test Redshift without S3 scratch bucket (#118)
- Introduce component test suite (#108)
- Bump Python version for internal development (#100)
1.0 release 🎉🎉🎉
Add logo, API documentation
Breaking changes:
- None
New features:
- None
Bug fixes / reliability improvements:
- Fix broken "cli" extra - you can now install
pip3 install records-mover[cli]
and have a working mvrec (#58)
Other updates:
Improved enterprise configuration
Breaking changes:
- The default session_type no longer uses LastPass for GCP/Google Sheets credentials. Set session_type to "lpass" for that mode - see the new configuration file setting in CONFIG.md to make that configuration permanent.
- Stricter hint validation is done internally, which may affect use of undocumented date/time formats--please file an issue if this affects a date/time hint value you rely on. (#66)
New features:
- Allow session type to be configured via config file, enabling a long-term default for which secret management system to use for credentials (#70)
- Allow IAM username suffix to be added to S3 scratch URL, enabling enterprise configurations using a common scratch bucket (#69)
- All hints can now be configured via CLI arguments on both input and output--notably, this allows a header row to be configured on output! (#66)
Bug fixes / reliability improvements:
- Minor error message tweaks (#71)
Better hint sniffing, initial GCS bucket support
Breaking changes:
- If you are relying on CSV hint inference in production, see detail on #57 and test carefully to ensure the new engine works correctly for your files.
New features:
- Default db-facts and credential objects for databases, AWS and GCP can be passed directly into
Session()
(#68) - Initial support for GCS buckets, allowing for copying files to and from them. This does not yet include bulk loading and unloading directly to/from these buckets. (#61)
- Records mover understands many more types of CSV files without being given explicit records format information. Hint inference now covers many additional types of hints. (#57)
- Support for a new user- or system-level configuration mechanism, initially used for S3 scratch buckets. (#47)
Bug fixes / reliability improvements:
- Better error message when processing zero byte CSV files. (#65)
- Address inconsistencies in records specs around CSV date/time hints (#63)
Other updates:
- Internal improvements doing validation on CSV hints (#55)
Breaking changes: retire context mover support, change db-facts config (breaking)
Breaking changes:
- Import latest db-facts, which renames an entry from
db_connect_method
toexports_from
in db-facts config: bluelabsio/db-facts#23 (#60)- Please update your records-mover with
pip3 install --upgrade records-mover
in your virtualenvs to make sure your code matches the new config format once your config is updated.
- Please update your records-mover with
- Remove context manager support in move() (Breaking change) (#52)
- Please update your
move()
statements to remove anywith
statements around the sources and targes - see this example for a valid use.
- Please update your
New features:
- Provide simplified way to get directly to sources/targets/move() without Session (#54)
Bug fixes / reliability improvements:
- Improve tests and fix bugs related to unhandled_hints (#51)
Other updates:
Bulk MySQL import, PostgreSQL export support
New features:
- MySQL bulk load (import) support
- PostgreSQL bulk unload (export) support
Bug fixes / reliability improvements:
- None
Other updates:
- Refactor db_driver load/unload functionality and document how to add bulk loading to a database driver
Breaking changes: Rearrange "extras", MySQL support, remove deprecated args, sqlalchemy-redshift workaround
Breaking changes:
- Separate out "extras" in
setup.py
to avoid installing unnecessary dependencies - #38- You will need to change how you run
pip install
to get all of the dependencies you need. Please see INSTALL.md for the new things to add and alter anypip install records-mover
calls accordingly.
- You will need to change how you run
- Remove deprecated and unused parameters in DataFrame source factory method - #44
- Please remove
schema_name
,table_name
anddb_engine
parameters tosources.dataframe()
calls.
- Please remove
New features:
- Initial MySQL support - including importing and exporting of small amounts of data using INSERT and SELECT for now - #31
Bug fixes / reliability improvements:
- Work around sqlalchemy-redshift/sqlalchemy-redshift#195
- Redact known secrets from logging - #36
- Error message improvement to help find error details in
stl_load_errors
- #45 - Fix deprecation warnings from sqlalchemy-redshift and google-cloud-bigquery #44
Other updates:
- Improvements to database-driver internal integration testing and documentation - #31
Postgres COPY FROM support, handle empty strings in Redshift, library logging setup
Breaking changes:
- None
New features:
- Faster loading into Postgres via the COPY FROM statement
Bug fixes / reliability improvements:
- Handle empty strings in Redshift loading correctly from Pandas dataframes
Other updates:
- Convenience method (
session.set_stream_logging()
) to set some reasonable logging defaults and get debugging information when using Records Mover as a library. Note that this was already set by default when using themvrec
CLI tool.
v0.3.1
Breaking changes:
- None
New features:
- None
Bug fixes / reliability improvements:
- Fix issues with pip install including setuputils.py
Other updates:
- None