-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial public interface changes (#28)
* Move Records objects exports to records package * Define public interface for db package * Add SqlAlchemyDbHook for more Airflow support * Remove utility methods not used by records-mover itself * Bump feature version * Add MAINT.md * Ratchet mypy coverage * Ratchet flake8 * Export constructors for records formats * Move integration tests to new interface * Ratchet mypy coverage * Explicitly offer Airflow hooks as public interface * Explicitly offer Airflow hooks as public interface * Add a test of public interface from internal uses
- Loading branch information
1 parent
6bb94ac
commit 972bc10
Showing
32 changed files
with
230 additions
and
342 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Maintenance | ||
|
||
Packages inside include: | ||
|
||
* [records](./records_mover/records/), which is the core API you | ||
can use to move relational data from one place to another. | ||
* [url](./records_mover/url/), which offers some abstractions | ||
across different filesystem-like things (e.g., S3/HTTP/local | ||
filesystems, maybe SFTP in the future) | ||
* [db](./records_mover/db/), which adds some functionality on top of | ||
SQLAlchemy for various different database types. | ||
* [creds](./records_mover/creds/), which manages credentials and | ||
other connection details. | ||
* [pandas](./records_mover/pandas/), which adds functionality on top | ||
of the Pandas data science framework. | ||
* [airflow](./records_mover/airflow/), which helps interface parts | ||
of this library to DAGS running under Airflow. | ||
* [utils](./records_mover/utils/), which is the usual junk drawer of | ||
things that haven't grown enough mass to be exported into their own | ||
package. | ||
|
||
Things either labeled private with a prefix of `_` aren't stable | ||
interfaces - they can change rapidly. | ||
|
||
If you need access to another function/class, please submit an issue | ||
or a PR make it public. That PR is a good opportunity to talk about | ||
what changes we want to make to the public interface before we make | ||
one--it's a lot harder to change later! | ||
|
||
## Development | ||
|
||
### Installing development tools | ||
|
||
```bash | ||
./deps.sh # uses pyenv and pyenv-virtualenv | ||
``` | ||
|
||
### Unit testing | ||
|
||
To run the tests in your local pyenv: | ||
|
||
```bash | ||
make test | ||
``` | ||
|
||
### Automated integration testing | ||
|
||
All of our integration tests use the `itest` script can can be provided | ||
with the `--docker` flag to run inside docker. | ||
|
||
To see details on the tests available, run: | ||
|
||
```sh | ||
./itest --help | ||
``` | ||
|
||
To run all of the test suite locally (takes about 30 minutes): | ||
|
||
```sh | ||
./itest all | ||
``` | ||
|
||
To run the same suite with mover itself in a Docker image: | ||
|
||
```sh | ||
./itest --docker all | ||
``` | ||
|
||
### Common issues with integration tests | ||
|
||
```vertica | ||
(vertica_python.errors.InsufficientResources) Severity: b'ERROR', Message: b'Insufficient resources to execute plan on pool general [Request Too Large:Memory(KB) Exceeded: Requested = 5254281, Free = 1369370 (Limit = 1377562, Used = 8192)]', Sqlstate: b'53000', Routine: b'Exec_compilePlan', File: b'/scratch_a/release/svrtar2409/vbuild/vertica/Dist/Dist.cpp', Line: b'1540', Error Code: b'3587', SQL: " SELECT S3EXPORT( * USING PARAMETERS url='s3://vince-scratch/PA6ViIBMMWk/records.csv', chunksize=5368709120, to_charset='UTF8', delimiter='\x01', record_terminator='\x02') OVER(PARTITION BEST) FROM public.test_table1 " | ||
``` | ||
|
||
Try expanding your Docker for Mac memory size to 8G. Vertica is | ||
memory intensive, even under Docker. | ||
|
||
### Manual integration testing | ||
|
||
There's also a manual records schema JSON functionality | ||
[torture test](tests/integration/table2table/TORTURE.md) available to run - | ||
this may be handy after making large-scale refactors of the records | ||
schema JSON code or when adding load/unload support to a new database | ||
type. | ||
|
||
### Semantic versioning | ||
|
||
In this house, we use [semantic versioning](http://semver.org) to indicate | ||
when we make breaking changes to interfaces. If you don't want to live | ||
dangerously, and you are currently using version a.y.z (see setup.py to see | ||
what version we're at) specify your requirement like this in requirements.txt: | ||
|
||
records_mover>=a.x.y,<b | ||
|
||
This will make sure you don't get automatically updated into the next | ||
breaking change. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
93.7300 | ||
93.6400 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
214 | ||
208 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
89.0100 | ||
89.1300 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
__all__ = ["hooks"] | ||
|
||
from . import hooks |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
__all__ = [ | ||
"RecordsHook", | ||
"SqlAlchemyDbHook", | ||
] | ||
|
||
from .sqlalchemy_db_hook import SqlAlchemyDbHook | ||
from .records_hook import RecordsHook |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
import sqlalchemy as sa | ||
from records_mover.db import create_sqlalchemy_url | ||
from airflow.hooks import BaseHook | ||
|
||
|
||
class SqlAlchemyDbHook(BaseHook): | ||
def __init__(self, db_conn_id): | ||
self.db_conn_id = db_conn_id | ||
|
||
def get_conn(self): | ||
conn = BaseHook.get_connection(self.db_conn_id) | ||
db_url = create_sqlalchemy_url( | ||
{ | ||
'host': conn.host, | ||
'port': str(conn.port), | ||
'database': conn.schema, | ||
'user': conn.login, | ||
'password': conn.password, | ||
'type': conn.extra_dejson.get('type', conn.conn_type.lower()), | ||
} | ||
) | ||
return sa.create_engine(db_url) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,9 @@ | ||
__all__ = [ | ||
'DBDriver', | ||
'LoadError', | ||
'create_sqlalchemy_url', | ||
] | ||
|
||
from .driver import DBDriver # noqa | ||
from .factory import db_driver # noqa | ||
from .errors import LoadError # noqa | ||
from .connect import create_sqlalchemy_url |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,19 @@ | ||
__all__ = [ | ||
'RecordsHints', | ||
'BootstrappingRecordsHints', | ||
'RecordsFormatType', | ||
'RecordsSchema', | ||
'RecordsFormat', | ||
'DelimitedRecordsFormat', | ||
'ParquetRecordsFormat', | ||
'ProcessingInstructions', | ||
'ExistingTableHandling', | ||
'Records', | ||
] | ||
|
||
from .types import RecordsHints, BootstrappingRecordsHints, RecordsFormatType # noqa | ||
from .schema import RecordsSchema # noqa | ||
from .records_format import RecordsFormat # noqa | ||
from .records_format import RecordsFormat, DelimitedRecordsFormat, ParquetRecordsFormat # noqa | ||
from .processing_instructions import ProcessingInstructions # noqa | ||
from .existing_table_handling import ExistingTableHandling # noqa | ||
from .records import Records # noqa |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.