Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new config mechanism and use for S3 scratch bucket #47

Merged
merged 29 commits into from
May 14, 2020

Conversation

vinceatbluelabs
Copy link
Contributor

@vinceatbluelabs vinceatbluelabs commented Apr 22, 2020

This is more generic configuration mechanism for records-mover that can be set system-, user-, or session-wide as needed. It'd replace the current /usr/local/bin/scratch-s3-url mechanism.

@@ -41,6 +42,33 @@ def _infer_scratch_s3_url(session_type: str) -> Optional[str]:
if "SCRATCH_S3_URL" in os.environ:
return os.environ["SCRATCH_S3_URL"]

# TODO: do I want this in 'app.ini'?
# TODO: where else to document this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncompleted punchlist item detected--consider resolving or moving this to your issue tracker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of thinking we create a CONFIG.md describing (for now) how to configure the S3 bucket, including the content below, and point to it on github.com in the error message when we raise a NoTemporaryBucketConfiguration or CredsDoNotSupportS3Export. Seem reasonable?

@vinceatbluelabs
Copy link
Contributor Author

Hey @cwegrzyn, could you take a look and provide some feedback on this so far?

I haven't yet looked into the db-facts aspect of this, but it seems to capture the essence of what we talked about for configuration.

@vinceatbluelabs vinceatbluelabs requested a review from cwegrzyn May 5, 2020 21:09
@vinceatbluelabs vinceatbluelabs changed the title Add new config mechanism and use for S3 scratch bucket WIP: Add new config mechanism and use for S3 scratch bucket May 5, 2020
@vinceatbluelabs vinceatbluelabs marked this pull request as ready for review May 5, 2020 21:09
Copy link
Contributor

@cwegrzyn cwegrzyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a first glance, this seems pretty reasonable. It did get me to think about WHEN we'd expect to need to override the scratch URL, and, in a multi-environment setup like we have, it seems like something that is roughly associated with a specific database. To be slightly more specific, imagine our staff who might be working against an internal redshift or a client-hosted Redshift-- they probably need (or at least ought) to use different intermediate S3 buckets for each. What's the way that we'd expect to handle that?

Just spitballing, but two ideas for extensions that might solve this and other general:

  1. Let db_facts supply additional config for records mover in maybe a records_mover key in the JSON blob. Could use similar paths to what you have right now (e.g., records_mover.aws.s3_scratch_url)

  2. Alternatively, add a mechanism to the app.ini config so that settings can be overridden on a per-dbname basis, e.g., so you can create a [db.<dbname>.aws] section to set the scratch url for that particular db.

Haven't really thought much about the pros and cons of those (or other possible solutions yet)

@vinceatbluelabs
Copy link
Contributor Author

Totally! Given AWS takes credentials alongside the S3 bucket, (except maybe in GovCloud?), I could certainly see reasons for folks to want to have that level of configurability. One flag is that AWS credentials would often need to change along with that S3 bucket configuration - having a consistent way to address this would be great, too, instead of relying on folks handing records-mover AWS creds that match the database they pointed it to.

Of course, sometimes this problem is thorny on its own - e.g., with third party tools to provide temporary AWS credentials. This sounds a little more in the realm of what something like db-facts would provide with the ability to call out to scripts to fetch back creds and all.

docs/CONFIG.md Outdated Show resolved Hide resolved
docs/CONFIG.md Outdated Show resolved Hide resolved
@@ -1 +1 @@
968
969
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One extra line in setup.py for the new dependency

return s3_scratch_url
else:
logger.debug('No config ini file found')

if session_type == 'cli':
try:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be a second PR to retire this feature once we have internal uses switched over to the config file.

#
# https://github.com/exhuma/config_resolver/blob/master/doc/intro.rst#logging
#
# https://github.com/exhuma/config_resolver/issues/69
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a PR open - hopefully it gets accepted: exhuma/config_resolver#70

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fine now. Let me know if there are any remaining issues.

@vinceatbluelabs vinceatbluelabs changed the title WIP: Add new config mechanism and use for S3 scratch bucket Add new config mechanism and use for S3 scratch bucket May 12, 2020
@vinceatbluelabs
Copy link
Contributor Author

OK! Updated and ready to merge.

@vinceatbluelabs vinceatbluelabs merged commit 6ec62d5 into master May 14, 2020
@vinceatbluelabs vinceatbluelabs deleted the config_files branch June 24, 2020 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants