Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new DuckDB Secrets manager #403

Merged
merged 31 commits into from
Jul 22, 2024
Merged

Add support for new DuckDB Secrets manager #403

merged 31 commits into from
Jul 22, 2024

Conversation

guenp
Copy link
Collaborator

@guenp guenp commented Jun 14, 2024

This adds support for the new DuckDB secrets manager.
Example usage:

default:
  outputs:
    dev:
      type: duckdb
      path: /tmp/dbt.duckdb
      secrets:
        - type: s3
          key_id: abc
          secret: xyz
          region: us-west-2
  target: dev

@guenp guenp requested a review from jwills June 14, 2024 22:35
@guenp
Copy link
Collaborator Author

guenp commented Jun 17, 2024

Note that glue still implements the s3_... settings variables. I'm not really sure how to tackle that - Plugin currently only takes settings in the BasePlugin.create factory method. We could add secrets there, or just pass the creds. Thoughts?

@jwills
Copy link
Collaborator

jwills commented Jun 19, 2024

@guenp wanted to clarify the issue with the glue plugin; one could still specify the necessary S3 info in the settings, right? But if you wanted to e.g. use the credential provider chain in conjunction with the Glue plugin, that would fail?

@guenp
Copy link
Collaborator Author

guenp commented Jun 20, 2024

@guenp wanted to clarify the issue with the glue plugin; one could still specify the necessary S3 info in the settings, right? But if you wanted to e.g. use the credential provider chain in conjunction with the Glue plugin, that would fail?

Yes, that's right, because I've deprecated the _load_aws_credentials method, use_credential_provider no longer uses boto3 to fetch the credentials from aws SSO. If you use the credential provider, the key_id and secret are no longer explicitly stored in memory in Python, instead it's handled by DuckDB internally. So the glue plugin can therefore also no longer access them. If we want to use the credential provider with glue plugin we'd need to migrate the old code from _load_aws_credentials into there, I believe.

Additionally, if people do specify the key_id and secret explicitly, they will now do so via the secrets key, not the settings. So we need the glue plugin to be able to access the secrets somehow, right now it can only access settings. I think it's possible to do that, but am curious wdyt?

@jwills
Copy link
Collaborator

jwills commented Jun 20, 2024

Ah so this wouldn't be backwards compatible, then-- folks would need to cut over to the new way. I'm assuming there's got to be a similar deprecation schedule for upstream DuckDB, right?

@guenp
Copy link
Collaborator Author

guenp commented Jun 20, 2024

Ah so this wouldn't be backwards compatible, then-- folks would need to cut over to the new way. I'm assuming there's got to be a similar deprecation schedule for upstream DuckDB, right?

Upstream DuckDB has named it "legacy authentication scheme for S3 API", but not sure what the deprecation schedule looks like: https://duckdb.org/docs/extensions/httpfs/s3api_legacy_authentication.html

glue will work if you use the legacy S3 auth, but in this PR it currently doesn't work with the new secrets manager.

Couple questions just to make sure:

  1. Do we want to keep supporting legacy authentication S3 at least until DuckDB deprecates it, or do we want to switch everyone over to the new secrets manager? If so, I can put the _load_aws_credentials method & usage thereof back in, and throw an exception if S3 secrets are redundantly defined in the settings and in secrets.
  2. Do we want glue to work with the new secrets manager, or shall we keep it on the legacy auth and tell people to keep using that?

@guenp
Copy link
Collaborator Author

guenp commented Jun 21, 2024

use_credential_provider no longer uses boto3 to fetch the credentials from aws SSO.

This is actually incorrect, I checked and boto3 can fetch the creds from AWS SSO itself. So this is no longer an issue.

I'm going to move forward with making glue compatible with the Secrets Manager, and will add in a test to make sure S3 legacy auth is still supported.

Copy link
Collaborator

@jwills jwills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good @guenp, thank you so much for doing it-- couple of substantive questions on deps and supporting additional types of secrets and their settings going forward to noodle on

@@ -39,7 +39,7 @@ def _dbt_duckdb_version():
install_requires=[
"dbt-common>=1,<2",
"dbt-adapters>=1,<2",
"duckdb>=0.7.0,!=0.10.3",
"duckdb>=0.10.0,!=0.10.3",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering when we want to simply require >= 1.0.0? Is this materially different than that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a little mentally fuzzy on how I should do minor/major version updates for adapters like dbt-duckdb since in theory we are now de-coupled from dbt-core versions via the dbt-adapters interface /cc @dbeatty10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chiming in here from adapters, you are totally free to version as you see fit. We generally recommend that major version bumps reflect major changes to the adapter and should therefore be reserved. Minor versions should reflect meaningful changes to behavior.
Beyond versioning we do want to encourage adapters to keep it easy for folks to upgrade and to not surprise people with changes. To do that we want to introduce breaking behavioral changes as opt-in with a warning that the default behavior will change in a future version. We are going to be picking up a project in Q3 to add behavior flags to formalize this process. This should provide a framework for adapters to manage the life cycle of these changes.

Copy link
Collaborator

@jwills jwills Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Colin, super-helpful context here 🙇

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to bump to 1.0.0 in a separate PR, I want to make sure there's a commit that still has 0.10.2 included in case someone turns up with that requirement

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR open here!

dbt/adapters/duckdb/secrets.py Show resolved Hide resolved
dbt/adapters/duckdb/secrets.py Outdated Show resolved Hide resolved
@guenp guenp requested a review from jwills July 16, 2024 22:42
Copy link
Collaborator

@jwills jwills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic @guenp -- thank you so much!

@guenp guenp merged commit 79fa49b into master Jul 22, 2024
33 checks passed
@guenp guenp deleted the guenp/secrets branch July 22, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants