-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest): snowflake using oauth #4647
feat(ingest): snowflake using oauth #4647
Conversation
Hi @anshbansal please pick up review on this one -- let's see if we can replicate on our Snowflake instance |
1 similar comment
Hi @anshbansal please pick up review on this one -- let's see if we can replicate on our Snowflake instance |
@saxo-lalrishav Few comments
I'll try and see see if I can validate this PR on my local after that. |
… feature/snowflake-oauth-oss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs changes. Please also add docs.
import logging | ||
from typing import Any, Dict, List, Union | ||
|
||
import msal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this dependency in setup.py
? If we directly depending on it can we please add it as a direct dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is already mentioned here -
datahub/metadata-ingestion/setup.py
Line 116 in 23f657e
microsoft_common = {"msal==1.16.0"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not added in snowflake
only powerbi. This might cause snowflake connector to fail for people not using powerbi
"powerbi": {"orderedset"} | microsoft_common,
return self._get_token(CLIENT_CREDENTIAL, scope, check_cache) | ||
|
||
def get_token_with_secret( | ||
self, secret: str, scope: Union[str, List[str]], check_cache: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please change scope
to be List[str]
only? And make the corresponding changes? I don't see why we need to have it a Union of both. Wherever we are passing we can make it List[str]
.
scopes
might represent that better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
metadata-ingestion/src/datahub/ingestion/source/sql/oauth_generator.py
Outdated
Show resolved
Hide resolved
return token | ||
|
||
def get_public_certificate_thumbprint(self, public_cert_str: str) -> str: | ||
cert_str = public_cert_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this get_public_certificate_thumbprint
function is used to fetch thumbprinbt of the certificate, which is later used by msal library to generate the toke is use_certificate is true
@@ -69,6 +73,13 @@ class BaseSnowflakeConfig(BaseTimeWindowConfig): | |||
|
|||
scheme: str = "snowflake" | |||
username: Optional[str] = None | |||
use_certificate: Optional[bool] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we extract all these oauth
settings in their own object? Currently it is hard to know what is required. 3 of them are not optional but this makes it hard to see whether they are optional or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if values.get("base64_encoded_oauth_private_key") is None: | ||
raise ValueError( | ||
f"'base64_encoded_oauth_private_key' was none " | ||
f"but should be set when using {v} authentication" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message needs to be changed and add it is needed when use_certificate
is true
if values.get("base64_encoded_oauth_public_key") is None: | ||
raise ValueError( | ||
f"'base64_encoded_oauth_public_key' was none" | ||
f"but should be set when using {v} authentication" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message needs to be changed and add it is needed when use_certificate
is true
if values.get("oauth_client_secret") is None: | ||
raise ValueError( | ||
f"'oauth_client_secret' was none " | ||
f"but should be set when using {v} authentication" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message needs to be changed and add it is needed when use_certificate
is true
@saxo-lalrishav Is this something that you are still working on? |
… feature/snowflake-oauth-oss
… feature/snowflake-oauth-oss
… feature/snowflake-oauth-oss
… feature/snowflake-oauth-oss
@anshbansal pls have a look into this and provide your feedback |
… feature/snowflake-oauth-oss
#4647 (comment) This comment is remaining |
Hi @saxo-lalrishav Are you still working on this PR? We can merge it after the changes requested in last comment. |
Yeah, looking into this |
… feature/snowflake-oauth-oss
For the lint failures suggest to re-create your local venv from scratch so that you have the latest libraries. |
… feature/snowflake-oauth-oss
…-datagov/datahub into feature/snowflake-oauth-oss
SnowflakeConfig.parse_obj( | ||
{ | ||
"authentication_type": "OAUTH_AUTHENTICATOR", | ||
"oauth_config": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be a python dictionary. This is not a yaml file. Same for all the tests added. So like
"oauth_config": {
"provider": "microsoft",
"scopes": "[https://microsoft.com/f4b353d5-ef8d/.default]",
"client_secret": "6Hb9apkbc6HD7",
"authority_url": "https://login.microsoftonline.com/yourorganisation.com",
}
This is the reason linting is failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh sorry , i got it, fixing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug caught in the unit tests. Are you running these unit tests on local while testing? Please see https://datahubproject.io/docs/metadata-ingestion/developing/#sanity-check-code-before-committing.
> if values.get("oauth_config").provider is None:
E AttributeError: 'NoneType' object has no attribute 'provider'
src/datahub/ingestion/source_config/sql/snowflake.py:192: AttributeError
_ test_snwoflake_throws_error_on_encoded_oauth_private_key_missing_if_use_certificate_is_true _
def test_snwoflake_throws_error_on_encoded_oauth_private_key_missing_if_use_certificate_is_true():
with pytest.raises(ConfigurationError):
SnowflakeConfig.parse_obj(
{
"authentication_type": "OAUTH_AUTHENTICATOR",
"oauth_config": {
"client_id": "882e9831-7ea51cb2b954",
"provider": "microsoft",
"scopes": "[https://microsoft.com/f4b353d5-ef8d/.default]",
"use_certificate": True,
"authority_url": "https://login.microsoftonline.com/yourorganisation.com",
> "encoded_oauth_public_key": "fkdsfhkshfkjsdfiuwrwfkjhsfskfhksjf==",
},
}
)
… feature/snowflake-oauth-oss
… feature/snowflake-oauth-oss
…-datagov/datahub into feature/snowflake-oauth-oss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Checklist