Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Manage Compute engine credentials in addition of oauth2 #92

Closed
lucienfregosibodyguard opened this issue Nov 10, 2022 · 13 comments · Fixed by #98
Closed

Manage Compute engine credentials in addition of oauth2 #92

lucienfregosibodyguard opened this issue Nov 10, 2022 · 13 comments · Fixed by #98
Labels
bug Something isn't working

Comments

@lucienfregosibodyguard
Copy link
Contributor

Hi Prefect-dbt team,

Following this tread #56
I tried to authenticate within a kubernetes pod associated to a valid service account.
google.auth.default() returns a compute engine credentials https://google-auth.readthedocs.io/en/master/reference/google.auth.compute_engine.credentials.html

Then the code fails because of 'Credentials' object has no attribute 'refresh_token'
From what I understood the code expects a https://google-auth.readthedocs.io/en/stable/reference/google.oauth2.credentials.html

Would be super nice to be able to use the compute_engine credentials

@ahuang11
Copy link
Contributor

Hi @lucienfregosibodyguard, thanks for reporting this. Do you have a traceback?

Also, would you be interested in contributing a fix; might be here:
https://github.com/PrefectHQ/prefect-gcp/blob/main/prefect_gcp/credentials.py#L127-L144

@lucienfregosibodyguard
Copy link
Contributor Author

lucienfregosibodyguard commented Nov 14, 2022

Yes here are the traceback

Encountered exception during execution:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 1215, in orchestrate_task_run
    result = await task.fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect_dbt/cli/commands.py", line 136, in trigger_dbt_cli_command
    profile = dbt_cli_profile.get_profile()
  File "/usr/local/lib/python3.9/site-packages/prefect_dbt/cli/credentials.py", line 111, in get_profile
    "outputs": {self.target: self.target_configs.get_configs()},
  File "/usr/local/lib/python3.9/site-packages/prefect_dbt/cli/configs/bigquery.py", line 114, in get_configs
    configs_json[key] = getattr(google_credentials, key)
AttributeError: 'Credentials' object has no attribute 'refresh_token'

We could use something like this
https://programtalk.com/python-more-examples/google.auth.compute_engine.credentials.Credentials/

Will try to make a PR

@lucienfregosibodyguard
Copy link
Contributor Author

@ahuang11 I tried to use the code snippet above but it doesn't work

@lucienfregosibodyguard
Copy link
Contributor Author

@ahuang11 do you have time to help on this ? Still stuck on my side :/

@ahuang11
Copy link
Contributor

ahuang11 commented Dec 2, 2022

@lucienfregosibodyguard can you try pulling this branch to test? #98

@ahuang11 ahuang11 added the bug Something isn't working label Dec 2, 2022
@ahuang11 ahuang11 mentioned this issue Dec 2, 2022
3 tasks
@lucienfregosibodyguard
Copy link
Contributor Author

Hi @ahuang11 I looked at the PR but it seems that token is not an attribute of the class
https://google-auth.readthedocs.io/en/master/reference/google.auth.compute_engine.credentials.html

maybe I miss something but I can't see how it can works 😕

@ahuang11
Copy link
Contributor

ahuang11 commented Dec 5, 2022

There might be multiple credentials: I'm looking at Google auth default. https://google-auth.readthedocs.io/en/master/reference/google.auth.html

 [docs]def default(scopes=None, request=None, quota_project_id=None, default_scopes=None):
    """Gets the default credentials for the current environment.

    `Application Default Credentials`_ provides an easy way to obtain
    credentials to call Google APIs for server-to-server or local applications.
    This function acquires credentials from the environment in the following
    order:

    1. If the environment variable ``GOOGLE_APPLICATION_CREDENTIALS`` is set
       to the path of a valid service account JSON private key file, then it is
       loaded and returned. The project ID returned is the project ID defined
       in the service account file if available (some older files do not
       contain project ID information).

       If the environment variable is set to the path of a valid external
       account JSON configuration file (workload identity federation), then the
       configuration file is used to determine and retrieve the external
       credentials from the current environment (AWS, Azure, etc).
       These will then be exchanged for Google access tokens via the Google STS
       endpoint.
       The project ID returned in this case is the one corresponding to the
       underlying workload identity pool resource if determinable.
    2. If the `Google Cloud SDK`_ is installed and has application default
       credentials set they are loaded and returned.

       To enable application default credentials with the Cloud SDK run::

            gcloud auth application-default login

       If the Cloud SDK has an active project, the project ID is returned. The
       active project can be set using::

            gcloud config set project

    3. If the application is running in the `App Engine standard environment`_
       (first generation) then the credentials and project ID from the
       `App Identity Service`_ are used.
    4. If the application is running in `Compute Engine`_ or `Cloud Run`_ or
       the `App Engine flexible environment`_ or the `App Engine standard
       environment`_ (second generation) then the credentials and project ID
       are obtained from the `Metadata Service`_.
    5. If no credentials are found,
       :class:`~google.auth.exceptions.DefaultCredentialsError` will be raised.

    .. _Application Default Credentials: https://developers.google.com\
            /identity/protocols/application-default-credentials
    .. _Google Cloud SDK: https://cloud.google.com/sdk
    .. _App Engine standard environment: https://cloud.google.com/appengine
    .. _App Identity Service: https://cloud.google.com/appengine/docs/python\
            /appidentity/
    .. _Compute Engine: https://cloud.google.com/compute
    .. _App Engine flexible environment: https://cloud.google.com\
            /appengine/flexible
    .. _Metadata Service: https://cloud.google.com/compute/docs\
            /storing-retrieving-metadata
    .. _Cloud Run: https://cloud.google.com/run

@ahuang11
Copy link
Contributor

ahuang11 commented Dec 5, 2022

"google.oauth2.service_account.Credentials" seems to have the token attr.
image

@lucienfregosibodyguard
Copy link
Contributor Author

Yes you're right it has a token attribute indeed !

Few remarks :

  • in order to get the token we need to refresh the credentials like this
                request = google.auth.transport.requests.Request()
                google_credentials.refresh(request)
                configs_json["token"] = google_credentials.token
  • We need to add a method object
    configs_json["method"] = "service-account"

  • And then I got this profile file :

config: {}
dbt_models:
  outputs:
    <target>:
      method: service-account
      project: <project>
      schema: <dataset>
      threads: 4
      token: <token>
      type: bigquery
  target: <target>

And I got the error Database Error expected str, bytes or os.PathLike object, not NoneType

Then I add dbname: <project> but I got a new error Got duplicate keys: (project) all map to "database"

Hope that helps

@lucienfregosibodyguard
Copy link
Contributor Author

lucienfregosibodyguard commented Dec 6, 2022

Oh following this

We need method: oauth-secrets and it works !!!

@lucienfregosibodyguard
Copy link
Contributor Author

I did a PR
#100

@ahuang11
Copy link
Contributor

ahuang11 commented Dec 8, 2022

This should now be fixed in v0.2.6. Please reopen if that's not the case!

@lucienfregosibodyguard
Copy link
Contributor Author

Hi @ahuang11 sadly it's still doesn't work, got a new error

Encountered exception during execution:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1339, in orchestrate_task_run
    result = await task.fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect_dbt/cli/commands.py", line 139, in trigger_dbt_cli_command
    yaml.dump(profile, f, default_flow_style=False)
  File "/usr/local/lib/python3.10/site-packages/yaml/__init__.py", line 253, in dump
    return dump_all([data], stream, Dumper=Dumper, **kwds)
  File "/usr/local/lib/python3.10/site-packages/yaml/__init__.py", line 241, in dump_all
    dumper.represent(data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 27, in represent
    node = self.represent_data(data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 48, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 207, in represent_dict
    return self.represent_mapping('tag:yaml.org,2002:map', data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 118, in represent_mapping
    node_value = self.represent_data(item_value)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 48, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 207, in represent_dict
    return self.represent_mapping('tag:yaml.org,2002:map', data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 118, in represent_mapping
    node_value = self.represent_data(item_value)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 48, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 207, in represent_dict
    return self.represent_mapping('tag:yaml.org,2002:map', data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 118, in represent_mapping
    node_value = self.represent_data(item_value)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 52, in represent_data
    node = self.yaml_multi_representers[data_type](self, data)
  File "/usr/local/lib/python3.10/site-packages/yaml/representer.py", line 317, in represent_object
    reduce = data.__reduce_ex__(2)
TypeError: cannot pickle 'coroutine' object

I guess it's related to the sync definition but don't know how to fix it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants