Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB migrate throws na error on encrypt_trigger_kwargs #38836

Closed
1 of 2 tasks
romanzdk opened this issue Apr 8, 2024 · 11 comments · Fixed by #39246
Closed
1 of 2 tasks

DB migrate throws na error on encrypt_trigger_kwargs #38836

romanzdk opened this issue Apr 8, 2024 · 11 comments · Fixed by #39246
Assignees
Labels
affected_version:2.9 area:core area:db-migrations PRs with DB migration kind:bug This is a clearly a bug
Milestone

Comments

@romanzdk
Copy link

romanzdk commented Apr 8, 2024

Apache Airflow version

Other Airflow 2 version (please specify below)

If "Other Airflow 2 version" selected, which one?

2.9.0

What happened?

airflow db migrate throws an error:

[2024-04-08T13:41:38.179+0000] {migration.py:216} INFO - Context impl PostgresqlImpl.                                                                                   │
[2024-04-08T13:41:38.180+0000] {migration.py:219} INFO - Will assume transactional DDL.                                                                                 │
[2024-04-08T13:41:38.185+0000] {db.py:1650} INFO - Creating tables                                                                                                      │
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.                                                                                                          │
INFO  [alembic.runtime.migration] Will assume transactional DDL.                                                                                                        │
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.                                                                                                          │
INFO  [alembic.runtime.migration] Will assume transactional DDL.                                                                                                        │
Traceback (most recent call last):                                                                                                                                      │
  File "/usr/local/bin/airflow", line 8, in <module>                                                                                                                    │
    sys.exit(main())                                                                                                                                                    │
  File "/usr/local/lib/python3.10/site-packages/airflow/__main__.py", line 58, in main                                                                                  │
    args.func(args)                                                                                                                                                     │
  File "/usr/local/lib/python3.10/site-packages/airflow/cli/cli_config.py", line 49, in command                                                                         │
    return func(*args, **kwargs)                                                                                                                                        │
  File "/usr/local/lib/python3.10/site-packages/airflow/utils/cli.py", line 114, in wrapper                                                                             │
    return f(*args, **kwargs)                                                                                                                                           │
  File "/usr/local/lib/python3.10/site-packages/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function                                          │
    return func(*args, **kwargs)                                                                                                                                        │
  File "/usr/local/lib/python3.10/site-packages/airflow/cli/commands/db_command.py", line 130, in migratedb                                                             │
    db.upgradedb(                                                                                                                                                       │
  File "/usr/local/lib/python3.10/site-packages/airflow/utils/session.py", line 79, in wrapper                                                                          │
    return func(*args, session=session, **kwargs)                                                                                                                       │
  File "/usr/local/lib/python3.10/site-packages/airflow/utils/db.py", line 1674, in upgradedb                                                                           │
    encrypt_trigger_kwargs(session=session)                                                                                                                             │
  File "/usr/local/lib/python3.10/site-packages/airflow/utils/db.py", line 982, in encrypt_trigger_kwargs                                                               │
    trigger.kwargs = BaseSerialization.deserialize(json.loads(trigger.encrypted_kwargs))                                                                                │
  File "/usr/local/lib/python3.10/json/__init__.py", line 346, in loads                                                                                                 │
    return _default_decoder.decode(s)                                                                                                                                   │
  File "/usr/local/lib/python3.10/json/decoder.py", line 337, in decode                                                                                                 │
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())                                                                                                                   │
  File "/usr/local/lib/python3.10/json/decoder.py", line 355, in raw_decode                                                                                             │
    raise JSONDecodeError("Expecting value", s, err.value) from None                                                                                                    │
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

What you think should happen instead?

no error

How to reproduce

use python:3.10.12-slim-bookworm image with airflow 2.9.0. deploy into kubernetes cluster 1.26.5. Celery Executor. run airflow db migrate

Operating System

Debian GNU/Linux 12 (bookworm)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.19.0
apache-airflow-providers-celery==3.6.1
apache-airflow-providers-cncf-kubernetes==8.0.1
apache-airflow-providers-common-io==1.3.0
apache-airflow-providers-common-sql==1.11.1
apache-airflow-providers-fab==1.0.2
apache-airflow-providers-ftp==3.7.0
apache-airflow-providers-http==4.10.0
apache-airflow-providers-imap==3.5.0
apache-airflow-providers-postgres==5.10.2
apache-airflow-providers-smtp==1.6.1
apache-airflow-providers-sqlite==3.7.1

Deployment

Other Docker-based deployment

Deployment details

on-premise Kubernetes cluster 1.26.5, celery executor, PostgreSQL 13.13

Anything else?

Related: 6f0e5bc #38358

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@romanzdk romanzdk added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Apr 8, 2024
@romanzdk romanzdk changed the title DB migrate throws na error on encrypt_trigger_kwargs [2.9.0] DB migrate throws na error on encrypt_trigger_kwargs Apr 8, 2024
@potiuk
Copy link
Member

potiuk commented Apr 8, 2024

cc: @hussein-awala

@potiuk potiuk added this to the Airflow 2.9.1 milestone Apr 8, 2024
@potiuk potiuk removed the needs-triage label for new issues that we didn't triage yet label Apr 8, 2024
@eladkal eladkal changed the title [2.9.0] DB migrate throws na error on encrypt_trigger_kwargs DB migrate throws na error on encrypt_trigger_kwargs Apr 8, 2024
@AronsonDan
Copy link

Hey,
We are experiencing the same issue.
environment details:
Python version: 3.11
docker image: apache/airflow:2.9.0-python3.11
executor: Kubernetes executor

traceback:

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/__main__.py", line 58, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/cli_config.py", line 49, in command
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/cli.py", line 114, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/commands/db_command.py", line 130, in migratedb
    db.upgradedb(
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", line 79, in wrapper
    return func(*args, session=session, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/db.py", line 1674, in upgradedb
    encrypt_trigger_kwargs(session=session)
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/db.py", line 982, in encrypt_trigger_kwargs
    trigger.kwargs = BaseSerialization.deserialize(json.loads(trigger.encrypted_kwargs))
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

@hussein-awala hussein-awala self-assigned this Apr 9, 2024
@hussein-awala
Copy link
Member

I will take a look.

Did you have this error the first time you ran airflow db migrate or did you run it multiple times?

@AronsonDan
Copy link

I will take a look.

Did you have this error the first time you ran airflow db migrate or did you run it multiple times?

Both on my end

@hussein-awala
Copy link
Member

I found the root of the issue, I'm working on the fix

@hussein-awala
Copy link
Member

There are some tricky conditions for kwargs encryption/decryption; the first time you run the command it will work without any issue, but when you run it the next time, it will try to encrypt the kwargs again which fails because they are encrypted.
Here is the fix: #38876

@eladkal
Copy link
Contributor

eladkal commented Apr 9, 2024

@hussein-awala @ephraimbuddy do you think this one calls for ad hoc release with the fix ?

@romanzdk
Copy link
Author

Hi, is the release planned, please?

@ephraimbuddy
Copy link
Contributor

ephraimbuddy commented Apr 16, 2024

Hi, is the release planned, please?

Yes. This week/Next week

@dstandish
Copy link
Contributor

While not trying to reopen a settled debate, I'm feeling a bit like maybe we should have just not encrypted trigger kwargs. I am curious if there are any examples of operators that are (or were) passing around sensitive info to triggers. I'm curious if that was really necessary or we could have just not passed sensitive info and relied on the connection object etc.

@potiuk
Copy link
Member

potiuk commented Apr 25, 2024

I'm curious if that was really necessary or we could have just not passed sensitive info and relied on the connection object etc.

See https://nvd.nist.gov/vuln/detail/CVE-2023-51702 has all details of what happened in the past (including PR fixing it).

And yes - it's been fixed by passing a reference to configuration rather than dictionary of configuration.

During the discussion in the security team we agreed that this is not at all obvious for those who create Triggers - including 3rd-party triggers, because they might not be aware that database might keep sensitive information.

Indeed. It's not necessary, but it prevents a number of security issues - including those that users might add in their code accidentally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment