Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'airflow db clean' with --skip-archive flag fails #33606

Closed
1 of 2 tasks
gil-tober opened this issue Aug 22, 2023 · 7 comments · Fixed by #33622
Closed
1 of 2 tasks

'airflow db clean' with --skip-archive flag fails #33606

gil-tober opened this issue Aug 22, 2023 · 7 comments · Fixed by #33622
Assignees
Labels
affected_version:2.7 Issues Reported for 2.7 area:core kind:bug This is a clearly a bug

Comments

@gil-tober
Copy link

gil-tober commented Aug 22, 2023

Apache Airflow version

2.7.0

What happened

Running airflow db clean -y -v --skip-archive --clean-before-timestamp '2023-05-24 00:00:00' fails.
Running the same command without the --skip-archive flag passes successfully.

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/__main__.py", line 60, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/cli_config.py", line 49, in command
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/cli.py", line 113, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/providers_configuration_loader.py", line 56, in wrapped_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/cli/commands/db_command.py", line 241, in cleanup_tables
    run_cleanup(
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", line 77, in wrapper
    return func(*args, session=session, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/db_cleanup.py", line 437, in run_cleanup
    _cleanup_table(
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/db_cleanup.py", line 302, in _cleanup_table
    _do_delete(query=query, orm_model=orm_model, skip_archive=skip_archive, session=session)
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/db_cleanup.py", line 197, in _do_delete
    target_table.drop()
  File "/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/sql/schema.py", line 978, in drop
    bind = _bind_or_error(self)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/sql/base.py", line 1659, in _bind_or_error
    raise exc.UnboundExecutionError(msg)
sqlalchemy.exc.UnboundExecutionError: Table object '_airflow_deleted__dag_run__20230822091212' is not bound to an Engine or Connection.  Execution can not proceed without a database to execute against.

What you think should happen instead

db clean command with --skip-archive should pass

How to reproduce

airflow db clean -y -v --skip-archive --clean-before-timestamp '2023-05-24 00:00:00'

Operating System

Debian GNU/Linux 11 (bullseye)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.5.1
apache-airflow-providers-celery==3.3.2
apache-airflow-providers-cncf-kubernetes==7.4.2
apache-airflow-providers-common-sql==1.7.0
apache-airflow-providers-daskexecutor==1.0.0
apache-airflow-providers-docker==3.7.3
apache-airflow-providers-elasticsearch==5.0.0
apache-airflow-providers-ftp==3.5.0
apache-airflow-providers-google==8.3.0
apache-airflow-providers-grpc==3.2.1
apache-airflow-providers-hashicorp==3.4.2
apache-airflow-providers-http==4.5.0
apache-airflow-providers-imap==3.3.0
apache-airflow-providers-jenkins==3.3.1
apache-airflow-providers-microsoft-azure==6.2.4
apache-airflow-providers-mysql==5.2.1
apache-airflow-providers-odbc==4.0.0
apache-airflow-providers-openlineage==1.0.1
apache-airflow-providers-postgres==5.6.0
apache-airflow-providers-redis==3.3.1
apache-airflow-providers-salesforce==5.4.1
apache-airflow-providers-sendgrid==3.2.1
apache-airflow-providers-sftp==4.5.0
apache-airflow-providers-slack==7.3.2
apache-airflow-providers-snowflake==4.4.2
apache-airflow-providers-sqlite==3.4.3
apache-airflow-providers-ssh==3.7.1
apache-airflow-providers-tableau==4.2.1

Deployment

Official Apache Airflow Helm Chart

Deployment details

Helm version 1.9.0
Deployed on AWS EKS cluster
MetaDB is a AWS Postgres RDS connected using a PGBouncer

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@gil-tober gil-tober added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Aug 22, 2023
@SamWheating
Copy link
Contributor

I was able to replicate this issue, I can dive a bit deeper later today - feel free to assign it to me.

@potiuk
Copy link
Member

potiuk commented Aug 22, 2023

Did :).

Looks like one assingment a day for you recently :)

@SamWheating
Copy link
Contributor

Looks like one assingment a day for you recently :)

I've got some time at work set aside for hacking / learning, so I figured I'd hop over here and clean up some issues. Let me know if there's anything in particular you could use help on :)

@eladkal eladkal added affected_version:2.7 Issues Reported for 2.7 and removed needs-triage label for new issues that we didn't triage yet labels Aug 22, 2023
@potiuk
Copy link
Member

potiuk commented Aug 22, 2023

I've got some time at work set aside for hacking / learning, so I figured I'd hop over here and clean up some issues. Let me know if there's anything in particular you could use help on :)

I have one particular issue that I had no time to look at - as I do not really understand how it is "supposed to work" - it's been 2.6.2/3 regression I think and it is related to cleaning DAG acces level permissions by DagFileProcessor .... I think That is a regression added recently that we had no time to look at and it kind of bugs me. I can share more details if you want :D

@SamWheating
Copy link
Contributor

Sounds interesting, is there an open issue? Please share more details

@potiuk
Copy link
Member

potiuk commented Aug 22, 2023

Here it is. @SamWheating

Issue here: #32839
Seems that this one: #30340 (added in 2.6.2) which fixed another issue (#25149) has introduced the problem that DAG-level permissions are cleared after (re) parsing the DAG. We have the PR reverting it back: #32999 but I am not sure it's the best fix as I generally have a very little knowledge about this part of Airflow :(

@gil-tober
Copy link
Author

Saw that there is already a PR
Thanks @SamWheating @potiuk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.7 Issues Reported for 2.7 area:core kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants