Postgres is raising lock errors when using more than 1 Airflow scheduler replica #39781
Closed
2 tasks done
Labels
area:core
duplicate
Issue that is duplicated
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
Apache Airflow version
2.9.1
If "Other Airflow 2 version" selected, which one?
No response
What happened?
Hi,
We are using Airflow 2.9.1 with PostgreSQL 15.7.0 on Azure Kubernetes Service.
Looks like this behaviour is not affecting the normal operation of the system but we are receiving hundreds of error messages like this:
2024-05-23 09:44:06.574 GMT [221572] ERROR: could not obtain lock on row in relation "dag_run"
2024-05-23 09:44:06.574 GMT [221572] STATEMENT: SELECT dag_run.state AS dag_run_state, dag_run.id AS dag_run_id, dag_run.dag_id AS dag_run_dag_id, dag_run.queued_at AS dag_run_queued_at, dag_run.execution_date AS dag_run_execution_date, dag_run.start_date AS dag_run_start_date, dag_run.end_date AS dag_run_end_date, dag_run.run_id AS dag_run_run_id, dag_run.creating_job_id AS dag_run_creating_job_id, dag_run.external_trigger AS dag_run_external_trigger, dag_run.run_type AS dag_run_run_type, dag_run.conf AS dag_run_conf, dag_run.data_interval_start AS dag_run_data_interval_start, dag_run.data_interval_end AS dag_run_data_interval_end, dag_run.last_scheduling_decision AS dag_run_last_scheduling_decision, dag_run.dag_hash AS dag_run_dag_hash, dag_run.log_template_id AS dag_run_log_template_id, dag_run.updated_at AS dag_run_updated_at, dag_run.clear_number AS dag_run_clear_number
FROM dag_run
WHERE dag_run.dag_id = '' AND dag_run.run_id = '*' FOR UPDATE NOWAIT
This seems to be happening when scheduler replicas are more than 1.
What you think should happen instead?
Not receiving this type of errors.
How to reproduce
Using PostgresSQL 15.7.0
Just increase scheduler replicas up to more than 1
Run several dag_runs of the same DAG in parallel.
Watch postgres server logs.
Operating System
Ubuntu 20.04.5
Versions of Apache Airflow Providers
apache-airflow-providers-microsoft-mssql==3.6.1
apache-airflow-providers-snowflake==5.4.0
apache-airflow-providers-microsoft-azure==10.0.0
apache-airflow-providers-http==4.10.1
apache-airflow-providers-cncf-kubernetes==8.1.1
apache-airflow-providers-common-sql==1.12.0
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: