-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of testing Providers that were prepared on January 26, 2024 #36948
Comments
Tested #36861, connection ID and impersonation chain are now properly passed. ✅ |
Test #36752 and it works as expected. |
All my changes work ! 🎉 |
Hi! |
#36276 works as expected |
#36828 works as expected |
#36817 works as expected |
Tested my changes, they look good. |
#36922 tested. It is working for configmaps mounted as volume but not for configmaps mounted as environment variable. I was troubleshooting the issue and it appears that I will need to make ‘env_from’ templated not ‘configmaps’ attribute. Any inputs from k8 operator experts? Also please let me know how do I proceed with the fix, a new issue or just open new PR with the fix? |
Opened #37001 with the fix and unit tested |
Cool. Yep. that's ok. In Airlfow |
Tested #36171. Env: Amazon MWAA v2.7.2 Instance. Steps: (2) Added to requirements amazon: 8.17.0rc1, also bump constrains: apache-airflow-providers-amazon==8.17.0rc1
boto3==1.33.0
botocore==1.33.0
s3transfer==0.8.0
redshift_connector==2.0.918 (3) Created connection {
"work_group": "primary",
"region_name": "us-east-1"
} (4) Created connection {
"work_group": "primary",
"region_name": "us-east-1",
"role_arn": "arn:aws:iam::xxxxxxxxx:role/athena-access"
} (5) Test DAG create_sql_table1 = SQLExecuteQueryOperator(
task_id="create_sql_table1",
conn_id='athena_default',
sql='SELECT 1;SELECT 2;SELECT 3;SELECT 4',
split_statements=True,
dag=dag,
)
create_sql_table2 = SQLExecuteQueryOperator(
task_id="create_sql_table2",
conn_id='athena_assumed_role',
sql='SELECT 1;SELECT 2;SELECT 3;SELECT 4',
split_statements=True,
dag=dag,
)
create_sql_table1 >> create_sql_table2 |
Yes, this is a new feature |
So I will not issue RC2 as this release has many new features that we shouldnt hold. The fix will be released in the next wave |
Error which raised on Airflow 2.6.3 [2024-01-25, 08:29:34 UTC] Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/operators/slack.py", line 258, in execute
self._method_resolver(
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/operators/slack.py", line 254, in _method_resolver
return self.hook.send_file
File "/usr/local/lib/python3.10/functools.py", line 981, in __get__
val = self.func(instance)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/operators/slack.py", line 82, in hook
return SlackHook(
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/hooks/slack.py", line 117, in __init__
super().__init__(logger_name=extra_client_args.pop("logger_name", None))
TypeError: LoggingMixin.__init__() got an unexpected keyword argument 'logger_name' I guess this affects only providers which explicitly propagate, something like |
Do simple test for migration AWS operators/sensors/triggerers to base classes. Works fine, however for most the classes I've check only the instantiating objects
|
mongo: 3.6.0rc1 Yes. Looks like the #34964 that introduced the logger_name had a hidden back-compatibility issue and we simply can't pass loggger_name for airflow < 2.8.0 . I guess we should remove these providers and only pass Smth like: if packaging.version.parse(packaging.version.parse(airflow_version).base_version) < packaging.version.parse(
"2.8.0"
):
super().__init__()
else:
super().__init__(logger_name=kwargs.pop("logger_name", None)) |
Or it might be something like that for avoid parsing version extra_kwargs = {}
if logger_name := kwargs.pop("logger_name", None):
extra_kwargs["logger_name"] = logger_name
super().__init__(**extra_kwargs) |
The first approach - the difference in the first case is that you can use same DAG / Operator for different Airlfow versions. The second one depends on how you use it - so for example you would not be able to have operator with logger_name that will work both pre- and post- 2.8.0 (unless you do version check in the operator) |
My proposed fix it is just for avoid provide In general usage users can't use/define attributes which not available into the BaseOperator / BaseHook for particular version. |
I understand - but the thing is that our Hooks are mostly used internally by Operators and both Hooks and opertors are relased in providers, so there is a risk that someone will add an Operator that will initialize Hook with logger_name (which might be then exposed by or even hard-coded in the operator) . By implementing a protection against it in Hook, we allow such change to happen without waiting for min-airflow >= 2.8.0. In other words - our code in Airflow that uses hooks - will not be able to use the The first proposal will work in all cases. It will only propagate logger_name to Base Hook in Airflow 2.8.0 - and will set no limitations on the users of the Hook to know they are run in Airflow 2.8.0 environment. You might just write custom operator (or we can implement it in our operators) to specify logger_name and it will work, regardless of what "airflow version" the operator is installed at. So in the above example - if somoene would like to create their own provider and use "DBApiHook" they will be able to create an operator that will use it it and pass "logger_name" = "mylogger" to it (and they will not have to add |
Then we need to add this fix into the all providers affected by #36675 (all?) or revert it for now. Depend on what is faster. In any cases we need to prepare rc2 for all providers I guess |
@eladkal asked me to take over from here as he is flying tonight. I thought a bit about it and agree with @Taragolis that reverting #36675 is the best way to approach this - including re-releasing RC2 with accelerated voting (and implementing better support for logger_name) for the next wave. |
@potiuk would re-releasing RC2 include cncf.Kubernetes provider? @eladkal said in his last comment for my PR #37001 that he will not release rc2 but include it in the next wave. With this recent change of rc2 re-releasing, would that be possible to reconsider and include #37001 in this re-releasing or not possible? I have been trying to get it approved by the listed reviewers but no luck so far. Please advise |
Yep. Included it. |
Thanks a lot! |
Hey Everyone - the issue is updated with new RC candidates and I am calling for another vote in a moment. Let's continue testing here! |
I kept the status from previous tests BTW |
#36813 tested and working ✅ |
Thank you everyone. Providers are released! I invite everyone to help improve providers for the next release, a list of open issues can be found here. |
Body
I have a kind request for all the contributors to the latest provider packages release.
Could you please help us to test the RC versions of the providers?
The guidelines on how to test providers can be found in
Verify providers by contributors
Let us know in the comment, whether the issue is addressed.
Those are providers that require testing as there were some substantial changes introduced:
Provider airbyte: 3.6.0rc2
Provider alibaba: 2.7.2rc2
__init__
inanalyticdb_spark.py
(#36911): @romsharon98Provider amazon: 8.17.0rc2
is_authorized_dag
in AWS auth manager (#36619): @vincbeck[aws] cloudwatch_task_handler_json_serializer
not set (#36851): @TaragolisAwsEcsExecutor
(#36179): @syedahsnProvider apache.beam: 5.6.0rc2
Provider apache.druid: 3.8.0rc2
Provider apache.hive: 6.4.2rc1
__init__
inhive-stats
(#36905): @romsharon98Provider apache.spark: 4.7.1rc1
Provider atlassian.jira: 2.5.1rc1
atlassian-python-api
limitation (#36841): @Taragolisatlassian-python-api
to <3.41.6 (#36815): @TaragolisProvider celery: 3.5.2rc2
Provider cncf.kubernetes: 7.14.0rc2
KubernetesPodOperator
(#37001): @vizeitProvider cohere: 1.1.2rc1
Provider common.sql: 1.10.1rc1
Provider databricks: 6.1.0rc2
Provider dbt.cloud: 3.6.0rc2
Provider discord: 3.6.0rc2
Provider elasticsearch: 5.3.2rc1
Provider exasol: 4.4.2rc2
Provider google: 10.14.0rc2
BigQueryToSqlBaseOperator
fromBigQueryToPostgresOperator
(#36663): @romsharon98__init__
inauto_ml.py
(#36934): @romsharon98__init__
inSFTPToGCSOperator
(#36603): @romsharon98__init__
in dataproc (#36489): @romsharon98parquet_row_group_size
inBaseSQLToGCSOperator
(#36817): @renzepost__init__
inBigQueryToPostgresOperator
operator (#36491): @romsharon98Provider hashicorp: 3.6.2rc1
raise_on_deleted_version=True
toread_secret_version
in Hashicorp operator (#36532): @romsharon98Provider http: 4.9.0rc2
Provider mongo: 3.6.0rc2
mongo_conn_id
argument into the MongoHook constructor (#36896): @TaragolisProvider mysql: 5.5.2rc2
Provider odbc: 4.4.1rc2
Provider openlineage: 1.5.0rc2
Provider pagerduty: 3.6.1rc1
Provider papermill: 3.6.1rc2
__init__
inpapermill.py
(#36530): @romsharon98Provider pinecone: 1.1.2rc1
pinecone-client
to <3.0 (#36818): @TaragolisProvider presto: 5.4.1rc2
Provider salesforce: 5.6.2rc1
Provider slack: 8.6.0rc2
client.files_upload_v2
in Slack Provider (#36757): @TaragolisProvider snowflake: 5.3.0rc2
Provider tableau: 4.4.1rc1
Provider telegram: 4.3.1rc1
Provider trino: 5.6.1rc2
Provider weaviate: 1.3.1rc2
__init__
in weaviate (#36908): @romsharon98Provider yandex: 3.8.0rc2
All users involved in the PRs:
@romsharon98 @kacpermuda @dabla @renzepost @vizeit @varuntwr @AchimGaedkeLynker @hamedhsn @Taragolis @Lee-W @ferruzzi @sasidharan-rathinam @hussein-awala @chrishronek @syedahsn @vatsrahul1001 @arjunanan
Committer
The text was updated successfully, but these errors were encountered: