Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of testing Providers that were prepared on January 26, 2024 #36948

Closed
eladkal opened this issue Jan 22, 2024 · 35 comments
Closed

Status of testing Providers that were prepared on January 26, 2024 #36948

eladkal opened this issue Jan 22, 2024 · 35 comments
Labels
kind:meta High-level information important to the community testing status Status of testing releases

Comments

@eladkal
Copy link
Contributor

eladkal commented Jan 22, 2024

Body

I have a kind request for all the contributors to the latest provider packages release.
Could you please help us to test the RC versions of the providers?

The guidelines on how to test providers can be found in

Verify providers by contributors

Let us know in the comment, whether the issue is addressed.

Those are providers that require testing as there were some substantial changes introduced:

Provider airbyte: 3.6.0rc2

Provider alibaba: 2.7.2rc2

Provider amazon: 8.17.0rc2

Provider apache.beam: 5.6.0rc2

Provider apache.druid: 3.8.0rc2

Provider apache.hive: 6.4.2rc1

Provider apache.spark: 4.7.1rc1

Provider atlassian.jira: 2.5.1rc1

Provider celery: 3.5.2rc2

Provider cncf.kubernetes: 7.14.0rc2

Provider cohere: 1.1.2rc1

Provider common.sql: 1.10.1rc1

Provider databricks: 6.1.0rc2

Provider dbt.cloud: 3.6.0rc2

Provider discord: 3.6.0rc2

Provider elasticsearch: 5.3.2rc1

Provider exasol: 4.4.2rc2

Provider google: 10.14.0rc2

Provider hashicorp: 3.6.2rc1

Provider http: 4.9.0rc2

Provider mongo: 3.6.0rc2

Provider mysql: 5.5.2rc2

Provider odbc: 4.4.1rc2

Provider openlineage: 1.5.0rc2

Provider pagerduty: 3.6.1rc1

Provider papermill: 3.6.1rc2

Provider pinecone: 1.1.2rc1

Provider presto: 5.4.1rc2

Provider salesforce: 5.6.2rc1

Provider slack: 8.6.0rc2

Provider snowflake: 5.3.0rc2

Provider tableau: 4.4.1rc1

Provider telegram: 4.3.1rc1

Provider trino: 5.6.1rc2

Provider weaviate: 1.3.1rc2

Provider yandex: 3.8.0rc2

All users involved in the PRs:
@romsharon98 @kacpermuda @dabla @renzepost @vizeit @varuntwr @AchimGaedkeLynker @hamedhsn @Taragolis @Lee-W @ferruzzi @sasidharan-rathinam @hussein-awala @chrishronek @syedahsn @vatsrahul1001 @arjunanan

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@eladkal eladkal added kind:meta High-level information important to the community testing status Status of testing releases labels Jan 22, 2024
@m1racoli
Copy link
Contributor

m1racoli commented Jan 22, 2024

Tested #36861, connection ID and impersonation chain are now properly passed. ✅

airflow-oss-bot added a commit to astronomer/astronomer-providers that referenced this issue Jan 23, 2024
@arjunanan6
Copy link
Contributor

Test #36752 and it works as expected.

@Lee-W
Copy link
Member

Lee-W commented Jan 23, 2024

Verified #36892, #36946 and #36894. will continue work on the rest

@potiuk
Copy link
Member

potiuk commented Jan 23, 2024

All my changes work ! 🎉

@romsharon98
Copy link
Contributor

Validate changes in:
#36911
#36905
#36663
#36934
#36603
#36489
#36533
#36491
#36532
#36530
#36908

@VladaZakharova
Copy link
Contributor

Hi!
#36473 works as well, thank you :)

@moiseenkov
Copy link
Contributor

#36276 works as expected

@AchimGaedkeLynker
Copy link
Contributor

#36828 works as expected

@Lee-W
Copy link
Member

Lee-W commented Jan 24, 2024

Tested #36586, #36578, #36550, #36680, #36658, #36749

@renzepost
Copy link
Contributor

#36817 works as expected

@hussein-awala
Copy link
Member

Tested my changes, they look good.

@vizeit
Copy link
Contributor

vizeit commented Jan 24, 2024

#36922 tested. It is working for configmaps mounted as volume but not for configmaps mounted as environment variable. I was troubleshooting the issue and it appears that I will need to make ‘env_from’ templated not ‘configmaps’ attribute. Any inputs from k8 operator experts? Also please let me know how do I proceed with the fix, a new issue or just open new PR with the fix?

@vizeit
Copy link
Contributor

vizeit commented Jan 24, 2024

#36922 tested. It is working for configmaps mounted as volume but not for configmaps mounted as environment variable. I was troubleshooting the issue and it appears that I will need to make ‘env_from’ templated not ‘configmaps’ attribute. Any inputs from k8 operator experts? Also please let me know how do I proceed with the fix, a new issue or just open new PR with the fix?

Opened #37001 with the fix and unit tested

@potiuk
Copy link
Member

potiuk commented Jan 24, 2024

Opened #37001 with the fix and unit tested

Cool. Yep. that's ok. In Airlfow PR is the unit of work - not issue, issue is mostly advisory and can be missing or there can be several PRs referring to the same issue, some of the PRs being incomplete. I understand that this is is a new feature, so it is not a regression?

@flolas
Copy link
Contributor

flolas commented Jan 24, 2024

Tested #36171.
Work as expected.

Env: Amazon MWAA v2.7.2 Instance.

Steps:
(1) Started a fresh MWAA instance

(2) Added to requirements amazon: 8.17.0rc1, also bump constrains:

apache-airflow-providers-amazon==8.17.0rc1
boto3==1.33.0
botocore==1.33.0
s3transfer==0.8.0
redshift_connector==2.0.918

(3) Created connection athena_default with extras

{
  "work_group": "primary",
  "region_name": "us-east-1"
}

(4) Created connection athena_assumed_role with extras

{
  "work_group": "primary",
  "region_name": "us-east-1",
  "role_arn": "arn:aws:iam::xxxxxxxxx:role/athena-access"
}

(5) Test DAG

create_sql_table1 = SQLExecuteQueryOperator(
            task_id="create_sql_table1",
            conn_id='athena_default',
            sql='SELECT 1;SELECT 2;SELECT 3;SELECT 4',
            split_statements=True,
            dag=dag,
        )
create_sql_table2 = SQLExecuteQueryOperator(
            task_id="create_sql_table2",
            conn_id='athena_assumed_role',
            sql='SELECT 1;SELECT 2;SELECT 3;SELECT 4',
            split_statements=True,
            dag=dag,
        )
create_sql_table1 >> create_sql_table2

@vizeit
Copy link
Contributor

vizeit commented Jan 24, 2024

Opened #37001 with the fix and unit tested

Cool. Yep. that's ok. In Airlfow PR is the unit of work - not issue, issue is mostly advisory and can be missing or there can be several PRs referring to the same issue, some of the PRs being incomplete. I understand that this is is a new feature, so it is not a regression?

Yes, this is a new feature

@eladkal
Copy link
Contributor Author

eladkal commented Jan 25, 2024

Yes, this is a new feature

So I will not issue RC2 as this release has many new features that we shouldnt hold. The fix will be released in the next wave

pankajastro pushed a commit to astronomer/astro-sdk that referenced this issue Jan 25, 2024
@Taragolis
Copy link
Contributor

Error which raised on Airflow 2.6.3

[2024-01-25, 08:29:34 UTC] Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/operators/slack.py", line 258, in execute
    self._method_resolver(
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/operators/slack.py", line 254, in _method_resolver
    return self.hook.send_file
  File "/usr/local/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/operators/slack.py", line 82, in hook
    return SlackHook(
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/slack/hooks/slack.py", line 117, in __init__
    super().__init__(logger_name=extra_client_args.pop("logger_name", None))
TypeError: LoggingMixin.__init__() got an unexpected keyword argument 'logger_name'

I guess this affects only providers which explicitly propagate, something like super().__init__(logger_name=kwargs.pop("logger_name", None)):

cc: @eladkal @hussein-awala

@Taragolis
Copy link
Contributor

Do simple test for migration AWS operators/sensors/triggerers to base classes. Works fine, however for most the classes I've check only the instantiating objects

@potiuk
Copy link
Member

potiuk commented Jan 25, 2024

I guess this affects only providers which explicitly propagate, something like super().init(logger_name=kwargs.pop("logger_name", None)):

mongo: 3.6.0rc1
segment: 3.5.0rc1
slack: 8.6.0rc1
common.sql: 1.11.0rc1

Yes. Looks like the #34964 that introduced the logger_name had a hidden back-compatibility issue and we simply can't pass loggger_name for airflow < 2.8.0 . I guess we should remove these providers and only pass logger_name from our providers conditionally (and add pre-commit for that).

Smth like:

if packaging.version.parse(packaging.version.parse(airflow_version).base_version) < packaging.version.parse(
    "2.8.0"
):
    super().__init__()
else:
     super().__init__(logger_name=kwargs.pop("logger_name", None))

@Taragolis
Copy link
Contributor

Or it might be something like that for avoid parsing version

extra_kwargs = {}
if logger_name := kwargs.pop("logger_name", None):
    extra_kwargs["logger_name"] = logger_name
super().__init__(**extra_kwargs)

@potiuk
Copy link
Member

potiuk commented Jan 25, 2024

The first approach - the difference in the first case is that you can use same DAG / Operator for different Airlfow versions. The second one depends on how you use it - so for example you would not be able to have operator with logger_name that will work both pre- and post- 2.8.0 (unless you do version check in the operator)

@Taragolis
Copy link
Contributor

My proposed fix it is just for avoid provide logger_name implicitly even if it not defined. If we would like to check version than we need to make fixes in all providers.

In general usage users can't use/define attributes which not available into the BaseOperator / BaseHook for particular version.
For example users can't define schedule="@once" and expect that it should work into the Airflow 2.3. The same here if user define logger_name in Airflow < 2.8 then well it just won't work because feature only available in Airflow 2.8.

@potiuk
Copy link
Member

potiuk commented Jan 25, 2024

My proposed fix it is just for avoid provide logger_name implicitly even if it not defined. If we would like to check version than we need to make fixes in all providers.

I understand - but the thing is that our Hooks are mostly used internally by Operators and both Hooks and opertors are relased in providers, so there is a risk that someone will add an Operator that will initialize Hook with logger_name (which might be then exposed by or even hard-coded in the operator) . By implementing a protection against it in Hook, we allow such change to happen without waiting for min-airflow >= 2.8.0.

In other words - our code in Airflow that uses hooks - will not be able to use the logger_name in any way if we do it the 2nd way. We will not be able to release an Operator which will adds the option of specifying it - you will only be able to specify the `logger_name" directly in the DAG code were Hooks are used directly or when users would like to develop custom operators and they are 100% sure the operator will be used in airflow 2.8.0+. Maybe that's what we want to do, but I think it is limiting. So for example if someone develops custom provider, that might use (say) common.sql DBApiHook. They will have to add "apache-airflow>=2.8.0" if their Operator will pass "logger_name" to the hook.

The first proposal will work in all cases. It will only propagate logger_name to Base Hook in Airflow 2.8.0 - and will set no limitations on the users of the Hook to know they are run in Airflow 2.8.0 environment. You might just write custom operator (or we can implement it in our operators) to specify logger_name and it will work, regardless of what "airflow version" the operator is installed at. So in the above example - if somoene would like to create their own provider and use "DBApiHook" they will be able to create an operator that will use it it and pass "logger_name" = "mylogger" to it (and they will not have to add apache-airlfow>= 2.8.0 in their provider.

@Taragolis
Copy link
Contributor

Then we need to add this fix into the all providers affected by #36675 (all?) or revert it for now. Depend on what is faster. In any cases we need to prepare rc2 for all providers I guess

@potiuk
Copy link
Member

potiuk commented Jan 25, 2024

@eladkal asked me to take over from here as he is flying tonight. I thought a bit about it and agree with @Taragolis that reverting #36675 is the best way to approach this - including re-releasing RC2 with accelerated voting (and implementing better support for logger_name) for the next wave.

@vizeit
Copy link
Contributor

vizeit commented Jan 26, 2024

@potiuk would re-releasing RC2 include cncf.Kubernetes provider? @eladkal said in his last comment for my PR #37001 that he will not release rc2 but include it in the next wave. With this recent change of rc2 re-releasing, would that be possible to reconsider and include #37001 in this re-releasing or not possible? I have been trying to get it approved by the listed reviewers but no luck so far. Please advise

@potiuk
Copy link
Member

potiuk commented Jan 26, 2024

@potiuk would re-releasing RC2 include cncf.Kubernetes provider? @eladkal said in his last comment for my PR #37001 that he will not release rc2 but include it in the next wave. With this recent change of rc2 re-releasing, would that be possible to reconsider and include #37001 in this re-releasing or not possible? I have been trying to get it approved by the listed reviewers but no luck so far. Please advise

Yep. Included it.

@vizeit
Copy link
Contributor

vizeit commented Jan 26, 2024

@potiuk would re-releasing RC2 include cncf.Kubernetes provider? @eladkal said in his last comment for my PR #37001 that he will not release rc2 but include it in the next wave. With this recent change of rc2 re-releasing, would that be possible to reconsider and include #37001 in this re-releasing or not possible? I have been trying to get it approved by the listed reviewers but no luck so far. Please advise

Yep. Included it.

Thanks a lot!

@potiuk potiuk changed the title Status of testing Providers that were prepared on January 22, 2024 Status of testing Providers that were prepared on January 26, 2024 Jan 26, 2024
@potiuk
Copy link
Member

potiuk commented Jan 26, 2024

Hey Everyone - the issue is updated with new RC candidates and I am calling for another vote in a moment. Let's continue testing here!

@potiuk
Copy link
Member

potiuk commented Jan 26, 2024

I kept the status from previous tests BTW

@listik
Copy link
Contributor

listik commented Jan 26, 2024

#36813 tested and working ✅

@vizeit
Copy link
Contributor

vizeit commented Jan 26, 2024

Tested #36922 and #37001, working as expected

cc: @potiuk

@potiuk
Copy link
Member

potiuk commented Jan 27, 2024

Thank you everyone. Providers are released!

I invite everyone to help improve providers for the next release, a list of open issues can be found here.

@potiuk potiuk closed this as completed Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community testing status Status of testing releases
Projects
None yet
Development

No branches or pull requests