Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove redshift async test job #30127

Merged
merged 1 commit into from
Mar 16, 2023
Merged

Conversation

pankajastro
Copy link
Member

@pankajastro pankajastro commented Mar 15, 2023

Address #28850 (comment)


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@@ -399,6 +399,8 @@ def write_version(filename: str = str(AIRFLOW_SOURCES_ROOT / "airflow" / "git_ve
"wheel",
"yamllint",
"aioresponses",
# TODO: why??
"aiobotocore>=2.1.1",
Copy link
Contributor

@eladkal eladkal Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to sync with AWS before doing further work.
We are not yet sure if we should add this lib
#30032 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@pankajastro pankajastro Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eladkal, I agree with your concern. I was testing the workaround suggested by @potiuk here #28850 (comment) to run the test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this is already merged, but can we please wait until 3/22 before making more deferrable contributions? Recent changes from @potiuk in #30144 do alleviate many concerns, but there is always a chance that users won't be able to access new Boto functionality. We are in the process of sharing the raised concerns internally at AWS. However, I do understand that there is no better alternative than using aiobotocore and tradeoff is not bad after the CI job separation. cc: @syedahsn @o-nikolas

Copy link
Member

@potiuk potiuk Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing here. Technically with the job implemented as it is right now wiht my #30144 we have another possibility. We could reverse the approach.

Currently we remove aiobotocore in the separate job and upgrade to latest boto and run all the relevant provider tests.

But we could do it the opposite: we could use latest boto/botocore in "main", remove aibotocore from the "devel" set of dependencies and install aiobotocore (together with downgrading boto/botocore) in the separate job.

This has slightly different characteristics:

  • by default the constraints we publish will not have aiobotocore and willhave botocore/boto not compatible with the latest aiobotocore - this means that if someone would like to use deferrable operators from AWS would have to deliberately install aiobotocore and downgrade boto/botocore with it.
  • the separate job would test automatically if the "aiobotocore" compatible boto still works with all the unit-tested provider operators - this way any PR that would be based on newer features would fail at PR time.
  • then we could make a deliberate decision that this is ok (and add conditional skips in the unit tests) if that happens

So basically this is this trade-off:

  1. (current) deferrable operators work out-of-the-box with the "official constraints" (but without latest boto/botocore)
  2. (possible) - deferrable operators will not work with official constraints and they require deliberately installing aiobotocore (and downgrading boto/botocore) - but latest botocore is used in those constraints

Those are the two trade-offs, and we can still change the decision (easily) which way go. With #30144 this is as easy as changing few lines in the CI scripts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think shipping aiobotocore by default is the better option. It will be a smoother user experience if users don't have to change their workflows to use deferrables if their workflows use features not covered by aiobotocore. It's not an ideal situation in either case, but this option is less likely to cause issues for users wanting to use deferrable operators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am with @o-nikolas on this one -- a case where everything works out of the box is better than if a user has to make choices depending.

Also, we want to increase and drive the adoption of Deferrable operators, and decreasing road-blocks for user's adopting it would be my preference

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the thoughts expressed by @o-nikolas @syedahsn and @kaxil. Priority should be given to providing a better out-of-the-box user experience. If we don't make aiobotocore the default option, it will increase the workload for users, and as Niko mentioned, it may require refactoring of DAGs once the user enables async. On the other hand, we don't have any evidence yet that restricting the botocore version will cause significant customer problems.

Thank you @potiuk for spending time on this and providing us with options. We're certainly converging to a better approach than what we started with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I did not want to bias it, but that was also my personal preference (but I wanted to objectively show you the options). It seems much easier to explain as well, and I have a feeling that deferring new features by few months is not a huge issue as the adoption is usually anyhow delayed and deferrable operators not being available by default would be a big bummer.

With the #30161 now merged, the aiobotocore will also be included in the PROD image by default, so we are on track to increase the adoption of deferrable operators :).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, look like we have solution now. Thank you so much @potiuk for stepping-up bringing it in better shape it was long since pending 🚀

@pankajastro pankajastro changed the title [WIP] Fix redshift Remove redshift async test job Mar 16, 2023
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I treat @o-nikolas approval on #29923 as general OK from the AWS team to follow this approach.

@potiuk potiuk merged commit b54285d into apache:main Mar 16, 2023
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 17, 2023
This is follow-up after apache#30127 and apache#30144 about handling async
(deferrable) operators for amazon provider.

Rather than making aiobotocre directly a devel extra dependency,
we create a separate "aiobotocore" extra that allows for greater
flexibility on how we handle the aiobotocore support. It allows
for two approach:

1) (current) if we decide that by default we keep boto/botocore
   compatible with aiobotocore in our constraints/image then
   aiobotocore should be added to devel and it should be included
   as preselected extra in Dockerfile. This will lead to
   having aibotocore and compatible boto/botocore in both constraints
   and the PROD image.

2) (possible) if we decide that we prefer to keep to the latest
   version of boto/botocore in constraints/image, then we could
   remove aiobotocore from both constraints and PROD image. We should
   also in this case swap the "LatestBoto" CI job introduced in
   apache#30144 to be "WithAiobotocore" job - by installing aiobotocore
   and downgrading boto/botocore in the job.
potiuk added a commit that referenced this pull request Mar 17, 2023
This is follow-up after #30127 and #30144 about handling async
(deferrable) operators for amazon provider.

Rather than making aiobotocre directly a devel extra dependency,
we create a separate "aiobotocore" extra that allows for greater
flexibility on how we handle the aiobotocore support. It allows
for two approach:

1) (current) if we decide that by default we keep boto/botocore
   compatible with aiobotocore in our constraints/image then
   aiobotocore should be added to devel and it should be included
   as preselected extra in Dockerfile. This will lead to
   having aibotocore and compatible boto/botocore in both constraints
   and the PROD image.

2) (possible) if we decide that we prefer to keep to the latest
   version of boto/botocore in constraints/image, then we could
   remove aiobotocore from both constraints and PROD image. We should
   also in this case swap the "LatestBoto" CI job introduced in
   #30144 to be "WithAiobotocore" job - by installing aiobotocore
   and downgrading boto/botocore in the job.
@pankajastro pankajastro deleted the fix_async_redshift_ci branch March 17, 2023 19:03
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants