Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove redshift async test job #30127

Merged
merged 1 commit into from
Mar 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 0 additions & 25 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -799,31 +799,6 @@ jobs:
run: breeze ci fix-ownership
if: always()

tests-aws-async-provider:
timeout-minutes: 50
name: "Pytest for AWS Async Provider"
runs-on: "${{needs.build-info.outputs.runs-on}}"
needs: [build-info, wait-for-ci-images]
if: needs.build-info.outputs.run-tests == 'true'
steps:
- name: Cleanup repo
shell: bash
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@v3
with:
persist-credentials: false
- name: "Prepare breeze & CI image"
uses: ./.github/actions/prepare_breeze_and_image
- name: "Run AWS Async Test"
run: "breeze shell \
'pip install aiobotocore>=2.1.1 && pytest /opt/airflow/tests/providers/amazon/aws/deferrable'"
- name: "Post Tests"
uses: ./.github/actions/post_tests
- name: "Fix ownership"
run: breeze ci fix-ownership
if: always()

tests-helm:
timeout-minutes: 80
name: "Python unit tests for Helm chart"
Expand Down
3 changes: 3 additions & 0 deletions airflow/providers/amazon/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -600,6 +600,9 @@ additional-extras:
- name: pandas
dependencies:
- pandas>=0.17.1
# There is conflict between boto3 and aiobotocore dependency botocore.
# TODO: We can remove it once boto3 and aiobotocore both have compatible botocore version or
# boto3 have native aync support and we move away from aio aiobotocore
- name: aiobotocore
dependencies:
- aiobotocore>=2.1.1
5 changes: 5 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,11 @@ def write_version(filename: str = str(AIRFLOW_SOURCES_ROOT / "airflow" / "git_ve
"wheel",
"yamllint",
"aioresponses",
# This required for AWS deferrable operators.
# There is conflict between boto3 and aiobotocore dependency botocore.
# TODO: We can remove it once boto3 and aiobotocore both have compatible botocore version or
# boto3 have native aync support and we move away from aio aiobotocore
"aiobotocore>=2.1.1",
Copy link
Contributor

@eladkal eladkal Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to sync with AWS before doing further work.
We are not yet sure if we should add this lib
#30032 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@pankajastro pankajastro Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eladkal, I agree with your concern. I was testing the workaround suggested by @potiuk here #28850 (comment) to run the test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this is already merged, but can we please wait until 3/22 before making more deferrable contributions? Recent changes from @potiuk in #30144 do alleviate many concerns, but there is always a chance that users won't be able to access new Boto functionality. We are in the process of sharing the raised concerns internally at AWS. However, I do understand that there is no better alternative than using aiobotocore and tradeoff is not bad after the CI job separation. cc: @syedahsn @o-nikolas

Copy link
Member

@potiuk potiuk Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing here. Technically with the job implemented as it is right now wiht my #30144 we have another possibility. We could reverse the approach.

Currently we remove aiobotocore in the separate job and upgrade to latest boto and run all the relevant provider tests.

But we could do it the opposite: we could use latest boto/botocore in "main", remove aibotocore from the "devel" set of dependencies and install aiobotocore (together with downgrading boto/botocore) in the separate job.

This has slightly different characteristics:

  • by default the constraints we publish will not have aiobotocore and willhave botocore/boto not compatible with the latest aiobotocore - this means that if someone would like to use deferrable operators from AWS would have to deliberately install aiobotocore and downgrade boto/botocore with it.
  • the separate job would test automatically if the "aiobotocore" compatible boto still works with all the unit-tested provider operators - this way any PR that would be based on newer features would fail at PR time.
  • then we could make a deliberate decision that this is ok (and add conditional skips in the unit tests) if that happens

So basically this is this trade-off:

  1. (current) deferrable operators work out-of-the-box with the "official constraints" (but without latest boto/botocore)
  2. (possible) - deferrable operators will not work with official constraints and they require deliberately installing aiobotocore (and downgrading boto/botocore) - but latest botocore is used in those constraints

Those are the two trade-offs, and we can still change the decision (easily) which way go. With #30144 this is as easy as changing few lines in the CI scripts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think shipping aiobotocore by default is the better option. It will be a smoother user experience if users don't have to change their workflows to use deferrables if their workflows use features not covered by aiobotocore. It's not an ideal situation in either case, but this option is less likely to cause issues for users wanting to use deferrable operators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am with @o-nikolas on this one -- a case where everything works out of the box is better than if a user has to make choices depending.

Also, we want to increase and drive the adoption of Deferrable operators, and decreasing road-blocks for user's adopting it would be my preference

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the thoughts expressed by @o-nikolas @syedahsn and @kaxil. Priority should be given to providing a better out-of-the-box user experience. If we don't make aiobotocore the default option, it will increase the workload for users, and as Niko mentioned, it may require refactoring of DAGs once the user enables async. On the other hand, we don't have any evidence yet that restricting the botocore version will cause significant customer problems.

Thank you @potiuk for spending time on this and providing us with options. We're certainly converging to a better approach than what we started with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I did not want to bias it, but that was also my personal preference (but I wanted to objectively show you the options). It seems much easier to explain as well, and I have a feeling that deferring new features by few months is not a huge issue as the adoption is usually anyhow delayed and deferrable operators not being available by default would be a big bummer.

With the #30161 now merged, the aiobotocore will also be included in the PROD image by default, so we are on track to increase the adoption of deferrable operators :).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, look like we have solution now. Thank you so much @potiuk for stepping-up bringing it in better shape it was long since pending 🚀

]


Expand Down
2 changes: 0 additions & 2 deletions tests/providers/amazon/aws/deferrable/hooks/test_base_aws.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,6 @@
except ImportError:
pass

pytest.importorskip("aiobotocore")


class TestAwsBaseAsyncHook:
@staticmethod
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@
from airflow.providers.amazon.aws.hooks.redshift_cluster import RedshiftAsyncHook
from tests.providers.amazon.aws.utils.compat import async_mock

pytest.importorskip("aiobotocore")


class TestRedshiftAsyncHook:
@pytest.mark.asyncio
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@
# under the License.
from __future__ import annotations

import importlib.util

import pytest

from airflow.providers.amazon.aws.triggers.redshift_cluster import (
Expand All @@ -30,7 +28,6 @@
POLLING_PERIOD_SECONDS = 1.0


@pytest.mark.skipif(not bool(importlib.util.find_spec("aiobotocore")), reason="aiobotocore require")
class TestRedshiftClusterTrigger:
def test_pause_serialization(self):
"""
Expand Down