-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove tasks annotation in AppflowHook
#33881
Conversation
AppflowHook
from mypy_boto3_appflow.type_defs import TaskOutputTypeDef, TaskTypeDef | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI for someone who want to keep type-checking for boto3/botocore expected values.
For correct work we should have exactly same version of mypy-boto3-{service-name}
and boto3
root@5e78bbc9c51e:/opt/airflow# pip list | grep boto
aiobotocore 2.6.0
boto 2.49.0
boto3 1.28.17
botocore 1.31.17
mypy-boto3-appflow 1.28.16
mypy-boto3-rds 1.28.34
mypy-boto3-redshift-data 1.28.16
mypy-boto3-s3 1.28.27
types-boto 2.49.18.9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fun fact, deprecated boto
and types-boto
we got from apache-airflow-providers-qubole and unmaintained qds-sdk
🤣
https://github.com/qubole/qds-sdk-py#where-are-the-maintainers-
Where are the maintainers ?
Qubole was acquired. All the maintainers of this repo have moved on. Some of the employees founded ClearFeed. Others are at big data teams in Microsoft, Amazon et al.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually (sort of) do. The packages get upgraded automatically as the boto3 - which means that --eager-upgrade
of ours will update them automatically when they are released. We certainly do not (and cannot) pin and update those packages manually (unless someone volunteers for updating it every few days). And to add the complexity we are also limited by aiobotocore releases - which automatically pin boto3 to specific versions.
And it ain't so easy as having the same version. Currently (due to aiobotocore) our "base CI image" has
- boto3==1.28.17 => we have not upgraded to latest 1.28.36 because aiobotocore holds us at 1.28.17 - we will upgrade automatically once aiobotocore releases newer version. Latest aiobotocore has:
Requires-Dist: boto3 <1.28.18,>=1.28.17 ; extra == 'boto3'
-
mypy-boto3-appflow==1.28.16 -> simply because there is no 1.28.17 and 1.28.36 (which is next version) apparently has been released today
-
the mypy-boto3-rds 1.28.36 has been released today as well and we will update to 1.28.36 likely (unless it has some requirements preventing from doing it).
same redshift-data
There is no consistency (for example some all the mypy libs had version betwen and some did not). And "eventually consistent" approach is I think the best we can do.
Of course we could try to somehow manyally manage it and put manual limits and bump them when we see we need to. But with 650 dependencies we have, that's far too much work - even if we would have a dependabot attempting to do it for us we often have 5-10 packages updated a day - so this automation we have to attempt to upgrade everything, let pypi figure out the latest "eager upgraded" set of dependencies that are not conflicting and passing the tests (only after they pass the tests they make it into latest constraints") is probably the best we can do.
At the expense of course that sometimes our main and PRs that change dependencies (which are few and far between) will get broken by unrelated dependency changes.
I have not figured a better way of doing it, but if somone has any ideas, I am all ears :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point just keep annotations only for boto3.client
in this case it is pretty safe to survive after changes, AFAIK mypy-boto3-{service-name}
build in top of botocore
JSON data definition, changes in methods are rare, rather than in methods types.
I have an idea in the past to make all AWS Hooks based on Generic, maybe it is even better than I decide do not implemented it. Also mypy-boto3-{service-name}
pretty heavy, I don't remember exact numbers but I guess full size would be 100-200 MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you think it's just safe to remove mypy
ones? can we do it without loosing anyting ? Can You do it :D ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requires-Dist: boto3 <1.28.18,>=1.28.17 ; extra == 'boto3'
Author of aiobotocore
explain about about why it required, and it wouldn't be resolved until botocore
have native asyncio support (9 years since request created an P3 priority)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you think it's just safe to remove mypy ones? can we do it without loosing anyting ? Can You do it :D ?
I would be better keep what we have now, and maybe move mypy-boto3-{service-name}
packages into provider extra block, I could have a look what we should do for it.
@o-nikolas Do you think is it even possible to implements #11297 or maybe we should decide that in won't fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@o-nikolas Do you think is it even possible to implements #11297 or maybe we should decide that in won't fixed
So far I've seen mostly grief working with these types packages, there was some breakage a week or so back as well. I'd agree with resolving as wont fix, for now, we can always re-open later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be better keep what we have now, and maybe move mypy-boto3-{service-name} packages into provider extra block, I could have a look what we should do for it.
We could simply move them to devel-only
extra. This is where we keep all things we want to have in CI image, but we do not want to make them direct dependencies of the packages. It's a bit disorganised and As I noticed yesterday we had forgottten "qubole" in there but let me just do it really quickly to split it into several dicts "per-provider" so that it will at least be organized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Time to suspend Qubole then. |
Type hints + Static Checkers + boto3 = Pain
#33822 (comment)
In latest mypy-boto3-appflow (1.28.29) version tasks have type
list[TaskTypeDef]
andTaskOutputTypeDef
not exists anymoreEven if we change it and it pass CI static checks, it still won't pass in 1.28.16 (
breeze ci-image build --python 3.8
), for avoid fighting with mypy, just remove annotations relatedmypy_boto3_appflow.type_defs
P.S.: I'm not sure that we should have annotate something more than specific boto3 clients
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.