-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use base classes for AWS Lambda Operators/Sensors #34890
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -19,20 +19,19 @@ | |||||||||||||||||||||
|
||||||||||||||||||||||
import json | ||||||||||||||||||||||
from datetime import timedelta | ||||||||||||||||||||||
from functools import cached_property | ||||||||||||||||||||||
from typing import TYPE_CHECKING, Any, Sequence | ||||||||||||||||||||||
|
||||||||||||||||||||||
from airflow.configuration import conf | ||||||||||||||||||||||
from airflow.exceptions import AirflowException | ||||||||||||||||||||||
from airflow.models import BaseOperator | ||||||||||||||||||||||
from airflow.providers.amazon.aws.hooks.lambda_function import LambdaHook | ||||||||||||||||||||||
from airflow.providers.amazon.aws.operators.base_aws import AwsBaseOperator | ||||||||||||||||||||||
from airflow.providers.amazon.aws.triggers.lambda_function import LambdaCreateFunctionCompleteTrigger | ||||||||||||||||||||||
|
||||||||||||||||||||||
if TYPE_CHECKING: | ||||||||||||||||||||||
from airflow.utils.context import Context | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
||||||||||||||||||||||
class LambdaCreateFunctionOperator(BaseOperator): | ||||||||||||||||||||||
class LambdaCreateFunctionOperator(AwsBaseOperator[LambdaHook]): | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
Creates an AWS Lambda function. | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -62,13 +61,15 @@ class LambdaCreateFunctionOperator(BaseOperator): | |||||||||||||||||||||
:param aws_conn_id: The AWS connection ID to use | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
|
||||||||||||||||||||||
aws_hook_class = LambdaHook | ||||||||||||||||||||||
template_fields: Sequence[str] = ( | ||||||||||||||||||||||
"function_name", | ||||||||||||||||||||||
"runtime", | ||||||||||||||||||||||
"role", | ||||||||||||||||||||||
"handler", | ||||||||||||||||||||||
"code", | ||||||||||||||||||||||
"config", | ||||||||||||||||||||||
*AwsBaseOperator.template_fields, | ||||||||||||||||||||||
) | ||||||||||||||||||||||
ui_color = "#ff7300" | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -82,12 +83,11 @@ def __init__( | |||||||||||||||||||||
code: dict, | ||||||||||||||||||||||
description: str | None = None, | ||||||||||||||||||||||
timeout: int | None = None, | ||||||||||||||||||||||
config: dict = {}, | ||||||||||||||||||||||
config: dict | None = None, | ||||||||||||||||||||||
wait_for_completion: bool = False, | ||||||||||||||||||||||
waiter_max_attempts: int = 60, | ||||||||||||||||||||||
waiter_delay: int = 15, | ||||||||||||||||||||||
deferrable: bool = conf.getboolean("operators", "default_deferrable", fallback=False), | ||||||||||||||||||||||
aws_conn_id: str = "aws_default", | ||||||||||||||||||||||
**kwargs, | ||||||||||||||||||||||
): | ||||||||||||||||||||||
super().__init__(**kwargs) | ||||||||||||||||||||||
|
@@ -98,16 +98,11 @@ def __init__( | |||||||||||||||||||||
self.code = code | ||||||||||||||||||||||
self.description = description | ||||||||||||||||||||||
self.timeout = timeout | ||||||||||||||||||||||
self.config = config | ||||||||||||||||||||||
self.config = config or {} | ||||||||||||||||||||||
self.wait_for_completion = wait_for_completion | ||||||||||||||||||||||
self.waiter_delay = waiter_delay | ||||||||||||||||||||||
self.waiter_max_attempts = waiter_max_attempts | ||||||||||||||||||||||
self.deferrable = deferrable | ||||||||||||||||||||||
self.aws_conn_id = aws_conn_id | ||||||||||||||||||||||
|
||||||||||||||||||||||
@cached_property | ||||||||||||||||||||||
def hook(self) -> LambdaHook: | ||||||||||||||||||||||
return LambdaHook(aws_conn_id=self.aws_conn_id) | ||||||||||||||||||||||
|
||||||||||||||||||||||
def execute(self, context: Context): | ||||||||||||||||||||||
self.log.info("Creating AWS Lambda function: %s", self.function_name) | ||||||||||||||||||||||
|
@@ -131,6 +126,9 @@ def execute(self, context: Context): | |||||||||||||||||||||
waiter_delay=self.waiter_delay, | ||||||||||||||||||||||
waiter_max_attempts=self.waiter_max_attempts, | ||||||||||||||||||||||
aws_conn_id=self.aws_conn_id, | ||||||||||||||||||||||
region_name=self.region_name, | ||||||||||||||||||||||
verify=self.verify, | ||||||||||||||||||||||
botocore_config=self.botocore_config, | ||||||||||||||||||||||
), | ||||||||||||||||||||||
method_name="execute_complete", | ||||||||||||||||||||||
timeout=timedelta(seconds=self.waiter_max_attempts * self.waiter_delay), | ||||||||||||||||||||||
|
@@ -152,7 +150,7 @@ def execute_complete(self, context: Context, event: dict[str, Any] | None = None | |||||||||||||||||||||
return event["function_arn"] | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
||||||||||||||||||||||
class LambdaInvokeFunctionOperator(BaseOperator): | ||||||||||||||||||||||
class LambdaInvokeFunctionOperator(AwsBaseOperator[LambdaHook]): | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll admit this format for the inheritence is new to me and I am trying to read up on how it's doing what it does, but it appears that the value inside the square brackets here is always the same as the value being assigned to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In square brakets it is just typing.Generic, which help type annotate type of the
airflow/airflow/providers/amazon/aws/utils/mixins.py Lines 100 to 102 in e9987d5
airflow/airflow/providers/amazon/aws/utils/mixins.py Lines 111 to 112 in e9987d5
airflow/airflow/providers/amazon/aws/utils/mixins.py Lines 148 to 151 in e9987d5
In general However information about generics doesn't available in runtime and we can't extract it from class definition, so we should assign actual class to |
||||||||||||||||||||||
""" | ||||||||||||||||||||||
Invokes an AWS Lambda function. | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -176,7 +174,14 @@ class LambdaInvokeFunctionOperator(BaseOperator): | |||||||||||||||||||||
:param aws_conn_id: The AWS connection ID to use | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
|
||||||||||||||||||||||
template_fields: Sequence[str] = ("function_name", "payload", "qualifier", "invocation_type") | ||||||||||||||||||||||
aws_hook_class = LambdaHook | ||||||||||||||||||||||
template_fields: Sequence[str] = ( | ||||||||||||||||||||||
"function_name", | ||||||||||||||||||||||
"payload", | ||||||||||||||||||||||
"qualifier", | ||||||||||||||||||||||
"invocation_type", | ||||||||||||||||||||||
*AwsBaseOperator.template_fields, | ||||||||||||||||||||||
) | ||||||||||||||||||||||
ui_color = "#ff7300" | ||||||||||||||||||||||
|
||||||||||||||||||||||
def __init__( | ||||||||||||||||||||||
|
@@ -189,7 +194,6 @@ def __init__( | |||||||||||||||||||||
invocation_type: str | None = None, | ||||||||||||||||||||||
client_context: str | None = None, | ||||||||||||||||||||||
payload: bytes | str | None = None, | ||||||||||||||||||||||
aws_conn_id: str = "aws_default", | ||||||||||||||||||||||
**kwargs, | ||||||||||||||||||||||
): | ||||||||||||||||||||||
super().__init__(**kwargs) | ||||||||||||||||||||||
|
@@ -200,11 +204,6 @@ def __init__( | |||||||||||||||||||||
self.qualifier = qualifier | ||||||||||||||||||||||
self.invocation_type = invocation_type | ||||||||||||||||||||||
self.client_context = client_context | ||||||||||||||||||||||
self.aws_conn_id = aws_conn_id | ||||||||||||||||||||||
|
||||||||||||||||||||||
@cached_property | ||||||||||||||||||||||
def hook(self) -> LambdaHook: | ||||||||||||||||||||||
return LambdaHook(aws_conn_id=self.aws_conn_id) | ||||||||||||||||||||||
|
||||||||||||||||||||||
def execute(self, context: Context): | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
.. Licensed to the Apache Software Foundation (ASF) under one | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice addition |
||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
|
||
.. http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
.. Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
|
||
|
||
aws_conn_id | ||
Reference to :ref:`Amazon Web Services Connection <howto/connection:aws>` ID. | ||
If this parameter is set to ``None`` then the default boto3 behaviour is used without lookup connection. | ||
Otherwise use credentials stored into the Connection. Default: ``aws_default`` | ||
Taragolis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
region_name | ||
AWS Region Name. If this parameter is set to ``None`` or omitted then **region_name** from | ||
:ref:`AWS Connection Extra Parameter <howto/connection:aws:configuring-the-connection>` will use. | ||
Otherwise use specified value instead of connection value. Default: ``None`` | ||
Taragolis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
verify | ||
Whether or not to verify SSL certificates. | ||
|
||
* ``False`` - do not validate SSL certificates. | ||
* **path/to/cert/bundle.pem** - A filename of the CA cert bundle to uses. You can specify this argument | ||
if you want to use a different CA cert bundle than the one used by botocore. | ||
|
||
If this parameter is set to ``None`` or omitted then **verify** from | ||
:ref:`AWS Connection Extra Parameter <howto/connection:aws:configuring-the-connection>` will use. | ||
Otherwise use specified value instead of from connection value. Default: ``None`` | ||
Taragolis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
botocore_config | ||
Use provided dictionary to construct a `botocore.config.Config`_. | ||
This configuration will able to use for :ref:`howto/connection:aws:avoid-throttling-exceptions`, configure timeouts and etc. | ||
Taragolis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. code-block:: python | ||
:caption: Example, for more detail about parameters please have a look `botocore.config.Config`_ | ||
|
||
{ | ||
"signature_version": "unsigned", | ||
"s3": { | ||
"us_east_1_regional_endpoint": True, | ||
}, | ||
"retries": { | ||
"mode": "standard", | ||
"max_attempts": 10, | ||
}, | ||
"connect_timeout": 300, | ||
"read_timeout": 300, | ||
"tcp_keepalive": True, | ||
} | ||
|
||
If this parameter is set to ``None`` or omitted then **config_kwargs** from | ||
:ref:`AWS Connection Extra Parameter <howto/connection:aws:configuring-the-connection>` will use. | ||
Otherwise use specified value instead of connection value. Default: ``None`` | ||
Taragolis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. note:: | ||
Empty dictionary ``{}`` uses for construct default botocore Config | ||
and will overwrite connection configuration for `botocore.config.Config`_ | ||
Taragolis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. _botocore.config.Config: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding this to every operator class, can we do it the other way around where the base class has this field already with these defaults and we update it with anything new that the concrete class wants to add (if anything)? It would need to be a mutable type of course, I'm not sure if it being immutable is a requirement or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could update it only in init method, at least I don't know another way to change class attribute during creation, and I'm not sure is it works on each cases or not (especially for mapped tasks).
Another way it is make parameters as set, but this have side effect, set doesn't preserve the order, and it might generate new serialised DAG each time it parsed (maybe it is not so bad)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought about another option, create helper function for apply default parameters
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A variation on that might be to have each operator define "op_template_fields" and have BaseOperator's
template_fields
return base+op? Basically:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also would be nice to test within the mapped task, otherwise we could bump into the same problem as we have in BatchOperator
operator_extra_links
property serialization in mapped tasks #31904There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is certainly nicer! Less duplication and the defaults are complexity hidden from the users. It would be nice to have a fully backwards compatible approach, but I'm happy to settle on this one.
This would make the existing operators incompatible since the field name would need to change, but we need to make other changes to them to convert them to the base class approach, so maybe that's perfectly fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this helper 28aef3b