Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Providers/HTTP] Add adapter parameter to HttpHook to allow custom requests adapters #44302

Merged

Conversation

jieyao-MilestoneHub
Copy link
Contributor

This PR adds an adapter parameter to the HttpHook, allowing users to mount custom adapters for HTTP request handling. This enhances flexibility and supports use cases like custom retries, timeouts, and SSL handling.

Summary of Changes:

  • Added adapter parameter to HttpHook to allow custom HTTP adapters.
  • Modified get_conn to support mounting custom adapters or using TCPKeepAliveAdapter by default.
  • Added comprehensive tests to validate the functionality of the adapter parameter and its integration with get_conn.
  • Ensured all new tests pass and maintain compatibility with existing functionality.

Issue Reference:

Closes #44285

Testing and Validation:

  • Added unit tests in ~/providers/tests/http/hooks/test_http.py to validate:
    • Mounting custom adapters.
    • Default behavior when no adapter is provided.
    • Handling invalid adapter types.
  • Ran pytest providers/tests/http/hooks/test_http.py using breeze.
  • Pass pre-commit run --files providers/src/airflow/providers/http/hooks/http.py providers/tests/http/hooks/test_http.py using breeze.

Backward Compatibility:

This change is backward-compatible. The new adapter parameter is optional and defaults to None, preserving existing behavior.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Copy link

boring-cyborg bot commented Nov 23, 2024

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@jieyao-MilestoneHub jieyao-MilestoneHub force-pushed the feature/http-hook-custom-adapter branch 2 times, most recently from 1697dc2 to 7938e2b Compare November 23, 2024 11:04
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to Apache Airflow and thanks for your contribution!
Looks OK overall, I have some comments :)

providers/src/airflow/providers/http/hooks/http.py Outdated Show resolved Hide resolved
providers/src/airflow/providers/http/hooks/http.py Outdated Show resolved Hide resolved
@jieyao-MilestoneHub
Copy link
Contributor Author

Thank you, @shahar1, for your detailed review and valuable feedback! I’ve addressed your comments by:

  1. Adding a detailed docstring for the __init__ method, including the missing adapter parameter.
  2. Removing the redundant instantiation of TCPKeepAliveAdapter in the run method.

Please let me know if there’s anything else I can improve. I really appreciate your time and support!

@jieyao-MilestoneHub jieyao-MilestoneHub force-pushed the feature/http-hook-custom-adapter branch from f94eca0 to ff2adff Compare November 23, 2024 13:08
@shahar1
Copy link
Contributor

shahar1 commented Nov 23, 2024

Thank you, @shahar1, for your detailed review and valuable feedback! I’ve addressed your comments by:

  1. Adding a detailed docstring for the __init__ method, including the missing adapter parameter.
  2. Removing the redundant instantiation of TCPKeepAliveAdapter in the run method.

Please let me know if there’s anything else I can improve. I really appreciate your time and support!

Thanks :) Regarding the 2nd point - could you please explain why it makes sense to relocate the instanation of TCPKeepAliveAdapter to the __init__?

@jieyao-MilestoneHub
Copy link
Contributor Author

Thanks for the great suggestion! Moving the TCPKeepAliveAdapter to init indeed streamlines the logic—keeps get_conn focused on session setup while ensuring TCP settings are ready upfront. A cleaner and more organized approach, much appreciated! 🙌

In response to the above, I've implemented two changes in this PR:

  1. Moved the instantiation of TCPKeepAliveAdapter to init for better organization and to keep get_conn focused on session setup.
  2. Streamlined the handling of connection extras to make the session configuration more robust and easier to maintain.

jieyao-MilestoneHub pushed a commit to jieyao-MilestoneHub/airflow_contribute that referenced this pull request Nov 24, 2024
Aligned the `get_conn` method with the adjustments specified in apache#44302,
including refined handling of headers. Optimized and updated test cases
to ensure compatibility and maintain robust test coverage.
jiao added 4 commits November 24, 2024 16:29
- Added `adapter` parameter to `HttpHook` to allow custom HTTP adapters.
- Modified `get_conn` to support mounting custom adapters or using TCPKeepAliveAdapter by default.
- Added comprehensive tests to validate the functionality of the `adapter` parameter and its integration with `get_conn`.
- Ensured all new tests pass and maintain compatibility with existing functionality.
…pter

- Added missing `adapter` parameter description to the HttpHook class docstring.
- Removed redundant instantiation of `TCPKeepAliveAdapter` in the `run` method since it's already instantiated in `get_conn`.
- Ensured proper mounting of TCP Keep-Alive adapter when enabled.
- Improved handling of connection extras for cleaner session configuration.
Aligned the `get_conn` method with the adjustments specified in apache#44302,
including refined handling of headers. Optimized and updated test cases
to ensure compatibility and maintain robust test coverage.
@jieyao-MilestoneHub jieyao-MilestoneHub force-pushed the feature/http-hook-custom-adapter branch from 3b5ebf1 to 93c23dc Compare November 24, 2024 08:30
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me :)
I'll be happy for an additional review though

@jieyao-MilestoneHub
Copy link
Contributor Author

Sure, let's have someone else take a look. Thank you for your suggestion!

@Lee-W Lee-W self-requested a review November 25, 2024 02:41
jiao added 3 commits November 25, 2024 21:11
…TPAdapter

- Changed the `adapter` parameter to accept only `HTTPAdapter` instead of `BaseAdapter`.
- Strengthened `_set_base_url` validation to ensure base_url is constructed with stricter conditions.
- Adjusted `_mount_adapters` to improve maintainability.
…TPAdapter

- Changed the `adapter` parameter to accept only `HTTPAdapter` instead of `BaseAdapter`.
- Strengthened `_set_base_url` validation to ensure base_url is constructed with stricter conditions.
- Adjusted `_mount_adapters` to improve maintainability.
@jieyao-MilestoneHub jieyao-MilestoneHub force-pushed the feature/http-hook-custom-adapter branch from 853ad14 to 151b402 Compare November 25, 2024 13:28
jiao added 12 commits November 30, 2024 11:46
The `adapter` parameter in `HttpHook` was previously required to be explicitly
set to an instance of `HTTPAdapter`. This commit modifies the `__init__`
method to assign a default `HTTPAdapter` when no adapter is provided.

Changes:
- Removed type checks for `adapter`, as default initialization guarantees correctness.
- Improved code readability and reduced potential runtime errors.

No functional changes beyond defaulting `adapter` to `HTTPAdapter`.
Refactored `HttpHook` to support a custom `HTTPAdapter` through the `adapter` parameter. If no adapter is provided, it defaults to `TCPKeepAliveAdapter` when `tcp_keep_alive=True`.

Test: Added `test_custom_adapter` to verify correct adapter mounting.
- Adjust the length of each line of code.
- modify `assert instance` by PEP8
@Lee-W
Copy link
Member

Lee-W commented Dec 2, 2024

running pre-commit run --all-files locally can help you resolve issues earlier 🙂

@potiuk
Copy link
Member

potiuk commented Dec 2, 2024

running pre-commit run --all-files locally can help you resolve issues earlier 🙂

Or breeze static-checks --only-my-changes which runs only on files you changed so it is way faster.

@jieyao-MilestoneHub
Copy link
Contributor Author

Thanks for your suggestion!

@Lee-W
Copy link
Member

Lee-W commented Dec 3, 2024

Thanks @jieyao-MilestoneHub ! The PR looks LGTM! I'll keep it for one or two days for folks to check. Please ping me if I forget to get back to you and merge it.

@jieyao-MilestoneHub
Copy link
Contributor Author

Thank you all for taking the time to work on this together! 😊

@potiuk potiuk merged commit 71fec4e into apache:main Dec 3, 2024
65 checks passed
Copy link

boring-cyborg bot commented Dec 3, 2024

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@jieyao-MilestoneHub jieyao-MilestoneHub deleted the feature/http-hook-custom-adapter branch December 3, 2024 13:12
@shahar1
Copy link
Contributor

shahar1 commented Dec 3, 2024

Thank you all for taking the time to work on this together! 😊

Thank you for your great first contribution :)
Looking forward!

LefterisXefteris pushed a commit to LefterisXefteris/airflow that referenced this pull request Jan 5, 2025
…quests adapters (apache#44302)

* feat(http-hook): add adapter parameter to HttpHook and enhance get_conn

- Added `adapter` parameter to `HttpHook` to allow custom HTTP adapters.
- Modified `get_conn` to support mounting custom adapters or using TCPKeepAliveAdapter by default.
- Added comprehensive tests to validate the functionality of the `adapter` parameter and its integration with `get_conn`.
- Ensured all new tests pass and maintain compatibility with existing functionality.

* fix(http_hook): Update docstring and remove redundant TCPKeepAliveAdapter

- Added missing `adapter` parameter description to the HttpHook class docstring.
- Removed redundant instantiation of `TCPKeepAliveAdapter` in the `run` method since it's already instantiated in `get_conn`.

* fix(http_hook): improve get_conn session setup and TCP adapter logic

- Ensured proper mounting of TCP Keep-Alive adapter when enabled.
- Improved handling of connection extras for cleaner session configuration.

* feat(http): update get_conn logic and corresponding tests (apache#44302)

Aligned the `get_conn` method with the adjustments specified in apache#44302,
including refined handling of headers. Optimized and updated test cases
to ensure compatibility and maintain robust test coverage.

* refactor(http_hook): simplify HttpHook by reverting BaseAdapter to HTTPAdapter

- Changed the `adapter` parameter to accept only `HTTPAdapter` instead of `BaseAdapter`.
- Strengthened `_set_base_url` validation to ensure base_url is constructed with stricter conditions.
- Adjusted `_mount_adapters` to improve maintainability.

* refactor(http_hook): simplify HttpHook by reverting BaseAdapter to HTTPAdapter

- Changed the `adapter` parameter to accept only `HTTPAdapter` instead of `BaseAdapter`.
- Strengthened `_set_base_url` validation to ensure base_url is constructed with stricter conditions.
- Adjusted `_mount_adapters` to improve maintainability.

* Merge: new main

* refactor: improve function naming and add type annotations

- Changed the function prefix from `_set` to `_configure_session_from` to enhance readability and better reflect its purpose.
- Added static type annotations for input parameters and return values.
- Included comments to document the design rationale following coding standards.
- Improved error message: replaced generic text with detailed and actionable messages.

* fix: simplify the change of session

- Added a variable `session` after the change of session member

* fix: Adjust response format.

* fix: simplify the logic

* fix(hook): ensure default HTTPAdapter in HttpHook init

The `adapter` parameter in `HttpHook` was previously required to be explicitly
set to an instance of `HTTPAdapter`. This commit modifies the `__init__`
method to assign a default `HTTPAdapter` when no adapter is provided.

Changes:
- Removed type checks for `adapter`, as default initialization guarantees correctness.
- Improved code readability and reduced potential runtime errors.

No functional changes beyond defaulting `adapter` to `HTTPAdapter`.

* feat(http_hook): add support for custom adapter in initialization

Refactored `HttpHook` to support a custom `HTTPAdapter` through the `adapter` parameter. If no adapter is provided, it defaults to `TCPKeepAliveAdapter` when `tcp_keep_alive=True`.

Test: Added `test_custom_adapter` to verify correct adapter mounting.

* fix: CI image checks / Static checks

- Adjust the length of each line of code.

* fix: Adjust indent style

- modify `assert instance` by PEP8

* fix: ruff error about `from requests.adapters import HTTPAdapter`

---------

Co-authored-by: jiao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow extending HttpHook with requests adapters
6 participants