Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce configuration to disable md5 used for security #31171

Conversation

longslvr
Copy link
Contributor

@longslvr longslvr commented May 9, 2023

Add a configuration flag to disable the use of md5 for security in Python >= 3.9. If Python < 3.9 is being used, this flag will not provide any effect on the existing behaviour. This is mainly for backward compatibility suggested in: #25625

With this configuration combined with the newly introduced flag to change the caching hash method (introduced in: #30675), airflow will be able to run in the environment restricted by FIPS 140-2 requirements.

Background: I have been running in FIPS enabled environment for my company, in order to do that we would have to fork out Airflow and introduce all the changes so that Airflow is fully functional. That added extra complexity when upgrading Airflow to the latest version. Also, a few people show interest in running Airflow in FIPS environment so I want to introduce the change in Airflow so we can all take advantage of that.

Testing done:

  • run pre-commit
  • running unit tests with Breeze for the following Python versions: python 3.9, python 3.7
  • Starting up airflow with Breeze and running the preloaded example dags in both Python 3.9 and Python 3.7

Thanks @vchiapaikeo for the work on the hashlib wrapper and the configurable caching

@boring-cyborg boring-cyborg bot added area:CLI area:core-operators Operators, Sensors and hooks within Core Airflow provider:cncf-kubernetes Kubernetes provider related issues area:providers area:serialization area:webserver Webserver related Issues provider:google Google (including GCP) related issues labels May 9, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented May 9, 2023

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

security:
description: ~
options:
disable_md5_for_security:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for_security in the config seems redundant

@uranusjr
Copy link
Member

What are the downsides if we always enable useforsecurity?

@longslvr
Copy link
Contributor Author

The configuration is a more conservative approach to let the users opt-in instead of opt-out as this functionality is not commonly used.
This flag doesn't exist in Python prior to 3.9 so doing this way would safeguard the usage.

@longslvr longslvr force-pushed the introduce_configuration_for_md5_to_run_in_fips_environment branch from 440b724 to 9d2d023 Compare May 15, 2023 18:57
@longslvr longslvr requested a review from kaxil May 15, 2023 20:48
@longslvr longslvr force-pushed the introduce_configuration_for_md5_to_run_in_fips_environment branch from 9d2d023 to d3ffc8a Compare May 17, 2023 03:02
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - in this form it is backwards compatible and great to have a separate configuration for it.

@longslvr longslvr force-pushed the introduce_configuration_for_md5_to_run_in_fips_environment branch 2 times, most recently from 3c4024b to 6128774 Compare May 19, 2023 16:43
@longslvr longslvr force-pushed the introduce_configuration_for_md5_to_run_in_fips_environment branch from 6128774 to 1073dc8 Compare May 25, 2023 13:54
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a config option for this? If it's possible to detect if the option will be accepted and just set it automatically? How is a user ID Airflow meant to know how we use md5 in the code to reasonably set this config option

If it only affects 3.9 let's just always set it there

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 to config option.

How is a user of Airflow meant to know how we use md5 in the code in order to reasonably set this config option.

This is a property of the code, not of the installation/deployment.

@longslvr
Copy link
Contributor Author

Do we need a config option for this? If it's possible to detect if the option will be accepted and just set it automatically?

If it only affects 3.9 let's just always set it there

I was going for the opt-in approach for these changes. Since the change is only required for the FIPS compliance, the configuration is to allow people to opt-in when running in FIPS environments instead of opt-out

@ashb
Copy link
Member

ashb commented May 25, 2023

Does it only work in FIPS environments, or is just ignored outside of one?

@longslvr
Copy link
Contributor Author

longslvr commented May 25, 2023

Does it only work in FIPS environments, or is just ignored outside of one?

It works outside of FIPS environements. I will make this as default and remove the configuration.

@ashb it is done now

@longslvr longslvr requested a review from ashb May 25, 2023 20:13
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @potiuk You happy with the config-less approach here?

@longslvr longslvr force-pushed the introduce_configuration_for_md5_to_run_in_fips_environment branch from c91dae6 to 0bb0f35 Compare May 26, 2023 00:42
@eladkal eladkal requested a review from potiuk May 26, 2023 08:58
@ashb ashb merged commit 22e44ab into apache:main May 26, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented May 26, 2023

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@longslvr longslvr deleted the introduce_configuration_for_md5_to_run_in_fips_environment branch May 26, 2023 14:51
@eladkal eladkal added this to the Airflow 2.6.2 milestone May 26, 2023
@eladkal eladkal added the type:improvement Changelog: Improvements label May 26, 2023
eladkal pushed a commit that referenced this pull request Jun 8, 2023
@eladkal eladkal modified the milestones: Airflow 2.6.2, Airflow 2.7.0 Jun 9, 2023
@umamaheswar52
Copy link

@longslvr - one quick confirmation needed pls. Suppose that if I install Apache Airflow 2.6.1 (or <= 2.7.3) successfully on FIPS enabled Ubuntu OS, shall I ensure myself Airflow (and its supporting libraries) are FIPS enabled?

@longslvr
Copy link
Contributor Author

@longslvr - one quick confirmation needed pls. Suppose that if I install Apache Airflow 2.6.1 (or <= 2.7.3) successfully on FIPS enabled Ubuntu OS, shall I ensure myself Airflow (and its supporting libraries) are FIPS enabled?

For Airflow to function correctly in a FIPs-enabled environment, MD5 usage for security must be disabled. You can install Airflow but it might not be able to run. Prior to 2.7.3 you would have to fork Airflow and make your change to disable it yourself.
I would suggest updating Airflow to >= 2.7.3 and use the configuration to turn of MD5 for security

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI area:core-operators Operators, Sensors and hooks within Core Airflow area:providers area:serialization area:webserver Webserver related Issues provider:cncf-kubernetes Kubernetes provider related issues provider:google Google (including GCP) related issues type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants