Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: replaced md4 with blake2b #34442

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

Anas12091101
Copy link
Contributor

@Anas12091101 Anas12091101 commented Mar 28, 2024

Description

This PR replaces md4 with blake2b in memcache. At MIT, we recently encountered an issue after upgrading our base image from a debian-buster image to a debian-bookworm image. This uncovered an incompatibility between the newer version of openssl and one of the hash digests that edx-platform is using (1.1.1d-0+deb10u7). Basically, openssl + hashlib are saying that md4 is supported and available but if you actually try to invoke openssl md4 you’ll get an error saying it is not available. This PR resolves this issue by updating the hash function to something a bit newer and more widely supported.

Useful information to include:

  • Which edX user roles will this change impact? Learner, Course Author

Supporting information

https://discuss.openedx.org/t/openssl-upgrade-md4-hashing/12654

Testing instructions

  • In the lms container run python manage.py lms shell
  • In the python shell run:
>>> import hashlib
>>> blake2b = hashlib.new("blake2b", digest_size=16)
>>> string = "abc"
>>> blake2b.update(string.encode('utf-8'))
>>> blake2b.hexdigest()
  • You should get the hash without any errors in the above statements.

Deadline

"None"

Other information

Include anything else that will help reviewers and consumers understand the change.

@openedx-webhooks
Copy link

Thanks for the pull request, @Anas12091101! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Mar 28, 2024
@pdpinch
Copy link
Contributor

pdpinch commented Mar 28, 2024

@Anas12091101 I know this caused problems with launching the gradebook MFE. Do you know what else in edx-platform might use this caching scheme?

@timmc-edx
Copy link
Contributor

My only concern is that this will have the effective of clearing the entire cache, potentially causing a disruption. (On edx.org we only clear memcache as a last resort, as it logs out all sessions and causes a performance hit.)

So, a few thoughts:

  • Is there some option for a rotation mechanism? Fetching two keys (one blake2b, one md4) would be a much higher traffic burden, but maybe there's something clever we can do.
  • Would it be possible to just use a different implementation of MD4? MD4 is a weak hash function for cryptographic purposes, but should be fine for sharding.

@Anas12091101
Copy link
Contributor Author

Anas12091101 commented Mar 29, 2024

@Anas12091101 I know this caused problems with launching the gradebook MFE. Do you know what else in edx-platform might use this caching scheme?

@pdpinch I think by default in LMS and CMS memcache is being used as the caching scheme in edx-platform. So anywhere in edx-platform where we are using cache.get, we are calling the safe_key fn in the memcache.py

@blarghmatey
Copy link
Contributor

One option would be to put this in a try/except block catching the ValueError that returns when md4 is not a supported algorithm and falling back to Blake2B. That doesn't solve for the case when deploying a new set of instances with the newer OpenSSL that no longer supports that method.

On the note of using a different implementation of md4, that will give a stable target (for now) but introduces the potential for stagnation, as that algorithm is (rightfully) not supported any more. That leaves us in the situation of either accepting a one-time hit of dumping cached items or developing a migration path.

All of that being said, the hashed key is only used if the combined values of key, key_prefix, and version exceeds 250 characters. Given that the cache is designed to be a performance enhancement, not a functional requirement and that only a subset of cached values will be impacted I think that this should be a safe change to merge without developing a complex stack of migration logic.

@timmc-edx
Copy link
Contributor

Good to know this only affects larger keys. I might put in a quick PR to add telemetry for checking what key sizes we're seeing on edX.

I think it might make sense to put this behind a feature switch and make a DEPR out of removing the md4 option entirely. Switching to blake2b as a default seems like a very reasonable thing to do, but having a well-defined timeframe would be helpful here.

From personal experience, while caches are nominally a performance enhancement, a full cache can very quickly become a critical requirement for system stability. I'd like to understand which situation we're in here before saying that I'm comfortable with a hard cutover.

# .. toggle_description: Enables the memcache to use the blake2b hash algorithm instead of depreciated md4 for keys
# exceeding 250 characters
# .. toggle_use_cases: open_edx
# .. toggle_creation_date: 2024-04-02
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mention somewhere that this should be a short-lived toggle. Maybe only until the sycamore release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And do we need to define this twice?

Copy link
Contributor Author

@Anas12091101 Anas12091101 Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about annonations? Yes, we don't need to add the annonations again in the CMS. The flag is however required in both LMS and CMS

@pdpinch
Copy link
Contributor

pdpinch commented Apr 3, 2024

@timmc-edx what do you think?

@timmc-edx
Copy link
Contributor

Looks good! The feature flag would allow us to schedule a time to do a cutover in a controlled way (and would give us time to research which keys would be affected).

@Anas12091101 Anas12091101 force-pushed the anas/replace-md4-with-blake2b branch 2 times, most recently from 63c10e3 to 527f1d8 Compare April 8, 2024 13:04
Copy link
Contributor

@asadali145 asadali145 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a few code improvements.

@@ -579,6 +579,9 @@
# .. toggle_creation_date: 2024-03-22
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/33911
'ENABLE_GRADING_METHOD_IN_PROBLEMS': False,

# See annotations in lms/envs/common.py for details.

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to redefine the annotations again. There are some examples as well in which the flag's annonation has not been redefined in cms settings. Examples: https://github.com/openedx/edx-platform/blob/34b1b90c499ad2b78233a0dcc70fbb789a78afc9/cms/envs/common.py#L187-L188

https://github.com/openedx/edx-platform/blob/34b1b90c499ad2b78233a0dcc70fbb789a78afc9/cms/envs/common.py#L189-L190

@@ -579,6 +579,9 @@
# .. toggle_creation_date: 2024-03-22
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/33911
'ENABLE_GRADING_METHOD_IN_PROBLEMS': False,

# See annotations in lms/envs/common.py for details.
'ENABLE_BLAKE2B_HASH_FN': False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the abbreviation FN? ENABLE_BLAKE2B_HASH_FUNCTION should be fine.

from django.utils.encoding import smart_str
from django.conf import settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to sort the imports.

Comment on lines 18 to 23
if settings.FEATURES.get("ENABLE_BLAKE2B_HASH_FN", False):
hash_obj = hashlib.new("blake2b", digest_size=16)
else:
hash_obj = hashlib.new("md4")
hash_obj.update(string.encode('utf-8'))
return hash_obj.hexdigest()

This comment was marked as resolved.

Comment on lines 1063 to 1072
# .. toggle_name: FEATURES['ENABLE_BLAKE2B_HASH_FN']
# .. toggle_implementation: DjangoSetting
# .. toggle_default: False
# .. toggle_description: Enables the memcache to use the blake2b hash algorithm instead of depreciated md4 for keys
# exceeding 250 characters
# .. toggle_use_cases: open_edx
# .. toggle_creation_date: 2024-04-02
# .. toggle_target_removal_date: 2024-12-09
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/34442
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add toggle_warning for this flag like other LMS & CMS common flags?
# .. toggle_warning: For consistency, keep the value in sync with the setting of the same name in the LMS and CMS.

Comment on lines 7 to 8
from django.test import TestCase, override_settings
from django.conf import settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting? You can use isort.

@Anas12091101 Anas12091101 force-pushed the anas/replace-md4-with-blake2b branch from bfeabba to 4658109 Compare April 15, 2024 12:37
@pdpinch
Copy link
Contributor

pdpinch commented Apr 18, 2024

@Anas12091101 would you mind squashing your commits?

@pdpinch
Copy link
Contributor

pdpinch commented Apr 18, 2024

@timmc-edx we took the route of adding a feature flag to ease the changeover. Is this OK to merge from your perspective?

After this merges, we'll open a DEPR for removing the md4 code and the feature flag, but it wouldn't be removed until after the Redwood or Sumac releases, depending on timing.

@Anas12091101 Anas12091101 force-pushed the anas/replace-md4-with-blake2b branch from 4658109 to 76385ed Compare April 19, 2024 07:23
@timmc-edx
Copy link
Contributor

Yes, I like this approach, and have no reservations about it merging.

@pdpinch pdpinch merged commit aea7fce into openedx:master Apr 22, 2024
66 checks passed
@openedx-webhooks
Copy link

@Anas12091101 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX production environment.

@@ -579,6 +579,9 @@
# .. toggle_creation_date: 2024-03-22
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/33911
'ENABLE_GRADING_METHOD_IN_PROBLEMS': False,

# See annotations in lms/envs/common.py for details.
'ENABLE_BLAKE2B_HASHiNG': False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Dictionary matching is going to be case-sensitive. This might cause errors.

Suggested change
'ENABLE_BLAKE2B_HASHiNG': False,
'ENABLE_BLAKE2B_HASHING': False,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created the PR to fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open-source-contribution PR author is not from Axim or 2U
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

8 participants