-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: replaced md4 with blake2b #34442
fix: replaced md4 with blake2b #34442
Conversation
Thanks for the pull request, @Anas12091101! Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
@Anas12091101 I know this caused problems with launching the gradebook MFE. Do you know what else in edx-platform might use this caching scheme? |
My only concern is that this will have the effective of clearing the entire cache, potentially causing a disruption. (On edx.org we only clear memcache as a last resort, as it logs out all sessions and causes a performance hit.) So, a few thoughts:
|
@pdpinch I think by default in LMS and CMS memcache is being used as the caching scheme in edx-platform. So anywhere in edx-platform where we are using cache.get, we are calling the |
One option would be to put this in a On the note of using a different implementation of All of that being said, the hashed key is only used if the combined values of key, key_prefix, and version exceeds 250 characters. Given that the cache is designed to be a performance enhancement, not a functional requirement and that only a subset of cached values will be impacted I think that this should be a safe change to merge without developing a complex stack of migration logic. |
Good to know this only affects larger keys. I might put in a quick PR to add telemetry for checking what key sizes we're seeing on edX. I think it might make sense to put this behind a feature switch and make a DEPR out of removing the md4 option entirely. Switching to blake2b as a default seems like a very reasonable thing to do, but having a well-defined timeframe would be helpful here. From personal experience, while caches are nominally a performance enhancement, a full cache can very quickly become a critical requirement for system stability. I'd like to understand which situation we're in here before saying that I'm comfortable with a hard cutover. |
cms/envs/common.py
Outdated
# .. toggle_description: Enables the memcache to use the blake2b hash algorithm instead of depreciated md4 for keys | ||
# exceeding 250 characters | ||
# .. toggle_use_cases: open_edx | ||
# .. toggle_creation_date: 2024-04-02 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mention somewhere that this should be a short-lived toggle. Maybe only until the sycamore release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And do we need to define this twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about annonations? Yes, we don't need to add the annonations again in the CMS. The flag is however required in both LMS and CMS
@timmc-edx what do you think? |
Looks good! The feature flag would allow us to schedule a time to do a cutover in a controlled way (and would give us time to research which keys would be affected). |
63c10e3
to
527f1d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a few code improvements.
@@ -579,6 +579,9 @@ | |||
# .. toggle_creation_date: 2024-03-22 | |||
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/33911 | |||
'ENABLE_GRADING_METHOD_IN_PROBLEMS': False, | |||
|
|||
# See annotations in lms/envs/common.py for details. |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to redefine the annotations again. There are some examples as well in which the flag's annonation has not been redefined in cms settings. Examples: https://github.com/openedx/edx-platform/blob/34b1b90c499ad2b78233a0dcc70fbb789a78afc9/cms/envs/common.py#L187-L188
cms/envs/common.py
Outdated
@@ -579,6 +579,9 @@ | |||
# .. toggle_creation_date: 2024-03-22 | |||
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/33911 | |||
'ENABLE_GRADING_METHOD_IN_PROBLEMS': False, | |||
|
|||
# See annotations in lms/envs/common.py for details. | |||
'ENABLE_BLAKE2B_HASH_FN': False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid the abbreviation FN
? ENABLE_BLAKE2B_HASH_FUNCTION
should be fine.
common/djangoapps/util/memcache.py
Outdated
from django.utils.encoding import smart_str | ||
from django.conf import settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to sort the imports.
common/djangoapps/util/memcache.py
Outdated
if settings.FEATURES.get("ENABLE_BLAKE2B_HASH_FN", False): | ||
hash_obj = hashlib.new("blake2b", digest_size=16) | ||
else: | ||
hash_obj = hashlib.new("md4") | ||
hash_obj.update(string.encode('utf-8')) | ||
return hash_obj.hexdigest() |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
lms/envs/common.py
Outdated
# .. toggle_name: FEATURES['ENABLE_BLAKE2B_HASH_FN'] | ||
# .. toggle_implementation: DjangoSetting | ||
# .. toggle_default: False | ||
# .. toggle_description: Enables the memcache to use the blake2b hash algorithm instead of depreciated md4 for keys | ||
# exceeding 250 characters | ||
# .. toggle_use_cases: open_edx | ||
# .. toggle_creation_date: 2024-04-02 | ||
# .. toggle_target_removal_date: 2024-12-09 | ||
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/34442 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add toggle_warning for this flag like other LMS & CMS common flags?
# .. toggle_warning: For consistency, keep the value in sync with the setting of the same name in the LMS and CMS.
from django.test import TestCase, override_settings | ||
from django.conf import settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting? You can use isort.
bfeabba
to
4658109
Compare
@Anas12091101 would you mind squashing your commits? |
@timmc-edx we took the route of adding a feature flag to ease the changeover. Is this OK to merge from your perspective? After this merges, we'll open a DEPR for removing the md4 code and the feature flag, but it wouldn't be removed until after the Redwood or Sumac releases, depending on timing. |
4658109
to
76385ed
Compare
Yes, I like this approach, and have no reservations about it merging. |
@Anas12091101 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
@@ -579,6 +579,9 @@ | |||
# .. toggle_creation_date: 2024-03-22 | |||
# .. toggle_tickets: https://github.com/openedx/edx-platform/pull/33911 | |||
'ENABLE_GRADING_METHOD_IN_PROBLEMS': False, | |||
|
|||
# See annotations in lms/envs/common.py for details. | |||
'ENABLE_BLAKE2B_HASHiNG': False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Dictionary matching is going to be case-sensitive. This might cause errors.
'ENABLE_BLAKE2B_HASHiNG': False, | |
'ENABLE_BLAKE2B_HASHING': False, | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created the PR to fix this
Description
This PR replaces md4 with blake2b in memcache. At MIT, we recently encountered an issue after upgrading our base image from a debian-buster image to a debian-bookworm image. This uncovered an incompatibility between the newer version of openssl and one of the hash digests that edx-platform is using (1.1.1d-0+deb10u7). Basically, openssl + hashlib are saying that md4 is supported and available but if you actually try to invoke openssl md4 you’ll get an error saying it is not available. This PR resolves this issue by updating the hash function to something a bit newer and more widely supported.
Useful information to include:
Supporting information
https://discuss.openedx.org/t/openssl-upgrade-md4-hashing/12654
Testing instructions
python manage.py lms shell
Deadline
"None"
Other information
Include anything else that will help reviewers and consumers understand the change.
md4
support is broken openssl/openssl#21247