-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update search index when course content is updated [FC-0040] #34391
feat: update search index when course content is updated [FC-0040] #34391
Conversation
Thanks for the pull request, @rpenido! Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on this @rpenido ! I realized after reviewing that a lot of the changes on this PR diff are from @bradenmacdonald 's PR since this PR wasn't updated with upstream yet. None the less I think the code overall is really well structured and easy to follow!
Just had a small question/nit.
However in terms of testing, I'm not sure if its ready or not, but after briefly tried I was faced with some errors (maybe because the PR needs to be updated with upstream), so I just reviewed the code for now. @rpenido Let me know when its ready for another review/testing it out.
upsert_xblock_index_doc.delay( | ||
str(xblock_info.usage_key), | ||
recursive=True, # Update all children because the breadcrumb may have changed | ||
update_metadata=True, | ||
update_tags=False, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is update_tags
set to False
here? Is it because we are going to implement updating tags in the index separately in openedx/modular-learning#197 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! This handler is called when we receive an XBLOCK_UPDATED
event. If this block already has tags, I think the index will still be up to date with its information.
The upsert_xblock_index_doc
did not replace the document in the index; it only updates it with the new fields (https://www.meilisearch.com/docs/reference/api/documents#add-or-update-documents).
For the tagging data, I was planning to mirror the implementation we did here:
- Create an
OBJECT_TAGGED
event that is dispatched when we call thetag_object
api. - Create a new handler here that only updates the tags for the object and leaves the metadata as it is (
{recursive=False, update_metadata: False, updat_tags=True}
);
But I'm open to new ideas if you think otherwise! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as long as we are careful not to overwrite the tags
field with an empty list, this is a good approach. Generally the tags haven't changed when this event is called, so there's no reason to update them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks for the clarification, and I agree I think it's a good approach. I'll follow it when working on openedx/modular-learning#197
34f5722
to
cab3af9
Compare
We have this temp PR (open-craft#645). It could be easier to review before we get #34310 merged.
Sorry for that @yusuf-musleh! I probably messed up something when I merged the changes from @bradenmacdonald. I will let you know when this is ready for testing again (probably for a final review). Also, I started writing some tests for the new API, but I'm not happy with them. As meilisearch is an external service, I need to mock many functions and only really test if the external API is called with the expected parameters. Maybe we should fire up a Meileisearch instance during the tests to actually test the feature (as we do with external services, like the database), but I'm not sure if this is the right time to do this (and I don't know how to do this in our CI suit either). I plan to continue with the current approach (mocking and checking calls) for the API and the handlers to get some coverage, but I think we should revisit this if the prototype becomes the default search engine. Let me know if you have a better idea for this! |
@rpenido I think this approach is fine, since we can make the assumption that Meilisearch will behave as expected after we manually tested it. However I can see the value of having integration tests to actually test that it behaves as expected in case there are breaking/unexpected changes on the Meilisearch side. Although that could be avoided by using the same working version vs updating to the latest version every time.
I agree, I think this can come at a later stage. As you suggested, we can potentially follow the footsteps of the mysqldb when the time comes. |
7c82a0b
to
bd28dec
Compare
@rpenido I agree that we should eventually run Meilisearch during tests. However, it's probably not necessary to implement that for this first studio search project which is considered experimental. Maybe you could add that to the ADR that I wrote? That for the experiment, we won't use Meilisearch during tests, but we expect to add that in the future if we move forward with replacing Elasticsearch completely. |
a2e9d2a
to
4a55c89
Compare
Hi @yusuf-musleh ! There is a test failing here, but I can't find the logs. I will look into it further tomorrow. Edit: fixed 5a6beafc86276aafa9a02ca895f49b9d691a2905 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rpenido Great work, the code is coming along quite well! The code looks good to me, however while testing the library aspects, I noticed a few issues:
- When I create a new empty library, nothing gets added to the index, is that expected?
- When I add a component/block to the library that gets added to the index, as expected, when I edit the blocks, eg: the name or the problem type (for problem blocks) the index does not get updated. However If I change the name of the library, thats when the blocks under it get updated in the index with the latest information. It's probably just a missing condition on how/when the docs should be updated in the index.
Thank you for the review @yusuf-musleh!
I think that it is expected. I commented on the first PR but I think it got lost in the review comments.
This is because the |
@yusuf-musleh I want to rebase and squash this PR, but you will need to rebase it and merge conflicts again after that too (it will happen anyway when this goes upstream). Let me know when you think it is a good time to do it! |
128e717
to
b7aef89
Compare
@rpenido @yusuf-musleh We aren't indexing the courses/libraries themselves. So it's expected that a newly created library won't appear in the results. |
b7aef89
to
45b6bf9
Compare
…course-content-is-changed
This is ready to merge but we're waiting on some unrelated issues with the NodeJS 18 upgrade to be solved first. |
Hi @bradenmacdonald ! I added this commit (612f32f) here. I deleted these libraries locally, and the code was working. In the sandbox, the error appeared again, so I think it would be better to skip the library if we found an error instead of crashing the reindex. |
```File"/openedx/edx-platform/openedx/core/djangoapps/content_libraries/api.py",line 617, in get_library_componentslearning_package.id,AttributeError: 'NoneType' object has no attribute'id'```
83d6823
to
612f32f
Compare
Thanks @rpenido - that's a great fix. |
@rpenido I was just about to merge this, but there are some conflicts with the "permissions" PR that I just merged. Could you please fix them? |
…course-content-is-changed
Working on it. The code changed a lot, so I will need to manually test it to make sure I did the merge right. |
@rpenido Take your time - it's important to get it right :) |
36c48a4
to
58f8ff4
Compare
Co-authored-by: Braden MacDonald <[email protected]>
This is ready for review and merge @bradenmacdonald. It also includes the changes from your PR: It may have impacted your PR @yusuf-musleh. |
@rpenido 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
Description
This PR implements updating the search index with content metadata whenever they change
More Infomation
Testing Instructions
meilisearch
setup locally, follow the setup instructions here https://github.com/open-craft/tutor-contrib-meilisearchtutor dev run cms bash
and./manage.py cms reindex_studio
NOTE: Meilisearch seems to cache queries along with their results in the frontend, so if you simply search it might
show you stale data (network tab shows no requests), especially if you're searching for the same query. Make sure you refresh the Meilisearch dashboard (http://meilisearch.local.edly.io:7700/) and then perform the search.
Private ref: FAL-3690