Discovery: tracking upstream-downstream content links #242

navinkarkera · 2024-12-10T04:32:42Z

Designs
Courses should display list of libraries from which content is being used in it. (Not yet finalized)
Courses should display list of content blocks from libraries that was updated in libraries after being imported.
Course outlines should display a notification when any library content under it has been updated in the original library.
Additional notes:
- Update or ignore all updates for components in a given course.
- Ability to sort components by library?

bradenmacdonald · 2024-12-10T19:19:33Z

@navinkarkera We also need to use this data for the opposite case: within libraries, showing all the courses in which a given component (or Unit etc.) is used.

bradenmacdonald · 2024-12-10T19:31:36Z

@kdmccormick and/or @ormsbee will probably be interested in reviewing this plan.

kdmccormick · 2024-12-13T17:14:51Z

Something to consider: This needs to be backwards-compatible with instances that are upgraded from Sumac to Teak, and thus may have some V2 Library References that are not persisted in these new tables. There are a couple ways to handle this. One would be to mandate a backfill migration on the Sumac->Teak upgrade. Another would be to accept that the Library Sync page may be missing some of these existing downstream-upstream connections, and provide a "Refresh" button which would check for them.

navinkarkera · 2024-12-16T05:18:20Z

@kdmccormick Thanks! I am actually thinking of tracking links in meilisearch index document of course blocks instead of new database tables. @pomegranited is yet to take a look and review it but re-indexing courses should automatically fill up upstream links for all old blocks and new ones are handled by signal/events.

Let me know if it is a terrible idea 😅

bradenmacdonald · 2024-12-16T19:15:43Z

@navinkarkera What's the advantage of using Meilisearch in that case? One hesitation I have is that so far all the core libraries, component, and upstream/downstream APIs don't have any dependency on Meilisearch (only the UI and the search APIs do), and I think it's better to stick with that until everyone is comfortable with using Meilisearch. For example, MIT and 2U still have some unanswered questions about its reliability and scalability so they aren't using it yet.

pomegranited · 2024-12-16T23:53:23Z

@navinkarkera @bradenmacdonald Ya.. I don't have a problem with using Meilisearch as a source for this info in the frontend, but I don't think it should be the authoritative source. We should always be able to re-create the search data from the database.

bradenmacdonald · 2024-12-17T02:20:46Z

@pomegranited Well I believe in this case it's not an authoritative source in any case. We can re-create the links at any time by scanning all the OLX (in modulestore+learning core). It's just very slow to do that.

My concern is more that this upstream-downstream tracking is going to be a pretty core functionality, and so I'm not sure it should depend on Meilisearch. But I'm open to the idea. I guess the upstream-downstream links themselves will continue to work just fine, so this is more about discovering them. I just want to know if there's really any advantage to using Meilisearch for this use case; if not, we might as well use MySQL.

navinkarkera · 2024-12-17T05:19:12Z

@pomegranited Like @bradenmacdonald pointed out, meilisearch won't be the authoritative source for this data but the upstream_ref in course block olx as it is now, so we can recreate the index any time.

Well I believe in this case it's not an authoritative source in any case. We can re-create the links at any time by scanning all the OLX (in modulestore+learning core). It's just very slow to do that.

@bradenmacdonald Yes, but the index will be updated as part of current reindex process which already goes through each xblock olx and indexes it.

My concern is more that this upstream-downstream tracking is going to be a pretty core functionality, and so I'm not sure it should depend on Meilisearch. But I'm open to the idea.

My understanding is that you need meilisearch setup for libraries v2 to work, which in turn means that you need it for importing/linking content from it to courses.

I just want to know if there's really any advantage to using Meilisearch for this use case; if not, we might as well use MySQL.

The biggest advantage would be the reading speed from the index, support for searching filtering and sorting the links.

I am not opposed to the idea of creating a database table but to support search, filter and sorting it would be better to store it in the index as well. Also, except for upstream ref (library usage key) everything else can be derived so storing it again is not really necessary (except if there are cases where we cannot use the index and need the information at decent speed ).

bradenmacdonald · 2024-12-17T18:00:53Z

My understanding is that you need meilisearch setup for libraries v2 to work, which in turn means that you need it for importing/linking content from it to courses.

@navinkarkera No - you need Meilisearch for the UI/frontend to work, but all the libraries v2 REST/python APIs (other than search, obviously) will work just fine without Meilisearch - including the upstream reference tracking. For example, someone could build an alternate UI for content libraries that doesn't have any search/filtering features, and just uses the "regular" libraries REST APIs and it would work perfectly fine.

support for searching filtering and sorting the links.

Do we need that? I thought our only use case was looking up by either "downstream course ID", "upstream component ID", or "downstream component ID", which MySQL will support just fine.

kdmccormick · 2024-12-17T18:34:50Z

Hm, interesting idea @navinkarkera . Thinking out loud:

Persisting the content links to mysql would allow us to make foreign keys against upstream-downstream links. I always saw this as something we would need, but now that we're challenging the idea, I can't actually think of anything we'd want to hang off the link table. Can anyone else? @ormsbee ?
As has been pointed out, the OLX will always be the authoritative source, so regardless of which route we take, we'll want to be careful to update the link table/index whenever OLX is imported or a course is edited.
Backfilling will also be important.
- With Mysql, it'd probably be a custom migration
- How would this work with meilisearch? Would we need to tell operators to rebuild indexes upon migrating to Teak?
I see us discussing "do operators need meili for X", which I appreciate us taking into consideration. At the same time, does anyone know where we stand on making meili a platform-wide dependency, replace ES? If that's already on the Teak roadmap, then we do not need to be factoring it into this decision.

bradenmacdonald · 2024-12-17T18:37:14Z

I see us discussing "do operators need meili for X", which I appreciate us taking into consideration. At the same time, does anyone know where we stand on making meili a platform-wide dependency, replace ES? If that's already on the Teak roadmap, then we do not need to be factoring it into this decision.

I had hoped that people would test it out with Redwood and we'd be able to make that decision now. That hasn't happened. But with Sumac it's now default in Tutor and we're going to get a lot more feedback about using Meilisearch in production, so we'll hopefully know soon. However, some stakeholders are definitely uncomfortable with making Meilisearch a core dependency, so for now it's best to plan as if it's going to be optional/swappable where we can.

ormsbee · 2024-12-17T18:42:08Z

I would also favor having the link relationship expressed in a Django model to start, for all the reasons @bradenmacdonald mentioned.

navinkarkera · 2024-12-18T04:49:40Z

build an alternate UI for content libraries that doesn't have any search/filtering features, and just uses the "regular" libraries REST APIs and it would work perfectly fine.

@bradenmacdonald Ohh, makes sense.

Do we need that? I thought our only use case was looking up by either "downstream course ID", "upstream component ID", or "downstream component ID", which MySQL will support just fine.

The designs indicate a search bar and sort options for both libraries and library blocks used in course pages.

Persisting the content links to mysql would allow us to make foreign keys against upstream-downstream links. I always saw this as something we would need, but now that we're challenging the idea, I can't actually think of anything we'd want to hang off the link table.

@kdmccormick ~~On this note, @pomegranited suggested that if we don't add foreign key links to this new table, it can be defined in learning core independent of edx-platform, is this something that we need?~~ I'll come back on this.

kdmccormick · 2025-01-02T19:36:42Z

Persisting the links Learning Core is an interesting idea. It would force us not to assume that upstreams are always content libraries and that downstreams are always courses--and I think that'd be positive.

I don't see why that would preclude us from creating foreign keys to the new table, though? Anyway, curious what you come up with here.

navinkarkera · 2025-01-03T05:14:00Z

@kdmccormick We are planning to link upstream i.e. PublishableEntity to the new table but still need to store usage keys for both upstream and downstream as it will help us store link information for not yet imported course or library pairs.

navinkarkera mentioned this issue Dec 10, 2024

Epic 8: Library sync page openedx/frontend-app-authoring#1097

Open

bradenmacdonald assigned navinkarkera Dec 10, 2024

bradenmacdonald transferred this issue from openedx/frontend-app-authoring Dec 10, 2024

bradenmacdonald mentioned this issue Dec 10, 2024

Implement backend for tracking upstream-downstream content links #243

Open

This was referenced Jan 8, 2025

feat: Legacy Libraries Migration (Prototype) openedx/edx-platform#35758

Closed

feat: Legacy Libraries Migration (WIP) openedx/edx-platform#36083

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discovery: tracking upstream-downstream content links #242

Discovery: tracking upstream-downstream content links #242

navinkarkera commented Dec 10, 2024

bradenmacdonald commented Dec 10, 2024

bradenmacdonald commented Dec 10, 2024

kdmccormick commented Dec 13, 2024

navinkarkera commented Dec 16, 2024

bradenmacdonald commented Dec 16, 2024

pomegranited commented Dec 16, 2024

bradenmacdonald commented Dec 17, 2024

navinkarkera commented Dec 17, 2024

bradenmacdonald commented Dec 17, 2024

kdmccormick commented Dec 17, 2024

bradenmacdonald commented Dec 17, 2024

ormsbee commented Dec 17, 2024

navinkarkera commented Dec 18, 2024 •

edited

Loading

kdmccormick commented Jan 2, 2025

navinkarkera commented Jan 3, 2025

Discovery: tracking upstream-downstream content links #242

Discovery: tracking upstream-downstream content links #242

Comments

navinkarkera commented Dec 10, 2024

bradenmacdonald commented Dec 10, 2024

bradenmacdonald commented Dec 10, 2024

kdmccormick commented Dec 13, 2024

navinkarkera commented Dec 16, 2024

bradenmacdonald commented Dec 16, 2024

pomegranited commented Dec 16, 2024

bradenmacdonald commented Dec 17, 2024

navinkarkera commented Dec 17, 2024

bradenmacdonald commented Dec 17, 2024

kdmccormick commented Dec 17, 2024

bradenmacdonald commented Dec 17, 2024

ormsbee commented Dec 17, 2024

navinkarkera commented Dec 18, 2024 • edited Loading

kdmccormick commented Jan 2, 2025

navinkarkera commented Jan 3, 2025

navinkarkera commented Dec 18, 2024 •

edited

Loading