Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: upstream_block ADR #34322

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
4. Persist the relationship between blocks and their upstream source
####################################################################
Comment on lines +1 to +2
Copy link
Member Author

@kdmccormick kdmccormick May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee @bradenmacdonald Generally, would you guys like to see this system kept within edx-platform, or built into the XBlock framework?

My initial instinct was to add the upstream_block[_version] fields to AuthoringMixin since only CMS needs to worry about synchronizing updates. But, both CMS and LMS will need to understand that upstream_block.{} values are to be interpreted as defaults, so the concept of "upstream" will bleed out of CMS in at least that sense.

Still, just like we do with Inheritance, we could keep this within edx-platform by making a special UpstreamDefaultsFieldData and UpstreamDefaultsKeyValueStore classes. Or, we could bake upstream-default logic into XBlock. I'm leaning the latter, personally (although I would still like to keep the upstream-sync logic isolated into a corner of CMS).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with upstream fields as an XBlock concept, but the syncing process as something that stays in Studio.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I think field data is an area where we are Too Clever For Our Own Good™️ . 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with upstream fields as an XBlock concept, but the syncing process as something that stays in Studio.

Sounds good to me too.


Status
******

Accepted

Pending implementation in (TODO - link)

Context
*******

* TODO - context

Decision
********

* The following two fields will be added to every XBlock in CMS:

.. code-block:: python

upstream_block = String(
scope=Scope.settings,
help=(
"The usage key of a the block that served as this block's template, if one exists. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: of a the block

But also, is "template" the right word? To me that generally implies a one-time copy-and-then-modify approach, but using library content is generally more clone-and-keep-in-sync.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. How does this look @bradenmacdonald ? I also added a note about nonexistent upstream blocks.

The usage key of a block (generally within a Content Library) which serves as a source of upstream
updates for this block, or None if there is no such upstream. Please note: It is valid for upstream_block
to hold a usage key for a block that does not exist (or does not *yet* exist) on this instance,
particularly if this block was imported from a different instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that, @kdmccormick 👍🏻

"Generally, the upstream_block is within a Content Library."
),
hidden=True,
default=None,
...
)
upstream_block_version = Integer(
scope=Scope.settings,
help=(
"The upstream_block's version number, at the time this block was created from it. "
"If this version is older than the upstream_block's latest version, then CMS will "
"allow this block to fetch updated content from upstream_block."
),
hidden=True,
default=None,
...
)

* The upstream-defined values of XBlock settings will be exposed in the XBlock API (TODO - details)

* The upstream-defined values will be set when the XBlock is initialized by (TODO - figure this out)

* The existing ``get_block_original_usage`` Modulestore methods will (TODO - figure this out)

* New attributes will be serialized into the OLX of blocks with upstreams:

* The two new XBlock fields will serialize into OLX as attributes, like any settings-scoped field.
* The upstream defaults will use a new special syntax: ``upstream.$FIELD_NAME``.
* The period (``.``) was explicitly chosen to ensure that these special attributes cannot possibly conflict with any Python-defined XBlock fields.

.. code-block:: xml

<problem
display_name="A title that has been customized in the course"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdmccormick, @bradenmacdonald: Something that came up in a recent discussion is how a display_name setting would look for an item that was copied into a course generally, items that are pulled in as a child of a LibraryContentBlock would have to have their display_name set to an empty string because it's the LibraryContentBlock that will be providing the display_name in those cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee breaking this down by static vs randomized reference:

  • static reference: we're not using LibraryContentBlock any more, so it seems fine to use the upstream display_name, and allow the course author to override it to "", right?
    • Note to self: Need to make sure we handle "" as a legitimate override, rather than accidentally falling back to the upstream default just because empty string is falsy.
  • randomized reference: This is now just an ItemPool wrapping a bunch of static references. I could imagine that a course author would want to use the ItemPool's display_name rather than each of its children's display names. Some ideas for how to handle this:
    • A show_item_display_names boolean setting on ItemPool; when False, it renders its children without their titles.
    • A button on the ItemPool editor which, in one swoop, overrides all of its children's titles to "". This is nice because it's totally at the UI level; no special data changes needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • static reference: we're not using LibraryContentBlock any more, so it seems fine to use the upstream display_name, and allow the course author to override it to "", right?

By this, you mean that there's going to be an upstream.display_name that has the original library-given name, and then a display_name="" set on the block in the course, right? If so, 👍

randomized reference: This is now just an ItemPool wrapping a bunch of static references. I could imagine that a course author would want to use the ItemPool's display_name rather than each of its children's display names. Some ideas for how to handle this:

  • A show_item_display_names boolean setting on ItemPool; when False, it renders its children without their titles.
  • A button on the ItemPool editor which, in one swoop, overrides all of its children's titles to "". This is nice because it's totally at the UI level; no special data changes needed.

I think either of these is fine from the technical side, and it's mostly a matter of UX preferences. That being said, I suspect we'll end up with something more like the show_item_display_names entry, because those display names might be useful in other contexts, e.g. "rescore problem X for all students because there was a bug we fixed" or "how did people perform on problem Y"?

max_attempts=2
upstream_block="lb:myorg:mylib:problem:p1"
upstream_block_version="12"
upstream_block.display_name="The title that was defined in the library block"
upstream_block.max_attempts=3
>
<!-- problem content would go here -->
</problem>

* Existing LibraryContentBlock children will be missing these attributes, and we will need to handle that (TODO - more details)

Consequences
************
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's an extra set of consequences here for defaults and inheritance:

  • We'll need to modify the XBlock runtime so that it grabs value in the order: explicitly set on block, inherited via parent, upstream.
  • We should be able to (eventually) remove the special case handling that InheritingFieldData makes for library_content. Honestly, there's so much weirdness around how LibraryContent currently work that I'd rather make a new block type entirely rather than bridging the old stuff and have something external to the two do the conversion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also going to go out on a limb and say that upstream.{} tags are not inheritable. 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I thought of: If we do this, I think it would be really valuable to ensure that [v2] library exports encode the full library ID in the export .tar.gz file. Currently, the convention is that Open edX exports don't include the ID of the thing you're exporting, and in fact you usually have to create an empty course/library with a new ID before you import. But in the case of libraries, we're going to want to encourage situations where IDs match across instances, and I'm thinking that a good way to do that is to include the full ID of the library and change the import workflow so that by default when you import a library to another instance, it has the same ID, and thus the upstream_block references will be valid. Of course, it won't break much if they're not valid, but the ability to preserve the upstream links is nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, course and library exports both have their context key data encoded into the to attributes of the root document (so course.xml or library.xml).

So for instance, the root course.xml for course-v1:edX+DemoX+Demo_Course would look like:

<course url_name="Demo_Course" org="edX" course="DemoX"/>

A long time ago (back in the XML-only days), this mattered more because we'd use it to encode multiple course runs into the same course git repo.

So we have it, but we just ignore it on import because the import mechanism is only available within the context of a specific library or course. I agree that a workflow for copying things directly without first creating a shell library could be useful. I still think that having cross-instance library references is worth looking into though. It's harder to do well, but I think it follows more closely to the workflow that people actually want to utilize.

All of which can wait until after Redwood. But yeah, let's definitely encode the full context key info into the export.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI the management commands to import courses and libraries still look at course.xml and library.xml to determine the target key (example invocation in Tutor--notice how no target key is provided).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL :)

Copy link
Member Author

@kdmccormick kdmccormick May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee @bradenmacdonald

Coming back to this consideration about upstreams and inheritance interacting... oh, this is interesting.

Even ignoring the legacy LibraryContentBlock weirdness that we want to get rid of... if we wanted fields to be inheritable and upstream-defaultable, then I think we'd need to handle it something like this (psuedo-python):

def get_value(block, field) -> Value|UNSET:
  block_value = block[field]                           # Priority 1: value on the block
  if block_value is not UNSET:
    return block_value
  if field.is_inheritable and block.has_parent:        # Priority 2: inherited value
     parent_value = get_value(block.parent, field)
     if parent_value is not UNSET:
        return parent_value
  return get_default(block, field)                     # Priority 3: computed default value (below)

def get_default(block, field) -> Value|UNSET:
  if field.is_inheritable and block.has_parent:
    parent_default = get_default(block.parent, field)  # Priority 3a: inherited default
    if parent_default is not UNSET:
        return parent_default 
  if block.has_upstream:                               # Priority 3b: default from upstream
    upstream_value = block.upstream_values[field]
    if upstream_value is not UNSET:
      return upstream_value
  return field.default                                 # Priority 3c: platform-defined field default

I'm also going to go out on a limb and say that upstream.{} tags are not inheritable. 😛

I like this in theory, but what are the implications? Is it that we forbid any field on InheritanceMixin from being set in a content library, and we filter out InheritanceMixin fields when copying library-defined settings into a block's upstream.{} defaults? Will that work out OK when we have units and subsections defined in content libraries? And is that backwards-compatible with today's LibraryContentBlock defaults?

Don't get me wrong-- I hate field inheritance :) I can't wait for the day where this automagical inheritance is replaced with explicit business logic in Learning Core. I just want to make sure don't introduce new edge cases in the meantime.


* LibraryContentBlock consequences:

* Previous ADR (TODO - link) on LibraryContentBlock schema is null and void.
* Statically-referenced library content will be direct children of the Unit with no LibraryContentBlock wrapper.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder how far we want to eventually go with this–e.g. allow references from courses, or from content on different servers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering the same 😄 We've taken this thing that's been all tangled up with the RandomizedContentModule, and defined a block-type agnostic interface for it (upstream_block* fields), so I'm cautiously optimistic that we could implement some pretty crazy synchronization strategies in the future without ever having to muck up modulestore or any block type.

* LibraryContentBlock will only be used for V1 libraries and V2 randomized problem banks.
* Eventually, we will deprecate V1 libraries and/or port them to V2 randomized problem banks.
* Long-term, we will remove LibraryContentBlock in favor of a Unit compositor.
* However, the ``<library_content>`` OLX tag will still be used for randomized content. Would be nice to rename this to ``<randomized>`` ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current terminology is now "Problem Bank", but I think we still need to talk about that one, esp. since some of the things inside of it won't be problems. I'm still rooting for Item Pool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormsbee As long as we're consistent within tech-land and product-land, I'm always happy to have a separate tech-term and product-term for the same piece data, especially since the product terms are subject to change as users try things and give feedback, whereas the technical meaning isn't likely to change. So, I'm favor of going with <item_pool>, which IMO is the best description of its semantics. If product decides to brand it as Problem Pool or Problem Bank, that's totally fine. Same deal with <sequential> vs Subsection.


* Course-library interaction consequences:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach also leaves copy-paste in an odd place when it comes to copying something from a Library and Pasting it into a course–i.e. should the pasting be smart enough to make the connection between the two and transform all the fields into upstream.{} fields? Not blocking, and it depends on product input, but it might be worth mentioning here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could prompt the user when they paste from a library. "Do you want to paste this as linked content, that stays in sync with the original library version?"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll note that possibility here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: for "regular" (non-randomized) blocks linked from a library into a course that have the upstream_block attribute, it would now be very easy to provide a method for users to "unlink" them from the library and convert them to normal, editable course blocks. We simply remove the upstream_block attribute. (For content which is a child of a LibraryContentBlock/ItemPool/ProblemBank, it's not as simple though.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 I'll mention all these as potential ways that the product could use the new fields


* Library-defined settings values will now load correctly, whether or not backing library exists. This is good news for courses with library content which need to be imported into different instances.
* CMS never needs to look up content from older versions of libraries.
* Library content in courses can now be copy-pasted and duplicated without refreshing from the library. That means that the copy-paste/duplicate operation will copy the blocks as they exist in the course, and later pulling updates down from a library will preserve any student state on those blocks.
* The slugs of course blocks from libraries can now be set to anything. Previously, they had be meticiulously set so that pulling updates down from the library didn't clobber them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ This will be a huge improvement

Loading