docs: upstream block ADR, take 2 (#35421)
This commit is contained in:
402
docs/decisions/0020-upstream-downstream.rst
Normal file
402
docs/decisions/0020-upstream-downstream.rst
Normal file
@@ -0,0 +1,402 @@
|
||||
4. Upstream and downstream content
|
||||
##################################
|
||||
|
||||
Status
|
||||
******
|
||||
|
||||
Accepted.
|
||||
|
||||
Implementation in progress as of 2024-09-03.
|
||||
|
||||
Context
|
||||
*******
|
||||
|
||||
We are replacing the existing Legacy ("V1") Content Libraries system, based on
|
||||
ModuleStore, with a Relaunched ("V2") Content Libraries system, based on
|
||||
Learning Core. V1 and V2 libraries will coexist for at least one release to
|
||||
allow for migration; eventually, V1 libraries will be removed entirely.
|
||||
|
||||
Content from V1 libraries can only be included into courses using the
|
||||
LibraryContentBlock (called "Randomized Content Module" in Studio), which works
|
||||
like this:
|
||||
|
||||
* Course authors add a LibraryContentBlock to a Unit and configure it with a
|
||||
library key and a count of N library blocks to select (or `-1` for "all
|
||||
blocks").
|
||||
|
||||
* For each block in the chosen library, its *content definition* is copied into
|
||||
the course as a child of the LibraryContentBlock, whereas its *settings* are
|
||||
copied into a special "default" settings dictionary in the course's structure
|
||||
document--this distinction will matter later. The usage key of each copied
|
||||
block is derived from a hash of the original library block's usage key plus
|
||||
the LibraryContentBlock's own usage key--this will also matter
|
||||
later.
|
||||
|
||||
* The course author is free to override the content and settings of the
|
||||
course-local copies of each library block.
|
||||
|
||||
* When any update is made to the library, the course author is prompted to
|
||||
update the LibraryContentBlock. This involves re-copying the library blocks'
|
||||
content definitions and default settings, which clobbers any overrides they
|
||||
have made to content, but preserves any overrides they have made to settings.
|
||||
Furthermore, any blocks that were added to the library are newly copied into
|
||||
the course, and any blocks that were removed from the library are deleted
|
||||
from the course. For all blocks, usage keys are recalculated using the same
|
||||
hash derivation described above; for existing blocks, it is important that
|
||||
this recalculation yields the same usage key so that student state is not
|
||||
lost.
|
||||
|
||||
* Over in the LMS, when a learner loads LibraryContentBlock, they are shown a
|
||||
list of N randomly-picked blocks from the library. Subsequent visits show
|
||||
them the same list, *unless* children were added, children were removed, or N
|
||||
changed. In those cases, the LibraryContentBlock tries to make the smallest
|
||||
possible adjustment to their personal list of blocks while respecting N and
|
||||
the updated list of children.
|
||||
|
||||
This system has several issues:
|
||||
|
||||
#. **Missing defaults after import:** When a course with a LibraryContentBlock
|
||||
is imported into an Open edX instance *without* the referenced library, the
|
||||
blocks' *content* will remain intact as will course-local *settings
|
||||
overrides*. However, any *default settings* defined in the library will be
|
||||
missing. This can result in content that is completely broken, especially
|
||||
since critical fields like video URLs and LTI URLs are considered
|
||||
"settings". For a detailed scenario, see `LibraryContentBlock Curveball 1`_.
|
||||
|
||||
#. **Strange behavior when duplicating content:** Typically, when a
|
||||
block is duplicated or copy-pasted, the new block's usage key and its
|
||||
children's usage keys are randomly generated. However, recall that when a
|
||||
LibraryContentBlock is updated, its children's usage keys are rederived
|
||||
using a hash function. That would cause the children's usage keys to change,
|
||||
thus destroying any student state. So, we must work around this with a hack:
|
||||
upon duplicating or pasting a LibraryContentBlock, we immediately update the
|
||||
LibraryContentBlock, thus discarding the problematic randomly-generated keys
|
||||
in favor of hash-derived keys. This works, but:
|
||||
|
||||
* it involves weird code hacks,
|
||||
* it unexpectedly discards any content overrides the course author made to
|
||||
the copied LibraryContentBlock's children,
|
||||
* it unexpectedly uses the latest version of library content, regardless of
|
||||
which version the copied LibraryContentBlock was using, and
|
||||
* it fails if the library does not exist on the Open edX instance, which
|
||||
can happen if the course was imported from another instance.
|
||||
|
||||
#. **Conflation of reference and randomization:** The LibraryContentBlock does
|
||||
two things: it connects courses to library content, and it shows users a
|
||||
random subset of content. There is no reason that those two features need to
|
||||
be coupled together. A course author may want to randomize course-defined
|
||||
content, or they may want to randomize content from multiple different
|
||||
libraries. Or, they may want to use content from libraries without
|
||||
randomizing it at all. While it is feasible to support all these things in a
|
||||
single XBlock, trying to do so led to a `very complicated XBlock concept`_
|
||||
which difficult to explain to product managers and other engineers.
|
||||
|
||||
#. **Unpredictable preservation of overrides:** Recall that *content
|
||||
definitions* and *settings* are handled differently. This distinction is
|
||||
defined in the code: every authorable XBlock field is either defined with
|
||||
`Scope.content` or `Scope.settings`. In theory, XBlock developers would use
|
||||
the content scope for fields that are core to the meaning of piece of
|
||||
content, and they would only use the settings scope for fields that would be
|
||||
reasonable to configure in a local copy of the piece of content. In
|
||||
practice, though, XBlock developers almost always use `Scope.settings`. The
|
||||
result of this is that customizations to blocks *almost always* survive
|
||||
through library updates, except when they don't. Course authors have no way
|
||||
to know (or even guess) when their customizations they will and won't
|
||||
survive updates.
|
||||
|
||||
#. **General pain and suffering:** The relationship between courses and V1
|
||||
libraries is confusing to content authors, site admins, and developers
|
||||
alike. The behaviors above toe the line between "quirks" and "known bugs",
|
||||
and they are not all documented. Past attempts to improve the system have
|
||||
`triggered series of bugs`_, some of which led to permanent loss of learner
|
||||
state. In other cases, past Content Libraries improvement efforts have
|
||||
slowed or completely stalled out in code review due to the overwhelming
|
||||
amount of context and edge cases that must be understood to safely make any
|
||||
changes.
|
||||
|
||||
.. _LibraryContentBlock Curveball 1: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3966795804/Fun+with+LibraryContentBlock+export+import+and+duplication#Curveball-1%3A-Import%2FExport
|
||||
.. _LibraryContentBlock Curveball 2: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3966795804/Fun+with+LibraryContentBlock+export+import+and+duplication#Curveball-2:-Duplication
|
||||
.. _very complicated XBlock concept: https://github.com/openedx/edx-platform/blob/master/xmodule/docs/decisions/0003-library-content-block-schema.rst
|
||||
.. _triggered series of bugs: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3858661405/Bugs+from+Content+Libraries+V1
|
||||
|
||||
We are keen to use the Library Relaunch project to address all of these
|
||||
problems. So, V2 libraries will interop with courses using a completely
|
||||
different data model.
|
||||
|
||||
|
||||
Decision
|
||||
********
|
||||
|
||||
We will create a framework where a *downstream* piece of content (e.g. a course
|
||||
block) can be *linked* to an *upstream* piece of content (e.g., a library
|
||||
block) with the following properties:
|
||||
|
||||
* **Portable:** Links can refer to certain content on the current Open edX
|
||||
instance, and in the future they may be able to refer to content on other
|
||||
Open edX instances or sites. Links will never include information that is
|
||||
internal to a particular Open edX instance, such as foreign keys.
|
||||
|
||||
* **Flat:** The *link* is a not a wrapper (like the LibraryContentBlock),
|
||||
but simply a piece of metadata directly on the downstream content which
|
||||
points to the upstream content. We will no longer rely on precarious
|
||||
hash-derived usage keys to establish connection to upstream blocks;
|
||||
like any other block, an upstream-linked blocks can be granted whatever block
|
||||
ID that the authoring environment assigns it, whether random or
|
||||
human-readable.
|
||||
|
||||
* **Forwards-compatible:** If downstream content is created in a course on
|
||||
an Open edX site that supports upstream and downstreams (e.g., a Teak
|
||||
instance), and then it is exported and imported into a site that doesn't
|
||||
(e.g., a Quince instance), the downstream content will simply act like
|
||||
regular course content.
|
||||
|
||||
* **Independent:** Upstream content and downstream content exist separately
|
||||
from one another:
|
||||
|
||||
* Modifying upstream content does not affect any downstream content (unless a
|
||||
sync happens, more on that later).
|
||||
* Deleting upstream content does not impact its downstream content. By
|
||||
corollary, pieces of downstream content can completely and correctly render
|
||||
on Open edX instances that are missing their linked upstream content.
|
||||
* (Preserving a positive feature of the V1 LibraryContentBlock) The link
|
||||
persists through export-import and copy-paste, regardless of whether the
|
||||
upstream content actually exists. A "broken" link to upstream content is
|
||||
seamlessly "repaired" if the upstream content becomes available again.
|
||||
|
||||
* **Customizable:** On an OLX level, authors can still override the value
|
||||
of any field for a piece of downstream content. However, we will empower
|
||||
Studio to be more prescriptive about what authors *can* override versus what
|
||||
they *should* override:
|
||||
|
||||
* We define a set of *customizable* fields, with platform-level defaults
|
||||
like display_name and a max_attempts, plus the ability for external
|
||||
XBlocks to opt their own fields into customizability.
|
||||
* Studio may use this list to provide an interface for customizing
|
||||
downstream blocks, separate from the usual "Edit" interface that would
|
||||
permit them to make unsafe overrides.
|
||||
* Furthermore, downstream content will record which fields the user has
|
||||
customized...
|
||||
|
||||
* even if the customization is to simply clear the value of the fields...
|
||||
* and even if the customization is made redundant in a future version of
|
||||
the upstream content. For example, if max_attempts is customized from 3
|
||||
to 5 in the downstream content, but the next version of the upstream
|
||||
content also changes max_attempts to 5, the downstream would still
|
||||
consider max_attempts to be customized. If the following version of the
|
||||
upstream content again changed max_attempts to 6, the downstream would
|
||||
retain max_attempts to be 5.
|
||||
|
||||
* Finally, the downstream content will locally save the upstream value of
|
||||
customizable fields, allowing the author to *revert* back to them
|
||||
regardless of whether the upstream content is actually available.
|
||||
|
||||
* **Synchronizable, without surprises:** Downstream content can be *synced*
|
||||
with updates that have been made to its linked upstream. This means that the
|
||||
latest available upstream content field values will entirely replace all of
|
||||
the downstream field values, *except* those which were customized, as
|
||||
described in the previous item.
|
||||
|
||||
* **Concrete, but flexible:** The internal implementation of upstream-downstream
|
||||
syncing will assume that:
|
||||
|
||||
* upstream content belongs to a V2 content library,
|
||||
* downstream content belongs to a course on the same instance, and
|
||||
* the link is the stringified usage key of the upstream library content.
|
||||
|
||||
This will allow us to keep the implementation straightforward. However, we
|
||||
will *not* expose these assumptions in the Python APIs, the HTTP APIs, or in
|
||||
the persisted fields, allowing us in the future to generalize to other
|
||||
upstreams (such as externally-hosted libraries) and other downstreams (such
|
||||
as a standalone enrollable sequence without a course).
|
||||
|
||||
If any of these assumptions are violated, we will raise an exception or log a
|
||||
warning, as appropriate. Particularly, if these assumptions are violated at
|
||||
the OLX level via a course import, then we will probably show a warning at
|
||||
import time and refuse to sync from the unsupported upstream; however, we
|
||||
will *not* fail the entire import or mangle the value of upstream link, since
|
||||
we want to remain forwards-compatible with potential future forms of syncing.
|
||||
As a concrete example: if a course block has *another course block's usage
|
||||
key* as an upstream, then we will faithfully keep that value through the
|
||||
import and export process, but we will not prompt the user to sync updates
|
||||
for that block.
|
||||
|
||||
* **Decoupled:** Upstream-downstream linking is not tied up with any other
|
||||
courseware feature; in particular, it is unrelated to content randomization.
|
||||
Randomized library content will be supported, but it will be a *synthesis* of
|
||||
two features: (1) a RandomizationBlock that randomly selects a subset of its
|
||||
children, where (2) some or all of those children are linked to upstream
|
||||
blocks.
|
||||
|
||||
Consequences
|
||||
************
|
||||
|
||||
To support the Libraries Relaunch in Sumac:
|
||||
|
||||
* For every XBlock in CMS, we will use XBlock fields to persist the upstream
|
||||
link, its versions, its customizable fields, and its set of downstream
|
||||
overrides.
|
||||
|
||||
* We will avoid exposing these fields to LMS code.
|
||||
|
||||
* We will define an initial set of customizable fields for Problem, Text, and
|
||||
Video blocks.
|
||||
|
||||
* We will define method(s) for syncing update on the XBlock runtime so that
|
||||
they are available in the SplitModuleStore's XBlock Runtime
|
||||
(CachingDescriptorSystem).
|
||||
|
||||
* Either in the initial implementation or in a later implementation, it may
|
||||
make sense to declare abstract versions of the syncing method(s) higher up
|
||||
in XBlock Runtime inheritance hierarchy.
|
||||
|
||||
* We will expose a CMS HTTP API for syncing updates to blocks from their
|
||||
upstreams.
|
||||
|
||||
* We will avoid exposing this API from the LMS.
|
||||
|
||||
For reference, here are some excerpts of a potential implementation. This may
|
||||
change through development and code review.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
###########################################################################
|
||||
# cms/lib/xblock/upstream_sync.py
|
||||
###########################################################################
|
||||
|
||||
class UpstreamSyncMixin(XBlockMixin):
|
||||
"""
|
||||
Allows an XBlock in the CMS to be associated & synced with an upstream.
|
||||
Mixed into CMS's XBLOCK_MIXINS, but not LMS's.
|
||||
"""
|
||||
|
||||
# Metadata related to upstream synchronization
|
||||
upstream = String(
|
||||
help=("""
|
||||
The usage key of a block (generally within a content library)
|
||||
which serves as a source of upstream updates for this block,
|
||||
or None if there is no such upstream. Please note: It is valid
|
||||
for this field to hold a usage key for an upstream block
|
||||
that does not exist (or does not *yet* exist) on this instance,
|
||||
particularly if this downstream block was imported from a
|
||||
different instance.
|
||||
"""),
|
||||
default=None, scope=Scope.settings, hidden=True, enforce_type=True
|
||||
)
|
||||
upstream_version = Integer(
|
||||
help=("""
|
||||
Record of the upstream block's version number at the time this
|
||||
block was created from it. If upstream_version is smaller
|
||||
than the upstream block's latest version, then the user will be
|
||||
able to sync updates into this downstream block.
|
||||
"""),
|
||||
default=None, scope=Scope.settings, hidden=True, enforce_type=True,
|
||||
)
|
||||
downstream_customized = Set(
|
||||
help=("""
|
||||
Names of the fields which have values set on the upstream
|
||||
block yet have been explicitly overridden on this downstream
|
||||
block. Unless explicitly cleared by the user, these
|
||||
customizations will persist even when updates are synced from
|
||||
the upstream.
|
||||
"""),
|
||||
default=[], scope=Scope.settings, hidden=True, enforce_type=True,
|
||||
)
|
||||
|
||||
# Store upstream defaults for customizable fields.
|
||||
upstream_display_name = String(...)
|
||||
upstream_max_attempts = List(...)
|
||||
... # We will probably want to pre-define several more of these.
|
||||
|
||||
def get_upstream_field_names(cls) -> dict[str, str]:
|
||||
"""
|
||||
Mapping from each customizable field to field which stores its upstream default.
|
||||
XBlocks outside of edx-platform can override this in order to set
|
||||
up their own customizable fields.
|
||||
"""
|
||||
return {
|
||||
"display_name": "upstream_display_name",
|
||||
"max_attempts": "upstream_max_attempts",
|
||||
}
|
||||
|
||||
def save(self, *args, **kwargs):
|
||||
"""
|
||||
Update `downstream_customized` when a customizable field is modified.
|
||||
Uses `get_upstream_field_names` keys as the list of fields that are
|
||||
customizable.
|
||||
"""
|
||||
...
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class UpstreamInfo:
|
||||
"""
|
||||
Metadata about a block's relationship with an upstream.
|
||||
"""
|
||||
usage_key: UsageKey
|
||||
current_version: int
|
||||
latest_version: int | None
|
||||
sync_url: str
|
||||
error: str | None
|
||||
|
||||
@property
|
||||
def sync_available(self) -> bool:
|
||||
"""
|
||||
Should the user be prompted to sync this block with upstream?
|
||||
"""
|
||||
return (
|
||||
self.latest_version
|
||||
and self.current_version < self.latest_version
|
||||
and not self.error
|
||||
)
|
||||
|
||||
|
||||
###########################################################################
|
||||
# xmodule/modulestore/split_mongo/caching_descriptor_system.py
|
||||
###########################################################################
|
||||
|
||||
class CachingDescriptorSystem(...):
|
||||
|
||||
def validate_upstream_key(self, usage_key: UsageKey | str) -> UsageKey:
|
||||
"""
|
||||
Raise an error if the provided key is not a valid upstream reference.
|
||||
Instead of explicitly checking whether a key is a LibraryLocatorV2,
|
||||
callers should validate using this function, and use an `except` clause
|
||||
to handle the case where the key is not a valid upstream.
|
||||
Raises: InvalidKeyError, UnsupportedUpstreamKeyType
|
||||
"""
|
||||
...
|
||||
|
||||
def sync_from_upstream(self, *, downstream_key: UsageKey, apply_updates: bool) -> None:
|
||||
"""
|
||||
Python API for loading updates from upstream block.
|
||||
Can choose whether or not to actually apply those updates...
|
||||
apply_updates=False: Think "get fetch".
|
||||
Use case: course import.
|
||||
apply_updates=True: Think "git pull".
|
||||
Use case: sync_updates handler.
|
||||
Raises: InvalidKeyError, UnsupportedUpstreamKeyType, XBlockNotFoundError
|
||||
"""
|
||||
...
|
||||
|
||||
def get_upstream_info(self, downstream_key: UsageKey) -> UpstreamInfo | None:
|
||||
"""
|
||||
Python API for upstream metadata, or None.
|
||||
Raises: InvalidKeyError, XBlockNotFoundError
|
||||
"""
|
||||
...
|
||||
|
||||
Finally, here is what the OLX for a library-sourced Problem XBlock in a course
|
||||
might look like:
|
||||
|
||||
.. code-block:: xml
|
||||
|
||||
<problem
|
||||
display_name="A title that has been customized in the course"
|
||||
max_attempts="2"
|
||||
upstream="lb:myorg:mylib:problem:p1"
|
||||
upstream_version="12"
|
||||
downstream_customized="["display_name","max_attempts"]"
|
||||
upstream_display_name="The title that was defined in the library block"
|
||||
upstream_max_attempts="3"
|
||||
>
|
||||
<!-- problem content would go here -->
|
||||
</problem>
|
||||
@@ -5,9 +5,9 @@ Evolving the library_content block schema
|
||||
Status
|
||||
******
|
||||
|
||||
**Provisional**
|
||||
**Replaced** by the `Upstream-Downstream ADR`_.
|
||||
|
||||
Subject to change due to implementation learnings and stakeholder feedback.
|
||||
.. _Upstream-Downstream ADR: https://docs/decisions/0020-upstream-block.rst
|
||||
|
||||
Context
|
||||
*******
|
||||
|
||||
Reference in New Issue
Block a user