From cb588738790c432c1e837e69f1547be80437e707 Mon Sep 17 00:00:00 2001
From: Braden MacDonald <braden@opencraft.com>
Date: Sat, 21 Mar 2020 11:04:44 -0700
Subject: [PATCH] Cache Blockstore XBlock field data per thread, not
 per-process

The cache in the previous version of this code was unwittingly being
shared among all threads, and an occasional race condition would result
in the .children field of some XBlocks containing duplicate entries.

I tried to find other ways to keep the existing cache design and let it
be shared among all the threads (which would be more efficient), but I
couldn't find any clean way of doing it (and even then, this code was
not written with the intention of being used in a multi-threaded way).
So to keep the fix simple, I made the block data cache thread-local
instead of process-local. That eliminated the bug.

Technical details:

The big challenge with this code in the first place was due to the
parse_xml API of XBlocks, which instantiates the XBlock instance and
_then_ sets field data on it and returns it, as there is no mechanism
available to distinguish that from the case of instantiating an XBlock
(using previously loaded/cached field data) and then deliberately
changing its field values. In particular, parse_xml sets the 'children'
field just by calling self.children.append(...) but never explicitly
initializes self.children to [] first, so it's necessary for the field
data store to have a mechanism for setting self.children = [] when
parsing begins, but it's hard to know "when parsing begins" in general.

This is made more challenging since the BlockstoreFieldData design ties
field data changes to the XBlock instance itself (using a weakkey), but
it's impossible to get a reference to the XBlock instance before calling
parse_xml, and since the BlockstoreFieldData design uses a pass-through
approach where fields that aren't being actively changed are read from
the cache; since it doesn't know when children is being initialized vs.
being modified, it would sometimes pass-through and start one thread's
changes with the final result from another thread.

Anyhow, the bottom line is that avoiding unintentional multithreading
here solves the problem.

If we want the field data cache to be shared among threads, it might as
well be rewritten to use memcached and shared among all processes too.
That would be a very good performance boost but would take up a lot more
memory in memcached. Also the rewrite may be challenging due to the
aforementioned nuances of the XBlock parse_xml / construct_xblock /
add_node_as_child APIs. Perhaps modifying the runtime to use a
completely separate fielddata implementation for parsing vs. loading a
parsed+cached XBlock would do it.
---
 openedx/core/djangoapps/xblock/api.py | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/openedx/core/djangoapps/xblock/api.py b/openedx/core/djangoapps/xblock/api.py
index a3449a198f..6072278cfe 100644
--- a/openedx/core/djangoapps/xblock/api.py
+++ b/openedx/core/djangoapps/xblock/api.py
@@ -9,6 +9,7 @@ Studio APIs cover use cases like adding/deleting/editing blocks.
 """
 
 import logging
+import threading
 
 from django.urls import reverse
 from django.utils.translation import ugettext as _
@@ -38,15 +39,24 @@ def get_runtime_system():
     keep application startup faster, it's only initialized when first accessed
     via this method.
     """
-    # pylint: disable=protected-access
-    if not hasattr(get_runtime_system, '_system'):
+    # The runtime system should not be shared among threads, as there is currently a race condition when parsing XML
+    # that can lead to duplicate children.
+    # (In BlockstoreXBlockRuntime.get_block(), has_cached_definition(def_id) returns false so parse_xml is called, but
+    # meanwhile another thread parses the XML and caches the definition; then when parse_xml gets to XML nodes for
+    # child blocks, it appends them to the children already cached by the other thread and saves the doubled list of
+    # children; this happens only occasionally but is very difficult to avoid in a clean way due to the API of parse_xml
+    # and XBlock field data in general [does not distinguish between setting initial values during parsing and changing
+    # values at runtime due to user interaction], and how it interacts with BlockstoreFieldData. Keeping the caches
+    # local to each thread completely avoids this problem.)
+    cache_name = '_system_{}'.format(threading.get_ident())
+    if not hasattr(get_runtime_system, cache_name):
         params = dict(
             handler_url=get_handler_url,
             runtime_class=BlockstoreXBlockRuntime,
         )
         params.update(get_xblock_app_config().get_runtime_system_params())
-        get_runtime_system._system = XBlockRuntimeSystem(**params)
-    return get_runtime_system._system
+        setattr(get_runtime_system, cache_name, XBlockRuntimeSystem(**params))
+    return getattr(get_runtime_system, cache_name)
 
 
 def load_block(usage_key, user):