The Course Info Blocks API endpoint has been known to be rather slow
to return the response. Previous investigation showed that the major
time sink was the get_course_blocks function, which is called three
times in a single request. This commit aims to improve the response
times by reducing the number of times that this function is called.
Solution Summary
The first time the function get_course_blocks is called, the result
(transformed course blocks) is stored in the current WSGI request
object. Later in the same request, before the second get_course_blocks
call is triggered, the already transformed course blocks are taken
from the request object, and if they are available, get_course_blocks
is not called (if not, it is called as a fallback). Later in the
request, the function is called again as before (see Optimization
Strategy and Difficulties).
Optimization Strategy and Difficulties
The original idea was to fetch and transform the course blocks once
and reuse them in all three cases, which would reduce get_course_blocks
call count to 1. However, this did not turn out to be a viable solution
because of the arguments passed to get_course_blocks. Notably, the
allow_start_dates_in_future boolean flag affects the behavior of
StartDateTransformer, which is a filtering transformer modifying the
block structure returned.
The first two times allow_start_dates_in_future is False, the third
time it is True. Setting it to True in all three cases would mean that
some blocks would be incorrectly included in the response.
This left us with one option - optimize the first two calls. The
difference between the first two calls is the non-filtering
transformers, however the second call applies a subset of transformers
from the first call, so it was safe to apply the superset of
transformers in both cases. This allowed to reduce the number of
function calls to 2. However, the cached structure may be further
mutated by filters downstream, which means we need to cache a copy of
the course structure (not the structure itself). The copy method itself
is quite heavy (it calls deepcopy three times), making the benefits of
this solution much less tangible. In fact, another potential
optimization that was considered was to reuse the collected block
structure (pre-transformation), but since calling copy on a collected
structure proved to be more time-consuming than calling get_collected,
this change was discarded, considering that the goal is to improve
performance.
Revised Solution
To achieve a more tangible performance improvement, it was decided to
modify the previous strategy as follows:
* Pass a for_blocks_view parameter to the get_blocks function to make
sure the new caching logic only affects the blocks view.
* Collect and cache course blocks with future dates included.
* Include start key in requested fields.
* Reuse the cached blocks in the third call, which is in
get_course_assignments
* Before returning the response, filter out any blocks with a future
start date, and also remove the start key if it was not in requested
fields
Adds new api to return block metadata which includes index_dictionary.
Reason for new api instead of adding it to course blocks API: data like
index_dictionary are too large for the cache used by course/blocks
transformers API.
The bug is explained in https://openedx.atlassian.net/browse/CRI-233. Only is missing add the `VisibilityTransformer` in `get_blocks()` when the user is not enrolled to the course.
On the test, `html_block` is visible only for staff and `vertical_block` is a normal block. The new behaviour hides the `html_block` and show the `vertical_block` to anonymous users
We discovered a subsection that contained a unit without any content
inside, but because of our logic requiring children, it would never be
marked complete (meaning the subsection, section, and course could thus
never be marked complete). This fixes that by removing the children
check from setting completion, but first gating that code path on the
xblock being an aggregator (to prevent leaves from marking as true
simply because there are no children).
Test fixes include adding a test for the empty aggregator case as
well as some changes to not have an entire course marked complete
because they are all empty aggregators.
All code in this PR should be removed after REVE-52 is merged and mobile
traffic from older app versions falls to < 5% of the mobile traffic
to the course_blocks api