Files
edx-platform/lms/djangoapps/course_api/blocks/api.py
Serhiii Nanai 7cd4170ca7 feat: [FC-0092] Optimize Course Info Blocks API (#37122)
The Course Info Blocks API endpoint has been known to be rather slow
to return the response. Previous investigation showed that the major
time sink was the get_course_blocks function, which is called three
times in a single request. This commit aims to improve the response
times by reducing the number of times that this function is called.

Solution Summary

The first time the function get_course_blocks is called, the result
(transformed course blocks) is stored in the current WSGI request
object. Later in the same request, before the second get_course_blocks
call is triggered, the already transformed course blocks are taken
from the request object, and if they are available, get_course_blocks
is not called (if not, it is called as a fallback). Later in the
request, the function is called again as before (see Optimization
Strategy and Difficulties).

Optimization Strategy and Difficulties

The original idea was to fetch and transform the course blocks once
and reuse them in all three cases, which would reduce get_course_blocks
call count to 1. However, this did not turn out to be a viable solution
because of the arguments passed to get_course_blocks. Notably, the
allow_start_dates_in_future boolean flag affects the behavior of
StartDateTransformer, which is a filtering transformer modifying the
block structure returned.

The first two times allow_start_dates_in_future is False, the third
time it is True. Setting it to True in all three cases would mean that
some blocks would be incorrectly included in the response.

This left us with one option - optimize the first two calls. The
difference between the first two calls is the non-filtering
transformers, however the second call applies a subset of transformers
from the first call, so it was safe to apply the superset of
transformers in both cases. This allowed to reduce the number of
function calls to 2. However, the cached structure may be further
mutated by filters downstream, which means we need to cache a copy of
the course structure (not the structure itself). The copy method itself
is quite heavy (it calls deepcopy three times), making the benefits of
this solution much less tangible. In fact, another potential
optimization that was considered was to reuse the collected block
structure (pre-transformation), but since calling copy on a collected
structure proved to be more time-consuming than calling get_collected,
this change was discarded, considering that the goal is to improve
performance.

Revised Solution

To achieve a more tangible performance improvement, it was decided to
modify the previous strategy as follows:

* Pass a for_blocks_view parameter to the get_blocks function to make
  sure the new caching logic only affects the blocks view.
* Collect and cache course blocks with future dates included.
* Include start key in requested fields.
* Reuse the cached blocks in the third call, which is in
  get_course_assignments
* Before returning the response, filter out any blocks with a future
  start date, and also remove the start key if it was not in requested
  fields
2025-10-30 17:23:49 -04:00

194 lines
8.4 KiB
Python

"""
API function for retrieving course blocks data
"""
from edx_django_utils.cache import RequestCache
import lms.djangoapps.course_blocks.api as course_blocks_api
from lms.djangoapps.course_blocks.transformers.access_denied_filter import AccessDeniedMessageFilterTransformer
from lms.djangoapps.course_blocks.transformers.hidden_content import HiddenContentTransformer
from openedx.core.djangoapps.content.block_structure.transformers import BlockStructureTransformers
from openedx.core.djangoapps.discussions.transformers import DiscussionsTopicLinkTransformer
from openedx.features.effort_estimation.api import EffortEstimationTransformer
from .serializers import BlockDictSerializer, BlockSerializer
from .toggles import HIDE_ACCESS_DENIALS_FLAG
from .transformers.blocks_api import BlocksAPITransformer
from .transformers.milestones import MilestonesAndSpecialExamsTransformer
from .utils import COURSE_API_REQUEST_CACHE_NAMESPACE, REUSABLE_BLOCKS_CACHE_KEY
def get_blocks(
request,
usage_key,
user=None,
depth=None,
nav_depth=None,
requested_fields=None,
block_counts=None,
student_view_data=None,
return_type='dict',
block_types_filter=None,
hide_access_denials=False,
allow_start_dates_in_future=False,
cache_with_future_dates=False,
):
"""
Return a serialized representation of the course blocks.
Arguments:
request (HTTPRequest): Used for calling django reverse.
usage_key (UsageKey): Identifies the starting block of interest.
user (User): Optional user object for whom the blocks are being
retrieved. If None, blocks are returned regardless of access checks.
depth (integer or None): Identifies the depth of the tree to return
starting at the root block. If None, the entire tree starting at
the root is returned.
nav_depth (integer): Optional parameter that indicates how far deep to
traverse into the block hierarchy before bundling all the
descendants for navigation.
requested_fields (list): Optional list of names of additional fields
to return for each block. Supported fields are listed in
transformers.SUPPORTED_FIELDS.
block_counts (list): Optional list of names of block types for which to
return an aggregate count of blocks.
student_view_data (list): Optional list of names of block types for
which blocks to return their student_view_data.
return_type (string): Possible values are 'dict' or 'list'. Indicates
the format for returning the blocks.
block_types_filter (list): Optional list of block type names used to filter
the final result of returned blocks.
hide_access_denials (bool): When True, filter out any blocks that were
denied access to the user, even if they have access denial messages
attached.
allow_start_dates_in_future (bool): When True, will allow blocks to be
returned that can bypass the StartDateTransformer's filter to show
blocks with start dates in the future.
cache_with_future_dates (bool): When True, will use the block caching logic using RequestCache
"""
if HIDE_ACCESS_DENIALS_FLAG.is_enabled():
hide_access_denials = True
# create ordered list of transformers, adding BlocksAPITransformer at end.
transformers = BlockStructureTransformers()
if requested_fields is None:
requested_fields = []
include_completion = 'completion' in requested_fields
include_effort_estimation = (EffortEstimationTransformer.EFFORT_TIME in requested_fields or
EffortEstimationTransformer.EFFORT_ACTIVITIES in requested_fields)
include_gated_sections = 'show_gated_sections' in requested_fields
include_has_scheduled_content = 'has_scheduled_content' in requested_fields
include_special_exams = 'special_exam_info' in requested_fields
include_discussions_context = (
DiscussionsTopicLinkTransformer.EMBED_URL in requested_fields or
DiscussionsTopicLinkTransformer.EXTERNAL_ID in requested_fields
)
if user is not None:
transformers += course_blocks_api.get_course_block_access_transformers(user)
transformers += [
MilestonesAndSpecialExamsTransformer(
include_special_exams=include_special_exams,
include_gated_sections=include_gated_sections
),
HiddenContentTransformer()
]
else:
transformers += [course_blocks_api.visibility.VisibilityTransformer()]
# Note: A change to the BlockCompletionTransformer (https://github.com/openedx/edx-platform/pull/27622/)
# will be introducing a bug if hide_access_denials is True. I'm accepting this risk because in
# the AccessDeniedMessageFilterTransformer, there is note about deleting it and I believe it is
# technically deprecated functionality. The only use case where hide_access_denials is True
# (outside of explicitly setting the temporary waffle flag) is in lms/djangoapps/course_api/blocks/urls.py
# for a v1 api that I also believe should have been deprecated and removed. When this code is removed,
# please also remove this comment. Thanks!
if hide_access_denials:
transformers += [AccessDeniedMessageFilterTransformer()]
if include_effort_estimation:
transformers += [EffortEstimationTransformer()]
if include_discussions_context:
transformers += [DiscussionsTopicLinkTransformer()]
transformers += [
BlocksAPITransformer(
block_counts,
student_view_data,
depth,
nav_depth
),
]
if cache_with_future_dates:
# Include future dates such that get_course_assignments can reuse the block structure from RequestCache
allow_start_dates_in_future = True
# transform
blocks = course_blocks_api.get_course_blocks(
user,
usage_key,
transformers,
allow_start_dates_in_future=allow_start_dates_in_future,
include_completion=include_completion,
include_has_scheduled_content=include_has_scheduled_content
)
if cache_with_future_dates:
# Store a copy of the transformed, but still unfiltered, course blocks in RequestCache to be reused
# wherever possible for optimization. Copying is required to make sure the cached structure is not mutated
# by the filtering below.
request_cache = RequestCache(COURSE_API_REQUEST_CACHE_NAMESPACE)
request_cache.set(REUSABLE_BLOCKS_CACHE_KEY, blocks.copy())
# Since we included blocks with future start dates in our block structure,
# we need to include the 'start' field to filter out such blocks before returning the response.
# If 'start' field is not requested, it will be removed from the response.
requested_fields = set(requested_fields)
requested_fields.add('start')
# filter blocks by types
if block_types_filter:
block_keys_to_remove = []
for block_key in blocks:
block_type = blocks.get_xblock_field(block_key, 'category')
if block_type not in block_types_filter:
block_keys_to_remove.append(block_key)
for block_key in block_keys_to_remove:
blocks.remove_block(block_key, keep_descendants=True)
# serialize
serializer_context = {
'request': request,
'block_structure': blocks,
'requested_fields': requested_fields,
}
if return_type == 'dict':
serializer = BlockDictSerializer(blocks, context=serializer_context, many=False)
else:
serializer = BlockSerializer(blocks, context=serializer_context, many=True)
# return serialized data
return serializer.data
def get_block_metadata(block, includes=()):
"""
Get metadata about the specified XBlock.
Args:
block (XBlock): block object
includes (list|set): list or set of metadata keys to include. Valid keys are:
index_dictionary: a dictionary of data used to add this XBlock's content
to a search index.
"""
data = {
"id": str(block.scope_ids.usage_id),
"type": block.scope_ids.block_type,
}
if "index_dictionary" in includes:
data["index_dictionary"] = block.index_dictionary()
return data