The edx-platform codebase already includes quite a few type annotations, but
they were not regularly checked. This may cause problems, when the annotations
themselves include errors (as we found out in some of the learning_sequences
annotations). So here, we add mypy as a dev requirement and introduce a make
command to run mypy regularly. Mypy runs on a very small portion of the total
edx-platform, as configured in mypy.ini. Our hope is that developers will add
more and more modules to this configuration file, until we can eventually run
mypy on the full code base.
See discussion: https://discuss.openedx.org/t/dev-notes-running-mypy-on-edx-platform/4860
In DE-1822, we believed we needed to switch to start_date and end_date.
It was determined this was not the case, so this updates the comment
to ensure future users use the correct fields (start and end) and
updates any pieces of code that may have used start_date or end_date.
We already had function traces to see how long each processor took when
generating a UserCourseOutlineData. This adds a similar tracking around
making a UserCourseOutlineDetailsData, to help track down performance
issues observed in what I think is the special exams processor.
[MICROBA-1178]
- remove modulestore usage in `generation_handler.py`
- add duplicate functions that utilize a CourseKey or CourseOverview to remove dependence on modulestore (this will be cleaned up (if possible) at a later part of this refactor)
- add python API function to `content`/`course_overview` app that will retrieve a single CourseOverview (rather than a serialized list of dicts of CourseOverview data)
NOTE: This will require a forced backfill of course outlines to update
the course content data in learning_sequences:
python manage.py cms backfill_course_outlines --force
Without this backfill, the learning_sequences API will continue to serve
stale content data that has no user partition group data. It won't cause
errors, but it won't do the exclusions properly.
Commit summary:
* Created EnrollmentTrackPartitionGroupsOutlineProcessor to process the
enrollment_track User Partition Group, allowing Sequences and Sections
to be removed based on their group_access settings.
* Added user_partition_groups attribute to CourseLearningSequenceData
and CourseSectionData in learning_sequences/data.py, along with
backing model data.
* get_outline_from_modulestore now extracts group_access settings from
Sections and Sequences. It also bubbles up group_access settings from
Units, meaning that if a Sequence with no group_access setting has
Units that are all set to show only to the Verified enrollment track,
then the Sequence will only show to the Verified enrollment track.
This commit adds model-level support for all user partition groups by
capturing all the content group associations (group_access), but it only
implements the code checks for the enrollment track partition. It's not
clear that we want to generalize, since there's only one other partition
type (A/B testing) that is applicable at the outline level.
It's important to note that there is no way to set the group_access for
a Section or Sequence in Studio today. It's only possible by direct
editing of the OLX for import. That being said, the block structures
framework supports applying course groups at this level, and this commit
moves learning_sequences closer to feature parity.
The bubbling up from Units to the parent Sequence was done to mitigate
confusion when a Sequence is entirely composed of Units that are not
visible to the user because of content group restrictions. It's not
clear whether this is something we want to do in the long term, since it
would simplify the code to always specify group_access at the Sequence
level. This first pass is done partially to collect better data about
places in our courses where this kind of usage is already happening.
Most of the EnrollmentTrackPartitionGroupsOutlineProcessor code and its
tests were written by @schenedx.
* MST-734 Fix production issue on Learner Onboarding Status Panel
Fix the prod issue where learning sequence service object missing the needed get_user_course_outline service API
The user web API call currently returns 500
I had the wrong attribute before this commit, and Django Admin let
me get away with it because it doesn't explode when you try to grab
relations that aren't there–it just quietly returns None in some of
those cases.
* Introduces the idea of content errors into the learning_sequences
public API, accessible using get_content_errors().
* Makes course outline generation much more resilient to unusual
structures (e.g. Section -> Unit with no Sequence in between),
with the understanding that anything that doesn't conform to the
standard structure will simply be skipped.
* Improves the Django Admin for learning_sequences to display
content errors and improve sequence data browsing within a course.
* Switches the main table viewed in the Django admin from
LearningContext to CourseContext, which is appropriate since only
course runs generate outlines.
This was done as part of TNL-8057, with the end goal of making
course outline generation resilient enough to switch over apps
to using the learning_sequences outline API. The types of course
structure errors that this PR addresses cause display issues even
in the current Outline Page experience, but would break the outline
generation for learning_sequences altogether.
The approach for error messages here is very generic, to keep
modulestore concepts from seeping into learning_sequences (which is
not aware of the modulestore/contentstore). We may need to address
this later, with a more normalized content error data model.
While the Django admin page is backwards compatible with the old
versions of the models, we should run the backfill_course_outlines
management command after deploying this change, to get the full
benefits.
Because the available date update to the CourseOverview happens inside a
view's signal and we have atomic requests on, the read that was
happening inside the task happened *before* the write was commited to
the database. To avoid the unknown bugs that would come from disabling
atomic transactions for that view (since it's large), this passes the
date we want down through the signals and tasks so we can skip the DB
read at the end.
This will make it possible to make a New Relic dashboard for the
learning_sequences API calls that tracks call performance across
different transactions (by querying Spans). Our goal would be to
offer SLAs around 99th percentile performance.
Course IDs and User ID metrics are also added, so that we can see
outliers in Span reporting, for later investigation.
* Adds the backfill_course_outlines management command to contentstore
* Adds a read-only Django admin interface to learning_sequences for the
support team and debugging.
* Adds two new functions to the learning_sequences public API:
key_supports_outlines and get_course_keys_with_outlines
The learning_sequences app isn't supposed to know about contentstore or
modulestore, as it's intended to be extracted out of edx-platform in the
long term. Therefore, the backfill_course_outlines command is in
contentstore, and not learning_sequences.
This work was tracked in TNL-7983, but it also fixes a bug where we were
trying to generate course outlines for libraries (TNL-7981).
All Open edX instances upgrading to Lilac should run the
backfill_course_outlines command as part of their upgrade process.
COURSE_CERT_DATE_CHANGE was being called before saving the new data in
the course overview. The listeners were expecting to pull the data out
of the course overview, and thus were only right about half the time.
This moves the signal to trigger after the course publish signals are
handled.
This commit removes several waffle toggles that have been enabled
on edx.org for years. It's time to remove the rollout gating for
these features and enable them by default.
This doesn't directly change any behavior. But it does create new
database objects by default now and allows for enabling other
schedule based features more easily.
Specifically, the following toggles were affected.
schedules.create_schedules_for_course
- Waffle flag removed as always-enabled
- We now always create a schedule when an enrollment is created
schedules.send_updates_for_course
- Waffle flag removed as always-enabled
- Course update emails are sent as long as the ScheduleConfig
allows it.
- This is not a change in default behavior, because ScheduleConfig
is off by default.
dynamic_pacing.studio_course_update
- Waffle switch removed as always-enabled
- Course teams can now always edit course updates directly in Studio
ScheduleConfig.create_schedules
ScheduleConfig.hold_back_ratio
- Model fields for rolling out the schedules feature
- Schedules are now always created
- This commit only removes references to these fields, they still
exist in the database. A future commit will remove them entirely
This commit also adds a new has_highlights field to CourseOverview.
This is used to cache whether a course has highlights, used to
decide which course update email behavior they get. Previously every
enrollment had to dig into the modulestore to determine that.
The waffle classes previously in future are now available directly from
edx_toggles.toggles. Since this future module is going to be removed soon,
we become future-proof by getting rid of it.
This includes the following settings:
- BLOCK_STRUCTURES_SETTINGS : setting dictionary which stores the other different block structures related settings
- BLOCK_STRUCTURES_SETTINGS['PRUNING_ACTIVE'] : keeps only a specific number of versions of each block structure, and deletes the rest
- BLOCK_STRUCTURES_SETTINGS['COURSE_PUBLISH_TASK_DELAY'] : specifies the delay, in seconds, after a new edit of a course is published before updating the block structures cache
- BLOCK_STRUCTURES_SETTINGS['TASK_DEFAULT_RETRY_DELAY'] : Specifies the delay, in seconds, between retry attempts for block structure tasks
- BLOCK_STRUCTURES_SETTINGS['TASK_MAX_RETRIES'] : specifies the max number of retries per block structure task
- BLOCK_STRUCTURES_SETTINGS['STORAGE_CLASS'] : specifies the storage which block structures would be saved to when storage backed block structures are enabled
Example: 'storages.backends.s3boto.S3BotoStorage'
- BLOCK_STRUCTURES_SETTINGS['STORAGE_KWARGS'] : specifies additional arguments needed when utilizing storage for storing storage backed block structures
Example: { bucket: 'test-edxapp' }
- BLOCK_STRUCTURES_SETTINGS['DIRECTORY_PREFIX'] : specifies the path to which storage backed block structures are saved
Example: '/courses/'
This also includes the following waffle switches:
- block_structure.storage_backing_for_cache : enables storage backed block structures, used to retrieve the block structures from storage instead of regenerating the structure, when not available in cache
it is important to note that this is important for production because it reduces response times on block structure related apis
- block_structure.raise_error_when_not_found : raises an error if the block structure requested doesnt exist in store or is outdated
- block_structure.invalidate_cache_on_publish : invalidates the block structure cache when courses are published
For an example with additional context, see the following forum post:
https://discuss.openedx.org/t/help-please-very-slow-load-time-10-seconds-for-courseware-on-sections-with-several-subsections-and-xblocks/2998/16
This also includes information about the toggles that will likely be deprecated and removed:
https://github.com/edx/edx-platform/pull/26175#issuecomment-769080286
The learning_sequences app has its own model for Course Outlines.
Prior to this commit, these course outlines were only populated by
a management command in the learning_sequences app that queried
modulestore. This commit does a few things:
1. Move the update_course_outline command to live in contentstore
(i.e. Studio). This makes learning_sequences unaware of
modulestore, and makes it easier for us to extract it from
edx-platform (or to plug in different kinds of course outlines).
2. Add tests.
3. Add performance and debug logging to course outline creation.
4. Make course outline creation happen every time a course publish
happens.
This will allow us to start collecting data about how long building
course outlines takes, and get error reporting around any content
edge cases that break the course outline code.
This method from the toggle legacy classes should not actually be
exposed to all. So we get rid of it by manually setting the cached
value. While we are at it, we convert the STORAGE_BACKING_FOR_CACHE
legacy waffle switch to its modern version. As the flag is not being
used elsewhere, this should not break anything.
We take the opportunity to modernize waffle switches from
block_structure.config: to do so we convert the INVALIDATE_CACHE_ON_PUBLISH and
RAISE_ERROR_WHEN_NOT_FOUND waffle switches from legacy classes to their modern
equivalents. These switches are not used outside of edx-platform, so this
change should not trigger any error.
By explicitly importing the legacy namespace classes, we make it clear
that we are using soon-to-be-deprecated classes. We will then be able to
start removing the legacy classes, one module at a time.