BREAKING CHANGE: this forces course IDs in modulestore to be unique (case insensitive). This was always supposed to be the case, but it wasn't working properly on MySQL. Upgrading past this commit may cause a migration failure if you have conflicting course IDs - see the migration 0004 docstring for details.
This commit introduces several improvements to database migration
scripts to enhance compatibility between MySQL and PostgreSQL, ensure
case-sensitive behavior where needed, and improve migration safety and
correctness. The changes include dynamic SQL generation based on the
database engine, improved transaction handling, and adjustments to
field types and adapters for better cross-database support.
Database compatibility and case sensitivity improvements:
- Migration scripts in split_modulestore_django and learning_sequences
now dynamically generate SQL statements for altering column case
sensitivity and uniqueness based on whether the database is MySQL or
PostgreSQL, ensuring correct behavior across both backends.
- common/djangoapps/split_modulestore_django/migrations/0001_initial.py
- openedx/core/djangoapps/content/learning_sequences/migrations/0001_initial.py
- The courseware.fields module now checks for "postgresql" in the
database engine string instead of a specific backend name, improving
compatibility with different PostgreSQL drivers.
- lms/djangoapps/courseware/fields.py
- The 0011_csm_id_bigint migration in courseware now supports both MySQL
and PostgreSQL for altering column types, with specific SQL for each
backend.
- lms/djangoapps/courseware/migrations/0011_csm_id_bigint.py
- The 0009_readd_facebook_url migration in course_overviews now
introspects the table structure using backend-specific SQL for MySQL
and PostgreSQL, ensuring correct detection of existing fields.
- openedx/core/djangoapps/content/course_overviews/migrations/0009_readd_facebook_url.py
Migration safety and correctness:
- Service user creation and deletion in the commerce app is now wrapped
in atomic transactions to ensure database consistency.
- lms/djangoapps/commerce/migrations/0001_data__add_ecommerce_service_user.py
- The move_overrides_to_edx_when migration in courseware now specifies
a no-op reverse migration, preventing accidental data loss on migration
rollback.
- lms/djangoapps/courseware/migrations/0008_move_idde_to_edx_when.py
Adapter registration and code cleanup:
- The common_initialization app now registers custom adapters for
CourseLocator and related classes in psycopg2 when using PostgreSQL,
ensuring proper serialization of these types.
- openedx/core/djangoapps/common_initialization/apps.py
- Minor code cleanup and formatting improvements in migration files,
including import order and field formatting for readability.
- lms/djangoapps/grades/migrations/0015_historicalpersistentsubsectiongradeoverride.py
Split modulestore persists data in three MongoDB "collections": course_index (list of courses and the current version of each), structure (outline of the courses, and some XBlock fields), and definition (other XBlock fields). While "structure" and "definition" data can get very large, which is one of the reasons MongoDB was chosen for modulestore, the course index data is very small.
This commit starts writing course indexes (active_versions) to both MySQL and Mongo, but continues to read from MongoDB only.
By moving course index data to MySQL / a django model, we get these advantages:
* Full history of changes to the course index data is now preserved
* Includes a django admin view to inspect the list of courses and libraries
* It's much easier to "reset" a corrupted course to a known working state, by using the simple-history revert tools from the django admin.
* The remaining MongoDB collections (structure and definition) are essentially just used as key-value stores of large JSON data structures. This paves the way for future changes that allow migrating courses one at a time from MongoDB to S3, and thus eliminating any use of MongoDB by split modulestore, simplifying the stack.
Split modulestore persists data in three MongoDB "collections": course_index (list of courses and the current version of each), structure (outline of the courses, and some XBlock fields), and definition (other XBlock fields). While "structure" and "definition" data can get very large, which is one of the reasons MongoDB was chosen for modulestore, the course index data is very small.
By moving course index data to MySQL / a django model, we get these advantages:
* Full history of changes to the course index data is now preserved
* Includes a django admin view to inspect the list of courses and libraries
* It's much easier to "reset" a corrupted course to a known working state, by using the simple-history revert tools from the django admin.
* The remaining MongoDB collections (structure and definition) are essentially just used as key-value stores of large JSON data structures. This paves the way for future changes that allow migrating courses one at a time from MongoDB to S3, and thus eliminating any use of MongoDB by split modulestore, simplifying the stack.