From ceae081cc7df9f82ba03503f4997c2a856c0966e Mon Sep 17 00:00:00 2001 From: Don Mitchell Date: Tue, 19 Aug 2014 14:49:24 -0400 Subject: [PATCH] Refactored expected indices after email w/ mongo Key order has no bearing on key order in queries other than for Key order should be by greatest to least cardinality Added sparse and unique declarations. --- mongo_indexes.md | 62 +++++++----------------------------------------- 1 file changed, 9 insertions(+), 53 deletions(-) diff --git a/mongo_indexes.md b/mongo_indexes.md index 0764f82afe..df8ce78ce5 100644 --- a/mongo_indexes.md +++ b/mongo_indexes.md @@ -20,16 +20,14 @@ which can be `uploadDate`, `display_name`, Replace existing index which leaves out `run` with this one: ``` -ensureIndex({'_id.tag': 1, '_id.org': 1, '_id.course': 1, '_id.category': 1, '_id.run': 1}) -ensureIndex({'content_son.tag': 1, 'content_son.org': 1, 'content_son.course': 1, 'content_son.category': 1, 'content_son.run': 1}) +ensureIndex({'_id.org': 1, '_id.course': 1, '_id.name': 1}, {'sparse': true}) +ensureIndex({'content_son.org': 1, 'content_son.course': 1, 'content_son.name': 1}, {'sparse': true}) +ensureIndex({'_id.org': 1, '_id.course': 1, 'uploadDate': 1}, {'sparse': true}) +ensureIndex({'_id.org': 1, '_id.course': 1, 'display_name': 1}, {'sparse': true}) +ensureIndex({'content_son.org': 1, 'content_son.course': 1, 'uploadDate': 1}, {'sparse': true}) +ensureIndex({'content_son.org': 1, 'content_son.course': 1, 'display_name': 1}, {'sparse': true}) ``` -Note: I'm not advocating adding one which leaves out `category` for now because that would only be -used for `delete_all_course_assets` which in the future should not actually delete the assets except -when doing garbage collection. - -Remove index on `displayname` - modulestore: ============ @@ -41,7 +39,7 @@ and no field is omitted. Because we often query for some subset of the id, we define this index: ``` -ensureIndex({'_id.tag': 1, '_id.org': 1, '_id.course': 1, '_id.category': 1, '_id.name': 1, '_id.revision': 1}) +ensureIndex({'_id.org': 1, '_id.course': 1, '_id.category': 1, '_id.name': 1}) ``` Because we often scan for all category='course' regardless of the value of the other fields: @@ -51,54 +49,12 @@ ensureIndex({'_id.category': 1}) Because lms calls get_parent_locations frequently (for path generation): ``` -ensureIndex({'_id.tag': 1, '_id.org': 1, '_id.course': 1, 'definition.children': 1}) -``` - -Remove these indices if they exist as I can find no use for them: -``` - { "_id.course": 1, "_id.org": 1, "_id.revision": 1, "definition.children": 1 } - { "definition.children": 1 } -``` - -NOTE, that index will only aid queries which provide the keys in exactly that form and order. The query can -omit later fields of the query but not earlier. Thus ```modulestore.find({'_id.org': 'myu'})``` will not use -the index as it omits the tag. As soon as mongo comes across an index field omitted from the query, it stops -considering the index. On the other hand, ```modulestore.find({'_id.tag': 'i4x', '_id.org': 'myu', '_id.category': 'problem'})``` -will use the index to get the records matching the tag and org and then will scan all of them -for matches to the category. - -To find out if any records have the wrong id structure, run -``` -db.fs.files.find({uploadDate: {$gt: startDate, $lt: endDate}, - $where: function() { - var keys = ['category', 'name', 'course', 'tag', 'org', 'revision']; - for (var key in this._id) { - if (key != keys.shift()) { - return true; - } - } - return false; - }}, - {_id: 1}) +ensureIndex({'definition.children': 1}, {'sparse': true}) ``` modulestore.active_versions =========================== ``` -ensureIndex({'org': 1, 'offering': 1}) +ensureIndex({'org': 1, 'course': 1, 'run': 1}, {'unique': true}) ``` - -modulestore.structures -====================== - -``` -ensureIndex({'previous_version': 1}) -``` - -modulestore.definitions -======================= - -``` -ensureIndex({'category': 1}) -``` \ No newline at end of file