Get only those block items which have their path to root #11095

Qubad786 · 2015-12-30T12:11:45Z

This PR is redo of edx/edx-platform#10994 which was reverted due to its implication on performance. We can use get_items with include_orphans set to False, this will help us getting only those items which are not orphans. Whenever we are not concerned about getting orphans, just don't pass include_orphans kwarg while calling get_items.

Performance Comparison:

With reverted PR , get_items was taking 386 milliseconds from which total 4 calls to load_tagged_classes were consuming 343 milliseconds as shown in profile

With this PR , get_items is taking only 44 milliseconds when called with or without include_orphans=False as shown in profiles below

adampalay · 2015-12-30T20:31:28Z

common/lib/xmodule/xmodule/modulestore/split_mongo/split.py

@@ -1197,7 +1205,11 @@ def _block_matches_all(block_data):
            settings['children'] = qualifiers.pop('children')
        for block_id, value in course.structure['blocks'].iteritems():
            if _block_matches_all(value):
-                items.append(block_id)
+                if not include_orphans:
+                    if self.has_path_to_root(block_id, course) or block_id.type in _DETACHED_CATEGORIES:


if we checked block_id.type in _DETACHED_CATEGORIES first, we'll get a little performance boost, since if that passed, we wouldn't have to check self.has_path_to_root(block_id, course)

Nice catch!

adampalay · 2015-12-30T20:35:36Z

@Qubad786 , can you please post some indication of the performance improvement between the two PRs on this PR?

Qubad786 · 2015-12-31T16:22:32Z

@adampalay I have updated PR with performance comparison.

mushtaqak · 2016-01-04T10:02:01Z

common/lib/xmodule/xmodule/modulestore/tests/test_mixed_modulestore.py

+        course_key = test_course.id
+
+        # get detached category list
+        detached_categories = [name for name, __ in XBlock.load_tagged_classes("detached")]


@Qubad786 You can import _DETACHED_CATEGORIES as you have done in the split.py

adampalay · 2016-01-04T14:42:55Z

common/lib/xmodule/xmodule/modulestore/split_mongo/split.py

@@ -83,6 +83,7 @@
 from .caching_descriptor_system import CachingDescriptorSystem
 from xmodule.modulestore.split_mongo.mongo_connection import MongoConnection, DuplicateKeyError
 from xmodule.modulestore.split_mongo import BlockKey, CourseEnvelope
+from xmodule.modulestore.mongo.base import _DETACHED_CATEGORIES


this feels weird, for split to import from mongo.base. Can they both just import from xmodule.modulestore? (You can put DETACHED_CATEGORIES into xmodule/modulestore/__init__.py. I'd also remove the leading underscore, since it will no longer be private.

adampalay · 2016-01-04T14:43:29Z

@Qubad786 , one minor point, otherwise this looks good. Thanks for the performance comparison, that's awesome

adampalay · 2016-01-04T14:45:15Z

So to be clear, this PR makes get_items 0.044 - 0.034 = 0.01 seconds slower?

Qubad786 · 2016-01-04T15:24:08Z

In reverted PR , get_items was taking 0.386 sec (i.e. 386 ms), and in this PR, get_items is taking only 0.044 sec (i.e. 44 ms) which makes get_items 0.342 sec (i.e. 342 ms) faster. And has_path_to_root is taking only 0.001 sec (i.e. 1 ms) to execute when we call get_items with include_orphans=False.

adampalay · 2016-01-04T15:25:56Z

@Qubad786 , ah, I see.
So the difference in performance on this PR vs. master is 0.001 secs? How big is the course that you're testing on?

Qubad786 · 2016-01-04T15:36:42Z

Yes. It's not that big, It has 2 chapters with 2 subsections each. In each subsection, there are 3 problems, other than that there is 1 orphan problem. But I kept the test course same for both PRs while testing.

Qubad786 · 2016-01-05T14:12:15Z

With the course provided yesterday, master's get_items is taking 0.055 sec (i.e. 55 ms) to execute.

On this branch get_items is taking 0.108 sec (i.e. 108 ms). Single call to has_path_to_root is taking 0.015 sec (i.e. 15 ms).

adampalay · 2016-01-05T17:45:01Z

@benpatterson , @ormsbee , @tobz

This PR gives one the ability to call get_items with an optional parameter that filters out orphan modules from its results. For large courses, this introduces a performance penalty, since each module, in order to determine if its parent is in the course, needs to iterate over each module in the course. For smaller courses, this isn't a big deal (the first one @Qubad786 tested on only added 1 millisecond). But for a larger course (the one Hassan tested had around 1500 items), the penalty was steeper. Each check to see that a module wasn't an orphan took 15 ms.

My question for you three is: is this performance hit too high?

ormsbee · 2016-01-05T18:00:52Z

@adampalay: I guess that really depends on how many things we're calling get_items in our worst case scenarios. Is getting rid of a lot of orphan related bugs worth a 50ms worst case? Sure. As long as you can tell me that 50ms is as bad as it gets. Are there places (e.g. grading, courses with really long sequences) where the penalty would be significantly higher?

Is there any chance we can just kill the orphans when we're doing the save (so that they never make it into the database in the first place)? Maybe post-processing before we save the structure doc? Or do orphans still need to get exported?

adampalay · 2016-01-05T18:18:32Z

@ormsbee , I don't know that 50 ms is as bad as it gets — I'd have to see what course has the most amount of items, since this should just be correlated with how many items a course has.

You ask if we can kill orphans we we do a save. Killing orphans turned out to be more complicated than we anticipated (see this document), so we took the tack of just working around them.

Orphans don't get exported, so we have a ticket in to make course imports atomic: PLAT-863. Once that is implemented, you'll be able to remove orphans in a split course by exporting and then importing.

benpatterson · 2016-01-05T18:28:26Z

I know it doesn't cover the large-course use case you're talking about, but I wanted to capture the bok-choy build results here. The reverted PR added about 70 minutes (of test time...10 mins per shard) to the overall bok-choy run, but this current PR doesn't have an impact.

I do think a look at how this impacts the larger courses is worth a cycle.

Qubad786 · 2016-01-05T18:42:04Z

@benpatterson this does not have considerable impact because get_items is never used with include_orphans=False kwarg anywhere except in only the test which was presented in this PR.

ormsbee · 2016-01-05T18:44:57Z

@adampalay: Got it. I'm definitely not concerned if it's not being used yet, but if we are switching the LMS over to it, I'd like to see it run on some of the very large MITx courses to gauge impact (e.g. 8.MReVx, 14.74)

benpatterson · 2016-01-05T18:45:59Z

Yes @Qubad786 I understand that. I just wanted to post additional findings :)

adampalay · 2016-01-06T14:23:20Z

@ormsbee , for now, we wouldn't switch the whole LMS; it would only be used in a couple of views (grabbing content groups, discussion modules to display on the discussion forum). We'll take a look at the performance impact on MReV too and post results here. So that we're clear, what are the acceptance criteria here?

ormsbee · 2016-01-06T14:31:17Z

@adampalay: spitballing it, I'd say a ~5% degradation in overall server side request time on those pages is a reasonable tradeoff for getting rid of these types of bugs. If it's > 10%, let's talk about this more.

macdiesel · 2016-01-06T20:23:44Z

common/lib/xmodule/xmodule/modulestore/store_utilities.py

@@ -3,6 +3,9 @@
 from collections import namedtuple

 import uuid
+from xblock.core import XBlock
+
+DETACHED_CATEGORIES = [name for name, __ in XBlock.load_tagged_classes("detached")]


As I was looking through the documentation on opaque keys today I noticed this: https://opaque-keys.readthedocs.org/en/stable/opaque_keys.edx.html#opaque_keys.edx.locator.BlockUsageLocator.category

Since category is depreciated we may want to change the name of this to DETACHED_XBLOCK_TYPES

@macdiesel great suggestion

Qubad786 · 2016-01-11T12:11:06Z

Following are further performance testing with the courses mentioned earlier:

With 8.MReVx; there were total 3405 items in course.
On master, get_items is taking 269 ms to execute, can be seen in profile below:

On this branch, get_items is taking 363 ms from which has_path_to_root is taking 106 ms to execute, can be seen in profile below:

With 14.74; there were total 2090 items in course.
On master, get_items is taking 122 ms to execute, can be seen in profile below:

On this branch, get_items is taking 159 ms from which has_path_to_root is taking 48 ms to execute, can be seen in profile below:

adampalay · 2016-01-11T15:58:43Z

@ormsbee , looks like this is a bit more than 10%. As @Qubad786 points out, we're not planning to implement this in that many places, but I also think we can improve the way we calculate which items are in a course tree. (for example, we could cache this list, and regenerate this cache on each publish event). Or we can reimplement has_path_to_root where we temporarily store if a module has a path to root so we don't have to walk up the course tree for every module.

ormsbee · 2016-01-13T16:07:08Z

It doesn't bother me so much that this particular method call goes up in time by that much. When I was spitballing 10%, I meant the overall server execution time for, say, the course index page. Given that, and the limited places where this is going to be used, I'm fine with taking this hit. Thank you for doing this investigative work.

adampalay · 2016-01-20T13:25:45Z

@ormsbee , can you give this another review please?

@Qubad786 , just for a sense of performance, can you post how long get_items takes for a large course with and without include_orphans?

ormsbee · 2016-01-20T14:41:07Z

common/lib/xmodule/xmodule/modulestore/store_utilities.py

@@ -3,6 +3,9 @@
 from collections import namedtuple

 import uuid
+from xblock.core import XBlock
+
+DETACHED_XBLOCK_TYPES = [name for name, __ in XBlock.load_tagged_classes("detached")]


If the ordering doesn't matter, we might as well make this a set, since the primary use case will be checking for inclusion, and we might one day have many more of these.

ormsbee · 2016-01-20T16:17:33Z

Done with my pass. Just minor comments.

Qubad786 · 2016-01-20T17:01:41Z

Course: 8.MReVx
Total Items: 3402

This time, I have tested get_items on getting all 3402 items of the course.

Following is the profile when get_items is called with the old has_path_to_root when include_orphans=False:

Following is the profile when get_items is called with the improved has_path_to_root when include_orphans=False:

Following is the profile when get_items is called without include_orphans=False:

You can see, has_path_to_root has been much improved with path_cache(in terms of reduced recursive calls) but still get_items is around 5 or 6 times slower than than the master's.
FYI @adampalay

adampalay · 2016-01-20T19:14:09Z

@ormsbee , what do you think of those results?

ormsbee · 2016-01-20T19:45:48Z

5 or 6x is a lot. If _get_parents_from_structure() is the expensive part, can't we make it faster? Looking at that code, it looks like it's doing a full traversal of the structure relationships to see if that item is in any child block. But we could instead do one pass to build a proper mapping of block_key -> parents, and then do hash lookups from that point on.

ormsbee · 2016-01-20T19:55:52Z

By building the reverse lookup, I mean having a method to build something like this ahead of time:

children_to_parents = defaultdict(list)
for parent_key, value in structure['blocks'].iteritems():
    for child_key in value.fields.get('children', []):
        children_to_parents[child_key].append(parent_key)

So that instead of having 3K lookups that all need to do this:

        return [
            parent_block_key
            for parent_block_key, value in structure['blocks'].iteritems()
            if block_key in value.fields.get('children', [])
        ]

You instead just iterate through it once to build the data structure, and each check is just a dict lookup.

adampalay · 2016-01-20T20:00:15Z

Yeah, that's a great idea; we hadn't thought of that.

We could also make both caches global caches, where the cache prefix is the version hash of the course structure we're using (since they're immutable). So even if it does cause a performance hit, you only take the hit once.

adampalay · 2016-01-20T20:14:23Z

(Also, this feels like an interview question: You have a list of nodes that has one root node. Each node has pointers to its children. How do you prune out the nodes that aren't descendents of the root?)

ormsbee · 2016-01-21T13:51:04Z

common/lib/xmodule/xmodule/modulestore/split_mongo/split.py

-            if block_key in value.fields.get('children', [])
-        ]
+        cache_key = u'structure.{structure_id}'.format(structure_id=structure['_id'])
+        children_to_parents = cache.get(cache_key)


You probably don't want this cache call here, because going out to the cache is going to be a network operation, and you don't want to do that 3K times. You'd want to build the data structure outside of this method altogether and pass it in.

And also because you'd be introducing a Django dependency into common/lib. Maybe just try it first without the caching and see where we end up in terms of profiling times? It's possible that the network hop and deserialization wouldn't be worth it when compared to a single traversal of a structure we already have in memory.

yeah, even our largest structures only have a few thousand items; I can't imagine this taking more than a handful of milliseconds if we do it once

@ormsbee thanks for pointing it out, I am going to move cache and mapping to get_items instead. And yeah, I will update the PR with profiles.

Qubad786 · 2016-01-21T23:10:37Z

Course: 8.MReVx
Total Items: 3402

Continued testing get_items on getting all 3402 items for the current commit.

Following is the profile when get_items is called with include_orphans=False:

Following is the profile when get_items is called without include_orphans=False:

get_items with include_orphans=False turns out to be 0.387 sec slower than master's get_items this time.

@adampalay , @ormsbee please have a look.

ormsbee · 2016-01-22T01:59:08Z

common/lib/xmodule/xmodule/modulestore/split_mongo/split.py

+            if parents_cache is None
+            else
+            parents_cache[block_key]
+        )


This is a bit awkward to read. Please break this out into a more conventional

if parents_cache is None: xblock_parents = ... else: ...

ormsbee · 2016-01-22T02:40:34Z

Congratulations! :-) There's probably more that can be done with micro level optimizations, but I think we're pretty close to the point of diminishing returns. Thank you for taking the time to work through this issue and for all the profiling results. If I had a Performance Team appreciation badge 🏆, I'd put one on this PR. :-P

I had a couple of minor comments I'd like to see addressed before merging, but I'm 👍 on the performance aspect of this.

adampalay · 2016-01-22T13:19:49Z

common/lib/xmodule/xmodule/modulestore/split_mongo/split.py

@@ -1195,31 +1201,84 @@ def _block_matches_all(block_data):
        # don't expect caller to know that children are in fields
        if 'children' in qualifiers:
            settings['children'] = qualifiers.pop('children')
+
+        # No need of these caches unless include_orphans is set to False
+        path_cache, parents_cache = None, None


we won't actually even use these variables if include_orphans is set to False. Maybe not even worth setting?

That's just declaration to avoid "variable might be referenced before assignment".

ok, totally fair. Can you split this out to separate lines though?

Yes, I will split them up.

adampalay · 2016-01-22T13:24:41Z

@Qubad786 , just a few nits, but otherwise this is looking good :). And thanks @ormsbee :)

adampalay · 2016-01-22T16:02:36Z

@Qubad786 , this looks great to me! Once you squash your commits, 👍

Qubad786 · 2016-01-22T16:02:56Z

@adampalay , @ormsbee may you guys please have a quick look for the last time to see if there is anything missed from being addressed :) Thanks!

adampalay · 2016-01-22T16:08:57Z

@Qubad786 , looks good to me, just needs a squash (maybe 2 commits — @mushtaqak 's original one and one for your improvements)

ormsbee · 2016-01-22T18:21:44Z

common/lib/xmodule/xmodule/modulestore/mongo/base.py

@@ -269,7 +268,7 @@ def load_item(self, location, for_parent=None):  # pylint: disable=method-hidden
                    )
                    if parent_url:
                        parent = self._convert_reference_to_key(parent_url)
-                if not parent and category not in _DETACHED_CATEGORIES + ['course']:
+                if not parent and category not in DETACHED_XBLOCK_TYPES + ['course']:


Does this work (adding a list to a set)?

whoops, it doesn't. Python tests caught that tho

ormsbee · 2016-01-22T18:22:09Z

Just one comment, otherwise 👍

Code refactor

Qubad786 · 2016-01-22T21:47:37Z

Going to merge.

Get only those block items which have their path to root

Qubad786 force-pushed the mushtaq/improve_get_item branch from 53da483 to 9a86e57 Compare December 30, 2015 12:40

adampalay reviewed Dec 30, 2015
View reviewed changes

Qubad786 force-pushed the mushtaq/improve_get_item branch from 9a86e57 to 0ff0693 Compare December 31, 2015 15:54

mushtaqak reviewed Jan 4, 2016
View reviewed changes

Qubad786 force-pushed the mushtaq/improve_get_item branch 2 times, most recently from 8c84fbe to 16a724b Compare January 4, 2016 13:53

adampalay reviewed Jan 4, 2016
View reviewed changes

macdiesel reviewed Jan 6, 2016
View reviewed changes

Qubad786 force-pushed the mushtaq/improve_get_item branch from ba3e316 to 0de049a Compare January 8, 2016 19:39

ormsbee reviewed Jan 20, 2016
View reviewed changes

ormsbee reviewed Jan 21, 2016
View reviewed changes

ormsbee reviewed Jan 22, 2016
View reviewed changes

adampalay reviewed Jan 22, 2016
View reviewed changes

ormsbee reviewed Jan 22, 2016
View reviewed changes

Mushtaq Ali and others added 2 commits January 23, 2016 00:21

Append Item only if it has path to root.

70b55cf

Code refactor

improve get_items and has_path_to_root with temporary caches.

352e219

Qubad786 force-pushed the mushtaq/improve_get_item branch from fbbed2a to 352e219 Compare January 22, 2016 20:51

Qubad786 added a commit that referenced this pull request Jan 22, 2016

Merge pull request #11095 from edx/mushtaq/improve_get_item

8c26178

Get only those block items which have their path to root

Qubad786 merged commit 8c26178 into master Jan 22, 2016

Qubad786 deleted the mushtaq/improve_get_item branch January 22, 2016 21:50

ormsbee mentioned this pull request Dec 18, 2024

feat: Source specifying for relative links is allowed openedx/cc2olx#232

Open

Get only those block items which have their path to root #11095

Get only those block items which have their path to root #11095

Conversation

Qubad786 commented Dec 30, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adampalay commented Dec 30, 2015

Qubad786 commented Dec 31, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adampalay commented Jan 4, 2016

adampalay commented Jan 4, 2016

Qubad786 commented Jan 4, 2016

adampalay commented Jan 4, 2016

Qubad786 commented Jan 4, 2016

Qubad786 commented Jan 5, 2016

adampalay commented Jan 5, 2016

ormsbee commented Jan 5, 2016

adampalay commented Jan 5, 2016

benpatterson commented Jan 5, 2016

Qubad786 commented Jan 5, 2016

ormsbee commented Jan 5, 2016

benpatterson commented Jan 5, 2016

adampalay commented Jan 6, 2016

ormsbee commented Jan 6, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Qubad786 commented Jan 11, 2016

adampalay commented Jan 11, 2016

ormsbee commented Jan 13, 2016

adampalay commented Jan 20, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ormsbee commented Jan 20, 2016

Qubad786 commented Jan 20, 2016

adampalay commented Jan 20, 2016

ormsbee commented Jan 20, 2016

ormsbee commented Jan 20, 2016

adampalay commented Jan 20, 2016

adampalay commented Jan 20, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Qubad786 commented Jan 21, 2016

Choose a reason for hiding this comment

ormsbee commented Jan 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adampalay commented Jan 22, 2016

adampalay commented Jan 22, 2016

Qubad786 commented Jan 22, 2016

adampalay commented Jan 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ormsbee commented Jan 22, 2016

Qubad786 commented Jan 22, 2016