Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/treeview endpoint is slow when a lot of nodes are in the tree #2250

Closed
1 task done
unclecheese opened this issue Sep 4, 2018 · 18 comments
Closed
1 task done

/treeview endpoint is slow when a lot of nodes are in the tree #2250

unclecheese opened this issue Sep 4, 2018 · 18 comments

Comments

@unclecheese
Copy link

unclecheese commented Sep 4, 2018

Per silverstripe/silverstripe-admin#571

Request admin/pages/edit/treeview

CMSMain::treeview() takes 12 to 18 seconds to generate response. This function just populates a template:

$this->renderWith($this->getTemplatesWithSuffix('_TreeView'))

Further investigation is needed to figure out which part of the template is slow. This issue is present on every CMS page which shows the site tree, but at least is triggered only once.

Summary:

Issue occurs once per every CMS page load.

Pull Requests

@caffeineinc
Copy link

caffeineinc commented Sep 5, 2018

I'm also finding issues with SiteTree performance, however i think the issues is largely deeper than just rendering. I've got a SiteTree with ~6400 records, most of which is under a single node. Looking deeper, it feels like an issue with how TreeDropdownField populates its MarkedSet.

Do you think that it is related to this, or seperate?

I actually was playing with these settings (in your PR) already, which allows at least the rendering of the tree in the modal.

@caffeineinc
Copy link

Related: silverstripe/silverstripe-admin#571

@unclecheese
Copy link
Author

The TreeDropdownField is definitely a bottleneck, and I'm looking into that, but are you saying it affects the tree view? It should only have an effect on the edit view.

@caffeineinc
Copy link

caffeineinc commented Sep 5, 2018

So the tree drop down issue seems to be the fact it's calling AllChildrenIncludingDeleted twice,
once for markPartialTree and once for getChildrenAsArray during render. SilverStripe lumberjack can take care of some of this, but the whole SiteTree has to be returned and then filtered if you're searching within it.

It makes more sense to me to do the filter on the initial query, because we're operating on less data being a subset that way.

The only other thing, is it calls the same function AllChildrenIncludingDeleted on every object in that partial tree, which for say SilverStripe Blog, is redundant. None of those items should have children due to lumberjack & flat structure.

Do you know if node_threshold_total exsits in 3.7 too ?

@caffeineinc
Copy link

caffeineinc commented Sep 6, 2018

Hmm i think i see why now, getSearchResults() has to return all of the results, even if we're using a search query, as we might not know what permissions (e.g. canView()) the current user has. Sometimes the search query is vague or attributed to a large number of results, which then have to be iterated through twice for markPartialTree and once for getChildrenAsArray.

the other option is applying the hierarchy limits on the search results first too, the likely scenario is the source data should be bound by the same limits as the SiteTree.

Either way this looks like way too much work for these situations.

@chillu chillu assigned unclecheese and lukereative and unassigned unclecheese Sep 6, 2018
@chillu chillu reopened this Sep 11, 2018
@chillu chillu assigned unclecheese and unassigned lukereative Sep 16, 2018
@chrispenny
Copy link

chrispenny commented Sep 16, 2018

We're at a point now where the TreeDropdownField is timing out.

The top level of our Site Tree contains 25 records, and then there are some pages that contain as many as 125 children.

Setting node_threshold_total: 1 on TreeDropdownField isn't working here for a couple of reasons.

  1. When node threshold is set to 1, the searching is also limited to a single level of the Site Tree.
  2. The levels that contain 125 records still time out.

The client snapshot is the best DB to use to test this.

@chillu
Copy link
Member

chillu commented Sep 17, 2018

@unclecheese Is the TreeDropdownField issue sufficiently different to split out into it's own Github issue? I think this form field shouldn't use the tree logic at all. It displays a list, and could optionally cache one level further inwards. None of these requires the weird MarkedSet incantations which relate this to tree performance in the first place. It's a list, not a tree.

@maxime-rainville
Copy link
Contributor

maxime-rainville commented Sep 17, 2018

@chillu I've created #2265 to track work on the TreeDropdownField.

@maxime-rainville
Copy link
Contributor

maxime-rainville commented Sep 17, 2018

The way the sitetree view is built right now is perhaps a bit overkill.

Basically, we start from the root node and try to figure out which nodes needs to be expanded to reach our targeted page.

If I was writing this thing from scratch, my instinct would be do it the other way around ... start from my targeted page and work my way up.

Maybe there's some deeper reason for doing it the current way that is eluding me.

I'm thinking a way around our current predicament could be to implement some sort of TreeRenderingService that could be injected into CMSMain. That could allows sites to use custom logic to render their sitetree. That would allow us to implement my bottom-up logic to render the treeview (for the record I'm not actually 100% sure it would be faster).

Currently, the worst case scenario for me is ~2.7 sec to render treeview. That's when access the treeview .

On a side note, treeview uses the session to determine what the current page is rather than an ID pass as a parameter, which I find kind of weird and probably makes it more difficult to cache.

@chillu
Copy link
Member

chillu commented Sep 17, 2018

Currently, the worst case scenario for me is ~2.7 sec to render treeview

@unclecheese @blueo @chrispenny How does this compare to your findings? Aaron indicated 12-18 secs in the initial issue description here. Note that this is rendering the tree. We've split out the TreeDropdownField discussion into a separate issue (#2265). 2.7 secs seems acceptable, 18 secs isn't. It seems unlikely that this big difference is due to environments (Vagrant vs. SSP)?

Basically, we start from the root node and try to figure out which nodes needs to be expanded to reach our targeted page.

Yeah, since we no longer present hierarchical search results in the tree or TreeDropdownField views, it's a lot more viable. Which is probably the reason it was implemented top-down rather than bottom-up. When we refactor the tree to React, I want to get rid of all the marking logic, and do more on the client. A TreeRenderingService would be a way for advanced users to fix this (and accept functional tradeoffs) but in the end there are other complex websites which have a similar issue. So I'd call that Plan B.

First of all, are we sure that there's only the max of ~250 nodes being processed here? I'm still a bit incredulous that doing anything with 250 things in SilverStripe could take 18 secs. If that's the case, then let's dig through any specific performance bottlenecks within those 250 things, for example is fluent adding overhead?

On a side note, treeview uses the session to determine what the current page is rather than an ID pass as a parameter, which I find kind of weird and probably makes it more difficult to cache.

The combination of user/context specific can*() methods, number of variations of tree views plus the complexity of doing granular cache invalidation likely means that caches won't be too effective either way (through HTTP headers or PHP).

@blueo
Copy link

blueo commented Sep 17, 2018

@chillu @unclecheese @chrispenny The speed of treeview highly depends on the structure of the tree. In the DB we're using (unclecheese has a copy) the top level pages only take 2 - 4s to pull down sitetree but page id 7772 further down the tree, takes around 12s. The difference appears to be number of child pages (ie more = slower load).
I've got an xhprof profile with fluent for a smaller call. It seems fluent might add about a 25% overhead with calls to isAvailableInLocale and we've got another check to getIsPageVisibleOnFrontend which adds another 25% and the other 50% is numChildren from hierarchy.
screen shot 2018-09-18 at 9 48 56 am
Note to image: these are the parent calls to DataList::count which makes up about 42% of calls to the DB

NB the tree call from Treedropdown is pretty similar to this when targeting sitetree.

@sminnee
Copy link
Member

sminnee commented Sep 17, 2018

The combination of user/context specific can*() methods

I think that Versioned::canView() is probably a culprit here. SiteTree::canView() was built to check permission data in bulk via InheritedPermissions.

Versioned::canView can be optimised by calling Versioned::prepopulate_versionnumber_cache() but I don't see that this is happening.

Also Versioned::canViewVersioned() "falls back to default permission check", but when it's being used as an extension on SiteTree, it should just return null in this case. Maybe this is necessary when it's being called directly, but not as part of Versioned::canView().

@sminnee
Copy link
Member

sminnee commented Sep 17, 2018

Argh, scratch that, I wasn't looking at a complete project when I concluded that prepopulate_versionnumber_cache wasn't being called :-/

@sminnee
Copy link
Member

sminnee commented Sep 17, 2018

@maxime-rainville @unclecheese I think another thing that would help here: precaching of Hierachy::numChildren() values in a similar manner to Versioned::prepopulate_versionnumber_cache.

Approach:

  • Replace the cache variable in Hierarchy::numChildren with a static cache by stage, class and ID, like Versioned
  • Introduce a prepopulate_versionnumber_cache method based on Versioned::prepopulate_versionnumber_cache
  • Use a query similar to "SELECT ParentID AS ID, COUNT(*) AS NumChildren FROM SiteTree WHERE ParentID != ID" to cache the number of children in bulk.
  • Incorporate the exclusion filters of Hierarchy::stageChildren() into this query

Hard things:

  • If Hierarchy::augmentStageChildren() is used, then this logic may break. You could disable the optimisation in that case, ignore it, or get more clever about how the query is generated (transform the stageChildren query of a single parent into a general query for all records by manipulation of the underlying SQLSelect object). It'll depend on how augmentStageChildren is used.

Should I break this out as a separate ticket?

@sminnee
Copy link
Member

sminnee commented Sep 17, 2018

I can't see any use of augmentStageChildren in the supported modules, which suggests that it is rare. As such I'd probably:

  • Detect if augmentStageChildren is used
  • If so, throw a warning that this disables the performance optimisation
  • Suppress the bulk prefetching of numChildren in this case
  • Deprecate augmentStageChildren

@sminnee
Copy link
Member

sminnee commented Sep 17, 2018

I've put this over here: silverstripe/silverstripe-framework#8379

@sminnee
Copy link
Member

sminnee commented Sep 19, 2018

Have raised this issue too which will help https://github.com/tractorcow/silverstripe-fluent/issues/467

@maxime-rainville
Copy link
Contributor

Treeview response time has been increased substantially. I think we've address most of what we can without a major refactor of the treeview rendering logic.

I'll close this for now and reopen it if our key project requires further work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants