Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid running a query for every childless page #31

Merged
merged 1 commit into from
Aug 16, 2017

Conversation

zploskey
Copy link
Contributor

@zploskey zploskey commented Aug 5, 2017

Previously, an extra query would be run for every leaf node in
the SimplePage navigation tree. Here we avoid those queries by
fetching all of the SimplePages from the outset and using the same
result to build the navigation tree.

We also remove a redundant "AND simple_pages_pages.is_published = 1"
conditional from these queries. The query already requires that pages be published.

All tests pass, and on my site this reduces database query time on the home page by approximately 50%, and reduced the number of queries from 19 to 10.

@zerocrates
Copy link
Member

I don't think the is_published change is quite right... The default getSelect will do a query based on is_published, but that's only if the user doesn't have the "view all" permission. Logged-in supers and admins wouldn't have the is_published where clause, (also users could see pages they themselves created but are not published as well).

@zerocrates
Copy link
Member

I'm slightly hesitant about requiring all pages to be loaded just because of the case where the function is called with an actual parent page given to start with... but we don't actually do that anywhere internally and the function's effectively just used to get the entire tree the only place it's used, so that's probably fine.

It would probably be a little better still to do slightly more processing on $pages when you first retrieve it, binning the pages into arrays keyed at the top level with their parent ID. You then wouldn't need to loop through the entire set of pages every time you step down recursively, but could instead just look up the appropriate key.

@zploskey
Copy link
Contributor Author

zploskey commented Aug 9, 2017

Let me know if you have any other concerns. Works fine for me.

Previously, an extra query would be run for every leaf node in
the SimplePage navigation tree. Here we avoid those queries by
fetching all of the SimplePages from the outset and using the same
result to build the navigation tree.

After fetching all SimplePages, pages are binned in a 2-D array
keyed by their parent_id.
@zploskey
Copy link
Contributor Author

Squashed into a single commit.

@zerocrates
Copy link
Member

There's just one further somewhat obvious improvement I can think of, which is that we already have to do a findAll on the pages on initial load, so we could just re-use that result for this purpose and avoid a double scan of the table.

I'm not sure this needs to be held back for that purpose, though.

@zploskey
Copy link
Contributor Author

The issue with saving the results of the findAll query in hookDefineRoutes is that plugin filters on simple pages browse_sql are not run on this query as they are not yet defined.

@zerocrates
Copy link
Member

Hmmmm that doesn't sound right... hooks and filters all get registered at once (with the exception of a few special cases for things like install). There's no time within any hook that's "too early" for the browse_sql filter to be applied.

The issue instead is that findAll just doesn't run that filter. It only gets used if you use findBy or getSelectForFindBy.

@zploskey
Copy link
Contributor Author

Ah I see, I was sure it wasn't being run but wrong about why. Would changing that query to findBy be the appropriate solution? It also may not meet the requirement that the page be published.

@zploskey
Copy link
Contributor Author

Additionally sort order is not applied to that query. We could add a sort to it, as I don't think it matters much for defining routes what order they're in. We could then simply skip over non-public pages if requireIsPublished flag is set. However, this might require a change to the API for simple_pages_get_links_for_children_pages as we could no longer allow passing in a custom sort order. I guess the other possibility is we only run a new query if the caller specified a non-default sort order. That's probably the best backwards compatible solution.

@zerocrates
Copy link
Member

Hmm, I'm now leaning toward just not fooling with that particular issue at the moment.

@zerocrates zerocrates merged commit 332e7ad into omeka:master Aug 16, 2017
@zploskey zploskey deleted the fewer_queries branch November 1, 2017 04:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants