Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize/speed up the dataverse page #7804

Closed
landreev opened this issue Apr 14, 2021 · 8 comments · Fixed by #8143
Closed

Optimize/speed up the dataverse page #7804

landreev opened this issue Apr 14, 2021 · 8 comments · Fixed by #8143

Comments

@landreev
Copy link
Contributor

Sub issue of #7788.
The dataverse page has objectively become very slow in production.
There are most definitely other things that can be slowing things down. But it is safe to say that the page is slow and resource-intensive by itself. (It is not exactly speedy even after a full restart).
Let's try to measure and quantify how much all the separate components are costing us there - db queries, solr queries, cpu cycles, waiting for other system resources, etc. - when serving our prod. data.
Some of it can be achieved by very low tech means, by inserting log statements in strategic places; but there are obviously more sophisticated ways to go about it.

@djbrooke
Copy link
Contributor

djbrooke commented Apr 14, 2021

  • As a first step, create a diagnostic build for some of this, on EC2 with prod-like data - this should give enough information to identify areas to address. This may highlight issues of solr as a resources hog, or point to something else.

@djbrooke djbrooke added the Large label Apr 14, 2021
@landreev landreev self-assigned this May 19, 2021
@landreev
Copy link
Contributor Author

landreev commented May 19, 2021

(I'll get some accurate prod. db query counts for the page first, per issue #7802, before doing anything serious for this one)

@landreev
Copy link
Contributor Author

landreev commented Sep 1, 2021

OK, this was supposed to be the "large" issue to track the overall resource use. I moved it temporarily out of the "in progress" column (back in May!!), and it was never moved back.
... But maybe there should be another umbrella issue yet - "Improve the performance of the dataverse page".

I will be adding more info on said resource use...

@landreev
Copy link
Contributor Author

landreev commented Sep 1, 2021

One important data point is that the SOLR search is still NOT a major resource draw or bottleneck there.
In fact, disabling the seach-include-fragment, and commenting out the search in the init section of the page does not reduce neither the load time, nor the number of queries (see #7802) next to nothing.
Looking at the queries there (specifically, all the extra lookups on the AuthenticatedUser table) means there is a lot of duplicated waste elsewhere in the page, outside its main search-and-display-the-results workload.

@landreev landreev changed the title Investigate/quantify overall real-life resource use by the dataverse page Optimize/speed up the dataverse page Sep 14, 2021
@landreev
Copy link
Contributor Author

Renamed the issue; will move this one back into in progress, to serve as the main/parent issue for the optimization work on the page. The associated PR will be tied to this issue.
All the separate issues related to performance investigation and optimization - this one, #7802, #7803, #7788, #7927 - were created so that they could be worked on in parallel to facilitate a team effort. But since that didn't happen they kinda ended up creating extra clutter.

@landreev landreev self-assigned this Sep 14, 2021
@landreev
Copy link
Contributor Author

(I'm not sure I can move it into the "in progress" column myself? - but this is the one that was supposed to be there, instead of #7802. That was a sub-issue, that was moved to "in progress" temporarily and got stuck there)

@landreev
Copy link
Contributor Author

landreev commented Sep 17, 2021

This is one isolated example of a pattern seen throughout our pages.
The following fragment - download metrics block on the root page:

<div id="metrics-block1" class="col-xs-4" jsf:rendered="#{DataversePage.isRootDataverse() and !settingsWrapper.rsyncOnly}">
    <div id="metrics-label" class="col-xs-4 small text-center">
        <h:outputLink value="#{systemConfig.metricsUrl}" target="_blank" title="#{bundle['metrics.title.tip']}" rendered="#{systemConfig.metricsUrl != null}">
            <span class="glyphicon glyphicon-stats"/> #{bundle['metrics.title']}
        </h:outputLink>
        <ui:fragment rendered="#{systemConfig.metricsUrl == null}">
            <span class="glyphicon glyphicon-stats"/> #{bundle['metrics.title']}
        </ui:fragment>
    </div>
    <div id="metrics-content" class="col-xs-8 small text-center">
        <h:outputFormat styleClass="metrics-downloads" value="{0} #{bundle['metrics.downloads']}">
            <f:param value="#{guestbookResponseServiceBean.getCountOfAllGuestbookResponses()}"/>
        </h:outputFormat>
    </div>
</div>

This is what it costs us in database queries alone (sorted):

SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':DownloadMethods') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ':MetricsUrl') AND (LANG IS NULL))
select count(o.id) from GuestbookResponse  o

There are obvious problems with the fragment above: a direct call to SystemConfig (instead of the SettingsWrapper) for the MetricsUrl; and, when the SettingsWrapper is called for isRsyncOnly, the code in there makes no attempt to cache the result, and (again) keeps calling SystemConfig directly; resulting in the repeated DownloadMethods queries above.
Do note just how many times they are repeated. This is NOT an entirely unexpected behavior either. JSF never promises that a render= attribute is going to be evaluated only once! Depending on the component's own children components, and where it is positioned in the page hierarchy, it may be evaluated during multiple stages (RENDER_RESPONSE, APPLY_REQUEST_VALUES, PROCESS_VALIDATIONS... etc.). Although I personally cannot explain just why it appears to be evaluated 7 (?) times here. But regardless, we should be conscious of this and avoid putting anything into rendered= attributes that results in database calls. JSF docs say you should make any such lookups during the page init. I feel it should be ok to just cache the result the first time it's retrieved. The settings queries are obviously not super expensive, but it is a common pattern. And it adds up. And it may be worth a concerted effort to eliminate any such waste throughout all of our pages.

Then there is also the part that the final lookup for the total number of guestbook entries is a fairly expensive query, that should probably be cached between pages too. (nobody needs that number to be up-to-date down to the second). But that's a different topic...

@landreev
Copy link
Contributor Author

To reiterate, the example above is not by any means a huge amount of waste by itself. It's just that, an example illustrating the dangers of putting methods that do any real work into JSF attributes; since they may end up getting evaluated multiple times.
I'll add an example that results in many more wasted queries shortly.

landreev added a commit that referenced this issue Sep 22, 2021
…ragment. (#7804)

(may still rearrange the code before finalizing)
landreev added a commit that referenced this issue Sep 28, 2021
query that becomes unreasonably expensive for an installation with
a lot of download/access activity.
This solution is very efficient, but very PostgresQL-specific, hence
the use of a stored function. (#7804)
landreev added a commit that referenced this issue Sep 29, 2021
query that becomes unreasonably expensive for an installation with
a lot of download/access activity.
This solution is very efficient, but very PostgresQL-specific, hence
the use of a stored function. (#7804)
landreev added a commit that referenced this issue Oct 4, 2021
landreev added a commit that referenced this issue Oct 5, 2021
landreev added a commit that referenced this issue Oct 6, 2021
landreev added a commit that referenced this issue Oct 7, 2021
landreev added a commit that referenced this issue Oct 12, 2021
landreev added a commit that referenced this issue Oct 12, 2021
landreev added a commit that referenced this issue Oct 12, 2021
landreev added a commit that referenced this issue Oct 14, 2021
…tedUsers, that does not

require passing the command name as a literal string; the next step is to streamline the
code duplicated between the 2 methods, similarly to how canIssueCommand() for regular users
is organized in the wrapper - now that we can. (#7804)
landreev added a commit that referenced this issue Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants