Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable delay between each Auto-Cache Engine connection #294

Closed
raamdev opened this issue Aug 30, 2014 · 17 comments
Closed

Configurable delay between each Auto-Cache Engine connection #294

raamdev opened this issue Aug 30, 2014 · 17 comments
Assignees
Milestone

Comments

@raamdev
Copy link
Contributor

raamdev commented Aug 30, 2014

The Auto-Cache Engine currently goes as fast as it can, making many requests to URLs when pre-caching a site. It would be nice if there was a way for a site owner to specify a delay between each request (in milliseconds, using usleep()) so that if upon inspection of the Auto-Cache Engine log they see many request timeouts, the site owner can try increasing the amount of time between each request.

@jaswsinc writes...

Somewhere right around this line of code maybe: http://bit.ly/XZQtMU

Note also that the Auto-Cache Engine currently runs on 15-minute intervals, caching as many URLs as it can within those 15 minutes. If all of the URLs are not cached, the remaining URLs will be cached during the next run, with any already-cached URLs ignored from being re-cached, unless they need to be.

If the delay between each Auto-Cache Engine request is considerably long and the number of URLs to cache considerably high, you need to consider that it may take the Auto-Cache Engine many iterations to cache the entire site. For this reason, there should be a warning displayed to the site owner if they try to set the delay to anything more than a few seconds.

See also, related feature request for user-configurable schedule for Auto-Cache Engine: #293

@mvander
Copy link

mvander commented Aug 30, 2014

Agreed. This would be a great addition.

@jaswrks
Copy link

jaswrks commented Sep 2, 2014

@ronnieg303 writes from #303...

Following is some of the feedback from web host tech support after QC Pro auto-cache killed their server and brought down all websites on that server. All of this started just after I enabled auto-cache for the first time on a very large site that has lots of complex database driven content pages, which is why I need to have a caching plugin like QC in the first place.

"Something caused an out of memory error and processes started being killed (mysql was one of them)."
"When I logged in from shell, there were a high number of php processes under . "
"Apache wasn't responsive, so I couldn't see exactly what urls prompted the processes. I can't say that those processes caused the OOM issue, or were the result of it."
".. mysql errors: " ..exceeded number of connections .."

What this tells me is that QC definitely needs to be able to throttle its requests.

IMO, this should be done with a configuration setting that would allow initiation no more than x requests at a time. Auto-cache would wait for pending requests to be completed and cache results stored before initiating any more requests. A configuration setting for a max timeout per request may also be helpful. That way, if an outstanding request times out, it can be cancelled and another request for a different page could be initiated. A log of any such timeouts would need to be generated so webmasters can research and resolva any page generation time issues, or increase the timeout value.

Limiting number of outstanding page generation requests to x as a throttling mechanism allows simple sites with faster overall page generation times to cache quickly, while allowing larger and more complex sites like mine to cache as quickly as their page generation times may permit, without overloading host resources and causing things to fail catastrophically like they do now.

Until this hosting resource overloading issue is resolved, I am unable to use QC Pro, and will have to continue to rely on W3 Total Cache, which has been running on a similar size site with exactly same configuration for several months now without this issue.

Contact me directly for additional info and specific domain names.

@raamdev raamdev added this to the Next Release milestone Sep 2, 2014
@raamdev
Copy link
Contributor Author

raamdev commented Sep 2, 2014

Marking this a bug after receiving the report in #303.

@ronnieg303
Copy link

#306 is no way a duplicate of #294. #306 is a totally separate issue, having to do with a page load timeout while getting an individual page cached, not with scheduling. #306 needs to be reopened as a separate issue.

@raamdev
Copy link
Contributor Author

raamdev commented Sep 3, 2014

@ronnieg303 #306 specifically refers to the Auto-Cache Engine timeout, which is what this (#294) issue seeks to address. Therefore, #306 is a duplicate. Please re-read this issue's description:

The Auto-Cache Engine currently goes as fast as it can, making many requests to URLs when pre-caching a site. It would be nice if there was a way for a site owner to specify a delay between each request (in milliseconds, using usleep()) so that if upon inspection of the Auto-Cache Engine log they see many request timeouts, the site owner can try increasing the amount of time between each request.

@ronnieg303
Copy link

I actually did read that. But I believe that the earlier description is actually confusing the issue. Increasing or adding a delay between requests does not help or change the fact that any one request can still time out if it does not receive the page content to be cached within 5000ms. Therefore, the suggestion that increasing or adding a delay between requests would reduce the timeouts being logged is erroneous. There are actually two separate timings at play, not one: 1) time to execute a page request and get the results to cache, and 2) a timing delay between requests, which is a server resource / loading issue.

@raamdev
Copy link
Contributor Author

raamdev commented Sep 3, 2014

@ronnieg303 Apologies for the confusion. When I created this issue (#294) the research I conducted led me to discover that WordPress itself has a way of changing the timeout length (for which the default is 5000ms), so while that information was in my head, it was not actually anywhere in this issue.

The timeout that you're seeing and referring to by "1) time to execute a page request and get the results to cache" is coming from the WordPress HTTP API, which Quick Cache (being a WordPress plugin) uses. (See the WordPress HTTP API on the Codex for more details.)

Here's the relevant section related to changing the default timeout:

The argument 'timeout' allows for setting the time in seconds, before the connection is dropped and an error is returned. The default for this value is 5 seconds and it also has a filter named, 'http_request_timeout', in case you want to write a plugin that sets it for every request.

With that, you can change the default timeout to something longer, such as 15 seconds (15000ms), by adding the following code to your theme's functions.php file (or, even better, creating an MU-Plugin):

add_action( 'http_request_timeout', '__custom_http_timeout_extension' );

function __custom_http_timeout_extension( ) {
    return 15;
}

Changing this value is really not something that Quick Cache should touch and rather is something that should be done by a site owner on a per-site basis, as such changes would be very dependent on the server environment and configuration (maximum values set in PHP, etc.).

@ronnieg303
Copy link

I think we are actually getting to the same page then. re:

"Changing this value is really not something that Quick Cache should touch and rather is something that should be done by a site owner on a per-site basis, as such changes would be very dependent on the server environment and configuration (maximum values set in PHP, etc.)."

This is in fact exactly what my issue #306 proposes that QC be able to do if a site owner so requests: override that default http timeout, and only for QC page requests, because the site owner already knows that some pages may take longer to render and are generating timeouts in the auto-cache log.

In fact, WP does not time out all http requests at 5000ms, because if it did, many pages on my site would never render in that time and would be failing left and right, and that doesn't happen. So that timeout appears to be applied only to internally initialed http requests, like those QC does.

When a site's pages take more than 5 seconds to load, that is exactly why caching plugins like QC are needed.

@raamdev
Copy link
Contributor Author

raamdev commented Sep 3, 2014

This is in fact exactly what my issue #306 proposes that QC be able to do if a site owner so requests: override that default http timeout, and only for QC page requests, because the site owner already knows that some pages may take longer to render and are generating timeouts in the auto-cache log.

Right. And as I see it, this is something that a site owner should configure on a per-site basis, outside of Quick Cache itself.

But let me bring @jaswsinc into the mix here and get his thoughts on this. @jaswsinc what do you think? Add an Auto-Cache Engine option to extend the WordPress HTTP API timeout, an option that would only apply to requests being made by the Auto-Cache Engine?

@ronnieg303
Copy link

Please keep in mind that a typical WP site owner is not going to have a clue about how to manage or change this, and even if they did, their hosting service may or may not support even allowing them to do it. If it's anything more complex than adding a line to wp-config.php, most users are going to freak out.

@jaswrks
Copy link

jaswrks commented Sep 3, 2014

@jaswsinc what do you think? Add an Auto-Cache Engine option to extend the WordPress HTTP API timeout, an option that would only apply to requests being made by the Auto-Cache Engine?

As I see it, a stream timeout is not necessary for the ACE (Auto-Cache Engine). Ordinarily, I would agree that a stream timeout should be configurable. However, in this particular case we are dealing with non-blocking (i.e. asynchronous) connections over HTTP, and PHP's ignore_user_abort() functionality so that timeouts become a moot point; i.e. so that site owners don't need to worry about this issue at all.

In the WP_Http class, there are two timeouts that we deal with.

  1. Connection timeout; i.e. how long it takes to "connect" over HTTP.
  2. Stream timeout (aka: read timeout); i.e. how long it takes WP to render the page and to respond back with the HTML that Quick Cache needs to cache.

The connection timeout is 5 seconds by default. I see no reason to make this configurable in QC, assuming that we correct the current issue here; where the ACE is creating a burden on some servers by connecting too many times in too short a period. If the delay (i.e. offset) is configurable, that should eliminate load issues, and then 5 is more than enough. If a timeout occurs in the initial connection, the ACE can just come back to that one later anyway :-)

Note: as Raam mentioned, there is a filter in WP where the default can be altered, but 5 seconds is a good value here. Anything more would just slow the ACE down and make it inefficient. If a connection timeout occurs, QC can recover from this anyway. There's really no reason for a server to take longer than 5 seconds to respond to itself. It's a connection coming from the server, to the server.


The second timeout is not an issue. Why?

The ACE uses a non-blocking HTTP connection. When the HTTP request is made, once the connection succeeds the ACE immediately disconnects from the socket. On the other end (where QC is running and waiting for the page to be rendered so it can get cached), we detect that a connection was opened by the ACE, and in this case we call upon ignore_user_abort().

So what does that mean? It means the ACE does not need to wait for the page to be rendered at all, it just takes however long it takes. The ACE disconnects long before this is done, and that's OK. The process was spawned and will do it's thing, all by itself. The ACE just started the process :-)

@jaswrks
Copy link

jaswrks commented Sep 3, 2014

@ronnieg303
Copy link

Good explanation. After reading this, it is likely all of the timeouts I was seeing in the log were probably caused by the original server overloading issue that caused it to crash. The server just stopped responding so all http connection requests because it was down. But that begs the question: Why didn't ACE recognize that there was a major server issue and suspend processing? It just kept hammering the server until I managed to kill it by renaming the plugin folder and renaming wp-cron.php so nothing could be scheduled.

@jaswrks
Copy link

jaswrks commented Sep 3, 2014

Why didn't ACE recognize that there was a major server issue and suspend processing?
It just kept hammering

Exactly, and that's the issue here. Thanks! The ACE is not smart enough to deal with this yet. Fortunately, we are :-) We just need to teach our little QC robot this trick now. haha

As it exists now, it just keeps hammering you. On a large site, this is causing issues because there are enough pages to place a burden on the processor. My advice, if your site is quite large, please disable the ACE for now, until the next release can address this concern.

@jaswrks
Copy link

jaswrks commented Sep 4, 2014

2014-09-04_07-48-52

@raamdev
Copy link
Contributor Author

raamdev commented Sep 5, 2014

@jaswsinc Thanks so much for submitting a PR for this.

This issue has been closed by PR #311 and this new Auto-Cache Engine option will go out with the next release.

@raamdev raamdev closed this as completed Sep 5, 2014
@raamdev
Copy link
Contributor Author

raamdev commented Sep 5, 2014

Next release changelog:

  • Bug Fix (Pro): The Auto-Cache Engine now has an option to configure a delay between each request when pre-caching the site. There were some reports of the Auto-Cache Engine causing load issues with large sites on servers that sometimes had trouble handling many requests. See #294.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants