-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase timeout and improve DB table indexes for jobs ProcessUsageStatsLogFile and RemoveDoubleClicks #9822
Comments
Because the current solution works also for the huge installation, I will close this issue for now... |
Hi @bozana
|
Same here - one usage events log file is just not processed with the same errors as OP reported. |
Working fine now after transferring the database to nutanix server. |
@bozana Still no luck. OJS 3.4.0-7 . MariaDB database is on a central database server (no local installation, we have to use the central primary/replica server).
|
@mpbraendle, hmmm... it seems the job batch fails on CompileUniqueInvestigations, so the solution I thought in this issue would not help. For this issue I thought that the reading of the log file and inserting of the entries from the log file into the DB tables, i.e. the job ProcessUsageStatsLogFile, are failing . |
@bozana - the timeout of 60s is taken over from lib/pkp/jobs/BaseJob.php If you add a high value of timeout, e.g.
or even to the PKPProcessUsageStatsLogFile and RemoveDoubleClicks class, the jobs run through. Especially the removeDoubleClicks task my take a while (even on a locally installed MariaDB database). Usually, we have 1000-2000 views and 1000-1500 downloads per day, sometimes up to 10000. This will increase with time when we add more articles retrospectively. |
Hi @mpbraendle, thanks a lot! |
No, I run them via workers (using supervisor). As recommended in https://docs.pkp.sfu.ca/admin-guide/en/deploy-jobs |
Hi all , the problem that we have here is bit more complex considering our current infrastructural limitation of queue system . So , first of all , for such long running queue jobs, the best and right approach is to use the worker with supervisor as @mpbraendle has mentioned . However , we have some limitation of Now setting the
and
Similar situation in more details explained at laravel/framework#35633 . NOTE that this case will arise when jobs are running via worker and MULTIPLE worker is set up which is currently possible using supervisor . I can think of few possible solution as such
Another options is to increase the |
That shouldn't be much of a problem, because the statistics are usually calculated once per night. |
It's not about when or how many times the job supposed to run but when and if the job run multiple times at same time by @bozana what will be the impact the same stat jobs run more than one time at same time ? |
Hi @touhidurabir, this should not happen -- the stats calculation could then be wrong. |
And also, some jobs, e.g. the RemoveDoubleClicks that is making problem here, cannot be split further into smaller jobs -- it uses this SQL on one DB table that should appropriately remove the user double clicks:
|
Maybe also a note that all stats jobs are chained, s. https://github.com/pkp/pkp-lib/blob/main/classes/task/PKPUsageStatsLoader.php#L122, because the order is also important and that they do not run in parallel. |
pkp/pkp-lib#9822 Increase timeout and improve DB table indexes for usage stats jobs
#9822 Increase timeout and improve DB table indexes for us…
pkp/pkp-lib#9822 Increase timeout and improve DB table indexes for usage stats jobs
pkp/pkp-lib#9822 Increase timeout and improve DB table indexes for usage stats jobs
pkp/pkp-lib#9822 Increase timeout and improve DB table indexes for usage stats jobs
@mpbraendle, I added the indexes and increased the timout to be 600. When you are able to test it, please let us know if that all helped... |
The name and topic of this issue changed:
It seems the ProcessUsageStatsLogFile (that reads and simply validates the log file, removes the bot entries and inserts the entries into the usage stats temporary tables) is not the problem any more, so that it does not need to be separate into smaller processes (that would process chunks of the log file at once).
The job RemoveDoubleClicks however seems to be the problem -- it takes too long for bigger log files. The job executes this function https://github.com/pkp/pkp-lib/blob/main/classes/statistics/PKPTemporaryTotalsDAO.php#L90.
To solve this we will change the index on the table usage_stats_total_temporary_records to consider user_agent and canonical_url columns, so to be something like:
$table->index(['load_id', 'context_id', 'ip', 'user_agent', 'canonical_url'], 'ust_load_id_context_id_ip'_ua_url);
Also, we will increase the $timeout (and $retry_after) for this job.
PRs:
stable-3_4_0:
main:
Original issue:
Big installations, that have huge log files and that use AcronPlugin for job processing, can experience timeouts. Thus, try to separate the ProcessUsageStatsLogFile into several smaller jobs, so that each job processes only e.g. 100 lines of the log file.
The text was updated successfully, but these errors were encountered: