-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not reporting data for timerange weeks but single days are working #14123
Comments
Hard to say what is happening there. Have you made any config changes to your |
Below our config file. [log] [General] [Tracker] [mail] [Plugins] [PluginsInstalled] [HeatmapSessionRecording] [CustomReports] [MediaAnalytics] [UsersFlow] [MultiChannelConversionAttribution] [Funnels] Do you see something weird? Best Regards |
Can't find anything suspicious here. Do you maybe see any information in the php error/webserver logs or maybe you even write the archive cron output to a log file and something is in there? Eg be interesting to see if the archiving job maybe fails sometimes randomly with some memory error or so. I would recommend to write the output of the archiving to a log file like this:
|
@tsteur
The ownership and directory permission are correct:
Neither the webserver or php error logs contain any error messages on that day. Those messages occured in https://forum.matomo.org/t/archive-problem-after-upgrade/11738/13 too, but have never been resolved. |
|
Sure, we'll try this. I added sleep 1 |
You may need to use something like |
I changed it to:
According to the log it seems that this is working correctly:
Every job ist starting 1 second delayed.
4 jobs in 1 second.
If I understand this correctly then this is exactly what I've done. In addition I assume you're saying that there should only be 1 archiving process at a time? We had this before: The problem is that we're running some Matomo instances where the archive process takes with only one process like 8 hours. Then I found your comment and used this. |
It's fine to have multiple at a time, I would just start them with a little delay (the sleep) to avoid any race conditions. |
This issue occured again on 08.03.2019 and 10.03.2019. On both days the same and only siteid was affected. A very interesting thing is: the archive process is running every hour between 1 am and 23 pm. The delay between the archive processes does not apply to the run at 1 am but to every other archive process running during a day. I'd really like to know why this happens. As a very first guess I assume it's because of UTC 0 am. So I edited my cron file so that the first archive run a day runs at 2 am. In addition I increased the sleep 1 to sleep 5. I'll check tomorrow if the delay is applied during the first archive run. |
Cheers for letting us know and giving it a try 👍 |
A new update: The main problem occured again yesterday. The same and only siteid as mentioned in my comment before was affected. That's why I would rather see this issue around this siteid and not with the multithreading. I just have no idea how to debug the problem regarding the siteid. I already checked the MySQL max connections. 300 are set and only 24 have been used so far. |
SiteNote:
During this month we switched from https://matomo.org/faq/how-to/faq_73/
to https://matomo.org/faq/how-to/faq_155/
I assume there are some problems with core:invalidate-report-data and/or waiting for auto archive run as documented. That's why I used the previous way, deleting data from the whole month and run manual archive. We'll see if this changes anything. |
Thanks for letting us know again 👍 curious to see the results |
Here I'm back again, providing my results: After that only one idea was left: switching back to single threaded archiving. And well, the issue has not been occuring anymore since that day (14.03.2019). In addition, using the multithreading we noticed something else. The weekly report scheduled in Matomo is send multiple times.
I received four e-mails this night at 01:44:30 am. As you can see in the log output I provided above
There are four jobs finishing at 01:44:30 am, although I don't really get the time elapsed there. Later on, matomot tries to prevent another report to be sent:
I really would like to get this multithreading issue fixed, as it's a huge speedup for archiving lots of websites. |
Hi there, sorry to get back to you so late. What happens if you start the archivers with a delay of one minute? Of course it's not best solution and doesn't fix the actual issue, just curious if it works. |
Hi @tsteur, actually we are facing the exact same issue. We have archiving on, every */15 minutes. No race conditions, archiving is done in about 2 minutes. Unfortunatly I have to mask the values.
|
@Littlericket can you confirm it basically never happens that two archivers run at the same time? Does any data appear for week when you don't search? Are there more than 500 different page urls recorded (you would see this in the pagination when you don't search like "1-10 of 500")? |
Hi @tsteur, thanks for you reply. Yes I can confirm that there is never a chance running two archivers at the same time. We have a lock if one is running and check if the lock is present before starting another one. There's data for the week, but we're not having more than 500 urls recorded. We're at 208 for the given week. I haven't read that theres a minimum of URLs needed for archiving in this ticket here tho... |
Sorry for replying late, I've been quite busy. What's working for us is running a script every hour, that starts one archiving process which runs across every website. In the very beginning we used a lockfile too, but removed when we heard about the multi processing. This results in all customers but one finsh in one hour, meaning there is never a second archive process running in parallel. The archiving for the mentioned customer can take longer then one hour, so there is a chance that a second archive process is running in parallel. But we never faced the previous issue anymore since we rolled back to single archive processing. So, regarding the specific customer, having a second process running in parallel which is started with a delay of one hour and occuring only 3-4 times a day is working fine. |
@mattab or @diosmosis do any of you have an idea how this could happen? Not quite sure how this happens. @daylicron @Littlericket @chgarling do any of you know if you have configured
|
The config.php from us (daylicron, chgarling) has been provided in in the comment by bab-mkedziora.
Yes indeed, we configured deletion of raw data after 180 days. We don't delete old reports. This matomo instance was affected very often by this issue, meaning sometimes every day. I have to add, that I know of to other matomo instances where we don't delete the raw data but which were affected by the same issue too. The issue occured there only 1 to 2 times. Since we rolled back to single threaded archiving, the issue hasn't occured there anymore. I would consider this something to start at. I'm curious what @Littlericket is going to report. |
We don't have any custom settings except a proxy and another cors domain but we have "old raw data" deletion set to 90 Days. Old aggregated report data deletion is disabled. |
Maybe related to #14379 although this is probably a different issue? |
@mattab thanks for the reply. I've executed your second query over every table we have. We have no duplicates. We dont have a day with 0 visits in the graph... |
Currently we don't have them as well:
But since we moved back to singlethreaded archiving, we don't have this problem anymore. So I'll try to repoduce this in a testsetup and check this again. |
noticed on the demo we have the same problem with some of the weekly reports appearing as zeros while days/months/years are all working. |
@mattab @diosmosis maybe any idea how this could happen? @mattab have you seen this maybe somewhere as well? |
This would mean the archives are failing to finish, correct? Maybe we can check whether the archives even exist for those dates or if they are finished with DONE_ERROR? |
Be good to know maybe. As there seems to be a pattern I wonder if it is somehow timezone related or so. Wouldn't expect 0 visits then though but instead only partial result. If anyone can give us access to their DB and matomo instance that be great: hello at matomo.org is our email. |
Yes, seen this happen several times as well (although I had not noticed it only occurred on weeks overlapping two months). It happens randomly. This was a much bigger issue months ago and now it happens a lot less (in recent Matomo releases). It's still visible for example on the demo: https://demo.matomo.org/index.php?module=CoreHome&action=index&date=today&period=week&idSite=62#?idSite=62&period=week&date=today&segment=&category=General_Visitors&subcategory=General_Overview 2 weeks are missing data and overlap 2 months (Select |
Fyi: a separate issue was created for this particular bug in #15363 |
Thanks for reporting this issue. |
Hi there,
we are running Matomo 3.8.1 under Debian 8.11 with PHP-FPM (php-5.6.40) and experienced the following problem more than once:
Data is collected with a Matomo tracking URL as usual, the archive cronjobs runs every hour. Reports for single days are working fine, but if we select the complete week as timerange, the report is empty.
After invalidating and recalculating the broken timerange, everything is fine.
The text was updated successfully, but these errors were encountered: