-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large Database, Deleting Old Raw Data Seems to Not Work #17431
Comments
Hi @mddvul22, it should have worked to delete the logs, it's possible the task that deletes the logs is failing somehow... would you be able to check the following?
(Note: 2019-04-08 is 730 days before today, or thereabouts. You could do a shorter range just to test. Also note that optimizing the tables also takes a long time but might also be the reason you're not seeing a change in table size.) |
Thanks @diosmosis When looking at the log_visit table, it appears that the data did get deleted. The oldest data in that table is dated April 8th, 2019. What should I be seeing when I look at the log_link_visit_action table? The server_time column of the first row in that table is showing a timestamp of "2014-04-1 13:35:07". |
For the Regarding the
|
Thanks @diosmosis I think I'm making progress. I ran the following command (giving it a shorter time period, as you mentioned):
After a few hours, I got the error that PHP had run out of memory. I was already giving it 2GB. But I raised it to 3GB, and started over. This worked and the first step of deleting visits completed. So, I think the core problem the other day was that it ran out of memory. Then, it started optimizing table log_visit and completed, successfully. Then, it started optimizing table log_link_visit_action. Here is where I ran into a problem that I've not seen before: Optimizing table log_link_visit_action... WARNING [2021-04-10 10:50:32] 1976738 /var/www/webanalytics/matomo/libs/Zend/Db/Statement/Pdo.php(228): Warning - PDOStatement::exec [Zend_Db_Statement_Exception] [PDOException] core:delete-logs-data [--dates="..."] [--idsite[="..."]] [--limit="..."] [--optimize-tables] I'm not really sure what that means. The mysql server is still running. It did reduce the size of the piwik_log_link_visit_action table from 1010GB to 844GB. Do you have any ideas about the mysql server has gone away error? Thanks! |
@mddvul22 it means the connection was automatically closed because it hadn't been used in a certain amount of time. We have a faq about how to fix these errors: https://matomo.org/faq/troubleshooting/faq_183/. Can you see whether any of these work for you? |
Hi @diosmosis I've been following that page you linked to for the past several days, raising the values each time and retrying the command to optimize tables, and still it fails each time with: [PDOException] Here are my current settings, as specified on that page: I don't have wait_timeout specified. But according to the documentation online, 28800 is already the default. Other values, as requested in that link: And, if I'm understanding my matomo config/global.ini.php file correctly, we are already using PDO/MYSQL: |
Hi @mddvul22, is it taking longer than 28800 seconds (8 hours) to see this error or less time? Are you able to check if there are any long running DELETE queries in your mysql via |
Definitely more than 8 hours to see the error. I'm now running the command again, but this time I prefaced it with "time", so I can get a real idea of when the error occurs. Also, I'll check what SHOW PROCESSLIST can show throughout this run time. |
@mddvul22 ok, if it takes more than 8 hours, it might be that there's just too much data to remove. Upping the wait_timeout to something higher might help for this initial run. |
Ok, I've raised the wait_timeout to 86400. Now running the command again. |
Ran the script yesterday, after increasing the wait_timeout to 86400 (24 hours). I got the same error as before. It appears to have generated the error just about 10 minutes ago. At this moment, SHOW_PROCESSLIST doesn't show any long running processes, but I wasn't on the server just before it died. It ran for 16 hours and 4 minutes before it generated the error. So, it appears that the problem is not wait_timeout. I will also add that the error comes during the optimization of the log_link_visit_action table: Optimizing table log_visit... done. Time elapsed: 5718.357s |
@mddvul22 the tables are optimized at the end of the process. Could you try and manually optimize the tables? |
I'd be happy to try. The command I have been running, all along, is: ./console core:delete-logs-data --dates=2001-01-01,2019-04-08 --optimize-tables What should I run for manually optimizing the tables? Is it just: -Bryan |
@mddvul22 I meant running the sql to optimize a single table at a time. If it takes so long to optimize a single table that the mysql connection times out, it might be better to run the queries one by one yourself. I'm hoping this is just because the tables are so large right now, and won't be an issue on subsequent runs. Given the log_visit table optimize finishes you can probably skip that one. (I'd also check the size of the log_visit table to see if it was reduced from the size mentioned in the first comment you made here.) |
The piwik_log_visit.ibd is definitely smaller. Originally, it was 161GB. Now, it is 138GB. Last night, I ran "OPTIMIZE TABLE piwik_log_link_visit_action;" from inside mariadb. In the mysql output, before I went to bed, it showed that it had completed 100% of the optimization, but it was still doing something, as I had not been returned to a mysql prompt. Checking it this morning, I see that I got the same error: ERROR 2013 (HY000): Lost connection to MySQL server during query So, looking at the size of the table, it has gotten much smaller. It is now 775GB, as opposed to its original 1009GB. Here is what I see in /var/log/messages: Apr 17 04:42:11 sitka kernel: mysqld[2991042]: segfault at 0 ip 000055c21df95a47 sp 00007fa853bafa50 error 6 in mysqld[55c21d315000+1228000] Then there is a long log entry that begins with: Any thoughts? BTW, this machine has 16GB of RAM if that is important. |
It sounds like the connection timed out but the query still ran (as it should). I think the tables are just too large to optimize in time. I would suggest setting the You'll also want to run OPTIMIZE on the other tables to get their size down. |
I wanted to come back to this. I've continued to work on this. I finally got our raw logs trimmed down to our desired two years. I then completed the table optimization via the mysql command line. Then, I made a few php.ini changes, based upon the information on this page: https://haydenjames.io/mysql-server-has-gone-away-error-solutions/ Finally, I ran the following command last night: ./console core:delete-logs-data --dates=2001-01-01,2019-04-21 --optimize-tables And it completed successfully! I'm not sure whether the changes to php.ini helped, or not. Perhaps it was just getting the database down to a size that Matomo could manage. Regardless, I was happy it worked without failure. |
That's great to hear @mddvul22! I'll close this since it's working for you. |
We have, what I believe is a very large Matomo database. Up until this week, we have never scheduled any regular deletion of old raw data. Some example table sizes in our database, as of two days ago:
piwik_log_link_visit_action.ibd : 1009G
piwik_log_visit.ibd: 161G
piwik_log_action.ibd: 19G
As you can see, those three tables, combined are over 1TB.
So, yesterday, I configured the section: "Regularly delete old raw data", and told it to delete logs older than 730 days.
Well, according to Matomo, it deleted data earlier this morning:
But, this morning I logged in to our server and checked and our database has not gotten smaller, at all. Today, it shows:
piwik_log_link_visit_action.ibd : 1010G
piwik_lpiwik_log_action.ibd: 162G
piwik_log_action.ibd: 19G
Why aren't these tables getting smaller? According to this page: https://matomo.org/faq/troubleshooting/faq_42/ it appears that deleting raw data should "free significant database space". Am I doing something wrong?
Expected Behavior
I would expect that those tables would have gotten much smaller.
Current Behavior
Two of the tables actually increased in size. I think the increase is because of new data having been imported overnight. So, most likely, the purge of old data did nothing.
Context
Our database is becoming unmanageable and is causing problems for us, with backups and with Matomo upgrades. We need to reduce its size.
Your Environment
The text was updated successfully, but these errors were encountered: