-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrity constraint violation: 1062 Duplicate entry '...' for key 'PRIMARY for segment temporary table #18937
Comments
In case it helps, it seems to mostly happen with high traffic instances. It could be though that it's unrelated to high traffic and maybe they use certain features or so |
The site we have the issue on would be regarded as a high traffic site. For low traffic instances, we don't see the issue at the moment. |
I think this is going to be hard to track down without a full segment query. The exception was thrown by code inside the Since it's likely that prior to the temporary index being added in #18797 duplicate records were being inserted, maybe it would be an acceptable workaround to use |
@tsteur I guess the error happens during archiving, right? Are you able to look up which segment is archived when it fails? |
@sgiehl I'm seeing in one of them around visitorType and there are some nested calls which I can see in the full backtrace. |
@tsteur I'm not able to reproduce that in any way. Which makes it hard to find the problem.
|
@sgiehl what would be the next steps to resolve this issue and make progress? If the distinct was the problem, I would assume then it be maybe more reproducible but maybe not. |
That's a good question. Simply using a Not sure if cloud team could maybe try to debug that a bit more to see if it is the query that contains duplicates or if the table already exists before, but records are inserted again nevertheless. |
I had a quick search for "when does mysql distinct not return distinct values" and "mysql distinct vs group by" as we can pretty much rule out the table exists check not working. Finding few posts around like https://www.experts-exchange.com/articles/12282/Select-Distinct-is-returning-duplicates.html where it does mention the Can you get in touch with the cloud team so they know what to log? |
@tsteur Pushed some code that could maybe help to identify the queries that produce duplicate records: |
Be good to test it if the log
If only the cloud team applies this patch, then the cloud team can use a special logger though that won't cause any issues and only logs to file. Is the purpose of the second query to check if the insert works when using |
@tsteur I would suggest we only let the cloud team apply the patch, so we can gather the queries that maybe produce the problem. I've added the |
@JasonMortonNZ @samjf could we log the query mentioned in https://github.com/matomo-org/matomo/compare/m18797?expand=1#diff-4140e491c64a9dcbf541f11e1db846adc3c27a4000810494ba8688e5feeee776R332 using our cloud file logger whenever the error occurs? Because the temp table insert eventually works at some point when it tries to archive again, we would need to re-run the |
We've been logging the queries that produced this error and then executed the Example query:
I've executed a few different queries and never received any duplicate visit ID (when testing the select) or error (when testing the insert). This was tested from a MySQL client directly, not through application code. What else could we log or do to resolve this issue @sgiehl ? |
Verified that the table is empty when this error happens, then the |
FYI I've also applied below patch but the error seems to still happen meaning it doesn't look like the read uncommitted is causing this.
|
Is there anything else we should log or try? |
@tsteur This may be a silly question since I'm not sure what's actually being done here, but is it by any change possible that this happens when there are concurrent archivers and they happen to write same id's to the same temporary table at the same time? |
@EreMaijala I was wondering this as well but it shouldn't be unless the DB connection was shared. Temporary tables only exist for the DB connection and aren't visible to other DB connections. As soon as the DB connection is closed the temporary table is also removed. The archiving basically works like this:
Sorry it's probably not clear what I'm describing above. There are cases where sub-processes launched from one PHP command can share the DB connection from the parent command. From testing this is not the case though on our platform and we're also not launching multiple commands in parallel so this shouldn't be the case here. I believe we can rule out that parallel commands are trying to write into the same table. |
@tsteur is there anything we can do here yet to move this one along or should we move it out of the milestone? |
@justinvelluppillai that would be up to the core team but we definitely need to resolve this issue as it affects quite a few users and breaks the archiving. |
Would it be worth writing a test script to build a suitable random dataset and then repeating the select / temporary table creation until the error is encountered? If we can automate recreating the issue reliably, even if only once every x attempts, then we could narrow down the possible causes. |
I did run these queries many times after we encountered the error on production and couldn't reproduce it. It was using the very same query just minutes later and running it many times. It may be really hard to reproduce. Since we know we don't execute the query twice we could potentially also think about simply adding an |
That seems like a pragmatic solution to me 👍 We've established that it only happens occasionally, it's hard to reproduce, we're not running the query multiple times and since the example queries are returning a simple list of primary keys then ignoring a duplicate insert value won't affect the intended outcome. |
While I appreciate the pragmatic approach and hope for a quick resolution to be able to upgrade Matomo, I'm still a bit worried that the root cause is unknown. Not that I have any better idea, so maybe |
We also had the problem on our instance. Additionally, we had another issue related to MariaDB, similar to #18864. By adding It may not be the best solution for the issue, but it may be a temporary solution until a fix is found. |
When this is set, the query will be no longer executed but it also makes archiving the reports in the background slower.
I would be too. However, because we ruled out that the same query might be executed twice and is writing data twice into it I'm not as worried. It's neither executed in parallel by another process nor after another. This would have been my biggest worry. |
Let's go with the |
Getting recently below error regularly. I assume this started with #18797
Goal of this issue to avoid these problems. The segment query already seems to do a
distinct
so maybe we execute the query twice when the data actually already exists and there is no need to execute it again?The text was updated successfully, but these errors were encountered: