Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable test testIntegrationSuite4.TestTruncatePartitionAndDropTable #27889

Closed
Tracked by #25899
sleepymole opened this issue Sep 8, 2021 · 13 comments · Fixed by #28094 or #28794
Closed
Tracked by #25899

Unstable test testIntegrationSuite4.TestTruncatePartitionAndDropTable #27889

sleepymole opened this issue Sep 8, 2021 · 13 comments · Fixed by #28094 or #28794
Assignees
Labels
component/test severity/major type/bug The issue is confirmed as a bug.

Comments

@sleepymole
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!


[2021-09-08T09:26:28.816Z] db_partition_test.go:2236:

[2021-09-08T09:26:28.816Z]     c.Assert(hasOldPartitionData, IsFalse)

[2021-09-08T09:26:28.816Z] ... obtained bool = true


1. Minimal reproduce step (Required)

https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/32078/pipeline/62

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

@tisonkun
Copy link
Contributor

[2021-09-10T03:30:28.315Z] FAIL: db_partition_test.go:2174: testIntegrationSuite4.TestTruncatePartitionAndDropTable
[2021-09-10T03:30:28.315Z] 
[2021-09-10T03:30:28.315Z] db_partition_test.go:2236:
[2021-09-10T03:30:28.315Z]     c.Assert(hasOldPartitionData, IsFalse)
[2021-09-10T03:30:28.315Z] ... obtained bool = true

another instance https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/32489/pipeline

@unconsolable
Copy link
Contributor

another occurrence
https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/32713/pipeline

[2021-09-11T00:30:28.496Z] FAIL: db_partition_test.go:2174: testIntegrationSuite4.TestTruncatePartitionAndDropTable
[2021-09-11T00:30:28.496Z] 
[2021-09-11T00:30:28.496Z] db_partition_test.go:2236:
[2021-09-11T00:30:28.496Z]     c.Assert(hasOldPartitionData, IsFalse)
[2021-09-11T00:30:28.496Z] ... obtained bool = true

@feitian124
Copy link
Contributor

same here
https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/32849/pipeline#step-73-log-1375

[2021-09-12T14:38:06.071Z] FAIL
[2021-09-12T14:38:06.071Z] + cat test.log
[2021-09-12T14:38:06.071Z] + grep -Ev '^\[[[:digit:]]{4}(/[[:digit:]]{2}){2}'
[2021-09-12T14:38:06.071Z] + grep -A 30 '\-------'
[2021-09-12T14:38:06.071Z] + grep -A 29 FAIL
[2021-09-12T14:38:06.071Z] FAIL: db_partition_test.go:2174: testIntegrationSuite4.TestTruncatePartitionAndDropTable
[2021-09-12T14:38:06.071Z] 
[2021-09-12T14:38:06.071Z] db_partition_test.go:2236:
[2021-09-12T14:38:06.071Z]     c.Assert(hasOldPartitionData, IsFalse)
[2021-09-12T14:38:06.071Z] ... obtained bool = true

@morgo
Copy link
Contributor

morgo commented Sep 13, 2021

@mjonss @tiancaiamao PTAL ;-)

@tisonkun
Copy link
Contributor

@github-actions
Copy link

Please check whether the issue should be labeled with 'affects-x.y' or 'backport-x.y.z',
and then remove 'needs-more-info' label.

@karuppiah7890
Copy link
Contributor

This test is still unstable I think. Check

https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/34562/pipeline

https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/34525/pipeline/

[2021-09-18T10:26:59.935Z] FAIL: db_partition_test.go:2181: testIntegrationSuite5.TestTruncatePartitionAndDropTable
[2021-09-18T10:26:59.935Z] 
[2021-09-18T10:26:59.935Z] db_partition_test.go:2245:
[2021-09-18T10:26:59.935Z]     c.Assert(hasOldPartitionData, IsFalse, Commentf("take time %v", time.Since(startTime)))
[2021-09-18T10:26:59.935Z] ... obtained bool = true
[2021-09-18T10:26:59.935Z] ... take time 16.694405079s
[2021-09-18T07:18:40.274Z] FAIL: db_partition_test.go:2181: testIntegrationSuite5.TestTruncatePartitionAndDropTable
[2021-09-18T07:18:40.274Z] 
[2021-09-18T07:18:40.274Z] db_partition_test.go:2245:
[2021-09-18T07:18:40.274Z]     c.Assert(hasOldPartitionData, IsFalse, Commentf("take time %v", time.Since(startTime)))
[2021-09-18T07:18:40.274Z] ... obtained bool = true
[2021-09-18T07:18:40.274Z] ... take time 15.211611077s

@tisonkun
Copy link
Contributor

Reopened. Thanks for your report @karuppiah7890 ! Please take a look @tiancaiamao

@unconsolable
Copy link
Contributor

Another occurrence https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/34961/pipeline

[2021-09-22T05:01:11.016Z] FAIL: db_partition_test.go:2181: testIntegrationSuite5.TestTruncatePartitionAndDropTable
[2021-09-22T05:01:11.016Z] 
[2021-09-22T05:01:11.016Z] db_partition_test.go:2282:
[2021-09-22T05:01:11.016Z]     c.Assert(hasOldPartitionData, IsFalse)
[2021-09-22T05:01:11.016Z] ... obtained bool = true

@tiancaiamao
Copy link
Contributor

@karuppiah7890 line 2245 is the same error, and that's weird. This time from the log, I can see the key range is deleted

[2021-09-18T07:18:36.548Z] [2021/09/18 15:13:27.523 +08:00] [INFO] [db_partition_test.go:2243] ["truncate partition table"] [key=7480000000000001d6]
[2021-09-18T07:18:36.274Z] [2021/09/18 15:13:25.184 +08:00] [INFO] [delete_range.go:238] ["[ddl] delRange emulator complete task"] [jobID=479] [elementID=470] [startKey=7480000000000001d6] [endKey=74800000000000
01d7]

@unconsolable line 2282 is another one, although they are the same function. The reason might be different, I find this kind of log, it needs furthur investigation.

[2021-09-22T05:01:08.435Z] [2021/09/22 12:56:43.933 +08:00] [WARN] [ddl_worker.go:200] ["[ddl] handle DDL job failed"] [worker="worker 134, tp add index"] [error="context canceled"] [errorVerbose="context canceled\ngithub.com/pingcap/errors.AddStack\n\t/nfs/cache/mod/github.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/nfs/cache/mod/github.com/pingcap/[email protected]/juju_adaptor.go:15\ngithub.com/tikv/client-go/v2/internal/locate.(*RegionRequestSender).onSendFail\n\t/nfs/cache/mod/github.com/tikv/client-go/[email protected]/internal/locate/region_request.go:1235\ngithub.com/tikv/client-go/v2/internal/locate.(*RegionRequestSender).sendReqToRegion\n\t/nfs/cache/mod/github.com/tikv/client-go/[email protected]/internal/locate/region_request.go:1197\ngithub.com/tikv/client-go/v2/internal/locate.(*RegionRequestSender).SendReqCtx\n\t/nfs/cache/mod/github.com/tikv/client-go/[email protected]/internal/locate/region_request.go:977\ngithub.com/tikv/client-go/v2/txnkv/txnsnapshot.(*ClientHelper).SendReqCtx\n\t/nfs/cache/mod/github.com/tikv/client-go/[email protected]/txnkv/txnsnapshot/client_helper.go:108\ngithub.com/tikv/client-go/v2/txnkv/txnsnapshot.(*KVSnapshot).get\n\t/nfs/cache/mod/github.com/tikv/client-go/[email protected]/txnkv/txnsnapshot/snapshot.go:544\ngithub.com/tikv/client-go/v2/txnkv/txnsnapshot.(*KVSnapshot).Get\n\t/nfs/cache/mod/github.com/tikv/client-go/[email protected]/txnkv/txnsnapshot/snapshot.go:466\ngithub.com/pingcap/tidb/store/driver/txn.(*tikvSnapshot).Get\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/store/driver/txn/snapshot.go:56\ngithub.com/pingcap/tidb/store/driver/txn.(*tikvTxn).Get\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/store/driver/txn/txn_driver.go:152\ngithub.com/pingcap/tidb/structure.(*TxStructure).loadListMeta\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/structure/list.go:218\ngithub.com/pingcap/tidb/structure.(*TxStructure).LIndex\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/structure/list.go:164\ngithub.com/pingcap/tidb/meta.(*Meta).getDDLJob\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/meta/meta.go:762\ngithub.com/pingcap/tidb/meta.(*Meta).GetDDLJobByIdx\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/meta/meta.go:791\ngithub.com/pingcap/tidb/ddl.(*worker).getFirstDDLJob\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/ddl/ddl_worker.go:332\ngithub.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue.func1\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/ddl/ddl_worker.go:507\ngithub.com/pingcap/tidb/kv.RunInNewTxn\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/kv/txn.go:47\ngithub.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/ddl/ddl_worker.go:498\ngithub.com/pingcap/tidb/ddl.(*worker).start\n\t/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/ddl/ddl_worker.go:198\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"]

@feitian124
Copy link
Contributor

feitian124 commented Oct 7, 2021

another case

[2021-10-07T03:23:32.292Z] FAIL: db_partition_test.go:2181: testIntegrationSuite5.TestTruncatePartitionAndDropTable
[2021-10-07T03:23:32.292Z] 
[2021-10-07T03:23:32.292Z] db_partition_test.go:2245:
[2021-10-07T03:23:32.292Z]     c.Assert(hasOldPartitionData, IsFalse, Commentf("take time %v", time.Since(startTime)))
[2021-10-07T03:23:32.292Z] ... obtained bool = true
[2021-10-07T03:23:32.292Z] ... take time 15.140463056s

https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check_2/detail/tidb_ghpr_check_2/36962/pipeline

@tisonkun
Copy link
Contributor

tisonkun commented Oct 7, 2021

@github-actions
Copy link

Please check whether the issue should be labeled with 'affects-x.y' or 'fixes-x.y.z', and then remove 'needs-more-info' label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/test severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
8 participants