-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding column in SQL Server connector sometimes fails even though column has been added successfully #12535
Comments
The test looks very flaky recently. Also, it's incorrect results now, not deadlock issue as issue description.
|
https://github.com/trinodb/trino/actions/runs/4015532208/jobs/6900588157
|
https://github.com/trinodb/trino/actions/runs/4017027158/jobs/6903973714
|
https://github.com/trinodb/trino/actions/runs/4022968719/jobs/6915702323
|
https://github.com/trinodb/trino/actions/runs/4025145616/jobs/6920371453
|
https://github.com/trinodb/trino/actions/runs/4055689915/jobs/6985043630
|
@vlad-lyutenko is working on this. Thank you Vlad! |
For me this failures looks pretty strange. |
Can we mark the test |
Disabling the test looks good to me. I guess |
I modified a little bit test - in the way, that I try to insert data into columns, which were failed to add with exception: And I see, that sometimes I can successfully add data to this columns, so they actually were added. So the question is, do we have some retry mechanism in engine ? |
Some info: This exception could occur not only during actual 'ALTER table ' connector execution, but also during getColumns() call of BaseJdbcClient. And in this second case, sometimes column is added despite an exception. |
Some flow execution from logs https://github.com/trinodb/trino/actions/runs/4091562653/jobs/7055747852:
|
The default isolation level for sqlserver is trino/core/trino-main/src/main/java/io/trino/metadata/MetadataManager.java Lines 740 to 752 in 4d83020
From the source code, both lines 744 and 748 may throw deadlock exceptions, but line 748 throws an exception and alter actually succeeds. Maybe add a more detailed judgment to |
On further thought, the flaky failure may be directly related to #15225. The actual operation and notification to SystemSecurityMetadata is not atomic. Should all exceptions for notifications to SystemSecurityMetadata be caught? |
@guiyanakuang Thanks for your comment. Yep, it was my initial thought, but then I noticed, that we actually fails not during
And this column is still added, so cannot understand how it could be. Maybe some retries on engine level. Please take a look on stack trace: |
Yes, there are three places where a deadlock exception may be thrown.
@vlad-lyutenko. So the test will actually only fail if the code deadlocks in the third place. Because in the second place we have successfully executed alter. Sqlserver does not retry. |
@guiyanakuang |
So what you think could be good fix for, this test. Or check by performing force insert into column to understand that column actually added? |
I now prefer to add exception catching to anything like the third place (#15225 added event notification code) |
Just note that this may or may not fix the issue. |
This issue was created on May 25, 2022. #15225 was merged into master a two weeks ago. |
This reverts commit 33fdaa0. The current implementation has the following problems: - does not work for Hive connector with `hive.security=system` (#15225 (comment)) - notifications are propagated even if transaction doesn't end up committing the DDL changes (#15225 (comment)) - notification propagation may disrupt transaction finish even if changes already applied in the external system (#12535 (comment)), for any connector that's using SYSTEM security (Iceberg, Delta with system security, or any JDBC connector)
This reverts commit 33fdaa0. The current implementation has the following problems: - does not work for Hive connector with `hive.security=system` (#15225 (comment)) - notifications are propagated even if transaction doesn't end up committing the DDL changes (#15225 (comment)) - notification propagation may disrupt transaction finish even if changes already applied in the external system (#12535 (comment)), for any connector that's using SYSTEM security (Iceberg, Delta with system security, or any JDBC connector)
On SQLServer:
https://github.com/trinodb/trino/runs/6572998375?check_suite_focus=true
The text was updated successfully, but these errors were encountered: