-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-46410][SQL] Assign error classes/subclasses to JdbcUtils.classifyException #44358
Conversation
val tableName = regex.findFirstMatchIn(message).get.group(2) | ||
case 42111 if errorClass == "FAILED_JDBC.CREATE_INDEX" => | ||
val indexName = messageParameters("indexName") | ||
val tableName = messageParameters("tableName") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good improvement!
@@ -1096,6 +1096,79 @@ | |||
], | |||
"sqlState" : "38000" | |||
}, | |||
"FAILED_JDBC" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FAILED_JDBC
-> FAILED_JDBC_OPERATION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too long, does _OPERATION
bring any benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because FAILED_JDBC
looks confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JDBC means Java Database Connectivity, the connectivity might fails at any time, ops or while establishing connections or while keeping it. For instance, this is real example:
org.postgresql.util.PSQLException: The connection attempt failed.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:331)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the connectivity is not changed if create index failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
val newTable = messageParameters("newName") | ||
throw QueryCompilationErrors.tableAlreadyExistsError(newTable) | ||
case "42P07" if pgAlreadyExistsRegex.findFirstMatchIn(sqlException.getMessage).nonEmpty => | ||
val tableName = pgAlreadyExistsRegex.findFirstMatchIn(sqlException.getMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make pgAlreadyExistsRegex.findFirstMatchIn(sqlException.getMessage)
execute only once
] | ||
} | ||
}, | ||
"sqlState" : "40000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be "transaction rollback"
This is all about reaching out to foreign data right?
HV000 FDW-specific condition?
Is it obvious from the context what we are connecting to? Should we add some JDBC level context (URL?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all about reaching out to foreign data right?
Right.
HV000 FDW-specific condition?
Seems like, but what do you propose? to get sqlState
from SQLExpression
?
Is it obvious from the context what we are connecting to?
Should be if an user reads data from one JDBC datasource.
Should we add some JDBC level context (URL?)
I can extract and pass option.url
to AnalysisException
with new error class FAILED_JDBC
but how about other (classified exceptions) like:
case "42P07" =>
if (errorClass == "FAILED_JDBC.CREATE_INDEX") {
throw new IndexAlreadyExistsException(
indexName = messageParameters("indexName"),
tableName = messageParameters("tableName"),
cause = Some(e))
The IndexAlreadyExistsException
has its own error class INDEX_ALREADY_EXISTS
without any parameters related to JDBC url.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was proposing to replace 40000 with HV004.
This example is interesting. How do we get a 42P07 with FAILED_JDBC errorclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was proposing to replace 40000 with HV004.
@srielau I didn't get the logic why HV004
but not HV000
? In postgresql for instance, it is related to fdw_invalid_data_type
How do we get a 42P07 with FAILED_JDBC errorclass?
From the doc https://www.postgresql.org/docs/14/errcodes-appendix.html and experiments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I fat-fingered. Yes HV000
@srielau @cloud-fan I am going to merge this PR soon if you don't mind. |
@@ -1180,12 +1180,15 @@ object JdbcUtils extends Logging with SQLConfHelper { | |||
} | |||
} | |||
|
|||
def classifyException[T](message: String, dialect: JdbcDialect)(f: => T): T = { | |||
def classifyException[T]( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a breaking change, isn't it? JDBCDialect is a public developer API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when was this API added? @beliefer do you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should keep two versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And how will you combine exceptions from the both versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are no nested exception here. We pass a default error class to classifyException
and if a dialect cannot classify an exception from JDBC driver, it throws the default one otherwise it forms and throws another one. The exception from JDBC driver is added as cause
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if a dialect can classify an exception, we will lose the error class which is actually worse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also enable MiMa for this? I believe this is being skipped because it's under execution
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if a dialect can classify an exception, we will lose the error class which is actually worse?
@cloud-fan A dialect can override proposed error class and make our default error class more precise using the driver specific info. And yes, the original error class will be lost.
Can we also enable MiMa for this? I believe this is being skipped because it's under execution package.
@HyukjinKwon I made the modification because MiMa complained. Did you change MiMa to expect different behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think losing the original error class is bad. We added classifyException
for getting more precise java exception type, but I don't think this is useful anymore as we have error classes. My suggestion is to deprecate classifyException
, saying that Spark never calls it anymore.
…ssifyException` ### What changes were proposed in this pull request? In the PR, I propose to restore `classifyException()` of `JdbcDialect` before the commit 14a933b, and extends `classifyException()` with the error class parameter by `description`: ```scala def classifyException( e: Throwable, errorClass: String, messageParameters: Map[String, String], description: String): AnalysisException ``` The `description` parameter has the same meaning as `message` in the old version of `classifyException()` which is deprecated. Also old implementation of `classifyException()` has been restored in JDBC dialects: MySQL, PostgreSQL and so on. ### Why are the changes needed? To restore compatibility with existing JDBC dialects. ### Does this PR introduce _any_ user-facing change? No, this PR restores the behaviour prior #44358. ### How was this patch tested? By running the affected test suite: ``` $ build/sbt "core/testOnly *SparkThrowableSuite" ``` and modified test suite: ``` $ build/sbt "test:testOnly *JDBCV2Suite" $ build/sbt "test:testOnly *JDBCTableCatalogSuite" ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44449 from MaxGekk/restore-jdbc-classifyException. Authored-by: Max Gekk <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…or classes ### What changes were proposed in this pull request? In the PR, I propose to port the existing `classifyException()` method which accepts a description to new one w/ an error class added by #44358. The modified JDBC dialects are: DB2, H2, Oracle, MS SQL Server, MySQL and PostgreSQL. ### Why are the changes needed? The old method `classifyException()` which accepts a `description` only has been deprecated already by ... ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By existing integration tests, and the modified test suite: ``` $ build/sbt "test:testOnly *JDBCV2Suite" ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44739 from MaxGekk/port-jdbc-classifyException. Authored-by: Max Gekk <[email protected]> Signed-off-by: Max Gekk <[email protected]>
What changes were proposed in this pull request?
In the PR, I propose to raise exceptions with only error classes from
JdbcUtils.classifyException
, and introduce new error classFAILED_JDBC
with sub-classes linked to a particular JDBC operation.Why are the changes needed?
To improve user experience with Spark SQL by migrating on new error framework when all Spark exceptions from the JDBC datasource have an error class.
Does this PR introduce any user-facing change?
Yes, if user's code depends on exceptions from the JDBC datasource.
How was this patch tested?
By running the affected test suites:
Was this patch authored or co-authored using generative AI tooling?
No.