-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add table function to execute stored procedure in SQLServer #15982
Add table function to execute stored procedure in SQLServer #15982
Conversation
e17613b
to
eec9f7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. @kokosing please review
@@ -69,6 +69,10 @@ public ConnectorSplitSource getSplits( | |||
DynamicFilter dynamicFilter, | |||
Constraint constraint) | |||
{ | |||
if (table instanceof JdbcProcedureHandle) { | |||
return delegateSplitManager.getSplits(transaction, session, table, dynamicFilter, constraint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a code comment
@@ -63,7 +64,7 @@ | |||
private ResultSet resultSet; | |||
private boolean closed; | |||
|
|||
public JdbcRecordCursor(JdbcClient jdbcClient, ExecutorService executor, ConnectorSession session, JdbcSplit split, JdbcTableHandle table, List<JdbcColumnHandle> columnHandles) | |||
public JdbcRecordCursor(JdbcClient jdbcClient, ExecutorService executor, ConnectorSession session, JdbcSplit split, ConnectorTableHandle table, List<JdbcColumnHandle> columnHandles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have a BaseJdbcConnectorTableHandle
marker interface and let JdbcProcedureHandle, JdbcTableHandle both extend it; use here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be as an abstract class right ?
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcRecordSet.java
Outdated
Show resolved
Hide resolved
// If no columns are recorded, it means that applyProjection never got called (e.g., in the case all columns are being used) and all | ||
// table columns should be returned. TODO: this is something that should be addressed once the getRecordSet API is revamped | ||
jdbcTable.getColumns() | ||
.ifPresent(tableColumns -> verify(ImmutableSet.copyOf(tableColumns).containsAll(columns))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not for procedure handle?
Have same check for them too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of Will implement it.ProcedureHandle
we won't face the no columns
kind of a situation - plus the columns are populated when we are creating the table handle. Add the ProcedureHandle
doesn't support project pushdown there is a good chance the column we pass on RecordSetProvider
might be a subset of overall columns present in ProcedureHandle
.
} | ||
} | ||
|
||
public record ProcedureQuery(@JsonProperty String query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we use records for between-nodes serialization (I think we are starting to)
then ProcedureFunctionHandle
above can be a record too
also, i don't think we need @JsonProperty
and @JsonCreator
for records (airlift/airlift#993), do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ProcedureFunctionHandle
- wanted be in-sync with other TableHandle
implementation. I don't have any strong opinions on it. Will move it to a record.
if (handle instanceof QueryFunctionHandle queryFunctionHandle) { | ||
return Optional.of(getTableFunctionApplicationResult(session, queryFunctionHandle.getTableHandle())); | ||
} | ||
else if (handle instanceof ProcedureFunctionHandle procedureFunctionHandle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant else
return jdbcClient.delete(session, (JdbcTableHandle) handle); | ||
} | ||
|
||
@Override | ||
public void truncateTable(ConnectorSession session, ConnectorTableHandle tableHandle) | ||
{ | ||
verify(!(tableHandle instanceof JdbcProcedureHandle), "Not a table reference: %s", tableHandle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this. The code will never get invoked with JdbcProcedureHandle
, and if it would, it would fail explicitly anyway
@@ -904,6 +985,7 @@ public void setColumnType(ConnectorSession session, ConnectorTableHandle table, | |||
@Override | |||
public void renameTable(ConnectorSession session, ConnectorTableHandle table, SchemaTableName newTableName) | |||
{ | |||
verify(!(table instanceof JdbcProcedureHandle), "Not a table reference: %s", table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replicate it for all alter options ?
584e83c
to
47bdf91
Compare
@findepi AC. |
47bdf91
to
db1d540
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcTableHandle.java
Show resolved
Hide resolved
As for me it's really new topic, can you please point me to some entry point, to understand what we have now (I mean existing query table function) and why we can't use it for this connectors |
@vlad-lyutenko this all boils down to two things
|
Given that this is a very powerful user facing feature that potentially has huge security and functionality implications I request the following:
|
Also for reference following is a link to the section in the MySQL connector docs where a new subsection for the new table function is needed. https://trino.io/docs/current/connector/mysql.html#table-functions And related question... will this table function have a similar restriction where it is required that the stored procedure returns a result set? If so .. that could be limiting functionality. And if not ... the security implications are even bigger since arbitrary stored procedures can be called. Last.. I assume that the user accessing the database from the catalog configuration has to have the right to run the stored procedure .. can you detail what is actually specifically needed in MySQL and SQL Server terminology? |
db1d540
to
9bf4ed0
Compare
Yes we have a same restriction where the stored procedure has to return a
Similar to tables - if the user specified in the catalog configuration has access to run the stored procedures then we would support it. |
@kokosing AC |
SCHEMA_NAME, | ||
NAME, | ||
List.of(ScalarArgumentSpecification.builder() | ||
.name("QUERY") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more user-friendly if the argument was renamed to procedure
and the user would only pass the procedure call, that is "procedure_name()" without the CALL
keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on vendor you are using CALL
or EXEC
. So maybe we could rename name of function to call
to exec
depending on vendor (connector) and only pass the name of procedure. This could work only if we are not expecting to pass parameters any parameters to procedures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of arguments we could either pass them as an Array<String>
and handle them accordingly.
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
Need any help with the docs here @Praveen2112 ? Should I send a separate parallel PR? |
9bf4ed0
to
10dc303
Compare
6d8b5c4
to
f464ee5
Compare
be36f5f
to
daec542
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skimmed. I am fine with the approach
); | ||
|
||
If the stored procedure ``employee_sp`` requires any input, it can be specified via ``inputs`` | ||
argument as an ``array<varchar>``:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if procedure expects an integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now we pass the arguments as string - we pass it without quotes for integer and with additional quotes for string (if requested by the underlying datasource).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for SQLServer, it doesn't expect additional quotes. So do we need to document it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’d be better to pass the arguments as a ROW, so each one has the proper type. You get type safety and avoid conversions that might not be possible to do easily or reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide pointer on where/how it is established during the analysis step - Should we use a different Argument specification for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cc @kasiafi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of scalar argument is established at TF declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we could pass arbitrary scalars, it seems right to use an array of varchar in this case. It shifts the responsibility of formatting them the right way to the user. That's in the spirit of query pass-through. This way it is harder for the user to pass parameters, or non-literal expressions, but I don't know if a real use-case would need those capabilities.
Like @kokosing noticed, we must watch out for possible abuse (injection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Table argument is the only argument type of table functions that is polymorphic . Technically, we could make use of it, and pass a scalar subquery containing parameters. But then we'd have to refactor the PTF to use the operator-based execution.
{ | ||
return new ProcedureQuery( | ||
"EXECUTE %s.%s %s" | ||
.formatted( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like one could inject some SQL by using inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. Have added additional checks.
daec542
to
d383be0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks good, some early comments.
plugin/trino-sqlserver/src/main/java/io/trino/plugin/sqlserver/SqlServerClient.java
Show resolved
Hide resolved
public void testProcedureWithDropOperation() | ||
{ | ||
try (TestTable table = new TestTable(onRemoteDatabase(), "table_to_drop", "(id BIGINT)")) { | ||
try (TestProcedure testProcedure = createTestingProcedure("DROP TABLE " + table.getName())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql?view=sql-server-ver16 exists so can we test with it as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have added test with OUTPUT
clause.
plugin/trino-sqlserver/src/test/java/io/trino/plugin/sqlserver/BaseSqlServerConnectorTest.java
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
d383be0
to
3792ce9
Compare
@hashhar , @skrzypo987 AC |
3301e93
to
91ee0b7
Compare
Rebased the PR to resolve conflicts. |
91ee0b7
to
5fa16b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the recent changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs look good. Did not look at the code..
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestProcedure.java
Outdated
Show resolved
Hide resolved
5fa16b1
to
8515486
Compare
); | ||
|
||
If the stored procedure ``employee_sp`` requires any input, it can be specified via ``inputs`` | ||
argument as an ``array<varchar>``:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document this?
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
8515486
to
47e7213
Compare
public JdbcProcedureHandle getProcedureHandle(ConnectorSession session, ProcedureInformation procedureInformation) | ||
|
||
{ | ||
if (procedureInformation.inputArguments().stream().anyMatch(argument -> argument.contains(";"))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can process arguments via preparedCall.setParameter
we wouldn't need this.
if we cannot process arguments via preparedCall.setParameter
we shouldn't pretend we can.
If the user provides parameter values as string literals to be concatenated with the procedure call, why can't they provide the whole procedure call (including the parameters)? Just like we do with SQL pushdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can process arguments via preparedCall.setParameter we wouldn't need this.
It could be possible - like if we could get the procedure column information and map them accordingly. I would like implement them in an incremental way as executing stored procedure is kind of an experimental feature. WDYT ?
why can't they provide the whole procedure call (including the parameters)? Just like we do with SQL pushdown?
The initial approach we have is like SQL Pushdown - where we don't do any processing but consume a String
as an input. For some datasources like SQLServer
it would be helpful if we could get the schema and procedure name - so we could apply some checks to ensure the SP doesn't any DDL or DML operations. So it more of an extended version of SQLPushdown - where we take some of them in parts and concat all the pieces.
These methods can be reused for Procedures PTF - Extract building columns from ResultSetMetaData as a separate method. - Extract creating connection based on session
47e7213
to
88d57a0
Compare
Rebased due to conflicts. |
Overridden by #16696 |
Description
A dedicated table function which allows us to run stored procedures in SQLServer. We can't use existing query table function - as it wraps with inside a query like
SELECT * FROM (EXEC my_procedure) o
- so we are handing them via a different function. All the pushdown are disabled forJdbcProcedureHandle
resolved by thisProcedure
.Slightly different implementation from #15813
Limitation
ResultSet
only for the first statement. (Test to be added)Example
Release notes
( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: