Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning for query passthrough #15562

Merged
merged 1 commit into from
Apr 20, 2023

Conversation

Jessie212
Copy link
Contributor

Description

Create fragment and warning callout for query passthrough workaround

Additional context and related issues

Release notes

(s) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Dec 30, 2022
@Jessie212 Jessie212 requested a review from mosabua December 30, 2022 00:32
@github-actions github-actions bot added the docs label Dec 30, 2022
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the following:

  • Include in all relevant connectors
  • Keep it all in the fragment
  • Change commit message and PR message to be about the content .. not the mechanics of how you do it (the fragment approach)

@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from 40d9d80 to ce8dd36 Compare December 30, 2022 16:08
@Jessie212 Jessie212 changed the title Create fragment and warning callout for query passthrough workaround Add warning for query passthrough workaround Dec 30, 2022
@mosabua
Copy link
Member

mosabua commented Dec 30, 2022

Drop the "workaround" from the PR title and commit message please

@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from ce8dd36 to a783ca5 Compare January 3, 2023 21:40
@Jessie212 Jessie212 changed the title Add warning for query passthrough workaround Add warning for query passthrough Jan 3, 2023
@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from a783ca5 to d4a2241 Compare January 4, 2023 22:58
@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from d4a2241 to f2e8be4 Compare January 5, 2023 15:30
@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch 2 times, most recently from 8598982 to e797679 Compare January 9, 2023 14:48
@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from e797679 to d4463c6 Compare January 12, 2023 23:38
Copy link
Contributor

@jhlodin jhlodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % fact-check

Comment on lines 6 to 11
You can use passthrough queries to read data; however, queries that write data
typically do not return a table. As a result, when a write operation is passed,
Trino declares that the query failed even if the database performs the
operation. If supported by the underlying data source, you can avoid this
outcome by appending a statement with ``RETURNING 1``. Performing this action
may grant access to restricted data or result in data loss.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this factual? IIRC the query fails during ANALYSIS itself so there's no possibility of a DELETE/UPDATE getting executed if the Trino query shows a failure. @kasiafi can you please double check my understanding?

Copy link
Member

@kasiafi kasiafi Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hashhar I don't think there's a possibility that an operation (like DELETE or UPDATE) got executed though Trino query failed. I would consider that a serious issue.

All operations that do not return a result set fail during analysis, so they never reach execution.

Please note that the success or failure of a query pass-through depends directly on the presence of a result set, not on the operation type. Theoretically, if some DB returned a result set metadata for a DELETE, we would be happy to execute it. cc @mosabua @Jessie212

Copy link
Member

@mosabua mosabua Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats exactly the point of the warning ... however ... do we know that the query was actually not run on the underlying system .. @electrum was mentioning that maybe a query like a DELETE query is actually passed through .. and run .. and then just fails on the trino side since no result set is returned.. but the query actually deleted records.

So the aim of this PR is to add a warning of whatever the potential behavior is. Could you test and fully verify this @kasiafi and then we can adjust the wording accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the failure happens during analysis of the query. There's no possibility of running a query if it doesn't return a return set. We check whether result set would be returned BEFORE executing the query.

You can verify this behaviour through the various QueryRunners we have (PostgreSqlQueryRunner/MySqlQueryRunner were the ones I verified).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that even possible .. Trino verifies a random pass through query string that is valid for the underlying data source without running the query? Does it pass it through for analysis only or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that even possible .. Trino verifies a random pass through query string that is valid for the underlying data source without running the query? Does it pass it through for analysis only or something? Or does Trino only validate the larger query .. but not the pass through bit .. and if thats the case .. how does it know if that returns a result set or whatever else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should discourage using these table functions for executing queries that might have side-effects in the underlying database. Doing that can be problematic in the face of query or task retries if the operation is not idempotent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we word that so users understand? You are just talking about write access and DDL stuff right? Or any other side effects?

Should we just say that users should use this for just for read-only access?

typically do not return a table. As a result, when a write operation is passed,
Trino declares that the query failed even if the database performs the
operation. If supported by the underlying data source, you can avoid this
outcome by appending a statement with ``RETURNING 1``. Performing this action
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all DBs support this syntax?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres has RETURNING, SQL Server has OUTPUT. I don't know about others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yes, this statement won't work as it is for all connectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What connectors support what syntax? Can we get a list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats basically the documentation for the underlying database .. its a pass through .. so we cant make a list .. we can only refer to the docs of the data source.

Trino declares that the query failed even if the database performs the
operation. If supported by the underlying data source, you can avoid this
outcome by appending a statement with ``RETURNING 1``. Performing this action
may grant access to restricted data or result in data loss.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove "may grant access to restricted data". If you mean access which would be restricted if the query was run in Trino, then it is equally true for read and write operations.

Regarding "result in data loss", I would reword it. I a user runs a DELETE query, it is their expectation that data will be deleted. "Data loss" sounds like a bug.

@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from 0a7f46e to 9776247 Compare April 11, 2023 17:06
@Jessie212
Copy link
Contributor Author

@hashhar @kasiafi, David has some suggestions about this warning. @mosabua and I have decided to keep the message simple. We're providing the user with a best practice only. No workarounds that could land them in trouble.

@Jessie212 Jessie212 force-pushed the jt/query-pass-through branch from 9776247 to a133f47 Compare April 19, 2023 16:16
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go now. Please look and merge @kasiafi or @hashhar

@hashhar hashhar merged commit 7e90112 into trinodb:master Apr 20, 2023
@github-actions github-actions bot added this to the 415 milestone Apr 20, 2023
@Jessie212 Jessie212 deleted the jt/query-pass-through branch April 20, 2023 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

7 participants