Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to sync tables in two RDBMS - Unsupported delete #10371

Closed
tmanolat opened this issue Dec 21, 2021 · 3 comments
Closed

How to sync tables in two RDBMS - Unsupported delete #10371

tmanolat opened this issue Dec 21, 2021 · 3 comments

Comments

@tmanolat
Copy link

tmanolat commented Dec 21, 2021

We are planning to use Trino for syncing some tables from SQL Server to PostgreSQL - in the context of an ETL process.

We can always drop the table in Postgresql and re-insert all the data from SQL Server - and this is ok for medium-sized tables.

But for larger tables - we are trying to do a partial sync between the two tables using a last_updated column in the table.
The concept here is to delete all the records in PostgresSQL where the IDs last_updated column is older

delete from postgresql.public.t 
where id in (
  select id from sqlserver.dbo.t 
  where t.last_updated > (select max(last_updated) from postgresql.public.t)
) 

Should that delete worked - we were going to insert the updated rows.

The problem is that the above sql fragment raises the error Unsupported delete - most possibly refering to https://trino.io/docs/current/connector/postgresql.html#sql-delete

If a WHERE clause is specified, the DELETE operation only works if the predicate in the clause can be fully pushed down to the data source.

So - is there a way to implement this partial sync between the two tables? Or must we always do massive truncates / re-insert the whole tables?

@tmanolat
Copy link
Author

merry christmas! any thoughts?

@hashhar
Copy link
Member

hashhar commented Dec 28, 2021

This isn't supported today but it should be possible to implement.

@kokosing For more complex DELETE with base-jdbc did you already have some ideas of how it should look like? I assume it'll be modeled similar to Hive, i.e. each record would have a rowid column added and the PageSink/RecordSetProvider will selectively copy positions and drop the ones that are not needed?

@hashhar
Copy link
Member

hashhar commented Dec 28, 2021

Closing as duplicate of #6993 (since it's a documented limitation).

@hashhar hashhar closed this as completed Dec 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants