-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trino lineage fails to capture upstream columns when join and transformation is used #10272
Comments
@Praveen2112 would you like to take a look? |
Taking a look at it. This is seen for columns with function where its argument are from a |
@Praveen2112 thanks for the amazing turnaround time, appreciate it. Least we can do is help test it. Our setup/deploy would have a turn-around time to test it. If it is okay, let's wait for the comments getting resolved and get the PR on approved state? Would be happy to deploy and give it a spin. Does that sound reasonable? |
Yes. Thank you for your help. I think you can test it now. It looks like there are comments about implementation details in PR. Addressing them should not change the scope of the fix. Testing now would prove if we covered all needed the cases in our tests. |
This is working now, thanks! I ran the following query again: create table amalakar.new_query_log_new_patch
as
with queries as
(
select * from
hive.default.event_presto_query_logged p2
where ds='2019-01-20'
)
SELECT
p1.occurred_at as occurred_at,
substr(p2.query_id, 1, 10) as new_query_id
FROM queries p1
inner join queries p2
ON p1.query_id=p2.query_id
limit 10 Lineage I am seeing now is:
cc: Thanks @akashkatipally in helping test this! |
Fixed by #10319 |
Here is how to reproduce, the following query:
Produces the following lineage:
Notice how the upstream of
new_query_id
is not being captured.I did an impact analysis, and at lyft this bug impacts 79% of our lineage, only 21% is being captured accurately as of now.
The text was updated successfully, but these errors were encountered: