Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-5657] subs: use subquery in Trino to join from Postgres #5381

Merged
merged 11 commits into from
Nov 14, 2024
Merged

Conversation

maskarb
Copy link
Member

@maskarb maskarb commented Nov 13, 2024

Jira Ticket

COST-5657

Description

This change will filter the list of resource-ids thru a subquery in postgres.

Testing

  1. smokes

Release Notes

  • proposed release note
* [COST-5657](https://issues.redhat.com/browse/COST-5657) subs: use subquery in Trino to join from Postgres

@maskarb maskarb added the smoke-tests pr_check will build the image and run minimal required smokes label Nov 13, 2024
@maskarb maskarb marked this pull request as ready for review November 13, 2024 19:15
@maskarb maskarb requested review from a team as code owners November 13, 2024 19:15
Copy link

codecov bot commented Nov 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.1%. Comparing base (6eea791) to head (637c005).
Report is 1 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #5381   +/-   ##
=====================================
  Coverage   94.1%   94.1%           
=====================================
  Files        371     371           
  Lines      31557   31552    -5     
  Branches    3385    3385           
=====================================
- Hits       29686   29684    -2     
+ Misses      1210    1209    -1     
+ Partials     661     659    -2     

chambridge
chambridge previously approved these changes Nov 13, 2024
@maskarb maskarb enabled auto-merge (squash) November 13, 2024 19:48
@myersCody
Copy link
Contributor

myersCody commented Nov 13, 2024

We likely have the same problem with the determine_ids_for_provider function. There is likely a cleaner way to handle the exclusion in Trino. Right now we use the django ORM to build the exclude list and then populate the exclude filter using that list in jinja creating our huge query problem.

Instead we could just build the exclude resource id list in trino, then left join in the excluded list and a filter to make sure all of our excluded resources are not included in the return:

    SELECT distinct resource_id 
    FROM postgres.{{schema | sqlsage}}.reporting_subs_last_processed_time as lp
    WHERE lp.source_uuid != {{source_uuid}}
)
SELECT lineitem_resourceid, max(lineitem_usagestartdate)
FROM hive.{{schema | sqlsafe}}.aws_line_items
LEFT JOIN excluded_ids AS ex ON aws.lineitem_resource = ex.resource_id
WHERE source={{source_uuid}}
AND year={{year}}
AND month={{month}}
AND lineitem_productcode = 'AmazonEC2'
AND strpos(lower(resourcetags), 'com_redhat_rhel') > 0
AND ex.resource_id IS NULL
GROUP BY lineitem_resourceid

This will allow for the excludes resource ids to be however long it needs to be without changing the length of the query itself.

@maskarb maskarb disabled auto-merge November 13, 2024 20:24
@maskarb
Copy link
Member Author

maskarb commented Nov 14, 2024

/retest

@maskarb maskarb changed the title [COST-5657] move filtering into python [COST-5657] subs: use subquery in Trino to join from Postgres Nov 14, 2024
@maskarb
Copy link
Member Author

maskarb commented Nov 14, 2024

@myersCody good suggestion. I had not considered making Trino do the postgres query, so that's a good idea!

@maskarb maskarb merged commit 52a10c7 into main Nov 14, 2024
12 of 14 checks passed
@maskarb maskarb deleted the COST-5657 branch November 14, 2024 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
smoke-tests pr_check will build the image and run minimal required smokes smokes-required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants