[COST-4745] OCPGCP Network data processing SQL #5058

cgoodfred · 2024-04-22T03:16:16Z

Jira Ticket

REQUIRES THE MIGRATION IN [COST-4740] Migrations for OCP Cloud table network columns #5086
REQUIRES THE FILTERING OUT OF Network unattributed project in [COST-4744] Azure network data processing SQL #5056
REQUIRES A NISE UPDATE STILL

Description

This change will add ocp on azure network processing. This change does a few things:

Identifies Network records from the GCP bill that are associated with a specific Compute Instance that can be tied to an OCP Node
Separates the usage and cost for these records into a distinct row per day, one for inbound traffic, one for outbound traffic when we aggregate the gcp_openshift_daily records up
Filter out the networking records when we are grouping by namespace because these values cannot be attributed to a specific namespace/project (hence the Network unattributed project!)
Perform a new insert into the project daily summary table for the networking records grouped by OCP node
Back populate these records into the OCPUsage table adding a data transfer direction to the group by which has 3 options, IN, OUT, and NULL

NOTE: when GCP renamed Ingress to Data Transfer in, Egress was renamed to Data Transfer that sometimes has a conditional of out but sometimes does not. Based on my understanding of this GCP article, Ingress was simply renamed to Data Transfer In and any other data transfer is Egress/Outbound

Nise has been updated and the test customer yamls now include network in and out records.

Testing

Using nise > 4.5.3, create GCP compute data that has networking SKUs defined for the same resource id as an OpenShift node. Something like

---
generators:
  - ComputeEngineGenerator:
      start_date: {{start_date}}
      end_date: {{end_date}}
      price: 2
      sku_id: CF4E-A0C7-E3BF
      usage.amount_in_pricing_units: 1
      usage.pricing_unit: hour
      currency: USD
      instance_type: m2-megamem-416
      location.region: australia-southeast1-a
      resource.name: projects/nise-populator/instances/gcp_compute1
      resource.global_name: //compute.googleapis.com/projects/nise-populator/zones/australia-southeast1-a/instances/3447398860992947181
      labels: [{"environment": "clyde", "app":"winter", "version":"green", "kubernetes-io-cluster-c32se93c-73z3-3s3d-cs23-d3245sj45349": "owned"}]
  - ComputeEngineGenerator:
      start_date: {{start_date}}
      end_date: {{end_date}}
      price: 2
      sku_id: BBF8-C07D-1DF4
      usage.amount_in_pricing_units: 50
      usage.pricing_unit: hour
      currency: USD
      instance_type: m2-megamem-416
      location.region: australia-southeast1-a
      resource.name: projects/nise-populator/instances/gcp_compute1
      resource.global_name: //compute.googleapis.com/projects/nise-populator/zones/australia-southeast1-a/instances/3447398860992947181
      labels: [{"environment": "clyde", "app":"winter", "version":"green", "kubernetes-io-cluster-c32se93c-73z3-3s3d-cs23-d3245sj45349": "owned"}]
  - ComputeEngineGenerator:
      start_date: 2024-05-01
      end_date: 2024-05-31
      price: 30
      sku_id: 9DE9-9092-B3BC
      usage.amount_in_pricing_units: 10
      usage.pricing_unit: hour
      currency: USD
      instance_type: m2-megamem-416
      location.region: australia-southeast1-a
      resource.name: projects/nise-populator/instances/gcp_compute1
      resource.global_name: //compute.googleapis.com/projects/nise-populator/zones/australia-southeast1-a/instances/3447398860992947181
      labels: [{"environment": "clyde", "app":"winter", "version":"green", "kubernetes-io-cluster-c32se93c-73z3-3s3d-cs23-d3245sj45349": "owned"}]

Create a source and load the OCP data
Create a source and load the GCP data you just created
Let summary run and check the OCP and OCP on GCP database records and verify the network records are visible and distinct with infrastructure_data_in_gigabytes or infrastructure_data_out_gigabytes filled in for each day and each Network unattributed project.
Run a few SQL queries to verify the costs before and after OCPGCP summary line up.
docker exec -it trino trino --server localhost:8080 --catalog hive --schema org1234567 --user admin --debug

trino:org1234567> SELECT sum(cost) as cost FROM gcp_openshift_daily WHERE month='05';
   cost   
----------
 306528.0 
(1 row)

trino:org1234567> select sum(unblended_cost) from reporting_ocpgcpcostlineitem_project_daily_summary WHERE month = '5';
  _col0   
----------
 306528.0 
(1 row)

trino:org1234567> SELECT sum(cost) as cost FROM gcp_openshift_daily WHERE lower(sku_description) LIKE '%data transfer%' AND month='05';
   cost   
----------
 297600.0 
(1 row)

trino:org1234567> SELECT sum(unblended_cost) as cost FROM reporting_ocpgcpcostlineitem_project_daily_summary WHERE data_transfer_direction IS NOT NULL AND month='5';
   cost   
----------
 297600.0 
(1 row)

trino:org1234567> select sum(unblended_cost) as cost, data_transfer_direction from reporting_ocpgcpcostlineitem_project_daily_summary WHERE data_transfer_direction IS NOT NULL AND month='5' GROUP BY data_transfer_direction;
   cost   | data_transfer_direction 
----------+-------------------------
 223200.0 | OUT                     
  74400.0 | IN                      
(2 rows)

trino:org1234567> select SUM(unblended_cost) from reporting_ocpgcpcostlineitem_project_daily_summary where data_transfer_direction IS NOT NULL AND month='5';
  _col0   
----------
 297600.0 
(1 row)

trino:org1234567> select usage_start, unblended_cost, infrastructure_data_in_gigabytes, infrastructure_data_out_gigabytes, usage_amount from postgres.org1234567.reporting_ocpgcpcostlineitem_project_daily_summary_p_2024_05 WHERE namespace = 'Network unattributed' ORDER BY usage_start;
 usage_start |    unblended_cost    | infrastructure_data_in_gigabytes | infrastructure_data_out_gigabytes |     usage_amount     
-------------+----------------------+----------------------------------+-----------------------------------+----------------------
 2024-05-01  | 7200.000000000000000 |                0.000000000000000 |               257.697599999999970 |  240.000000000000000 
 2024-05-01  | 2400.000000000000000 |             1288.487999999999800 |                 0.000000000000000 | 1200.000000000000000 
 2024-05-02  | 7200.000000000000000 |                0.000000000000000 |               257.697599999999970 |  240.000000000000000 
 2024-05-02  | 2400.000000000000000 |             1288.487999999999800 |                 0.000000000000000 | 1200.000000000000000

Inbound math:
Cost: 2400 = 50 (usage) * 2 (rate) * 24 hours ✅
Quantity: 1288.488 = 50 (usage) * 24 hours * 1.07374 (gibibyte to gigabyte conversion) ✅
Outbound math:
Cost: 7200 = 30 (usage) * 10 (rate) * 24 hours ✅
Quantity:257.6976 = 30 (usage) * 24 hours * 1.07374 (gibibyte to gigabyte conversion) ✅

Release Notes

proposed release note

* [COST-4745](https://issues.redhat.com/browse/COST-4745) This PR will **result in a numbers change when looking at OpenShift or GCP filtered by OpenShift endpoints when grouped by project** as long as OpenShift Costs are coming from a GCP cloud source. 
* Previously the networking cost of the node was distributed amongst the projects on the node but now those networking costs are removed into a separate NEW project called `Network unattributed`.
* Example with numbers: 

- I have a node called `compute_1` and this node has 2 projects, `projectA` and `projectB` that each use 50% of the cluster leaving 0 unallocated costs.
- When I look at the costs for this node grouped by project today, `projectA` costs $15 and `projectB` costs $5 for a total of $20. 
- Of that $20, I know that $5 is networking costs. 
- After this change there will be 3 projects with costs for this node, `projectA`, `projectB`, and `Network unattributed`.
- The cost for `projectA` would now be $12.5, `projectB` would now be $2.5 and `Network unattributed` would be $5. 
- The new Network unattributed project is the networking costs that can be specifically tied to this node but not broken down at the project level.

codecov · 2024-04-22T03:32:18Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.1%. Comparing base (07ae2b8) to head (9415f96).

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #5058     +/-   ##
=======================================
- Coverage   94.1%   94.1%   -0.0%     
=======================================
  Files        376     376             
  Lines      31261   31261             
  Branches    4602    4602             
=======================================
- Hits       29429   29427      -2     
- Misses      1167    1168      +1     
- Partials     665     666      +1

.../database/trino_sql/gcp/openshift/reporting_ocpgcpcostlineitem_daily_summary_resource_id.sql

…twork

…-ocpgcp-network

.../database/trino_sql/gcp/openshift/reporting_ocpgcpcostlineitem_daily_summary_resource_id.sql

lcouzens · 2024-07-04T08:27:43Z

/retest

lcouzens · 2024-07-04T09:31:19Z

/retest

* [COST-4745] OCPGCP Network data processing SQL --------- Co-authored-by: Sam Doran <[email protected]>

…5117) * feat: cleaning up old changes. making first changes to create the API pieces. * feat: cleaning up old changes. making first changes to create the API pieces. * feat: insert new filters. * feat: insert new filters and order by params. * feat: customizing provider map and serializer. * fix: changing TIME_CHOICES options. * feat: first unit tests. * feat: wip. * feat: fixing provider map. * feat: fixing provider map. * update ec2 annotations to get all required fields * use AWSEC2ComputeQueryParamSerializer * feat: creating orderby and groupby serializers for ec2 * feat: unit tests for filters * feat: removing group_by - not needed on ec2 * feat: fixing orderby serializer and starting units tests * fix: typo * feat: changing usage_hours to usage_amount * feat: fixing unit test. * wip: blocking some filters and unit test. * feat: unit tests for group by filter * flake8 fix * feat: inserting more filters on validate function. * feat: updating validate method to use similar logic and add filters. * fix: changing unit tests for some filters. * feat: testing filter combinations and flake8 checks. * fix: test * feat: serializer Unit tests and view Unit test fix * flake8 fix * fix: new approach to satisfy CodeCov * fix:: getting rid of validate custom method. * fix: commenting tags. * fix: validarte functions, tests. * handle filter params for specific report type * transform tags to desired ui format * default to monthly resolution on the EC2 endpoint * add special pagination for EC2 * use default report type time period settings if exists * fix typo * fix: fixing parameters validations. * [COST-5141] Fix management command to use continue instead of return. (#5173) * [COST-5128] Process new subs tagging strategy to identify non-converted instances (#5162) * [COST-4745] Add data_transfer_direction to OCP on GCP Trino tables (#5130) * [COST-4741] Add data_transfer_direction for AWS network costs to Trino tables (#5129) * [COST-5168] - Adding new penalty pipeline (#5176) * [COST-5168] - Adding new penalty pipeline * Improve our logging readability (#5178) * add prometheus metrics for new queues (#5179) * add v3.3.0 operator commits (#5143) * [COST-5124] Improve Trino migration management command (#5163) * Add exponential backoff and logging to retries * Change log level to reflect severity * Explicit SQL alias for clarity * Catch and log exception instead of exiting * Add return type hints * Return if unsuccessful No point in verifying if the SQL did not run correctly * Fine tune exponential backoff * Create action class for adding/verifying columns were added * Assign default list using default_fatory Instead of doing it in the post_init, which get’s a little weird. * Add drop column action * Quote items in logs for better legibility * Consolidate action classes We lose some of the action-specific logging messages, but there is less code overall. I’m not sure how this scale to the action related to dropping. * Change local variable name No need to add a prefix to differentiate it from the parameter name. * Use a set to prevent running on the same schema multiple times Co-authored-by: Cody Myers <[email protected]> * Filter accounts by matching criteria during subs processing to prevent unnecessary SQL from running (#5184) * Update tasks.py (#5185) * clean up grafana dashboard (#5183) * Skip OCPCloud tag SQL if key is present in cache but value is None (#5186) * [COST-5196] - Send OCP tasks to correct queues (#5187) * [COST-5196] - Send OCP tasks to correct queues * [COST-5176] correctly pass context dictionary within log_json function call (#5182) * [COST-5176] correctly pass context dictionary within log_json function call * add unittests for exceptions in generate_report * batch delete S3 files (#5180) * Bump urllib3 from 1.26.18 to 1.26.19 in the pip group across 1 directory (#5172) Bumps the pip group with 1 update in the / directory: [urllib3](https://github.com/urllib3/urllib3). Updates `urllib3` from 1.26.18 to 1.26.19 - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/1.26.19/CHANGES.rst) - [Commits](urllib3/urllib3@1.26.18...1.26.19) --- updated-dependencies: - dependency-name: urllib3 dependency-type: indirect dependency-group: pip ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add flower as a dev dependency (#5189) * Add docs * [COST-4844] Serializer update for ordering by storageclass (#5174) * Switch to using podman in build_deploy (#5193) The VM used in CI is now RHEL 8 * skip polling providers still processing (#5181) * skip polling providers that are still processing * [COST-5214] Move TARGETARCH declaration to the top of the Dockerfile (#5195) There is a bug in podman where this is only used correctly for the multi-stage build if it is defined as the first line. Update Jenkinsfile to use RHEL 8 Unfortunately this breaks the image build for Docker. I'll fix that in a followup PR. * [COST-5213] - fix S3 prepare (#5194) * Switch default parquet flag to prevent iterating on all files in each worker when there is nothing to delete * [COST-5214] pass build-arg to docker build command (#5196) * [COST-5216] Delete filtering optimization (#5197) * Revert "[COST-5216] Delete filtering optimization (#5197)" (#5200) This reverts commit 97ba98e. * [COST-5226] - Skip S3 delete (daily flow) if we have marked deletion complete. (#5198) * dont attempt more S3 deletes if we have marked deletion complete * [COST-5076] upgrade to python 3.11 (#4444) * upgrade to python 3.11 * pipfile update * add gcc-c++ compiler Co-authored-by: Sam Doran <[email protected]> * update test * replace gcc with gcc-c++ --------- Co-authored-by: Sam Doran <[email protected]> * [COST-5228] log outside for loop (#5202) * [COST-5228] log outside for loop * additional log clean up * add context to logs in _remove_expired_data func * log s3 batch deletes (#5204) * log s3 batch deletes * [COST-5219] Correctly report VM usage for metering when billing record is split (#5201) * [COST-5219] Handle Azure instance record being split * [COST-4745] OCPGCP Network data processing SQL (#5058) * [COST-4745] OCPGCP Network data processing SQL --------- Co-authored-by: Sam Doran <[email protected]> * [COST-5198] - split read traffic to read replica db using nginx proxy (#5188) * update nginx with HTTP method routing * switch koku-api to koku-api-writes * duplicate koku-api to koku-api-reads add a optional mounted secret for the read replica * update clowder configurator to read from read replica secret if mounted and enabled via ENV var * remove unused methods (#5208) * Bump certifi in the pip group across 1 directory (#5207) * chore(image): update and rebuild image (#5203) Co-authored-by: Update-a-Bot <[email protected]> * Handle case when resource ID cannot be obtained (#5209) * Catch exception case. * [COST-5148] filter out empty resource ids and SavingsPlanCoveredUsage entries (#5206) * [COST-5148] update insert sql filter out empty resource ids offset savings from SavingsPlanCoveredUsage * closing CASE statement * clean up comment * remove case stmts in favor of filtering out SavingsPlanCoveredUsage * clean up * Unpause the csi volume handle sql (#5175) * update linting * feat: cleaning up old changes. making first changes to create the API pieces. * feat: cleaning up old changes. making first changes to create the API pieces. * feat: insert new filters. * feat: insert new filters and order by params. * feat: customizing provider map and serializer. * fix: changing TIME_CHOICES options. * feat: first unit tests. * feat: wip. * feat: fixing provider map. * feat: fixing provider map. * update ec2 annotations to get all required fields * use AWSEC2ComputeQueryParamSerializer * feat: creating orderby and groupby serializers for ec2 * feat: unit tests for filters * feat: removing group_by - not needed on ec2 * feat: fixing orderby serializer and starting units tests * fix: typo * feat: changing usage_hours to usage_amount * feat: fixing unit test. * wip: blocking some filters and unit test. * feat: unit tests for group by filter * flake8 fix * feat: inserting more filters on validate function. * feat: updating validate method to use similar logic and add filters. * fix: changing unit tests for some filters. * feat: testing filter combinations and flake8 checks. * fix: test * feat: serializer Unit tests and view Unit test fix * flake8 fix * fix: new approach to satisfy CodeCov * fix:: getting rid of validate custom method. * fix: commenting tags. * fix: validarte functions, tests. * handle filter params for specific report type * transform tags to desired ui format * default to monthly resolution on the EC2 endpoint * add special pagination for EC2 * use default report type time period settings if exists * fix typo * fix: fixing parameters validations. * update linting * squash commits * clean up query params and time period settings * do not use filter keyword * more code clean up update unit tests * address feedback - move report_specific filters to main filter map - use report_type instead of kwargs in get_paginator - do not use deepcopy - just overwrite query_data - resolution and time_scope_units are always monthly and month respectively - overide start and end date params in base ParamSerializer - overide limit and offset in base FilterSerializer * more unit tests * update openapi spec * clean up and add unit tests * move changes to openapi spec to a separate pr * use serializer choice field and not customer validate method --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: David N <[email protected]> Co-authored-by: Luke Couzens <[email protected]> Co-authored-by: Cody Myers <[email protected]> Co-authored-by: Corey Goodfred <[email protected]> Co-authored-by: Sam Doran <[email protected]> Co-authored-by: Michael Skarbek <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Hambridge <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Update-a-Bot <[email protected]>

[COST-4745] OCPGCP Network data processing SQL

b84a478

github-actions bot added the smokes-required label Apr 22, 2024

remove migration

2a56a20

myersCody reviewed May 9, 2024

View reviewed changes

.../database/trino_sql/gcp/openshift/reporting_ocpgcpcostlineitem_daily_summary_resource_id.sql Outdated Show resolved Hide resolved

samdoran and others added 7 commits May 13, 2024 18:34

Merge remote-tracking branch 'upstream/main' into COST-4745-ocpgcp-ne…

b960195

…twork

Correct case transformation for comparison

b4a4716

Merge remote-tracking branch 'upstream/main' into COST-4745-ocpgcp-ne…

c791f1b

…twork

Merge remote-tracking branch 'upstream/main' into COST-3761/COST-4745…

92d7adc

…-ocpgcp-network

Correct strpos comparison in conditional

340e819

update case field and condition

58ab763

sql updates

0b2854f

cgoodfred added the gcp-smoke-tests pr_check will build the image and run gcp + ocp on gcp smoke tests label Jun 3, 2024

cgoodfred self-assigned this Jun 3, 2024

cgoodfred added 2 commits June 3, 2024 11:52

update nise, yml and SQL

004056e

Merge branch 'main' into COST-4745-ocpgcp-network

f47d0aa

cgoodfred marked this pull request as ready for review June 3, 2024 15:56

cgoodfred requested review from a team as code owners June 3, 2024 15:56

cgoodfred added 2 commits June 4, 2024 07:53

Merge branch 'main' into COST-4745-ocpgcp-network

68cf7c7

Make case statement specific to compute engine records

c94ea8c

maskarb reviewed Jun 4, 2024

View reviewed changes

.../database/trino_sql/gcp/openshift/reporting_ocpgcpcostlineitem_daily_summary_resource_id.sql Show resolved Hide resolved

maskarb previously approved these changes Jun 4, 2024

View reviewed changes

cgoodfred added 2 commits June 20, 2024 11:55

Merge branch 'main' into COST-4745-ocpgcp-network

7cdb307

fix pipfile conflict

d18f71d

cgoodfred dismissed maskarb’s stale review via d18f71d July 3, 2024 17:31

Merge branch 'main' into COST-4745-ocpgcp-network

e8ef50e

cgoodfred enabled auto-merge (squash) July 4, 2024 03:41

lcouzens approved these changes Jul 4, 2024

View reviewed changes

Merge branch 'main' into COST-4745-ocpgcp-network

9415f96

cgoodfred merged commit bdd992d into main Jul 4, 2024
10 of 11 checks passed

cgoodfred deleted the COST-4745-ocpgcp-network branch July 4, 2024 10:21

djnakabaale pushed a commit that referenced this pull request Jul 9, 2024

[COST-4745] OCPGCP Network data processing SQL (#5058)

cc465ba

* [COST-4745] OCPGCP Network data processing SQL --------- Co-authored-by: Sam Doran <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[COST-4745] OCPGCP Network data processing SQL #5058

[COST-4745] OCPGCP Network data processing SQL #5058

cgoodfred commented Apr 22, 2024 •

edited

Loading

codecov bot commented Apr 22, 2024 •

edited

Loading

lcouzens commented Jul 4, 2024

lcouzens commented Jul 4, 2024

[COST-4745] OCPGCP Network data processing SQL #5058

[COST-4745] OCPGCP Network data processing SQL #5058

Conversation

cgoodfred commented Apr 22, 2024 • edited Loading

Jira Ticket

Description

Testing

Release Notes

codecov bot commented Apr 22, 2024 • edited Loading

Codecov Report

lcouzens commented Jul 4, 2024

lcouzens commented Jul 4, 2024

cgoodfred commented Apr 22, 2024 •

edited

Loading

codecov bot commented Apr 22, 2024 •

edited

Loading