Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-4620] - Add basic provider data validation #5218

Merged
merged 18 commits into from
Aug 2, 2024
Merged

[COST-4620] - Add basic provider data validation #5218

merged 18 commits into from
Aug 2, 2024

Conversation

lcouzens
Copy link
Contributor

@lcouzens lcouzens commented Jul 17, 2024

Jira Ticket

COST-4620

Description

  • Add a new post summary task to validate some total metrics between postgres and trino data.
  • Logs if data looks valid or not and marks the provider with a true/false accordingly (Giving us a quick way to build a list of incorrect provider data)
  • Additionally adds a trigger for OCP on Cloud clusters too.
  • New Masu endpoint to manually trigger this validation for a given date range.
  • Also includes an unleash flag so we can skip running this task by default and gradually enable customers. Just to be aware of any performance hits.

Testing

  1. Checkout Branch
  2. Restart Koku
  3. Enable the validation unleash flag is_validation_enabled
  4. Ingest data like normal.
  5. You should see logging as follows per provider ingested
{'message': 'validation started for provider: a67bb4a2-8e05-40
{'message': 'triggering VALIDATION_QUERY',
{'message': 'finished VALIDATION_QUERY',
{'message': 'executing trino sql', 'tracing_id': '', 'log_ref': 'data validation query'}
{'message': 'executed trino sql', 'tracing_id': '', 'log_ref': 'data validation query', 'running_time': 0.5209639072418213}
{'message': 'all data complete for provider: a67bb4a2-8e05-4078-b1e4-9c8525f3fb0f'
  1. You can also re-trigger the validation task via this masu endpoint:
    http://localhost:5042/api/cost-management/v1/validate_cost_data/?provider_uuid={uuid}&start_date=2024-07-01&end_date=2024-07-16
  2. You should see similar to the above in the logs

Testing invalid data

  1. After you have ingest some data lets mess it up in postgres!
  2. Heres a command for AWS:
    update org1234567.reporting_awscostentrylineitem_daily_summary set unblended_cost = 0 where product_code = 'AmazonEC2';
  3. Run validation via masu again
  4. see the following log messages
'message': 'validation started for provider: a67bb4a2-8e05-4078-b1e4-9c8525f3fb0f'
'message': 'triggering VALIDATION_QUERY'
'message': 'finished VALIDATION_QUERY'
'message': 'executing trino sql'
'message': 'executed trino sql'
'message': "provider has incomplete data for specified days: {datetime.date(2024, 7, 1): {'pg_value': 3789.255140293, 'trino_value': 6347.878609799311, 'delta': 2558.623469506311},

Testing OCP on Cloud clusters

  1. Ingest OCP data
  2. Ingest correlated cloud data
  3. Check logs for matching validation for the correlated clusters similar to those above.

Testing via masu

  1. Ingest data
  2. Hit the following endpoint http://localhost:5042/api/cost-management/v1/validate_cost_data/?provider_uuid={provider_uuid}&start_date=2024-07-01&end_date=2024-07-19 (Sub your provider ID)
  3. See validation log messages.

Testing an OCP on CLOUD cluster via masu

  1. Ingest ocp on cloud data
  2. Hit the following endpoint http://localhost:5042/api/cost-management/v1/validate_cost_data/?provider_uuid={provider_uuid}&ocp_on_cloud_type={type}&start_date=2024-07-01&end_date=2024-07-19 (Sub your provider ID)
  3. See validation log messages similar to above

Notes:

Along side the testing about if you want to verify the query data you can enable DEBUG logs and you should see the comparison data from Trino/PG logged for quick comparison.
Example: [2024-07-22 12:45:54,311] INFO fc9a26e0-a60a-4264-8aac-afebb4f082e5 3268 PG: {datetime.date(2024, 7, 18): 3870.034663868, datetime.date(2024, 7, 19): 3670.98312068} Trino data: {datetime.date(2024, 7, 18): 3870.034663795944, datetime.date(2024, 7, 19): 3670.9831205979876}

Release Notes

  • proposed release note
* [COST-4620](https://issues.redhat.com/browse/COST-4620) Add internal data validation

@lcouzens lcouzens added smoke-tests pr_check will build the image and run minimal required smokes and removed smoke-tests pr_check will build the image and run minimal required smokes labels Jul 17, 2024
@lcouzens lcouzens added the smoke-tests pr_check will build the image and run minimal required smokes label Jul 22, 2024
Copy link

codecov bot commented Jul 22, 2024

Codecov Report

Attention: Patch coverage is 91.97861% with 15 lines in your changes missing coverage. Please review.

Project coverage is 94.1%. Comparing base (9262a29) to head (15b358d).

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #5218     +/-   ##
=======================================
- Coverage   94.1%   94.1%   -0.0%     
=======================================
  Files        373     375      +2     
  Lines      31226   31412    +186     
  Branches    4593    4623     +30     
=======================================
+ Hits       29379   29549    +170     
- Misses      1177    1185      +8     
- Partials     670     678      +8     

@lcouzens lcouzens marked this pull request as ready for review July 22, 2024 17:32
@lcouzens lcouzens requested review from a team as code owners July 22, 2024 17:32
koku/masu/processor/_tasks/data_validation.py Outdated Show resolved Hide resolved
koku/api/provider/models.py Outdated Show resolved Hide resolved
koku/masu/processor/_tasks/data_validation.py Show resolved Hide resolved
koku/masu/processor/_tasks/data_validation.py Show resolved Hide resolved
Copy link
Contributor

@djnakabaale djnakabaale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here some feedback on a couple things noticed:

koku/api/utils.py Outdated Show resolved Hide resolved
koku/masu/database/report_db_accessor_base.py Show resolved Hide resolved
koku/masu/database/report_db_accessor_base.py Outdated Show resolved Hide resolved
koku/api/utils.py Outdated Show resolved Hide resolved
@lcouzens lcouzens merged commit 580f47a into main Aug 2, 2024
11 checks passed
@lcouzens lcouzens deleted the COST-4620 branch August 2, 2024 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
smoke-tests pr_check will build the image and run minimal required smokes smokes-required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants