Skip to content
This repository has been archived by the owner on Jul 11, 2023. It is now read-only.

ref(crdconvert): respond with errors from all requested resources #4601

Merged
merged 3 commits into from
Mar 23, 2022
Merged

ref(crdconvert): respond with errors from all requested resources #4601

merged 3 commits into from
Mar 23, 2022

Conversation

nojnhuh
Copy link
Contributor

@nojnhuh nojnhuh commented Mar 21, 2022

Description:
Currently, CRD conversion webhooks only respond with the error returned
by the first failed resource converted, but there may be several. This
change aggregates errors from all resources sent in a request.

In addition, for #4568 I'm planning to add logic to record metrics labeled with the requested resources' apiVersion and kind from objects parsed out of the conversion request. Those changes (WIP) are shown here: e5587d6#diff-c4fc6ee6327a461684ba57d2d0cc26326c4cb0a980119d85cf93c730c1a7ec65. Without this change, resources in the request beyond the first one that fails will not have metrics recorded showing they were also failed to be converted.

Testing done:

  • Added/updated unit tests

Affected area:

Functional Area
New Functionality [ ]
CI System [ ]
CLI Tool [ ]
Certificate Management [ ]
Control Plane [ ]
Demo [ ]
Documentation [ ]
Egress [ ]
Ingress [ ]
Install [ ]
Networking [ ]
Observability [ ]
Performance [ ]
SMI Policy [ ]
Security [ ]
Sidecar Injection [ ]
Tests [ ]
Upgrade [ ]
Other [X]

Please answer the following questions with yes/no.

  1. Does this change contain code from or inspired by another project? No

    • Did you notify the maintainers and provide attribution?
  2. Is this a breaking change? No

  3. Has documentation corresponding to this change been updated in the osm-docs repo (if applicable)? N/A

Currently, CRD conversion webhooks only respond with the error returned
by the first failed resource converted, but there may be several. This
change aggregates errors from all resources sent in a request.

Signed-off-by: Jon Huhn <[email protected]>
@shashankram
Copy link
Member

Currently, CRD conversion webhooks only respond with the error returned
by the first failed resource converted, but there may be several. This
change aggregates errors from all resources sent in a request.

Could you comment on why the aggregation of errors is necessary here? I am guessing it may be important in some circumstances to know about all the errors that might occur to be able to diagnose an issue pertaining to the conversion. I am curious to know why this is necessary in this particular case.

@codecov-commenter
Copy link

codecov-commenter commented Mar 21, 2022

Codecov Report

Merging #4601 (27bdc3c) into main (eec706d) will increase coverage by 0.39%.
The diff coverage is 42.30%.

@@            Coverage Diff             @@
##             main    #4601      +/-   ##
==========================================
+ Coverage   68.87%   69.26%   +0.39%     
==========================================
  Files         217      217              
  Lines       15019    15021       +2     
==========================================
+ Hits        10344    10405      +61     
+ Misses       4623     4563      -60     
- Partials       52       53       +1     
Flag Coverage Δ
unittests 69.26% <42.30%> (+0.39%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...onversion/config_multiclusterservice_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/policy_egress_conversion.go 0.00% <0.00%> (ø)
...crdconversion/policy_ingressbackends_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/policy_retry_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/smi_httproutegroup_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/smi_tcproutes_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/smi_trafficaccess_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/smi_trafficsplit_conversion.go 0.00% <0.00%> (ø)
pkg/crdconversion/config_meshconfig_conversion.go 75.60% <33.33%> (ø)
pkg/crdconversion/framework.go 67.01% <90.90%> (+62.79%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eec706d...27bdc3c. Read the comment docs.

@nojnhuh
Copy link
Contributor Author

nojnhuh commented Mar 21, 2022

Could you comment on why the aggregation of errors is necessary here? I am guessing it may be important in some circumstances to know about all the errors that might occur to be able to diagnose an issue pertaining to the conversion. I am curious to know why this is necessary in this particular case.

It's not strictly necessary for the conversion to work, but I see it as generally useful to return as much information as we have available for users to be able to fix errors faster than one at a time.

My primary motivation for this change is to accurately count metrics which I tried to describe above. Currently, if a request comes in to convert two resources and converting the first resource fails, the response would indicate converting both failed but the metrics would only count one failed.

@shashankram
Copy link
Member

Could you comment on why the aggregation of errors is necessary here? I am guessing it may be important in some circumstances to know about all the errors that might occur to be able to diagnose an issue pertaining to the conversion. I am curious to know why this is necessary in this particular case.

It's not strictly necessary for the conversion to work, but I see it as generally useful to return as much information as we have available for users to be able to fix errors faster than one at a time.

My primary motivation for this change is to accurately count metrics which I tried to describe above. Currently, if a request comes in to convert two resources and converting the first resource fails, the response would indicate converting both failed but the metrics would only count one failed.

Please also document this in the code, so someone doesn't revert this back to return the first error as a form of simplification for the existing code.

Signed-off-by: Jon Huhn <[email protected]>
Copy link
Member

@shashankram shashankram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a minor comment that can be ignored if you think it so

// doConversion converts the requested object given the conversion function and returns a conversion response.
// failures will be reported as Reason in the conversion response.
func doConversion(convertRequest *v1beta1.ConversionRequest, convert convertFunc) *v1beta1.ConversionResponse {
var convertedObjects []runtime.RawExtension
// aggregate errors from all objects in the request vs. only returning the first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This comment is missing why aggregating errors helps in this case.

Signed-off-by: Jon Huhn <[email protected]>
@nojnhuh
Copy link
Contributor Author

nojnhuh commented Mar 23, 2022

GitHub Actions seems to be acting up at the moment per https://githubstatus.com, so the regular CI tests aren't running for the newest commit here. Since the previous commit passed aside from the known image scan issue for alpine that has a fix upstream in progress and I've only changed a code comment since then, I'm going to merge this.

@nojnhuh nojnhuh merged commit aaa7f30 into openservicemesh:main Mar 23, 2022
@nojnhuh nojnhuh deleted the crdoncvert-errs branch March 23, 2022 16:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants