Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wildcard Refresh Interval #276

Merged
merged 5 commits into from
Dec 20, 2021
Merged

Add Wildcard Refresh Interval #276

merged 5 commits into from
Dec 20, 2021

Conversation

braydonk
Copy link
Contributor

Adds new field wildcard_refresh_interval to the logging files receiver
in the ops-agent config.

Adds new field wildcard_refresh_interval to the logging files receiver
in the ops-agent config.
@braydonk braydonk force-pushed the braydonk-refresh-interval branch from 9325aa4 to b95a9dd Compare November 22, 2021 15:55
@braydonk braydonk requested review from a team and igorpeshansky and removed request for a team November 23, 2021 17:57
confgenerator/config.go Outdated Show resolved Hide resolved
@braydonk braydonk changed the title confgenerator: add wildcard_refresh_interval Add Wildcard Refresh Interval Nov 24, 2021
@@ -0,0 +1 @@
cannot unmarshal uint64 into Go struct field UnifiedConfig.Logging of type time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can reuse some similar validation / error message customization that collection_interval uses? In theory, the same type of validations should apply to both, except the minimum is different.

References:

Copy link
Contributor Author

@braydonk braydonk Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been trying to figure this out but haven't come up with a solution. That spot in config.go only seems to be called if it fails a validation, but this error is failing during the unmarshalling step. It looks like we also don't handle it for invalid port numbers, since those fail at the unmarshalling step as well. Reference: https://github.com/GoogleCloudPlatform/ops-agent/blob/master/confgenerator/testdata/invalid/linux/logging-receiver_tcp_type_invalid_parameter_listen_port_is_not_an_int/golden_error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like

CollectionInterval string `yaml:"collection_interval" validate:"required,duration=10s"` // time.Duration format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was only done as a band-aid fix for the fact that the go-yaml library couldn't unmarshal the time.Duration type. Now that it can (see goccy/go-yaml#246) we'd rather use that than a string with a custom validator.

The reason we didn't run into this problem with CollectionInterval is because it will unmarshal into a string, but fails on our duration validator. (FWIW this also resulted in an unhelpful error message of "did not meet minimum duration").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the duration validator was written before the YAML parser could unmarshal directly into a time.Duration. I think it's totally fine to have the normal unmarshal error if there is garbage in the input config, so don't worry too much about what error you get.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Do I get it right that we are basically choosing between two options:

Option 1. Use a custom validator like duration.

Pros:

Cons

  • We need to maintain a custom validator duration.

Option 2. Use the upstream time.Duration validator

Pros

  • No need to maintain a custom validator for duration parsing.

Cons

  • Degraded user experience as the error message does not point out which line of the configuration is problematic with a concise reason. Users need to guess based on the internal config struct names.

Any other tradeoffs I missed?

If this is the main pros and cons, option 2 makes sense if we do get the pros (aka we can get rid of the maintenance of a custom validator). Otherwise it feels like we get degraded user experience but not much gains. Also for consistency and long-term maintenance, we should keep things consistent for similar types (aka between collection_interval and wildcard_refresh_interval.)

BTW we do eventually want to fix all the raw unmarshal error messages to make them more actionable and self descriptive. We haven't tackled them because the cost is relatively high, but it's not the ideal state that we'd like to stick with in the long term.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't about validation; it's about whether the config struct contains a time.Duration or a string.

Actually I don't know why it does't have a better error message in option 2; the YAML parser is the thing generating the error message, so it certainly has enough info to emit the line and column number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get terribly far when I researched this, but what I managed to find was that if the library fails to assert a string on it, it returns the typeError which is the less helpful message we see here. If we put a different type of garbage input that does assert a string, we get the marginally more helpful time: invalid duration "asdfjk34802934809" which is still not ideal.

Copy link
Contributor Author

@braydonk braydonk Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for consistency and long-term maintenance, we should keep things consistent for similar types (aka between collection_interval and wildcard_refresh_interval.)

FWIW I have a branch off this one that changes CollectionInterval to a time.Duration. I was planning to open a bug and PR for it depending what we decided on this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. A better unmarshal error message that emits the line and column number probably requires some upstream go-yaml changes. Sounds like it's a limitation at that level.

A follow up PR to change CollectionInterval sounds perfect. I'm ready to LGTM this one.

Note that we are now in a rare state of code chill / freeze, so feature PRs like this one needs to be held from merging for the time being. More details in the internal team email alias conversation.

confgenerator/config.go Outdated Show resolved Hide resolved
@@ -0,0 +1 @@
cannot unmarshal uint64 into Go struct field UnifiedConfig.Logging of type time.Duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the duration validator was written before the YAML parser could unmarshal directly into a time.Duration. I think it's totally fine to have the normal unmarshal error if there is garbage in the input config, so don't worry too much about what error you get.

confgenerator/config.go Outdated Show resolved Hide resolved
confgenerator/config.go Outdated Show resolved Hide resolved
Use a different strategy for the validator that is more scalable than
the original `whole_time_denomination` implementation; take a
time.Duration and see if it is a multiple of some other duration.
@braydonk braydonk requested a review from quentinmit November 30, 2021 15:01
Copy link
Contributor

@qingling128 qingling128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo a follow up PR to change collection_interval so that the two are consistent for long-term maintenance.

@braydonk braydonk merged commit bd66309 into master Dec 20, 2021
@braydonk braydonk deleted the braydonk-refresh-interval branch June 20, 2022 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants