Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow mapping event types to groups using regexes #17772

Merged
merged 1 commit into from
Aug 10, 2018

Conversation

tadeboro
Copy link
Contributor

Up until now, in order to associate an event type with an event group, we needed to spell out events full name in settings file under appropriate event group category and level.

Some of the providers have hundreds of event types, which makes filling all this information in tedious and error prone. Not to mention that settings file is now one giant pile of event types.

In order to make process of registering event types scale a bit better, this commit adds support for registering a class of event types that match specified regular expressions. Simple example:

:ems:
  :ems_dummy:
    :event_handling:
      :event_groups:
        :addition:
          :critical:
          - dummy_event01_add
          - dummy_event02_add
          :detail:
          - dummy_event03_add
          - dummy_event04_add
          ...
          - dummy_event70_add
          - dm_event_create

This can now be written as

:ems:
  :ems_dummy:
    :event_handling:
      :event_groups:
        :addition:
          :critical:
          - dummy_event01_add
          - dummy_event02_add
          :detail:
          - dm_event_create
          :detail_regex:
          - ^dummy_[0-9]{2}_add$

Regular expression support has been added to all levels, so if there exists a level :lvl, then :lvl_regex is also available.

We took a great deal of care not to break existing code, so non-regex event type mappings take precedence over regular expression matches. For example, in our sample above, dummy_event_02_add gets assigned critical level despite also matching detail regex.

Note that if sets of event types that are matched by regular expressions are not disjoint, then there will be multiple valid event type mappings. In this case, event types that match more than one
group and/or level get assigned an arbitrary one.

@miq-bot assign @gtanzillo

/cc @miha-plesko @agrare @gberginc

@miha-plesko
Copy link
Contributor

Copy link
Contributor

@miha-plesko miha-plesko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tadeboro for the PR, it's great to see what kind of power we're about to get with regular expressions - I mean we'll assign Nuage events in a single line instead of prevous 1k 👍 Please see my suggestion how to make this PR more readable, looking forward to your opinion.

@gtanzillo @agrare so we had discussion with Fryguy on this topis some time (see #17295) and we agreed that REGEX support is mandatory in order to support Timelines for Nuage because Nuage has like 1000 event types. Kindly ask for review.

@@ -36,7 +36,9 @@ class EventStream < ApplicationRecord

after_commit :emit_notifications, :on => :create

GROUP_LEVELS = %i(critical detail warning).freeze
GROUP_LEVELS = %w(critical detail warning).map do |lvl|
[lvl.to_sym, (lvl + '_regex').to_sym]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we didn't go with 2D array at this point as it's affects code readability and introduces a bunch of annonymous arguments _ below. So IMAO we should just leave a simple %i(critical detail warning) here and update the group-matching .detect statement below to append _regex suffix on the fly, please see comment below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are going to have a hard time convincing me that placing a string concatenation inside a loop is a good idea, no matter the readability benefits. Especially since we would be doing same three concatenations over and over again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you encode the regex directly as mentioned in #17772 (comment), then I think you don't need to differentiate with an _regex special suffix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'm picturing a YAML that looks like

:ems:
  :ems_dummy:
    :event_handling:
      :event_groups:
        :addition:
          :critical:
          - dummy_event01_add
          - dummy_event02_add
          :detail:
          - dm_event_create
          - !ruby/regexp /^dummy_[0-9]{2}_add$/

Then when you loop over that you would do something like

GROUP_LEVELS.detect { |lvl| lvl.any? { |v| v === event_type } }

Note that === does a match? for regexps and == for Strings, so it's a nice operator that works for both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will definitely use the serialized version of regex, since this kills two problems I had in one hit: regex validation and avoiding regex construction on each self.group_and_level call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the loop, I would still keep the two loops separated, since giving explicitly named event types precedence over regular expression matches since this gives us guarantee that regular expressions do not break existing functionality. For example, provider introducing a bad regular expression cannot "recategorise" some other provider's events.

This separation is independent of the place where regular expressions are listed, so we can still add them to the configuration file under existing levels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, you can do strings, regexes = events.partition { |e| e.kind_of?(Regexp }, and then process those subcategories separately

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid that partitioning will not work here, since iteration over strings and regexes do not happen inside the same one-level-up iteration. In pseudo-code, we determine the event type group and level like this:

for each event group:
  for each level:
    if event type is equal to one of the types in level, set level and group

if group is not set:
  for each event group:
    for each level:
      if event type matches one of the type regexes in level, set level and group


group = egroups.find do |_, value|
GROUP_LEVELS
.detect { |lvl, _| value[lvl]&.include?(event_type) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'd go with something like

.detect { |lvl| value[lvl]&.include?(event_type) || Regexp.new(value["#{lvl}_regex"]).match(event_type) }

and then we could omit all the egroups memoization, annonymous args _ and the .first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would break backwards compatibility, since badly formed regular expressions in one group of event types could take precedence over explicitly named events in other groups. And backwards compatibility is not something I would like to mess with.

Just to give you an example of bad situation, let us assume that one of the providers has this in its settings:

:ems:
  :ems_dummy:
    :event_handling:
      :event_groups:
        :addition:
          :critical_regex:
          - .+

If we are unlucky and this is the first category that is checked, all of the events will be mapped to :addition group with :critical level. This example is an exaggeration, but the the general principle still stands.

We have two loops here for a reason: we would like to give explicitly named types higher precedence over regular expressions.


group ||= egroups.find do |_, value|
GROUP_LEVELS
.detect { |_, re_lvl| value[re_lvl]&.find { |regex| Regexp.new(regex).match(event_type) } }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if regex is not given, we end up with Regexp.new(nil) which raises TypeError, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my ruby serves me, then no. If value[re_lvl] returns nil, because regex was not given, then find is not executed at all.

@Fryguy
Copy link
Member

Fryguy commented Aug 6, 2018

I haven't read the details, but I wonder if it's better to put an actual Ruby regexp into the yaml, so it can be serialized and deserialized directly (with the side effect of verifying that it's a valid regex)

[1] pry(main)> require 'yaml'
[2] pry(main)> s = {:detail_regex => /^dummy_[0-9]{2}_add$/}.to_yaml
[3] pry(main)> puts s
---
:detail_regex: !ruby/regexp /^dummy_[0-9]{2}_add$/
[4] pry(main)> YAML.load(s)
=> {:detail_regex=>/^dummy_[0-9]{2}_add$/}

@Fryguy
Copy link
Member

Fryguy commented Aug 6, 2018

To add to #17772 (comment), you can have an event with simple chars like ., and we don't want those interpreted as regexes. If you encode the regex into the YAML then we can have strings with . that are not confused with regexes.

@Fryguy Fryguy assigned Fryguy and unassigned gtanzillo Aug 6, 2018
@tadeboro
Copy link
Contributor Author

tadeboro commented Aug 7, 2018

@Fryguy Thanks for valuable feedback. It is greatly appreciated. I implemented most of the enhancements you proposed. I only kept the regular expression precedence lower in order to have a tool to cherry-pick some of the event types into different groups/levels.


group = egroups.find do |_, value|
GROUP_LEVELS
.find { |lvl| value[lvl]&.select { |typ| typ.kind_of?(String) }&.include?(event_type) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.select can't return nil, so you can remove the & before .include?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select is safe, but the [] is not. And if I am not mistaken, &. operator does not short-circuit, which means that include? will still get called on nil.

Copy link
Member

@Fryguy Fryguy Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this can be simplified to...

GROUP_LEVELS.detect { |lvl| value[lvl]&.any? { |typ| typ.kind_of?(String) && typ == event_type } }

(Technically you don't even need the .kind_of? check because Regexp can never == a String, but I can see leaving it for consistency)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil&.include?("foo")
# => nil

&. short-cirtcuits, but my point is that .select can never return nil. The &. before the select must stay as you have it though because value[lvl] can return nil.

Copy link
Contributor Author

@tadeboro tadeboro Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

> {a: [1, 2]}[:a].select { |_| false }.include?(1)
=> false
> {a: [1, 2]}[:b].select { |_| false }.include?(1)
NoMethodError: private method `select' called for nil:NilClass
> {a: [1, 2]}[:b]&.select { |_| false }.include?(1)
NoMethodError: undefined method `include?' for nil:NilClass

Now I am almost sure we need that last &.


group ||= egroups.find do |_, value|
GROUP_LEVELS
.find { |lvl| value[lvl]&.select { |typ| typ.kind_of?(Regexp) }&.find { |regex| regex.match(event_type) } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here before the .find after the .select

Additionally, prefer regex.match?(event_type), since you don't actually need the value for the find call.

Copy link
Member

@Fryguy Fryguy Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I'm looking deeper, this code finds the GROUP_LEVEL if any of the deeper arrays match. In this case, I think it's preferable to do...

GROUP_LEVELS.detect { |lvl| value[lvl]&.any? { |typ| typ.kind_of?(Regexp) && regex.match?(event_type) } }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that code needs to build against ruby 2.3 that has no match? listed in API docs. But maybe this is a documentation issue. Will fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow you are right! I didn't realize .match? was Ruby 2.4 only, So yeah, leave it.

GROUP_LEVELS
.find { |lvl| value[lvl]&.select { |typ| typ.kind_of?(Regexp) }&.find { |regex| regex.match(event_type) } }
.tap { |level_found| level = level_found || level }
end&.first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but prefer .detect over .find. I'm surprised the original code had both, but .detect is preferred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I did try to make it consistent, but being relatively new to ruby/MIQ, I managed to pick the wrong one.

@tadeboro tadeboro force-pushed the event-type-regex branch 3 times, most recently from 5f3487d to 3b3b07e Compare August 8, 2018 08:31
Up until now, in order to associate an event type with an event group,
we needed to spell out events full name in settings file under
appropriate event group category and level.

Some of the providers have hundreds of event types, which makes
filling all this information in tedious and error prone. Not to
mention that settings file is now one giant pile of event types.

In order to make process of registering event types scale a bit
better, this commit adds support for registering a class of event
types that match specified regular expressions. Simple example:

    :ems:
      :ems_dummy:
        :event_handling:
          :event_groups:
            :addition:
              :critical:
              - dummy_event01_add
              - dummy_event02_add
              :detail:
              - dummy_event03_add
              - dummy_event04_add
              ...
              - dummy_event70_add
              - dm_event_create

This can now be written as

    :ems:
      :ems_dummy:
        :event_handling:
          :event_groups:
            :addition:
              :critical:
              - dummy_event01_add
              - dummy_event02_add
              :detail:
              - dm_event_create
              - !ruby/regexp /^dummy_event[0-9]{2}_add$/

We took a great deal of care not to break existing code, so non-regex
event type mappings take precedence over regular expression matches.
For example, in our sample above, dummy_event_02_add gets assigned
critical level despite also matching detail regex.

Note that if sets of event types that are matched by regular
expressions are not disjoint, then there will be multiple valid event
type mappings. In this case, event types that match more than one
group and/or level get assigned an arbitrary one.
@miq-bot
Copy link
Member

miq-bot commented Aug 9, 2018

Checked commit xlab-si@85e3008 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
2 files checked, 0 offenses detected
Everything looks fine. 🍰

@Fryguy Fryguy merged commit 2fefe62 into ManageIQ:master Aug 10, 2018
@Fryguy Fryguy added this to the Sprint 92 Ending Aug 13, 2018 milestone Aug 10, 2018
@tadeboro tadeboro deleted the event-type-regex branch August 10, 2018 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants