Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support alternate rules loader #3008

Merged
merged 6 commits into from
Jan 30, 2024
Merged

Conversation

mstemm
Copy link
Contributor

@mstemm mstemm commented Jan 11, 2024

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

/kind release

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area build

/area engine

/area tests

/area proposals

/area CI

What this PR does / why we need it:
Add support for an externally provided rules loader, which will allow extensions to the falco rules syntax.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copy link

This PR may bring feature or behavior changes in the Falco engine and may require the engine version to be bumped.

Please double check userspace/engine/falco_engine_version.h file. See versioning for FALCO_ENGINE_VERSION.

/hold

@mstemm mstemm force-pushed the support-alternate-rules-loader branch from 411d887 to 27ee313 Compare January 11, 2024 23:19
@mstemm mstemm force-pushed the support-alternate-rules-loader branch from 27ee313 to cc9bf48 Compare January 12, 2024 20:59
@leogr leogr requested a review from jasondellaluce January 15, 2024 09:34
@Andreagit97 Andreagit97 added this to the 0.38.0 milestone Jan 17, 2024
@mstemm
Copy link
Contributor Author

mstemm commented Jan 17, 2024

@incertum here's the PR I mentioned during the community call this am.

@incertum
Copy link
Contributor

Thanks @mstemm supportive of this feature. Would you mind adding a few comments so it is easier to understand the technical details and extension of behavior?

@mstemm
Copy link
Contributor Author

mstemm commented Jan 18, 2024

Thanks @mstemm supportive of this feature. Would you mind adding a few comments so it is easier to understand the technical details and extension of behavior?

Sure, do you mean to the PR or to the commit comments?

@incertum
Copy link
Contributor

Thanks @mstemm supportive of this feature. Would you mind adding a few comments so it is easier to understand the technical details and extension of behavior?

Sure, do you mean to the PR or to the commit comments?

Either way, mostly as PR comments. Thanks!

@mstemm
Copy link
Contributor Author

mstemm commented Jan 22, 2024

In some cases, a user of the falco engine may want to extend the falco rules format to provide additional objects to the rules file. Falco already supports extensibility in several ways, with support for plugins, custom event sources e.g. add_source(), and custom ruleset objects that handle the details of evaluating filters.

This PR builds on these interfaces to add extensibility to the rules loading process. Rules loading is currently handled by 3 classes rules_loader::{reader,collector,compiler} with the output ultimately ending up in a rule_loader::compile_output object. With this PR, a developer can extend these classes to add support for new top level yaml objects in the falco rules file, and then provide these classes to the falco engine via set_rule_loader(). In turn, this PR changes how the result of rules loading is provided to the ruleset object, by passing the compile output to the ruleset via filter_ruleset::add_compile_output() instead of individual calls to filter_ruleset::add().

These changes are backwards compatible with existing interfaces and the default behavior is unchanged.

A unit test in test_rule_loader.cpp also acts as an example of how to extend the rules format.

@incertum
Copy link
Contributor

Nice @mstemm seems all very clear to me and I also like the pure code refactor to adopt smart pointers even more and other cleanups. I also agree @jasondellaluce should be the primary reviewer. In summary, I like these changes very much!

@@ -73,6 +73,11 @@ class falco_engine
// If source is non-empty, only fields for the provided source are printed.
void list_fields(std::string &source, bool verbose, bool names_only, bool markdown) const;

// Provide an alternate rule reader, collector, and compiler
// to compile any rules provided via load_rules*
void set_rule_loader(std::shared_ptr<rule_loader::reader> reader,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a regular case of dependency injection of interfaces, so why not breaking it down in the three set/get_rule_loader_reader, set/get_rule_loader_collector, set/get_rule_loader_compiler ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, split and updated the unit tests.

Not all rulesets need this information, so a default implementation
does nothing.
*/
virtual void set_falco_engine(falco_engine *engine) {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a circular dependency between filter_ruleset and falco_engine. I generally think we should avoid this, and since falco_engine is only a facade for the other engine objects/interfaces nowadays, the point gets stronger. I'm also against the raw pointer in this context. Isn't it possible to pass more fine-grained solution? Is this something that can be in each filter_ruleset impl insteat that in the base interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced this with a struct containing functions to fetch specific information from the falco engine instead. The only function so far is to fetch the filter_ruleset for any given source, but we can add additional functions as needed.


returns true if the condition could be compiled, and sets
ast_out/filter_out with the compiled filter + ast. Returns false if
the condition could not be compiled and should be skipped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it throw? If so, in which scenarios? Also, 11 arguments seem a lot and I'm wondering if there is any way this method could be further factorized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It throws in the same conditions that the older implementation did, namely when compiling a filter returns an error and the rule does not have allow_unknown_fields set.

I retained all the arguments just to keep it a simpler case of moving code into a method rather than a larger cleanup. If you want me to also do the cleanup to provide a more rigid interface, I can do that, but it will add additional changes to this PR, which is already getting pretty big.

@mstemm mstemm force-pushed the support-alternate-rules-loader branch 3 times, most recently from 338f995 to 724fbe9 Compare January 24, 2024 20:04
Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana poiana added the lgtm label Jan 26, 2024
@poiana
Copy link
Contributor

poiana commented Jan 26, 2024

LGTM label has been added.

Git tree hash: 71032eea3af092928a4cc69a1e5c5d942974a21c

Copy link
Member

@leogr leogr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

/hold for 0.37 to be out.

In some cases, a user of the falco engine may want to extend the falco
rules format to provide additional objects to the rules file.

To support that, add a new method set_rule_loader() that allows a user
to provide classes that derive from
rule_loader::{reader,collector,compiler} and read those additional
objects from the rules file.

Signed-off-by: Mark Stemm <[email protected]>
In order to support external rules loaders that may extend the falco
rules format with new top level objects, move away from providing
individual filter objects to the filter_ruleset via calls to add().

Instead, pass the entire compile output returned by the compiler to
the ruleset using a new method add_compile_output(). Custom users can
then cast back the compile output to the appropriate derived class for
use in the ruleset.

Move the declaration of the compile output to a standalone class so it
can be used by rulesets without including the entire rules loader
header files, and add a new factory method new_compile_output() to the
compiler so it can create a derived class if necessary.

This change is
backwards-compatible with existing rulesets, as the default
implementation of add_compile_output() simply iterates over rules and
calls add() for each rule.

This change also speeds up rule loading. Previously, each rule
condition was compiled twice:

1. First, in the compiler, to see if it was valid.
2. Second, in the falco engine before providing each rule to the
ruleset.

Add the compiled filter to the falco_rule object instead of throwing
it away in the compiler.

Signed-off-by: Mark Stemm <[email protected]>
To support subclasses that may extend the falco rules format, add
additional error/warning/item types for an extension item.

When subclasses report errors and warnings, they can use these
codes/item types in context objects and still provide an exact
line/column context.

Also make some previously static functions in rules reader protected
methods so they can be used in sub-classes.

Signed-off-by: Mark Stemm <[email protected]>
Some rulesets may need information which is held by the falco_engine
that created this ruleset. So define a set of functions in a struct
and have setters/getters for those functions in the base class.

Derived classes can use the struct's functions to obtain the falco
engine information.

The only function so far is to obtain the filter_ruleset for a given
event source.

Signed-off-by: Mark Stemm <[email protected]>
Move the part of compile_rule_infos that actually compiled a condition
string into a sinsp_filter into a standalone method
compile_condition(). That way it can be used by classes that derive
from rule_loader::compiler() and want to compile condition strings.

This implementation also saves the compiled filter as a part of the
falco_rule object so it does not need to be compiled again wihin the
falco engine after rules loading.

Signed-off-by: Mark Stemm <[email protected]>
Add a unit test for providing an alternate rules loader that also
demonstrates how users can define sub-classes that may want to extend
the falco rules syntax.

This test creates a test rules reader/collector/compiler that supports
top-level objects "test_object". The reader reads them and saves them
in the collector. The compiler iterates over all test_objects and puts
the property values into a single set<string>.

Signed-off-by: Mark Stemm <[email protected]>
@mstemm mstemm force-pushed the support-alternate-rules-loader branch from 724fbe9 to e7e68b5 Compare January 30, 2024 18:04
@poiana poiana removed the lgtm label Jan 30, 2024
@mstemm
Copy link
Contributor Author

mstemm commented Jan 30, 2024

I rebased just to get these changes as close as possible to the master branch. I didn't actually change anything.

Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana poiana added the lgtm label Jan 30, 2024
@poiana
Copy link
Contributor

poiana commented Jan 30, 2024

LGTM label has been added.

Git tree hash: fa4f91e594909f78c4925e577c0f8e08ed9ddae3

Copy link
Contributor

@incertum incertum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

/unhold

@poiana
Copy link
Contributor

poiana commented Jan 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: incertum, jasondellaluce, leogr, mstemm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [incertum,jasondellaluce,leogr,mstemm]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit e1fb55e into master Jan 30, 2024
29 of 30 checks passed
@poiana poiana deleted the support-alternate-rules-loader branch January 30, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants