Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rule Tuning] EQL bugs using regex and backslashes (Windows Paths) #2260

Closed
w0rk3r opened this issue Aug 24, 2022 · 15 comments
Closed

[Rule Tuning] EQL bugs using regex and backslashes (Windows Paths) #2260

w0rk3r opened this issue Aug 24, 2022 · 15 comments
Assignees
Labels
Rule: Tuning tweaking or tuning an existing rule

Comments

@w0rk3r
Copy link
Contributor

w0rk3r commented Aug 24, 2022

Rules affected

Description

A change in Lucene seems to break the usage of \ in regex conditions and marked as a no-fix, so we need to adjust the logic of these rules accordindly. The impact is low, as these regex conditions are used as exceptions on our rules and not the main logic, but they can cause FPs.

The use of triple quotes seems to prevent the rule from throwing a error, but the regex will not working based on my testing.

SDH issue for context.

@mark-dufresne
Copy link

@w0rk3r I had a question about the expected solution. Are you expecting to get rid of regex to work around for the bug/breaking change discussed in the SDH?

@w0rk3r
Copy link
Contributor Author

w0rk3r commented Aug 30, 2022

Yeah, as it seems to be a no-fix, and our use here is to exclude paths, I'll replace the regex expressions with wildcards or just an OR condition on Program Files and Program Files (x86)

@rw-access
Copy link
Contributor

rw-access commented Sep 2, 2022

hello team! it's been a while 😉
my understanding of what I can see publicly about that Lucene change is that it's actually a good thing! I wish it was done much earlier, to be honest. It removes leniency for \ so that you can only use proper character escapes, and \\ is valid but \a won't quietly become a literal \ then a anymore.

As long as you use \ properly and put it inside a """ it looks like it will work correctly. If you don't put it inside a """ string, then you'll need to remember to double escape: once for EQL, and once for regex.

It looks like there are a lot of issues here that are linked but not public so I can't grasp the full context. Everything I can see points to process.parent.executable regex~ """C:\\Program Files( \(x86\))?\\Cisco Systems\\.*""" functioning the same. The only slashes used there are used to escape special characters like \, ( and ), so it looks like it would work as expected.

Is there any (public) documentation for how this breaks existing rules? All the existing rules should be using \ properly like the above example, so I think you should be okay.

@brokensound77
Copy link
Contributor

brokensound77 commented Sep 2, 2022

Thanks @rw-access, that sounds spot on!

I did a review of all existing rules using regex and I saw no issues. The only rule using regex which is not triple quoted was AdminSDHolder SDProp Exclusion Added, which has no escapes in the query

...
  winlog.event_data.AttributeValue regex~ "[0-9]{15}([1-9a-f]).*"
...

The original source of the issue was centered around custom rules, which were not triple quoting.

I think we can close #2284 with no further action beyond continuing to ensure proper syntax in our own rules

@w0rk3r
Copy link
Contributor Author

w0rk3r commented Sep 3, 2022

Yeah, just validated our cases and seems there are really no issues on them (sorry and thanks!)... But the initial bug was something like this, but using single quotes:

Query:

process where
(
    process.name like "control.exe"
    and process.args like~ "*.cpl"
    and not process.command_line regex~ """.*:\\Windows\\System32\\\\w+\\.cpl.*"""
)

I suggested using triple quotes:

image

Which doesn't throw any error, but doesn't match the condition neither:

image

When I use a single backslach for \w+ it throws this error:

search_phase_execution_exception: [query_shard_exception] Reason: failed to create query: Query must not be null; [query_shard_exception] Reason: failed to create query: Query must not be null; [query_shard_exception] Reason: failed to create query: Query must not be null

Any clues around this?

@rw-access
Copy link
Contributor

rw-access commented Sep 3, 2022

not sure about the exception to be honest. it does look like you have \\w when you should have just \w and \\. when it should be \.. but if it got an exception when doing just \w then I'm not sure why.

also, keep in mind that it will have different results for different versions of Lucene and Elasticsearch because \w hasn't always been interpreted as a predefined character class, so for an old version it'll be the same as \\w. if you switch \w to [a-z0-9_] you'll get maximal compatibility!

you could also do [^\\] if you want to match all non-\ characters so you get one directory

@w0rk3r
Copy link
Contributor Author

w0rk3r commented Sep 5, 2022

I'm doing some more testing, and seems that the problem is that I can't use \w+, \d+, \S+, etc in 8+, not the backslashes:

In 8.3.3:
image

In 7.17.x it works just fine:
image

Wdyt, is this a bug that we should track somewhere? I'm working with the docs team to adjust this to be more precise about the bug we got on the SDH, how should we report this? @rw-access @brokensound77

@rw-access
Copy link
Contributor

rw-access commented Sep 6, 2022

Personally, I would switch to [a-z...] style character classes for full compatibility. I believe the 7.x series treats \w literally, so it's the same as \\w. I don't know the best part forward for reporting, documentation, etc., so I'll let you all handle that.

@luigidellaquila
Copy link

I tested a similar query on a fresh 8.4.1 cloud instance and everything seems to work fine

Schermata 2022-09-13 alle 11 11 46

maybe I'm missing something, or the problem is already fixed in v 8.4.x... As far as I remember, we did not do much on these aspects recently, so if it's a fix, probably it's in Lucene.

@w0rk3r
Copy link
Contributor Author

w0rk3r commented Sep 14, 2022

@luigidellaquila I just tested the following query in a new 8.4.1 and got the same error:

process where process.command_line regex~ """\w+"""

image

Diving a bit more, I just saw that it works normal with process.args:

image

Maybe this is a issue with wildcard fields?
image

@luigidellaquila
Copy link

@w0rk3r yes, the problem seems to be specific to wildcard fields.
I'm investigating it, I'll keep you updated

@luigidellaquila
Copy link

@w0rk3r I can confirm that the problem happens specifically on wildcard fields.
Actually, the problem is not strictly related to EQL, I tried to run a simple search query:

{
    "query": {
        "regexp": {
            "name": {
                "value": "\\w+"
            }
        }
    }
}

the query works fine if name is a text or a keyword field, but fails if it's a wildcard.

I think I found the problem in the code and a possible fix, I'll keep you updated

@luigidellaquila
Copy link

Hi all,

the PR with the fix was just merged on main branch (8.5)

@w0rk3r
Copy link
Contributor Author

w0rk3r commented Sep 15, 2022

Awesome, I'll keep this one open until we have the BC for testing. @nastasha-solomon I think @luigidellaquila may give you a better picture of the bug/limitation to get elastic/security-docs#2337 right, let me know if any input from my side is needed.

@w0rk3r
Copy link
Contributor Author

w0rk3r commented Sep 30, 2022

Already worked with Nastasha and Luigi via Slack to get the bug properly documented, so I'm closing this as completed.

@w0rk3r w0rk3r closed this as completed Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Rule: Tuning tweaking or tuning an existing rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants