Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor alert_processor proc to be based on the ECS correlation logic #63

Merged
merged 25 commits into from
Apr 11, 2023

Conversation

sfc-gh-nlele
Copy link
Collaborator

@sfc-gh-nlele sfc-gh-nlele commented Mar 14, 2023

This PR rewrites the alert processor to have similar correlation logic to the ECS-based alert processor.

Testing:
An alert in results.alerts before correlation:
Screenshot 2023-03-20 at 8 43 35 PM

Alert processor's result on the above alert:

Screenshot 2023-03-20 at 8 45 03 PM

Resulting correlated alert:

Screenshot 2023-03-20 at 8 46 24 PM


Similar alert which should receive the same correlation ID as the above alert, before correlation:

Screenshot 2023-03-20 at 9 07 29 PM

Resulting correlated alert:

Screenshot 2023-03-20 at 9 10 32 PM

--

The changes have been retested.

@sfc-gh-pkommini
Copy link
Collaborator

@sfc-gh-nlele Is there an "after correlation" alert screenshot? Like the "before correlation" alert shot?

@sfc-gh-nlele
Copy link
Collaborator Author

@sfc-gh-pkommini Was just adding it, added it to the description.

Copy link
Collaborator

@sfc-gh-afedorov sfc-gh-afedorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks mostly good, but let me sleep on it and give it another go tomorrow

procedures_js/alert_processor.js Show resolved Hide resolved
}
if (action instanceof Array) {
o = object.join('","')
object = `'["$${o}"]'`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean action?

(caught by GPT-4!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry yes.

(caught by GPT-4!)

Wow 😮

'EVENT_TIME' in alert == false
) {
return generate_uuid()
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove else if the first if always returns so that it's more clearly an early bail condition on the rest of the function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right makes sense 👍

}
}

try {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I follow when this would throw an exception -- could you make it more clear, or remove the exception and just do match = exec(...)[0] which would be either the first row object or undefined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to try and catch any unexpected errors here, in order to match the ECS version in terms of exception handling. Will just get the first row 👍

Copy link
Collaborator

@sfc-gh-afedorov sfc-gh-afedorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for delay on this, I think this should be the last of it but give me one more chance to re-review after changes

)
}

return { alert_correlation_result: alert_correlation_result }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to repeat the key in JS instead of {x: x} you can just do {x}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you determine whether this return value is something that can be recorded? if not, let's remove this as it's misleading. if so, what value do you expect to be returned here?

Copy link
Collaborator Author

@sfc-gh-nlele sfc-gh-nlele Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we can store the results, without an insert statement running in the proc, to record these values.

However, the return value here is something like below right now, we could maybe pair these with alert_id for more context.

But I think since the individual call statement results will still be available for 24 hours in the query history, it might not hurt to return these?

{
  "alert_correlation_result": [
     {
        "number of multi-joined rows updated": 0,
        "number of rows updated": 1
     },
     {
        "number of multi-joined rows updated": 0,
        "number of rows updated": 1
    },
    {
        "number of multi-joined rows updated": 0,
        "number of rows updated": 1
   }
  ]
}

procedures_js/alert_processor.js Show resolved Hide resolved
action = `'["$${o}"]'`
}

match = exec(GET_CORRELATED_ALERT, [actor, object, action, time])[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another common pattern here is to use the boolean || and && as short circuit operators, i.e.

match = exec(GET_CORRELATED_ALERT, [actor, object, action, time])[0] || {}
return match['CORRELATION_ID'] || generate_uuid()

WHEN MATCHED THEN UPDATE SET
correlation_id = src.correlation_id
GET_CORRELATED_ALERT = `
SELECT *
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why select * if all you use is the CORRELATED_ID?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had kept the SQL to be same as the original, but makes sense, will just select the correlation_id.


for (const x of UNCORRELATED_ALERTS) {
alert_body = x['ALERT']
correlation_id = get_correlation_id(alert_body)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this queries the db, let's maybe call it find_related_correlation_id or something else that hints the database is going to be selected?

AND alert:ALERT_ID = ?
`

UNCORRELATED_ALERTS = exec(GET_ALERTS_WITHOUT_CORREALTION_ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this naming is a little confusing since in other places they are constant strings. since it's only used once, could you move this into the for loop directly instead of naming it?


UNCORRELATED_ALERTS = exec(GET_ALERTS_WITHOUT_CORREALTION_ID)

for (const x of UNCORRELATED_ALERTS) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use row instead of x


UPDATE_ALERT_CORRELATION_ID = `
UPDATE ${results_alerts_table}
SET correlation_id = ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to call UUID_STRING() here, instead, in something like COLLATE(?, UUID_STRING())

Copy link
Collaborator Author

@sfc-gh-nlele sfc-gh-nlele Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean COALESCE instead? Would we want it to serve as a fail-safe, and not remove generate_uuid() from find_related_correlation_id?

Copy link
Collaborator

@sfc-gh-afedorov sfc-gh-afedorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm


match = exec(GET_CORRELATED_ALERT, [actor, object, action, time])[0] || {}

return match['CORRELATION_ID'] || generate_uuid()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can drop this separate call or replace it with null now that you have the piece with COALESCE below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, kept a null here, since if the return is undefined because of match['CORRELATION_ID'], it can't be bound with a statement.

'ACTION' in alert == false ||
'EVENT_TIME' in alert == false
) {
return generate_uuid()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can return null now that you have the piece with COALESCE below?

@sfc-gh-nlele
Copy link
Collaborator Author

Retested that the changes work.

@sfc-gh-nlele sfc-gh-nlele merged commit d0e9159 into main Apr 11, 2023
@sfc-gh-nlele sfc-gh-nlele deleted the refactor-alert-processor branch April 11, 2023 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants