Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended detector use-cases to consider the workflow #394

Conversation

stevanbz
Copy link
Contributor

Description

This PR contains the changes related with workflow.

  1. Enables workflow creation - all the monitors related with the given detector will be part of the detector (220)
  2. Enables workflow update during detector update (222)
  3. Deletes the workflow during the detector deletion (225)
  4. Adds additional field to a detector (226)

This PR considers if the current detectors are supporting the workflow or not

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -51,6 +51,8 @@ public class Detector implements Writeable, ToXContentObject {
public static final String ENABLED_TIME_FIELD = "enabled_time";
public static final String ALERTING_MONITOR_ID = "monitor_id";

public static final String ALERTING_WORKFLOW_ID = "workflow_ids";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: shouldnt we have single workflow id?
Any motivations for list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope we are not exposing workflow id in response serialized to user

Copy link
Contributor Author

@stevanbz stevanbz Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added list because I though it's more generic solution - which will allow us (if we are going consider that scenario in one moment) to add multiple workflows without taking care about the detectors created in the period once we had only one workflow (and creating special case branches in the code with some if-else logic).
Does this makes sense? If not, I will remove it

Copy link
Member

@eirsep eirsep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens

  1. User creates detector in 2.6
  2. upgrades cluster to version where workflows are introduced
  3. updates detector rules? Will new workflow be created and the monitors be disabled? or will we not create workflows for existing detectors.
  4. let's closely check the udpate scenario

case ALERTING_WORKFLOW_ID:
XContentParserUtils.ensureExpectedToken(XContentParser.Token.START_ARRAY, xcp.currentToken(), xcp);
while (xcp.nextToken() != XContentParser.Token.END_ARRAY) {
String workflowId = xcp.text();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null check?

errorStatusSupplier.trySet(response.getStatus());
return true;
// If detector doesn't have the workflows it means that older version of the plugin is used
if (detector.isWorkflowSupported()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug log "deleting workflow before deleting detector

workflowService.deleteWorkflow(detector.getWorkflowIds().get(0),
new ActionListener<>() {
@Override
public void onResponse(DeleteWorkflowResponse deleteWorkflowResponse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug log "workflow deleted. deleting monitors before detector deletion"

}
return false;
}).count() > 0) {
onFailures(new OpenSearchStatusException("Monitor associated with detected could not be deleted", errorStatusSupplier.get()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not removed - it's extracted to monitor service. Check it out here

log.error("Detector not being deleted because monitor [{}] could not be deleted. Status [{}]", response.getId(), response.getStatus());
errorStatusSupplier.trySet(response.getStatus());
return true;
// If detector doesn't have the workflows it means that older version of the plugin is used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use step listener instead of repeating logic
first step is deleting workflow. if workflow doesnt exist listener can return success response or soemthing like that.
much more cleaner and understandable that way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (numberOfUnprocessedResponses == 0) {
workflowService.upsertWorkflow(
monitorResponses.stream().map(IndexMonitorResponse::getId).collect(Collectors.toList()),
null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which field is being set as null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updatedMonitors


@Override
public void onFailure(Exception e) {
listener.onFailure(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete monitors if we fail to create workflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are deleting the monitors in workflowService.upsertWorkflow if upserts fails

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the same monitor be referred to other workflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@getsaurabh02 Yes - one monitor can be part of delegate list of two different workflows


@Override
public void onFailure(Exception e) {
listener.onFailure(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add error logs everywhere log.error("message", e)


@Override
public void onFailure(Exception e) {
listener.onFailure(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error logs

@@ -1141,7 +1230,30 @@ public void onResponse(IndexResponse response) {

@Override
public void onFailure(Exception e) {
onFailures(e);
// Revert the workflow and monitors created in previous steps
workflowService.deleteWorkflow(request.getDetector().getWorkflowIds().get(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if workflow never gets created

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will we delete monitors then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's an error in program flow - if the workflow is not created (because of the error) the detector won't be created.

}
@Override
public void onFailure(Exception e) {
listener.onFailure(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error logs for e and e.suppressedExceptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought to add something like this:
log.error("Error deleting monitors", e);
When you say e.suppressedExceptions you mean we should also do
log.error("Error deleting monitors", e.getSuppressed()); or?

import org.opensearch.commons.alerting.action.DeleteMonitorResponse;
import org.opensearch.rest.RestStatus;

public class MonitorService {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java doc to explain this is an Alerting commons interface

import org.opensearch.rest.RestRequest.Method;
import org.opensearch.securityanalytics.model.Detector;

public class WorkflowService {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java doc to explain this is an Alerting commons interface

@@ -88,6 +88,9 @@
}
}
},
"workflow_ids": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keyword or text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to be keyword

@stevanbz
Copy link
Contributor Author

what happens

1. User creates detector in 2.6

2. upgrades cluster to version where workflows are introduced

3. updates detector rules? Will new workflow be created and the monitors be disabled? or will we not create workflows for existing detectors.

4. let's closely check the udpate scenario

Check it out here
You can see that we are checkign if the detector is supporting the workflow. If it does, we're updating the workflow delegate list if not, we are just deleting the monitors.

@eirsep eirsep marked this pull request as ready for review April 18, 2023 17:50
@eirsep eirsep requested a review from a team April 18, 2023 17:50
@eirsep
Copy link
Member

eirsep commented Apr 19, 2023

@stevanbz
Can we have an INTERNAL cluster setting security_analytics.enable_workflow_usage to turn off or turn on the feature.
If flag is enabled, we will create workflow in detector
If flag is disabled, we will create only individual enabled monitors
then create following test:

  1. disable the above setting
  2. create detector (which would not have workflow)
  3. enabled the above setting
  4. update the workflow and verify that no workflow is created for that detector

@stevanbz
Copy link
Contributor Author

@stevanbz Can we have an INTERNAL cluster setting security_analytics.enable_workflow_usage to turn off or turn on the feature. If flag is enabled, we will create workflow in detector If flag is disabled, we will create only individual enabled monitors then create following test:

1. disable the above setting

2. create detector (which would not have workflow)

3. enabled the above setting

4. update the workflow and verify that no workflow is created for that detector

Yeah sure. Makes sense. Will add the flag. Just one correction re the item 4.
instead of "update the workflow and verify that no workflow is created for that detector" I guess you wanted to say "update the detector..."

… enable_workflow_usage. Added integration test that verify that the workflow is not created during update

Signed-off-by: Stevan Buzejic <[email protected]>
@@ -546,7 +707,7 @@ private IndexMonitorRequest createBucketLevelMonitorRequest(
triggers.add(bucketLevelTrigger1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still do not see the bucket level triggers passed to the workflow. will it be part of a future pr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address this in a quick follow up PR

verifyWorkflow(detectorMap, monitorIds, 2);
}

public void testCreateDetector_verifyWorkflowExecutionBucketLevelDocLevelMonitors_success() throws IOException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this test case be create a detector with an aggregation sigma rule with a detector trigger that generates a finding but not an alert?

can we add another test case to create a detector with an aggregation sigma rule with a detector trigger that generates a finding & also an alert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address this in a quick follow up PR

@eirsep eirsep self-requested a review April 28, 2023 18:09
@eirsep eirsep merged commit 4f40fc7 into opensearch-project:feature/composite-monitors Apr 28, 2023
eirsep pushed a commit to eirsep/security-analytics that referenced this pull request Aug 24, 2023
Enables workflow creation - all the monitors related with the given detector will be part of the detector
Enables workflow update during detector update
Deletes the workflow during the detector deletion
Adds additional field to a detector

Signed-off-by: Stevan Buzejic <[email protected]>
eirsep pushed a commit to eirsep/security-analytics that referenced this pull request Aug 24, 2023
Enables workflow creation - all the monitors related with the given detector will be part of the detector
Enables workflow update during detector update
Deletes the workflow during the detector deletion
Adds additional field to a detector

Signed-off-by: Stevan Buzejic <[email protected]>
eirsep pushed a commit to eirsep/security-analytics that referenced this pull request Aug 28, 2023
Enables workflow creation - all the monitors related with the given detector will be part of the detector
Enables workflow update during detector update
Deletes the workflow during the detector deletion
Adds additional field to a detector

Signed-off-by: Stevan Buzejic <[email protected]>
eirsep pushed a commit to eirsep/security-analytics that referenced this pull request Sep 5, 2023
Enables workflow creation - all the monitors related with the given detector will be part of the detector
Enables workflow update during detector update
Deletes the workflow during the detector deletion
Adds additional field to a detector

Signed-off-by: Stevan Buzejic <[email protected]>
eirsep added a commit that referenced this pull request Sep 6, 2023
* Using alerting workflows in detectors  (#394)

Enables workflow creation - all the monitors related with the given detector will be part of the detector
Enables workflow update during detector update
Deletes the workflow during the detector deletion
Adds additional field to a detector

Signed-off-by: Stevan Buzejic <[email protected]>

* disable delegate monitors as workflow is synced to detector enabled/disabled flag

Signed-off-by: Surya Sashank Nistala <[email protected]>

* fix test

Signed-off-by: Surya Sashank Nistala <[email protected]>

---------

Signed-off-by: Stevan Buzejic <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Subhobrata Dey <[email protected]>
Co-authored-by: Stevan Buzejic <[email protected]>
Co-authored-by: Subhobrata Dey <[email protected]>
eirsep pushed a commit to eirsep/security-analytics that referenced this pull request Sep 7, 2023
Enables workflow creation - all the monitors related with the given detector will be part of the detector
Enables workflow update during detector update
Deletes the workflow during the detector deletion
Adds additional field to a detector

Signed-off-by: Stevan Buzejic <[email protected]>
riysaxen-amzn pushed a commit to riysaxen-amzn/security-analytics that referenced this pull request Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants