Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATAGO-57805: Add command line functionality #153

Merged
merged 12 commits into from
Jan 19, 2024

Conversation

gregmeldrum
Copy link
Collaborator

@gregmeldrum gregmeldrum commented Jan 18, 2024

What is the purpose of this change?

The existing EMA user interface is too difficult. The ask is to add a command runner which takes arguments.

How was this change implemented?

Built on top of Julian's excellent POC. Added a new command runner that can be accessed by disabling tomcat using spring command line attributes.
Made the error handling more robust by introducing the new PluginClientException that camel can use to tell when the scan has failed and report an error.

How was this change tested?

Tested manually from the command line. Also regressed the existing standalone and cloud-connected scans to make sure nothing was broken.

Is there anything the reviewers should focus on/be aware of?

No

if (args.length > 2) {

String type = args[0];

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, only handle the scan command.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment for next ticket, let's change this var to commandType or command or something else.

log.error("Scan request [{}]: Scan did not complete successfully.", scanId);
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It completed successfully if there are dataCollectionFileEntitites for each scan type, and none of them are marked as "FAILED"

AtomicBoolean scanCompleted = new AtomicBoolean(false);

// The hard timeout is set to 30 minutes to prevent the scan from running forever
long hardTimeoutMillis = System.currentTimeMillis() + 30 * 60 * 1000;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the "paranoia" timeout just so the EMA doesn't run forever. It's currently set to 30 minutes since large configs can take a long time to scan.


// The initiated timout is set to 2 minutes to shortcut a scan that is stuck in the INITIATED state
// This typically occurs if there is an error in the broker URL or credentials
long initiatedTimeoutMillis = System.currentTimeMillis() + 2 * 60 * 1000;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timeout is triggered if all of the scans are in "INITIATED" state for more than 2 minutes. It's meant to catch cases where we had trouble creating the client or running the command but the exception was not captured.

String scanId = (String) exchange.getIn().getHeader(RouteConstants.SCAN_ID);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled this code into a shared abstract class called RouteStateProcessor that's used by RouteCompleteProcessor and RouteFailedProcessor.

this.idGenerator = idGenerator;
}

@Transactional
public ScanStatusEntity save(String name, String scanId) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the ability to set any state instead of hardcoding to COMPLETE.

@@ -39,17 +40,17 @@ public class KafkaClientManagerImpl implements MessagingServiceClientManager<Adm
private final TimeUnit maxReconnectBackoffTimeUnit;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SaveActions plugin re-wrote this...

@@ -0,0 +1,19 @@
package com.solace.maas.ep.event.management.agent.plugin.exception;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new Generic Plugin Exception that Camel will look for to know when a scan has failed.

@@ -40,7 +41,8 @@ protected AbstractRouteBuilder(ScanTypeDescendentsProcessor scanTypeDescendentsP
public void configure() throws Exception {

// NetworkClient Selectable network i/o exception + Kafka Future exceptions
onException(IOException.class, InterruptedException.class, ExecutionException.class, TimeoutException.class)
onException(IOException.class, InterruptedException.class, ExecutionException.class, TimeoutException.class,
PluginClientException.class)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the PluginClientException that any plugin can throw. This signals to camel that an error has occurred and Camel can set the scan type status to FAILED.

@@ -59,6 +61,7 @@ public void configure() throws Exception {
RouteConstants.TRACE_ID + "}]: The status of [${header." +
RouteConstants.SCAN_TYPE + "}]" + " is: [${header." + RouteConstants.SCAN_STATUS + "}].")
.to("direct:perRouteScanStatusPublisher?block=false&failIfNoConsumers=false")
.to("direct:processScanStatusAsFailed?block=false")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New route to set the scan status to failed for this type.

@gregmeldrum gregmeldrum marked this pull request as ready for review January 19, 2024 16:46
long initiatedTimeoutMillis = System.currentTimeMillis() + 2 * 60 * 1000;

while (System.currentTimeMillis() < hardTimeoutMillis && !scanCompleted.get()) {
Thread.sleep(2000);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this Thread.sleep increases CPU usage by a lot. Can we use Timer and TimerTask instead to run a scanComplete or hardTimeoutMillis check every 2 minutes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm guessing wrong on the cpu consumption, let's keep the code as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything I've read online says that the thread is released and does not consume CPU for a .sleep and .wait() so out of simplicity I'd like to keep it as is.

Copy link
Collaborator

@moodiRealist moodiRealist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with non-blocking comments.

Copy link

SonarQube Quality Gate

Quality Gate failed

Failed condition 56.1% 56.1% Coverage on New Code (is less than 80%)

See analysis details on SonarQube

@gregmeldrum gregmeldrum merged commit a59133c into main Jan 19, 2024
5 of 6 checks passed
@gregmeldrum gregmeldrum deleted the DATAGO-57805-command-line branch February 6, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants