Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the code to use scontrol in place of slurm APIs to drain the #74

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mithun-pensando
Copy link
Collaborator

node to make it slurm version independent

@mithun-pensando mithun-pensando force-pushed the redfishmh branch 2 times, most recently from 3b048a6 to 478801c Compare January 23, 2025 17:07
@@ -219,15 +220,34 @@ func (s *Server) processRequest(AppConfig Config, conn net.Conn, req *http.Reque
log.Printf("Origin Of Condition: %s", originOfCondition)
for _, triggerEvent := range AppConfig.TriggerEvents {
if severity == triggerEvent.Severity {
log.Printf("Matched Trigger Event: %s with action %s", triggerEvent.Severity, triggerEvent.Action)
if triggerEvent.Message != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the action will only be triggered if the severity message matches as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user doesn't want to match on the 'message' field, it can be left empty. I will add it to the comments in .env

TRIGGER_EVENTS="[\
{\"Severity\":\"Fatal\",\"Action\":\"DrainNode\"},\
{\"Severity\":\"Critical\",\"Action\":\"DrainNode\"}
{\"Severity\":\"Critical\",\"Message\":\"message 1|This is a critical test event\",\"Action\":\"DrainNode\", \"DrainReasonPrefix\":\"RebootNeeded\"},\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an action necessary if it's just going to be a drain anyway?

Could you also add a comment for these fields as well to make it clear?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it as is in case we need to use it for some other action in the future. Will update the comments

@@ -160,6 +201,183 @@ func (c *Client) GetNodes() ([]string, error) {
return nodes, nil
}

func (c *Client) GetNodeReason(nodeName string) (string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this code needed anymore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left in the functions which use API for future use. I will update the name of this function

"log"
"strings"

"github.com/nod-ai/ADA/redfish-exporter/metrics"
)

const (
Drain = "DrainNode"
Drain = "DrainNode"
ExlcudeReasonSet = "DRAIN_EXCLUDE_REASON_SET"
Copy link
Collaborator

@yuva29 yuva29 Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the user provide this? Is this different than the exclude reason from .env?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for internal logging

@@ -160,6 +201,183 @@ func (c *Client) GetNodes() ([]string, error) {
return nodes, nil
}

func (c *Client) GetNodeReasonWithAPI(nodeName string) (string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed now that we are moving everything to scontrol?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tested this API path also, left it in if we move to API later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants