Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCU] Add PFC_WD RDMA validator #2619

Merged
merged 22 commits into from
Mar 8, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 21 additions & 6 deletions generic_config_updater/generic_config_updater.conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,22 @@
},
"NTP_SERVER": {
"services_to_validate": [ "ntp-service" ]
}
},
"PFC_WD": {
"table_modification_validators": [ "generic_config_updater.validators.is_mellanox_device_validator" ]
},
"BUFFER_POOL": {
"table_modification_validators": [ "generic_config_updater.validators.is_mellanox_device_validator" ]
},
"WRED_PROFILE": {
"table_modification_validators": [ "generic_config_updater.validators.is_mellanox_device_validator" ]
},
"QUEUE": {
"table_modification_validators": [ "generic_config_updater.validators.is_mellanox_device_validator" ]
},
"BUFFER_PROFILE": {
"table_modification_validators": [ "generic_config_updater.validators.is_mellanox_device_validator" ]
Copy link
Contributor

@qiluo-msft qiluo-msft Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline discussed, this is not flexible to express all the rules: ASIC vendor, ASIC generation, and sonic branches. #Closed

Copy link
Contributor Author

@isabelmsft isabelmsft Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the design only allowed for "AND" of all validators.
To address this comment, I added a new section to allow for "OR" check of validators.
Extending the "OR" section to be a list of lists would allow for (X AND Y AND Z) OR (S AND T) checks as well for even greater flexibility.

}
},
"services": {
"system_health": {
Expand All @@ -58,19 +73,19 @@
"validate_commands": [ ]
},
"rsyslog": {
"validate_commands": [ "generic_config_updater.services_validator.rsyslog_validator" ]
Copy link
Contributor

@qiluo-msft qiluo-msft Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_commands is used for post health check after ConfigDB modification. What you want to achieve is to check the modification before it is applied to ConfigDB. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a new conf file specific to validators that need to be run before the change is applied to ConfigDB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep original name services_validator? there is nothing wrong with old code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the new validator is_mellanox_device_validator to the originally named services_validator.py file, so I changed to validator.py

If is_mellanox_device_validator would be better placed somewhere else, I will move it and leave the original file name services_validator.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can place new validator in generic_config_updater.move_validator, and still keep old code.

Copy link
Contributor Author

@isabelmsft isabelmsft Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new validator is not a MoveValidator, which validates every intermediary move. The preexisting MoveValidators come in during patch sorting. For MoveValidators, if validation fails, the PatchSorter algorithm will continue searching for a new valid move. For this use case, we don't need to start patch sorting at all.

Please see a comment I made on a previous PR #2545 describing why we are adding this new infra for one-shot validation at the start (as opposed to MoveValidators for every intermediary move).

Quote from linked PR: We don't want to validate each intermediate move to get to the final state, as is done by MoveValidators- we just need to validate the initial request. MoveValidators continue searching for a different move that is 'valid' if it finds one invalid - possibly by deleting the parent and then readding other children. This is not the behavior we desire. We only need a check of the initial request (reflected in Json diff), so we add this extra one-shot validation at the start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep original name services_validator? there is nothing wrong with old code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I changed it back to services_validator. I initially changed it away because I added a non-service validator to this file, but this new non-service validator is now in a new file field_operation_validators.py

"validate_commands": [ "generic_config_updater.validators.rsyslog_validator" ]
},
"dhcp-relay": {
"validate_commands": [ "generic_config_updater.services_validator.dhcp_validator" ]
"validate_commands": [ "generic_config_updater.validators.dhcp_validator" ]
},
"vlan-service": {
"validate_commands": [ "generic_config_updater.services_validator.vlan_validator" ]
"validate_commands": [ "generic_config_updater.validators.vlan_validator" ]
},
"caclmgrd-service": {
"validate_commands": [ "generic_config_updater.services_validator.caclmgrd_validator" ]
"validate_commands": [ "generic_config_updater.validators.caclmgrd_validator" ]
},
"ntp-service": {
"validate_commands": [ "generic_config_updater.services_validator.ntp_validator" ]
"validate_commands": [ "generic_config_updater.validators.ntp_validator" ]
}
}
}
Expand Down
30 changes: 30 additions & 0 deletions generic_config_updater/gu_common.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
import json
import jsonpatch
import importlib
from jsonpointer import JsonPointer
import sonic_yang
import sonic_yang_ext
import subprocess
import yang as ly
import copy
import re
import os
from sonic_py_common import logger
from enum import Enum

YANG_DIR = "/usr/local/yang-models"
SYSLOG_IDENTIFIER = "GenericConfigUpdater"
SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__))
UPDATER_CONF_FILE = f"{SCRIPT_DIR}/generic_config_updater.conf.json"

class GenericConfigUpdaterError(Exception):
pass
Expand Down Expand Up @@ -155,6 +159,32 @@ def validate_field_operation(self, old_config, target_config):
if any(op['op'] == operation and field == op['path'] for op in patch):
raise IllegalPatchOperationError("Given patch operation is invalid. Operation: {} is illegal on field: {}".format(operation, field))

def _invoke_validating_function(cmd):
# cmd is in the format as <package/module name>.<method name>
method_name = cmd.split(".")[-1]
module_name = ".".join(cmd.split(".")[0:-1])
module = importlib.import_module(module_name, package=None)
method_to_call = getattr(module, method_name)
return method_to_call()
Copy link
Contributor

@qiluo-msft qiluo-msft Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method_to_call

There is potential risk to run ambiguous code. We can proactively check the module, and method_names. Maybe we can have allowlist or similar?
#Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added code to ensure the module_name is generic_config_updater.field_operation_validators and the method_name contains the string validator


if os.path.exists(UPDATER_CONF_FILE):
with open(UPDATER_CONF_FILE, "r") as s:
updater_conf = json.load(s)
else:
raise GenericConfigUpdaterError("GCU Config file not found")

for element in patch:
path = element["path"]
table = re.search(r'\/([^\/]+)(\/|$)', path).group(1)
Copy link
Contributor

@wen587 wen587 Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it match? Could you add an commented example? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the table name "PFC_WD" and removes any forward slashes at the beginning or end. I added a comment to clarify as well

validating_functions = set()
tables = updater_conf["tables"]
validating_functions.update(tables.get(table, {}).get("table_modification_validators", []))

for function in validating_functions:
if not _invoke_validating_function(function):
raise IllegalPatchOperationError("Modification of {} table is illegal- validating function {} returned False".format(table, function))


def validate_lanes(self, config_db):
if "PORT" not in config_db:
return True, None
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import os
import time
from .gu_common import genericUpdaterLogging
from sonic_py_common import device_info

logger = genericUpdaterLogging.get_logger(title="Service Validator")
logger = genericUpdaterLogging.get_logger(title="GCU Validator")

print_to_console = False

Expand Down Expand Up @@ -101,3 +102,9 @@ def caclmgrd_validator(old_config, upd_config, keys):

def ntp_validator(old_config, upd_config, keys):
return _service_restart("ntp-config")


def is_mellanox_device_validator():
version_info = device_info.get_sonic_version_info()
asic_type = version_info.get('asic_type')
return asic_type == "mellanox"
4 changes: 2 additions & 2 deletions tests/generic_config_updater/change_applier_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from unittest.mock import patch, Mock, call

import generic_config_updater.change_applier
import generic_config_updater.services_validator
import generic_config_updater.validators
import generic_config_updater.gu_common

SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__))
Expand Down Expand Up @@ -232,7 +232,7 @@ def test_change_apply(self, mock_set, mock_db, mock_os_sys):

generic_config_updater.change_applier.UPDATER_CONF_FILE = CONF_FILE
generic_config_updater.change_applier.set_verbose(True)
generic_config_updater.services_validator.set_verbose(True)
generic_config_updater.validators.set_verbose(True)

applier = generic_config_updater.change_applier.ChangeApplier()
debug_print("invoked applier")
Expand Down
4 changes: 2 additions & 2 deletions tests/generic_config_updater/service_validator_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from collections import defaultdict
from unittest.mock import patch

from generic_config_updater.services_validator import vlan_validator, rsyslog_validator, caclmgrd_validator
from generic_config_updater.validators import vlan_validator, rsyslog_validator, caclmgrd_validator
import generic_config_updater.gu_common


Expand Down Expand Up @@ -177,7 +177,7 @@ def test_change_apply_os_system(self, mock_os_sys):
rc = rsyslog_validator("", "", "")
assert not rc, "rsyslog_validator expected to fail"

@patch("generic_config_updater.services_validator.time.sleep")
@patch("generic_config_updater.validators.time.sleep")
def test_change_apply_time_sleep(self, mock_time_sleep):
global time_sleep_calls, time_sleep_call_index

Expand Down