-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POWERMON-235: Add support for Redfish #364
POWERMON-235: Add support for Redfish #364
Conversation
2fde234
to
d1ab5ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is worth merging for configuring the redfish. Please kindly find my detailed comments below.
@@ -93,82 +93,98 @@ func TestTolerations(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestHostPID(t *testing.T) { | |||
func Test_DaemonSet(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to TestDaemonSet
to keep the same format to the other test function.
type InternalExporterSpec struct { | ||
// +kubebuilder:validation:Required | ||
Deployment InternalExporterDeploymentSpec `json:"deployment"` | ||
|
||
Redfish *InternalRedfishSpec `json:"redfish,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to extend spec for configuring the other metrics as well (such as to enable/disable cgroups metrics and so on) - check #202.
Thus, I think we should move Redfish to be under metric.energy
and resource utilization metrics such as cGroups should go under metric.resource
.
Refer to terms in Kepler:
// https://github.com/sustainable-computing-io/kepler/blob/3d0bad753d68f74a0955f6e25d2a9d8e93507d08/pkg/collector/stats/stats.go#L47
type Stats struct {
ResourceUsage map[string]*types.UInt64StatCollection
EnergyUsage map[string]*types.UInt64StatCollection
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a bigger change to the API. Can we do this api change in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I think we can update that on a different PR. @sthaha What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in favour of keeping redfish
right under exporter
since it isn't metrics
or deployment
related rather is is power meter and having exporter.redfish
is acceptable. I might consider exporter.powermeter.redfish
(or something else) at a later stage.
d1ab5ab
to
4fb666d
Compare
This PR made a change on Kepler resource. Shall we first limit this change to only KeplerInternal? |
thats by design. once |
REDFISH_ARGS = "-redfish-cred-file-path=/etc/redfish/redfish.csv" | ||
REDFISH_CSV = "redfish.csv" | ||
REDFISH_ANNOTATION = "kepler.system.sustainable.computing.io/redfish-secret-ref" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please change this to follow the convention?
If these constants aren't used outside the package, lets make them private.
pkg/controllers/kepler_internal.go
Outdated
func (r KeplerInternalReconciler) getRedfishSecret(secretName string) (*corev1.Secret, error) { | ||
redfishSecret := corev1.Secret{} | ||
redFishSecretName := types.NamespacedName{ | ||
Namespace: KeplerDeploymentNS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be the namespace that you get from the kepler-internal
instead of the deployment namespace of kepler
?
As a rule of thumb, internal must not use (or have any knowledge) of Kepler
related config or variables.
pkg/controllers/kepler_internal.go
Outdated
rs = append(rs, | ||
KeplerDaemonSetReconciler{ | ||
ki: *ki, | ||
updater: updateResource, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All single resource reconcilers should move to pkg/reconcilers
to provide decoupling.
Since KeplerDaemonSetReconciler
already has ki
which should be be owner, it need not be passed the updater
and can create a generic updater{Owner: ki, Resource: ds}.
Note: reconcileFn
is a private type within the controller to ease creation of reconciler
// # pkg/reconciler/kepler_daemonset.go
func (r ) Reconcile(..) {
...
return Updater{Owner: ki, }.Reconcile(...)
}
pkg/controllers/kepler_internal.go
Outdated
rs = append(rs, resourceReconcilers(updateResource, openshiftNamespacedResources(ki, cluster)...)...) | ||
return rs | ||
} | ||
|
||
type KeplerDaemonSetReconciler struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should move to pkg/reconciler/
where this can be unit tested as well.
pkg/controllers/kepler_internal.go
Outdated
|
||
requests := []reconcile.Request{} | ||
for _, ki := range ks.Items { | ||
if secret.ObjectMeta.Name == ki.Spec.Exporter.Redfish.SecretRef { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we also check the namespace matches deployment.namespace
if Config.Cluster == k8s.OpenShift { | ||
c = c.Owns(&secv1.SecurityContextConstraints{}, genChanged) | ||
} | ||
return c.Complete(r) | ||
} | ||
|
||
// mapSecretToRequests returns the reconcile requests for kepler-internal objects for which an associated redfish secret has been changed. | ||
func (r *KeplerInternalReconciler) mapSecretToRequests(ctx context.Context, object client.Object) []reconcile.Request { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider Own
(ing) the secret instead, which wouldn't require us to map secrets to k-i manually?
pkg/controllers/kepler_internal.go
Outdated
if errors.IsNotFound(err) { | ||
return nil, nil | ||
} | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return nil, errors.IgnoreNotFound(err)
pkg/components/exporter/exporter.go
Outdated
@@ -287,7 +303,7 @@ func NewClusterRole(c components.Detail, k *v1alpha1.KeplerInternal) *rbacv1.Clu | |||
}, | |||
Rules: []rbacv1.PolicyRule{{ | |||
APIGroups: []string{""}, | |||
Resources: []string{"nodes/metrics", "nodes/proxy", "nodes/stats", "pods"}, | |||
Resources: []string{"nodes/metrics", "nodes/proxy", "nodes/stats", "pods", "secrets"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this shouldn't be needed since kepler does not need to watch secrets.
pkg/controllers/kepler.go
Outdated
var redfishSpec *v1alpha1.InternalRedfishSpec = nil | ||
if k.Spec.Exporter.Redfish != nil { | ||
redfishSpec = &v1alpha1.InternalRedfishSpec{ | ||
RedfishSpec: *k.Spec.Exporter.Redfish, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this an api added to Kepler
, we should first add this as v1alpha1.RedfishSpec
and resuse that in the internal. As as rule of thumb, Internal can resue Kepler but not the other way round.
type InternalRedfishSpec struct { | ||
RedfishSpec `json:",inline"` | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need a separate internal redfish spec since the one in kepler can be reused.
pkg/api/v1alpha1/kepler_types.go
Outdated
// +optional | ||
// +kubebuilder:validation:Pattern=`\d+` | ||
// +kubebuilder:default:="60" | ||
ProbeInterval string `json:"probeIntervalSec,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use metav1.Duration
instead and convert to seconds when the configmap is created?
also rename the field to json:probeInterval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/controllers/kepler_internal.go
Outdated
func (r KeplerDaemonSetReconciler) Reconcile(ctx context.Context, cli client.Client, s *runtime.Scheme) reconciler.Result { | ||
|
||
secretRef := r.ki.Spec.Exporter.Redfish.SecretRef | ||
secret, err := r.getRedfishSecret(ctx, cli, secretRef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should fail reconcile, set the status to available false, if a secret is specified and is missing or invalid (missing the redfish.csv) . Otherwise there is no way for user to know if kepler is using redfish or not.
@vimalk78 nice work overall. It worked pretty well in my test. But few comments to address. |
pkg/controllers/kepler_internal.go
Outdated
@@ -74,12 +77,39 @@ func (r *KeplerInternalReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
Owns(&rbacv1.ClusterRoleBinding{}, genChanged). | |||
Owns(&rbacv1.ClusterRole{}, genChanged) | |||
|
|||
c = c.Watches(&corev1.Secret{}, | |||
handler.EnqueueRequestsFromMapFunc(r.mapSecretToRequests)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, handler.ResourceVersionChanged{}
} | ||
|
||
ks := v1alpha1.KeplerInternalList{} | ||
if err := r.List(ctx, &ks); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can improve the perf a bit more by using the fieldSelector
see: https://book.kubebuilder.io/reference/watching-resources/externally-managed
pkg/controllers/kepler_internal.go
Outdated
secret, ok := object.(*corev1.Secret) | ||
if !ok { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed since the watch is already set for v1.Secret
and we can make use of GetName()
and GetNamespace()
- Extended Kepler and KeplerInternal API to add `Exporter.Redfish` - Redfish spec has fields for adding reference to a secret which should contain Redfish credentials. - The user supplied secret is mounted in the kepler container. - Added Unit tests and e2e test Signed-off-by: Vimal Kumar <[email protected]>
4fb666d
to
8ce215e
Compare
@@ -9,6 +9,7 @@ rules: | |||
resources: | |||
- daemonsets | |||
- deployments | |||
- secrets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't create secrets do we ?
ds.Spec.Template.Annotations = map[string]string{ | ||
REDFISH_ANNOTATION: secret.ResourceVersion, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be sufficient to just annotate the ds instead of the template. Lets handle this in a different PR.
//+kubebuilder:rbac:groups=security.openshift.io,resources=securitycontextconstraints,verbs=list;watch;create;update;patch;delete;use | ||
//+kubebuilder:rbac:groups=monitoring.coreos.com,resources=servicemonitors;prometheusrules,verbs=list;watch;create;update;patch;delete | ||
|
||
// RBAC required by Kepler exporter | ||
//+kubebuilder:rbac:groups=core,resources=nodes/metrics;nodes/proxy;nodes/stats,verbs=get;list;watch | ||
//+kubebuilder:rbac:groups=core,resources=nodes/metrics;nodes/proxy;nodes/stats;secrets,verbs=get;list;watch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary since Kepler exporter does not watch or create secrets.
@@ -302,7 +302,7 @@ restart_operator() { | |||
info "scale down Operator" | |||
run kubectl scale -n "$OPERATORS_NS" --replicas=0 "$deployment" | |||
run kubectl wait -n "$OPERATORS_NS" --for=delete \ | |||
pods -l app.kubernetes.io/component=operator --timeout=60s | |||
pods -l app.kubernetes.io/component=manager --timeout=60s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good find!
This PR
Exporter.Redfish
contain Redfish credentials.