-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide example kubernetes manifest #661
Conversation
thanks a lot for the patch. Please let me find a reviewer |
I have two questions:
and it worked in my enviornment (it used Linux perf interface). Was there any other reason to use privilaged? Can you check, does it work for you without privilaged? Without privileged, we could put less strcit requirments for namespace (with labels) I just want to follow least privileged principle if it doesn't break any functionality.
FYI: I'm going for vacation for a week, I'll comeback to review in the second week of February, so no rush. |
I did use the I set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be ready to accept this as is when we drop privileged and hostNetwork and just need to be sure it works without functional issues in bare "kind based" testing enviorment .
Signed-off-by: Pat Riehecky <[email protected]>
In theory I've made the changes you requested. Does this look better? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks definitelly better and it works flaweslly! :) so LGTM
Here is the functional test to be further used for validation:
# Create cluster
kind create cluster
kind export kubeconfig
# Deploy NodeFeatureDiscovery
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.1
kubectl get node -o jsonpath='{.items[0].metadata.labels.feature\.node\.kubernetes\.io\/cpu\-model\.vendor_id}{"\n"}'
# Deploy prometheus for PodMonitor
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false
kubectl get sts prometheus-prometheus-kube-prometheus-prometheus
# Deploy PCM
kubectl apply -f pcm-kubernetes.yaml
# Verfiy PCM works as expected
kubectl -n intel-pcm get daemonset
kubectl -n intel-pcm get pods
podname=`kubectl -n intel-pcm get pods -ojsonpath='{.items[0].metadata.name}'`
kubectl proxy &
curl -Ls http://127.0.0.1:8001/api/v1/namespaces/intel-pcm/pods/$podname/proxy/metrics | grep DRAM_Writes
promtool query instant http://127.0.0.1:8001/api/v1/namespaces/default/services/prometheus-kube-prometheus-prometheus:http-web/proxy 'avg by(__name__) ({job="pcm"})'
and we get
CStateResidency => 0.09090909090909094 @[1707901856.957]
Clock_Unhalted_Ref => 1010026077.3913049 @[1707901856.957]
Clock_Unhalted_Thread => 1295730425.8695648 @[1707901856.957]
DRAM_Joules_Consumed => 0 @[1707901856.957]
DRAM_Reads => 3600814506.6666665 @[1707901856.957]
DRAM_Writes => 1974366592 @[1707901856.957]
Embedded_DRAM_Reads => 0 @[1707901856.957]
Embedded_DRAM_Writes => 0 @[1707901856.957]
Incoming_Data_Traffic_On_Link_0 => 689786624 @[1707901856.957]
Incoming_Data_Traffic_On_Link_1 => 689454432 @[1707901856.957]
Incoming_Data_Traffic_On_Link_2 => 0 @[1707901856.957]
Instructions_Retired_Any => 749013885.5739133 @[1707901856.957]
Invariant_TSC => 432975372048881700 @[1707901856.957]
L2_Cache_Hits => 3531524.973913045 @[1707901856.957]
L2_Cache_Misses => 2334387.130434784 @[1707901856.957]
L3_Cache_Hits => 1325323.1739130428 @[1707901856.957]
L3_Cache_Misses => 627863.4000000003 @[1707901856.957]
L3_Cache_Occupancy => 0 @[1707901856.957]
Local_Memory_Bandwidth => 0 @[1707901856.957]
Measurement_Interval_in_us => 14507400443881 @[1707901856.957]
Memory_Controller_IO_Requests => 0 @[1707901856.957]
Number_of_sockets => 2 @[1707901856.957]
OS_ID => 55.499999999999986 @[1707901856.957]
Outgoing_Data_And_Non_Data_Traffic_On_Link_0 => 1843333122.5 @[1707901856.957]
Outgoing_Data_And_Non_Data_Traffic_On_Link_1 => 1849219231.5 @[1707901856.957]
Outgoing_Data_And_Non_Data_Traffic_On_Link_2 => 0 @[1707901856.957]
Package_Joules_Consumed => 0 @[1707901856.957]
Persistent_Memory_Reads => 0 @[1707901856.957]
Persistent_Memory_Writes => 0 @[1707901856.957]
RawCStateResidency => 89486131.66409859 @[1707901856.957]
Remote_Memory_Bandwidth => 0 @[1707901856.957]
SMI_Count => 0 @[1707901856.957]
Thermal_Headroom => -2147483648 @[1707901856.957]
Utilization_Incoming_Data_Traffic_On_Link_0 => 0 @[1707901856.957]
Utilization_Incoming_Data_Traffic_On_Link_1 => 0 @[1707901856.957]
Utilization_Incoming_Data_Traffic_On_Link_2 => 0 @[1707901856.957]
Utilization_Outgoing_Data_And_Non_Data_Traffic_On_Link_0 => 0 @[1707901856.957]
Utilization_Outgoing_Data_And_Non_Data_Traffic_On_Link_1 => 0 @[1707901856.957]
Utilization_Outgoing_Data_And_Non_Data_Traffic_On_Link_2 => 0 @[1707901856.957]
ps. above test was run on Intel(R) Xeon(R) Platinum 8180 CPU - for VM based hosts we will have issues depending on the type (e.g. we may need to comment out MCFG/sys-acpi volume as described in FAQ Q11 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot!
This provides an example for how you might deploy this in kubernetes.
It includes node selectors defined by the Node Feature Discovery SIG and podMonitors defined by the Prometheus Operator initiative.