New Node Page Layout #2640

ChengYanJin · 2020-06-22T13:24:44Z

Component:
ui, design

Why this is needed:
We would like to work on the new Node page.

What should be done:
Would require the Sketch design from @Cuervino
(We could discuss the detail after this issue and comments.)

Node Table / Component fields:
Name (mandatory)
IP
Roles (mandatory)
Age
Status (Ready, Unknown, Deployment Failed) (mandatory)
K8s version (TBC)
Nbr of Volumes exposed by this node
Nbr of PODs exposed by this node
Health (mandatory)

Node Perf Charts:
CPU Usage, Load AVG, Memory Usage, IOPS and Network bandwidth (2 charts)

The Node Grafana Dashboard contains all required information: https://10.200.6.39:8443/grafana/d/fa49a4706d07a042595b664c87fb33ea/nodes?orgId=1
For the IOPS chart, we should aggregate all devices in Write and Read Metrics
For Network we should aggregate all interfaces in Received and Transmitted Metrics

Implementation proposal (strongly recommended):
Since we don't have that much quantity of Node as Volume, we may don't need to use the table to show the node list. @Cuervino will work on the two different proposals.

Test plan:
TBD

Cuervino · 2020-06-24T12:22:35Z

I have only 1 proposal, which is a more like an enhanced table for the left side. Anyway, feel free to comment.

For the first tab "Health":

The thin red line is the scrolling limit (780 pixels that we are aiming for).

For this one, the main issue that I saw is the density of information for the right panel. I do think that scrolling is ok for this screen, as long as the user notices all the main 3 components Info/Alerts/Performance charts without scrolling at first.

Tab "Volume":

1)
For "jumping" to the volume page, there could be different options from the Volume table on the right side:
-simple click on the line (not very good IMO)
-double click on the line
-a "Go to" button at the last column [->]
-a hypertext lookalike link

In this proposal, the blue font for the volume name suggests that it's a link to the volume page. In addition with the change of cursor when hovering, I think it's a valid solution.

I keep the same icons as in the Volume page for the "Status" icon, which is a mix of Availability & Bound

Tab "Pod":

Maybe we can replaced the "Running" text in the column by an icon.

In general:
-The filtering and the sorting icons are missing, I'll add them on the next iteration.
-It might be needed to work more on charts, as we noticed that they need to be precisely defined for avoiding development issues.
-Please let me know if the fake data are not realistic enough.

ChengYanJin · 2020-06-26T14:55:43Z

Hello @Cuervino,

Thanks for the early version of the New Node page!

I notice that in the Node list instead of displaying the health in a separate column we put a bigger statue circle at the beginning. Do we have particular reason for it?
Do we already take into consideration what will be displayed when I click on the Deploy button?

Regrading to For "jumping" to the volume page,
I like the blue font solution, for us the blue is the interaction color. So I think it's clear enough to tell it's a clickable link.

gdemonet · 2020-06-29T07:38:47Z

Looks great, thanks @Cuervino!

I have some feedback on the content to be displayed (maybe I should have reviewed this earlier, sorry). These comments are mostly addressed toward @thomasdanan and @ChengYanJin, but some (who said "most"?) of them will impact the information display.

Node IPs

We show a single IP per node: by design, we at least have two IPs (one for control plane and one for workload plane) to show (if they are the same, we may omit the notion of control plane IP in the UI, TBD).

[optional] On the topic of IPs, we may also want to show the Pod IP ranges assigned to a Node, to ease troubleshooting. This would materialize as one (or more, if not contiguous) CIDR notations (111.111.111.111/111).

Node Roles

This topic is tricky: to sum up, I think we should have more than one role, and this is to be considered as a user-defined value (even if the creation from the UI hides them).

Let me explain why

In essence, node roles are special labels, where the key takes the form node-role.kubernetes.io/<role>. In MetalK8s, we use the master, etcd, bootstrap, infra and node roles. However, in the UI, for node creation, we decided to simplify these concepts and group them under "control plane"/"infra"/"workload plane". Maybe it makes sense to keep these user-friendly checkboxes for Node creation, but display should, IMO, reflect the actual roles as extracted from labels. What this would provide is a clear mapping between the "roles" seen in the UI and the ones from kubectl or the API. It would also allow us to see custom roles, e.g. if the use case mandates dedicating some nodes to running a special workload, like node-role.kubernetes.io/mongodb.

Side note on master role

Due to recent events, there is a strong effort in the K8s community to ensure terms such as master are removed - and it will be renamed as control-plane, probably. We should follow this lead, one more reason why the "high-level Control Plane" can be confusing.

Node Deployment

The Deploy button makes little sense to keep in the left-side panel IMO. That's a disruptive action, which may alter behavior of workloads running on this node. As such, akin to other critical operations such as draining (a form of "quarantine" if you will) or deletion (which isn't even handled today), they should appear in the detailed right-side panel.

For the status, I like that we try to show "Deployment fail" (failed, or failure, maybe?), but this information isn't readily available today. I would prefer us working on the conditions that are already available today on Node objects, and see how to add Salt-related information later. Currently, we only show Unknown/Ready/NotReady (same as kubelet, roughly), while we have all those conditions available:

Ready (what we are using today)
DiskPressure
MemoryPressure
PIDPressure
NetworkUnavailable

We also have an extra "status" information which could be useful to display: when a node is cordoned (no new pod can be scheduled there), it has a spec.unschedulable flag enabled. This can help understand why a node is NotReady while everything is installed on the host.

Missing details: topology

Most cloud-based installs of K8s have extra information per Node, through some well-known labels. We're planning on adding support for these labels and react to them when deploying (to e.g. optimize Pod spreading).

As such, it would be meaningful, if the information is available, to display zone and region info for each Node (even if they actually mean rack and site).

Right-hand-side panel: Health tab

Agreed with you, this seems too crowded. Since we already have tabs, could we imagine splitting the Health tab into three others: Details / Alerts / Performance for instance?

On this tab, I have some extra comments:

do we really need the Creation time? what benefit does our user gain from having this information? I personally would be more interested in the last updated time (not sure if there's one readily available)
what is a tag? not sure this exists in K8s, we have labels and annotations (which are essentially the same, just the former are indexed while the latter aren't)
for labels (and annotations), we have both a key and a value: as such, we can't (shouldn't) display them as pills, IMO. Maybe a two-column table?

RHS panel: Volumes

Looks good! However, we have a type column, which shows different kinds of Pod spec.volumes (e.g. configMap or downwardAPI. That's not what we want to show in this tab IMO: instead, we only want to show actual Volume objects as defined by Scality, those that map to a single device (or sparse file) on the host. As such, instead of showing a type column, I'd rather see the device path, or maybe the StorageClass used for provisioning.

RHS panel: Pods

Not much to say here, just not sure of the interest of showing a Health circle alongside a Status text. Also, a pod can be more than just one container, and each has a different status (when using kubectl, we see a READY column, in which we have the count of ready containers, e.g. 2/3).
Otherwise, looks good to me. One day we'll get a detailed view of pods, the resources they consume, how they are defined, their logs, etc. But not right now :)

thomasdanan · 2020-07-02T11:20:44Z

Looks great, thanks @Cuervino!

I have some feedback on the content to be displayed (maybe I should have reviewed this earlier, sorry). These comments are mostly addressed toward @thomasdanan and @ChengYanJin, but some (who said "most"?) of them will impact the information display.

Node IPs

We show a single IP per node: by design, we at least have two IPs (one for control plane and one for workload plane) to show (if they are the same, we may omit the notion of control plane IP in the UI, TBD).

true and I suggest we display both, even if they are the same. More or less what we do already with the RING when we talk about Mngt IP and Data IP.

[optional] On the topic of IPs, we may also want to show the Pod IP ranges assigned to a Node, to ease troubleshooting. This would materialize as one (or more, if not contiguous) CIDR notations (111.111.111.111/111).

I can't say if it is important or not, @xaf-scality maybe you have some feedback on this one?

Node Roles

This topic is tricky: to sum up, I think we should have more than one role, and this is to be considered as a user-defined value (even if the creation from the UI hides them).

I agree with displaying the roles extracted from labels.
I also agree to have the Deploy button in the right panel (and not in the table). Following some discussion I had with @gdemonet I wonder if we should keep this "Deploy" naming ... but let's not bother with this for now.

Let me explain why
Side note on master role

Node Deployment

The Deploy button makes little sense to keep in the left-side panel IMO. That's a disruptive action, which may alter behavior of workloads running on this node. As such, akin to other critical operations such as draining (a form of "quarantine" if you will) or deletion (which isn't even handled today), they should appear in the detailed right-side panel.

For the status, I like that we try to show "Deployment fail" (failed, or failure, maybe?), but this information isn't readily available today. I would prefer us working on the conditions that are already available today on Node objects, and see how to add Salt-related information later. Currently, we only show Unknown/Ready/NotReady (same as kubelet, roughly), while we have all those conditions available:

Ready (what we are using today)

DiskPressure

MemoryPressure

PIDPressure

NetworkUnavailable

We also have an extra "status" information which could be useful to display: when a node is cordoned (no new pod can be scheduled there), it has a spec.unschedulable flag enabled. This can help understand why a node is NotReady while everything is installed on the host.

For the Node Status, let's have the 3 status (we get from kubelet) you are proposing: Unknown, Ready, NotReady and hve tooltip displaying the conditions you are talking about. I think those condition should be visible in the Information panel as well. Knowing that no pod can be scheduled on a Node is super important as well, not sure to understand how you get this information: is it a condition? or a kubelet status?

Missing details: topology

Most cloud-based installs of K8s have extra information per Node, through some well-known labels. We're planning on adding support for these labels and react to them when deploying (to e.g. optimize Pod spreading).

As such, it would be meaningful, if the information is available, to display zone and region info for each Node (even if they actually mean rack and site).

I suggest we enrich the page when we support those labels. (so not for the first iteration)

Right-hand-side panel: Health tab

Agreed with you, this seems too crowded. Since we already have tabs, could we imagine splitting the Health tab into three others: Details / Alerts / Performance for instance?

To me we need to have the same organisation between Volumes and Nodes. I am okay to go that route if we can't reasonably put all those info in a single view but then I would suggest we adapt Volume Page to have the same approach.

On this tab, I have some extra comments:

do we really need the Creation time? what benefit does our user gain from having this information? I personally would be more interested in the last updated time (not sure if there's one readily available)

Ok let's not bother with creation time

what is a tag? not sure this exists in K8s, we have labels and annotations (which are essentially the same, just the former are indexed while the latter aren't)

for labels (and annotations), we have both a key and a value: as such, we can't (shouldn't) display them as pills, IMO. Maybe a two-column table?

Indeed, no tags, just labels (in the same way we display labels for Volume no?)

RHS panel: Volumes

Looks good! However, we have a type column, which shows different kinds of Pod spec.volumes (e.g. configMap or downwardAPI. That's not what we want to show in this tab IMO: instead, we only want to show actual Volume objects as defined by Scality, those that map to a single device (or sparse file) on the host. As such, instead of showing a type column, I'd rather see the device path, or maybe the StorageClass used for provisioning.

RHS panel: Pods

Not much to say here, just not sure of the interest of showing a Health circle alongside a Status text. Also, a pod can be more than just one container, and each has a different status (when using kubectl, we see a READY column, in which we have the count of ready containers, e.g. 2/3).
Otherwise, looks good to me. One day we'll get a detailed view of pods, the resources they consume, how they are defined, their logs, etc. But not right now :)

Cuervino · 2020-07-02T11:55:20Z

Thanks for all these information @gdemonet

Here is a new proposal with most of your comments.

What changed.

On the Left-Hand Panel,
-Button "Deploy" removed
-"Sorting" icon in the column
-1 additional IP value
-new status values - here is the GUI values that I propose:

Status	Condition	GUI display "Condition"	Color
Ready	Ready	Ready	Green
Not Ready	DiskPressure	Disk Pressure	Yellow
Not Ready	MemoryPressure	Memory Pressure	Yellow
Not Ready	PIDPressure	PID Pressure	Yellow
Not Ready	NetworkUnavailable	Network Unavailable	Yellow
Not Ready	spec.unschedulable	Unschedulable	Yellow
Unknown	NA	Unknown	Yellow

Now there are 5 tabs:
-Health - for Global Health, info, and the action button
-Alerts - for active alerts
-Performance - with the charts
-Node
-Volumes

It's allowing us to have no scrollable screens :)

From the comment of @thomasdanan, we might need to revert back to having the same behaviour as the Volume page.

On the Health tab,
-3 buttons: drain, deploy and delete
-Maybe "Details" is a better name for this tab, as it's not only related with Health
-"Creation date" changed for "Last updated"
-Addition of "Region" & "Zone" info
-1 additional IP value
-About the labels/annotations ; I think that it might be possible to show labels with pills, other softwares are doing so (Kubernetes for example).
However, by splitting the tabs we have more space, and so here what it could be with label/annotation tables.

On the Volumes tab,
-"Type" replaced by "Device path"

On the Pods tab,
-Health column removed
-Status column amended, with Running + nb of ready containers, and a color for quickly seeing not "100% running/available" values.

@ChengYanJin ,
For 1) there is no specific reason for a bigger icon for this one, it's just for trying to have a design that is specific for the user.
For 2), I didn't work on that, I should review with you how it looks like currently.

gdemonet · 2020-07-03T07:08:38Z

Awesome!

Status Condition GUI display "Condition" Color

Ready Ready Ready Green

Not Ready DiskPressure Disk Pressure Yellow

Not Ready MemoryPressure Memory Pressure Yellow

Not Ready PIDPressure PID Pressure Yellow

Not Ready NetworkUnavailable Network Unavailable Yellow

Not Ready spec.unschedulable Unschedulable Yellow

Unknown NA Unknown Yellow

I like this approach. For the three "pressure" conditions, the node may still be ready, as in its condition named Ready is true, so yellow is a good color choice if the node status.conditions["Ready"] is true, and maybe use red otherwise.

To make things obvious, I think we should use (for @ChengYanJin) the following:

green for status.conditions['Ready'] == True and all other conditions are false
yellow for status.conditions['Ready'] == True and some other conditions are true
red for status.conditions['Ready'] == False
grey when there is no status.conditions

And it seems your design matches that rule, so 👍
For the text however, how would the design handle multiple conditions at once (e.g. both memory and disk pressure)?

It's allowing us to have no scrollable screens :)

👍 💯

-About the labels/annotations ; I think that it might be possible to show labels with pills, other softwares are doing so (Kubernetes for example).

You're right, indeed I've seen this in the K8s Dashboard as well. Both approaches are worth considering, if you think pills can be a good fit for potentially long (e.g. storage.metalk8s.scality.com/device-path = /dev/loop2 could be an annotation for Volume) and numerous (I'd say on average, we can get up to 15-20 per object) labels/annotations.

thomasdanan · 2020-07-03T08:37:13Z

Hello @Cuervino

Looks super nice

I wonder if we should rename Health in Info. Health is more a combination of what you see under alerts and performances.
And to be clear, if we use this way of presenting information (through different tabs), I think we should do the same for Volumes. Obviously the benefit is you don't have to scroll anymore or you don't have to put too much info on the same page, it improves "readbility", the drawback (if it is one) is you don't see alerts and perf at the same time.

I am not sure the unschedulable condition would/should trigger an alert. In your screenshot, central-2.compute.ext should have Green Health.

@xaf-scality @bkettler @mastachand @nloewensen would love to have your feedback on this page.

Thanks

nloewensen · 2020-07-03T17:09:56Z

@Cuervino @thomasdanan

Looks already pretty cool!
I like the approach using the additional tabs since the look less "crowded" now. I agree with Thomas that the "Health" tab should be renamed to "Info".
If we apply this concept here, we should make sure to apply it consistently for the other items (e.g. Volumes).

Not sure if this was already asked but on the "Alert" tab, can we somehow get to the entire Alert message, e.g. by hovering over the message with the mouse?

On the Performance tabs, the graphs need info on the x-axis (time and scale). Do we plan to enlarge the graphs if you click on them in order to see more details?

At this stage, do you already care about feedback on "cosmetic" like font-sizes, alignment etc? Perhaps for this, it would also be good at some stage to have somebody like Frank Roesner look at it.

ChengYanJin · 2020-07-12T17:57:09Z

Hello @nloewensen,

I also like the tabs approach and definitely we should apply it to Volume Page as well.

For the Performance Charts, we don't plan to enlarge the graph by clicking on it. However, an approach that could help us to see the detail is to change the timespan from 7 days to last 24h from the dropdown list. What do you think about this approach?

Regarding the "cosmetic", Use Design has helped us to design the Moonshot UI. We are trying to apply the same design in the Metalk8s UI. It's a good idea that we could ask for an early review from Frank Roesner once we are done with the Volume Page.

nloewensen · 2020-07-13T16:52:40Z

@ChengYanJin I feel that on the "Performance Charts" changing on the time-span will not provide better "readability/usability", since the displayed information still remains rather small. I would prefer to enlarge the graphs or be re-directed to the appropriate monitoring page (in a new window), so that details can be inspected.

gdemonet · 2020-07-15T08:09:06Z

@nloewensen Agreed, and as a general rule, we want all charts to have a corresponding Grafana link, which would then solve the aforementioned usability problems (size, time span, refresh rate, filtered series...).

ChengYanJin · 2020-10-23T14:32:30Z

Merged in #2881

ChengYanJin added the topic:ui UI-related issues label Jun 22, 2020

ChengYanJin changed the title ~~Node Page Layout~~ New Node Page Layout Jun 24, 2020

ChengYanJin closed this as completed Oct 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Node Page Layout #2640

New Node Page Layout #2640

ChengYanJin commented Jun 22, 2020 •

edited by thomasdanan

Loading

Cuervino commented Jun 24, 2020

ChengYanJin commented Jun 26, 2020

gdemonet commented Jun 29, 2020