-
Add each Loki instance as datasources in Grafana dashboards (PR#3681)
-
Bump Grafana image version to 8.3.4-ubuntu (PR#3684)
-
Bump ingress-nginx chart version to 4.0.9 nginx-ingress-controller image has been bumped accordingly to v1.0.5 (PR#3691)
-
Disable fluent-bit service monitor as currently the fluent-bit HTTP server that serve metrics does not work (PR#3689)
-
Fix the incomplete alert name in MetalK8s UI alert page (PR#3669)
- #2199 - Prometheus label
selector for
PodMonitor
has changed fromrelease: prometheus-operator
tometalk8s.scality.com/monitor: ''
(PR#3692)
-
Deploy a hierarchy of Prometheus alerts to provide different granularities when observing the cluster state (used in the UI Dashboard page) (PR#3540)
-
#3574 - Allow to manage number of replicas and, soft and hard
podAntiAffinity
forCoreDNS
from Bootstrap configuration file, with a default soft anti-affinity on hostname, so that if it's possible eachCoreDNS
pods will sit on different infra node (PR#3579) -
Allow to manage number of replicas and, soft and hard
podAntiAffinity
for Control Plane Ingress Controller from Bootstrap configuration file, with a default soft anti-affinity on hostname, so that if it's possible each Control Plane Ingress Controller pods will sit on a different master node (PR#3617) -
Allow to manage soft and hard
podAntiAffinity
forDex
from Cluster and Services Configurations, with a default soft anti-affinity on hostname, so that if it's possible eachDex
pods will sit on a different infra node (PR#3614) -
Allow to manage the number of terminated pods that can exist, before the terminated pod garbage collector starts deleting them, from the Bootstrap configuration. It defaults to
500
(PR#3621)
- Removed the PDF support for documentation, replaced it with the HTML output in the ISO (PR#3540)
-
Bump Kubernetes version to 1.22.4 (PR#3608)
-
Bump etcd version to 3.5.0-0 (PR#3525)
-
Bump CoreDNS version to v1.8.4 (PR#3525)
-
Bump
containerd
version to 1.5.8 (PR#3648). -
Bump Calico version to 3.20.0 (PR#3527)
-
Bump ingress-nginx chart version to 4.0.1 nginx-ingress-controller image has been bumped accordingly to v1.0.0 (PR#3518)
-
Bump Dex chart version to v0.6.3, Dex image has been bumped accordingly to v2.30.0 (PR#3519)
-
Bump kube-prometheus-stack charts version to 23.2.0 The following images have also been bumped accordingly:
- grafana to 8.3.1-ubuntu
- k8s-sidecar to 1.14.2
- kube-state-metrics to v2.2.4
- node-exporter to v1.2.2
- prometheus to v2.31.1
- prometheus-config-reloader to v0.52.1
- prometheus-operator to v0.52.1 (PR#3639)
-
#3487 - Make Salt Kubernetes execution module more flexible relying on
DynamicClient
frompython-kubernetes
(PR#3510) -
Add Dashboard page to monitor the health and performances of the cluster in MetalK8s UI (PR#3551, PR#3522, PR#3465, PR#3420, PR#3501)
-
Deploy Thanos querier in front of Prometheus in order to make metrics highly-available when we have multiple Prometheus instances (PR#3573)
-
Handle 401 unauthorized error in MetalK8s UI (PR#3640)
-
#3618 Detect Grafana dashboard ConfigMaps in any namespace rather than just
metalk8s-monitoring
, and enable Grafana folder generation from dashboard file structure (PR #3620) -
#3387 - Make metalk8s-sosreport package compatible with sos version 4.0+ (PR#3664)
-
Explicitly set the Grafana datasource UID to
metalk8s-<datasource_name>
(PR#3668) -
Do not use
cluster.local
suffix in Loki datasources (PR#3679)
-
#3601 - Marks the
pause
image used bycontainerd
aspod infra container image
so that kubelet does not remove it (PR#3624) -
Do not fail if the Control Plane Ingress section exists in the Bootstrap configuration file, but Ingress IP is not set. (PR#3675)
- Filter out some filesystem (NSFS, iso9660) from node exporter since metrics for those filesystem does not bring any value (PR#3661)
-
Bump Kubernetes version to 1.21.8 (PR#3653)
-
Bump ingress-nginx chart version to 3.36.0 nginx-ingress-controller image has been bump accordingly to v0.49.3 (PR#3649)
-
Bump grafana image to 8.0.7-ubuntu (PR#3656)
- Fix display of volume usage on newly created volumes in MetalK8s Web UI (PR#3651)
- Skip "Pending" pods when draining a node (PR#3641)
-
Bump Kubernetes version to 1.21.7 (PR#3607)
-
Add ability to change the drain timeout from the upgrade and downgrade scripts and default to 3600 seconds (PR#3633)
- #3341 - Try to refresh udev database automatically if a Volume persistent path does not exist (PR#3630)
- Fix wrong average value in Control Plane and Workload Plane Bandwidth chart (PR#3616)
- Fix no data displayed within the tooltip of UI chart when Node name contains more than 1 dots (PR#3629)
- Bump Kubernetes version to 1.21.6 (PR#3583)
-
#3570 - Fix the upgrade script, so that it does not exit 1 just after the initial backup creation (PR#3571)
-
Fix a bug in MetalK8s UI that sometimes display the metrics of the previously selected instance when switching between them (PR#3580)
-
Fix the backup replication Job name which was including the node name, so that he could exceed the limit of 63 characters. (PR#3584)
-
Fluent-bit instances stayed stuck when a Loki instance was down, blocking the whole logging pipeline. It is now fixed as we configure fluent-bit to talk with Loki's service and use memberlist to keep the Loki instances replicated. (PR#3557)
-
Properly handle
generateName
in our Salt Kubernetes module (PR#3590)
- #3564 - Fix a bug that prevents running salt states using salt-ssh if the target node has some MetalK8s volumes (PR#3566)
- A daily backup of the bootstrap node is now automatically scheduled. All the backups are also replicated onto all the master nodes. (PR #3557)
-
Bump Kubernetes version to 1.21.5 (PR#3537)
-
Bump Salt version to 3002.7 (PR #3524)
-
Improve UI metrics charts (cursor synchronisation when hovering a chart, better tooltip with coloured legend and unit, lot of bug fixes when data is missing, symmetrical charts to compare read/write in/out metrics) (PR#3529)
-
Enforce a single subnet for control plane when using a MetalLB-managed VIP for Ingress (PR #3533)
-
Fix UI issues in multi nodes environment when a node is unavailable (PR#3521)
- Fix the link to documentation from the UI navigation bar (PR#3486)
- Improve performance of Shell UI when switching between navigation entries (PR#3469)
-
Fix a few issues in MetalK8s UI with error handling for Nodes deployment (PR#3477)
-
#3480 - Switch Grafana base image to Ubuntu (and bump to 8.0.6) to handle DNS
SERVFAIL
errors gracefully (PR#3481) -
#3474 - Lower the alert thresholds for low filesystem available space and inodes to react before kubelet starts evicting pods (PR#3479)
-
Fix "Logs" dashboard in Grafana (templating error) (PR#3484)
-
#3475 - Fix broken links in MetalK8s UI for "Advanced Metrics" in Nodes and Volumes pages (PR#3483)
-
Bump Kubernetes version to 1.21.3 (PR#3452)
-
Bump CoreDNS version to 1.8.0 (PR#3354)
-
Bump prometheus-adapter chart version to 2.14.2. k8s-prometheus-adapter-amd64 image has been bump accordingly to v0.8.4 (PR#3429)
-
#3279 - Bump fluent-bit chart version from 2.0.1 to 2.2.0 fluent-bit-plugin-loki image has been bump accordingly from v1.6.0-amd64 to v2.1.0-amd64 (PR#3364)
-
Bump loki chart version to 2.5.2, loki image has been bump accordingly to 2.2.1 (PR#3428)
-
Migrate from stable Dex deprecated chart to dexidp.io Dex chart, and bump dex image to v2.28.1 (PR#3427)
-
Bump kube-prometheus-stack charts version to 16.9.1 The following images have also been bumped accordingly:
- grafana to 8.0.1
- k8s-sidecar to 1.12.2
- kube-state-metrics to v2.0.0
- node-exporter to v1.1.2
- prometheus to v2.27.1
- prometheus-config-reloader to v0.48.1
- prometheus-operator to v0.48.1 (PR#3422)
-
Bump ingress-nginx chart version to 3.34.0 nginx-ingress-controller image has been bump accordingly to v0.47.0 (PR#34381)
-
Bump Calico version to 3.19.1 (PR #3430)
-
#3366 - Use
systemd
cgroupDriver for Kubelet and containerd (PR#3377) -
Allow to manually deploy a second registry container (PR#3400)
-
#2381) - Allow configuring the Control Plane Ingress' external IP, to enable high availability with failover of this (virtual) IP between control plane nodes (PR#3415). If supported by the user environment, MetalK8s can manage fail-over of this virtual IP using MetalLB (PR#3418).
-
Use webpack 5 module federation to provide a framework allowing aggregation of solutions UIs (PR#3414)
- #3445 - Avoid kube-apiserver timeout during single node cluster upgrade when a lot of pod ran on the node (PR#3447)
- #2199 - Prometheus label
selector for
ServiceMonitor
andPrometheusRule
objects has changed fromrelease: prometheus-operator
+app: prometheus-operator
tometalk8s.scality.com/monitor: ''
(PR#3356)
- Allow hostPort on 127.0.0.1 (PR#3396)
- Fixed bug in display when adding a new disk with long labels (PR#3328)
-
Check on minion ID / Kubernetes node name match constraints (PR#3258)
-
Add custom metalk8s_network.routes execution module (PR#3352)
- Add an optional order property to manage ordering of navbar entries (PR#3334)
-
Re-support MetalK8s UI on Firefox (PR#3399)
-
Remove unnecessary
View logical alerts
toggle in the Alert page (PR#3399)
-
#3180 - All alerts from Alertmanager are now stored in Loki database for persistence (PR#3191)
-
#3294 - Allow to manage
kube-apiserver
feature gates from Bootstrap Configuration file (PR#3318) -
Complete rebranding of MetalK8s UI (PR#3295)
-
Bump Kubernetes version to 1.20.6 (PR#3311)
-
Include qperf in the
metalk8s-utils
container image (PR #3174) -
Bump Node.js version to 14.16.0 (PR#3214)
-
Introduce a
shell-ui
project that groups various UI components to be reused by solutions UIs (PR#3106) -
Move the navbar component to
shell-ui
to enable its reuse by solutions UIs (PR#3110) -
Add a static user/groups mapping configuration as part of
shell-ui
configuration to allow solutions UIs displaying features according to some user groups (PR#3154) -
Enrich
sosreport
output (PR#3222) -
#1997 - Add support for LVM LogicalVolume Volume creation using storage operator (PR #3220)
-
#3051 - Prefix OIDC claims to prevent naming clashes (PR #3054)
-
Bump Kubernetes version to 1.19.8 (PR #3137)
-
Bump
coredns
version to 1.7.0 (PR #3008) -
Bump etcd version to 3.4.13-0 (PR #3008)
-
#3026 - Embed a checksum of the data contained in the ISO image inside the ISO so its integrity can be ensured after download, next to or instead of checking the
SHA256SUM
usingcheckisomd5
(from isomd5sum) (PR #3032) -
#2996 - The
bash-completion
completions for thekubectl
command are now provided whenkubectl
is installed (PR #3039) -
Use the Alpine Linux-based version of the nginx container image, reducing disk space used by the ISO and in image caches (PR #3047)
-
#2932 - Add system partitions tab in MetalK8s UI node page (PR #3045)
-
#2925 - Compare node metrics with average from MetalK8s UI (PR #3078)
-
Improve the upgrade robustness when the platform is a bit slow (PR #3105)
-
Use HTTPS endpoints for kube-controller-manager and kube-scheduler (PR #3152)
-
#3092 - Check if all needed addresses are free, or already used by a MetalK8s process (PR #3163)
-
#3079 - Ensure configured archives are valid in the iso-manager script (PR #3081)
-
#3022 - Ensure salt-master container can start at reboot even if local salt-minion is down (PR #3041)
-
#3075 - Properly install "base" Salt dependencies from "base" RHEL 7 repository (PR #3083)
-
#3128 - No longer assume ISOs mounted under
/srv/scality
are Solutions (PR #3182)
- Bump Salt version to 3002.6 (PR #3248)
-
#2992 - Check for conflicting packages (
docker
,docker-ce
andcontainerd.io
) on target machines before installation (bootstrap or expansion) (PR #3153, backport of PR #3050) -
#3067 - Check for conflicting services (
firewalld
) already started or enabled on target machines before installation (bootstrap or expansion) (PR #3153, backport of PR #3069) -
Improve error handling when providing invalid CA minion in Bootstrap configuration file (PR #3153, backport of PR #3065)
-
kubernetes/kubernetes#57534 - Check if a route exists for the Service IPs CIDR (PR #3153, backport of PR #3076)
- Do not install
containerd.io
instead ofcontainerd
andrunc
when this package is available in one configured repository (PR #3153, backport of PR #3050)
-
Due to vulnerabilities ( CVE-2021-3197, CVE-2021-25281, CVE-2021-25282, CVE-2021-25283, CVE-2021-25284, CVE-2021-3148, CVE-2020-35662, CVE-2021-3144, CVE-2020-28972 and CVE-2020-28243) affecting all Salt versions inferior to
3002.5
, this release ships with all Saltstack updated to3002.5
.Upgrade Salt to version
3002.5
(PR #3158)
-
Bump Kubernetes version to 1.18.16 (PR #3132)
-
Improve Salt master and cluster upgrade stability in slow environments (PR #3125)
- Embed
pause
image version 3.2 instead of 3.1 needed for MetalK8s to work offline (needed by containerd version superior to 1.4.0) (PR #3120)
- Fix a bug where salt-minion process does not get properly restarted (PR #3059)
- #3064 - Fix upgrade from 2.6.x (PR #3048)
- Prevent unneeded log warning about kubeconfig regeneration (PR #3053)
-
Bump Kubernetes version to 1.18.15 (PR #3035)
-
Bump
coredns
version to 1.6.7 (PR #2816) -
#2203 - Migrate Salt to Python3 and bump to version 3002.2 (PR #2839)
-
Bump
calico
version to 3.17.0 (PR #2943) -
Bump
fluent-bit
chart to 2.0.1 andloki
chart to 2.1.0 (PR #2946) -
Replace the prometheus-operator chart by the kube-prometheus-stack one and bump the version to 12.2.3. All the container images of this stack have also been bumped:
- alertmanager from v0.20.0 to v0.21.0
- grafana from 6.7.4 to 7.3.5 (PR #3006)
- k8s-sidecar from 0.1.20 to 1.1.0
- kube-state-metrics from v1.9.5 to v1.9.7
- node-exporter from v0.18.1 to v1.0.1
- prometheus from v2.16.0 to v2.22.1
- prometheus-config-reload from v0.38.1 to v0.43.2
- prometheus-operator from v0.38.1 to v0.43.2 (PR #2948)
-
Bump
prometheus-adapter
chart to 2.10.1 (PR #3007) -
Bump
ingress-nginx
chart to 3.13.0 (PR #2961) -
#2953 - Allow customization of Prometheus retention (time and size based), see MetalK8s documentation (PR #2968)
-
The
screen
andtmux
terminal multiplexers are now installed in themetalk8s-utils
container image (PR #2995) -
The
bash-completion
completions for thekubectl
command are now included in themetalk8s-utils
container image (PR #2995) -
#2931 - [UI] Improve Volumes list performance using a virtualized table (PR #2938)
-
#2908 - Make upgrade script more robust about static pod restart and improve user experience (PR #2928)
-
#2726 - Ensure sparse loop volumes are all provisioned on reboot (PR #2936)
-
Make sure container engine is ready before trying to import container images (PR #3020)
-
Fix invalid return of Success when
wait_minions
runner fails (PR #3031) -
Improve the robustness of salt orchestrate execution (PR #3033)
-
[UI] Fix memory leak in chart component (PR #2988)
-
#2840 - Prevent duplicate static Pods from being created when updating their manifests (PR #3003)
-
#3014 - Fix sosreport
metalk8s
plugin'sdescribe
option (PR #3013)
-
#1887 - All Kubernetes kubeconfig, client and server certificates are now automatically regenerated when close to the expiration date (less than 45 days) (PR #2914)
- #2581 - Solution UI are no longer deployed by MetalK8s, it's now the responsibility of the Solution Operators (PR #2617)
- Extend the set of packages installed in the
metalk8s-utils
container image (Partially resolves issue #2156, PR #2374) - Upgrade
containerd
to 1.2.14 (PR #2874) - Enable
seccomp
support incontainerd
(Issue #2259, PR #2369) - #1095 - Ability to use multiple CIDRs for control plane and workload plane networks and to specify the workload plane MTU to compute the MTU used by Calico (PR #2677)
- Deploy log aggregation layer, based on Loki and Fluentbit (see #2722, #2723, #2727, #2738, and #2745)
-
Due to vulnerabilities ( CVE-2020-16846 and CVE-2020-25592) affecting all Salt-API versions inferior to
3000.5
, this release ships with all Saltstack updated to3000.5
.Upgrade Salt to version
3000.5
(PR #2916)
-
Bump Calico version to 3.16.1 (PR#2824)
- #3218 - Enrich sosreport
plugins:
- Add a Prometheus snapshot
- Add Salt configuration
- Add salt-minion journal
- Add kubectl top nodes & pods
- Add bootstrap and solutions configuration files (PR #3222)
- #3247 - Fix a bug where Salt minion process may fail to restart during upgrade or downgrade process (PR #3281)
-
#2854 - Bump containerd version to 1.2.14 to fix CVE-2020-15157 (PR #2874)
-
#2653 - Bind MetalK8s OIDC static admin user to a Grafana Admin role (PR #2742)
-
#2704 - Always install the right Salt minion version during Bootstrap (PR #2734)
-
#2653 - Dex admin user have super-admin access in Grafana (PR #2743)
-
Storage Operator no longer spams Salt API because of an infinite reconciliation loop (Commit b0eca3d84, PR #2651)
- Solutions product information format has changed, there is a new
manifest.yaml
file to describe the whole Solution instead of theproduct.txt
andconfig.yaml
(#2422). Solution archives working on previous versions of MetalK8s will no longer be compatible and will need to be regenerated. See Solutions documentation for details about the new format.
-
#2423 - Bump nginx-ingress-controller version to 0.30.0 (PR #2446)
-
#2430 - Bump prometheus-operator version to 8.13.0 (PR #2557)
-
#2488 - Update default CSC value during upgrade/downgrade (PR #2513)
-
#2493 - Use async call for disk.dump during Volume provisioning (PR #2571)
-
Add support for CustomResourceDefinition apiextensions.k8s.io/v1 in
metalk8s_kubernetes
Salt module (PR #2583)
-
#2434 - Wait for a single Salt Master container during Bootstrap (PR #2435)
-
#2526 - Add 'groups' scope when requesting an id_token from Dex in the UI (PR #2529)
-
#2443 - Improve error handling for Salt jobs in the UI (PR #2475)
-
#2495 - Fix monitoring page to display all alerts in the UI (PR #2503)
-
#2569 - Restart Dex Pod automatically upon CSC Dex configuration changes (PR #2573)
-
Basic authentication has been deprecated in favour of OpenID Connect (OIDC) with Dex being deployed as a local Identity Provider, used by Kubernetes API and Grafana.
This implies:
- The existing users defined for Kubernetes API Basic Auth in
(
/etc/kubernetes/htpasswd
) and for the Grafana admin will become unusable - A default admin user will be created in Dex, with the new
credentials
[email protected]
:password
which can be used to access the MetalK8s UI and Grafana - Procedures to edit and add new users can now be found here
- The existing users defined for Kubernetes API Basic Auth in
(
-
A new framework for persisting Cluster and Services Configurations (CSC) has been added to ensure configurations set by administrators are not lost during upgrade or downgrade and can be found here.
-
User-provided configuration is now stored in ConfigMaps, and MetalK8s tooling will honor the values provided when deploying its services:
- Dex uses
metalk8s-auth/metalk8s-dex-config
- Grafana uses
metalk8s-monitoring/metalk8s-grafana-config
- Prometheus uses
metalk8s-monitoring/metalk8s-prometheus-config
- Alertmanager uses
metalk8s-monitoring/metalk8s-alertmanager-config
- Dex uses
-
Documentation for changing and applying configuration values is found here.
Note that any configuration applied on other Kubernetes objects (e.g. a configuration Secret that Alertmanager uses, or the Deployment of Grafana) will be lost upon upgrade, and admins should make sure to prepare the relevant ConfigMaps from their existing configuration before upgrading to this version.
-
-
The MetalK8s UI has been re-branded with lots of changes to the Login screens and Navbar to offer a smoother experience.
-
Upgrade Calico to 3.12.0 (PR #2253)
-
#2007 - Deploy Dex in a MetalK8s cluster from stable Helm Charts (PR #2025)
-
#2015 - Configure MetalK8s UI to require authentication through Dex (OIDC) (PR #2042)
-
#2016 - Brand the Dex GUI to match MetalK8s UI specifications (PR #2062)
-
#2072 - Remove support for Kubernetes API server basic authentication (PR #2119)
-
#2078 - Store Dex authentication access_token in the browser localStorage (PR #2088)
-
#2255 - Template and store replicas count for Prometheus, Grafana & Alertmanager as service configurations (PR #2258)
-
#2261 - Template and store Dex backend settings as service configurations (PR #2274)
-
#2262 - Template and store Alertmanager Secret as a service configuration (PR #2289)
-
Enable OIDC based authentication for Grafana service (PR #2378)
-
#2351 - Update documentation with default credentials for Metalk8s UI and Grafana UI (PR #2377)
-
#2264 - Add documentation on the list of Cluster and Service configurations (PR #2291)
-
Due to critical vulnerabilities ( CVE-2020-11652 and CVE-2020-11651) with CVSS score of 10.0 affecting all Salt master versions inferior to
3000.2
, this release ships with all Saltstack updated to3000.3
. Users, especially those who expose the Salt master to the Internet must therefore upgrade immediately. -
Due to an access control vulnerability CVE-2020-13379 with CVSS score of 5.3 that was discovered affecting Grafana versions from
3.0.1
through7.0.1
, this release ships with a Grafana version updated to6.7.4
. For more, see here -
A potential risk for privilege escalation in SaltAPI described here was fixed in this release.
#2634 - Prevent impersonation in SaltAPI (PR #2642)
#1528 and #2084 - Tighten storage-operator permissions against Salt (PR #2635)
-
Make etcd expansions more resilient (PR #2147)
-
#2585 - Add state to cleanup PrometheusRule CRs after upgrade/downgrade (PR #2594)
-
#2444 - Fix flaky SLS rendering when missing a pillar key (PR #2445)
-
#2551 Fix downgrade pre-check regarding the saltenv version (PR #2552)
-
#2592 - Fix invalid custom object listing in
metalk8s_kubernetes
Salt module (PR #2593) -
Fix apiserver-proxy to no longer proxy to non-master nodes (PR #2555)
-
#2530 - Make cluster upgrade more robust to Pod disruption constraints (PR #2531)
-
#2028 - Improve the resilience of node deployment (PR #2147)
-
Fix various issues in the bootstrap restore script (PR #2061)
- #1993 - Add Solutions management, CLI tooling to deploy Solutions (complex Kubernetes applications) (PR #2279)
-
Add
label_selector
in MetalK8s custom kubernetes salt module for listing kubernetes objects (PR #2236) -
Salt grains cache is now enabled (PR #2417
-
#2334 - Add and install
yum-utils
package required for cluster expansion (PR #2343) -
#2245 - Rephrase volume status from
Available
toReady
(PR #2248)
- If
apiServer.host
is configured inBootstrapConfiguration
, this is no longer used (and must no longer be defined). - If
apiServer.keepalived
is configured inBootstrapConfiguration
, this is no longer used, and Keepalived is no longer deployed at all. - Generated
admin.conf
KubeConfig
files point to the control-plane IP of the host on which they are generated. You can override this when using them usingkubectl
s-s
/--server
argument to point to another address.
-
#1891 - Allow adding labels to Volumes from the UI (PRs #1979 and #2066)
-
#2049 - Deploy prometheus-adapter to implement the
metrics.k8s.io
API, to supportkubectl top
and other consumers of this API (PR #2057) -
#2103 - Add a host-local
nginx
on every node to provide highly-available and load-balanced access tokube-apiserver
(PR #2106) -
#2052 - Handle configuration of an HTTP proxy for
containerd
(PRs #2071 and #2201) -
#2149 - Provide access to the product documentation from the UI (PR #2176)