Skip to content

cdeadspine/howto-k3os-baremetal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

howto-k3os-baremetal

Either https://github.com/rancher/k3os is not really designed to run on a bare metal cluster, or it is just too bleeding edge right now

k3os config.yaml

  • i dont think mDNS is a part of k3os, so you have to use a hardcoded ip address i think? or else have an external DNS system outside the cluster maybe?
  • need to create 2 different installation ISOs to set up the "Embedded HA using etcd" , the other options seem equally strange in that you need an external database not part of the cluster. This is weird to my brain, in that i thought the whole point of a cluster is without single point of failure. but the options are "external database single point of failure" or "i need multiple clusters to make a cluster"
  • the first computer to make needs
k3os:
  k3s_args:
  - server  
  - "--cluster-init"
  token: token
  • subsequent computers need a different configuration / installation ISO . This hardcoded ip seems very strange, and i don't entirely know yet if this breaks metalLB when certain/first computer goes down. Also not sure if I need the first master fully running before I can add any more computers to the cluster like to fix a problem? Do I really need to make ANOTHER ISO image (with a new server_url) if the 1/3 computers that experiences a hardware problem happens to be the first master i spun up?
k3os:
  k3s_args:
  token: token
  server_url: https://192.168.10.43:6443
  • metalLB seems like the simplest way to route a tiny corporate router to potentially partial cluster. All config.yaml require
k3os:
  k3s_args:
  - server
  - "--disable=servicelb"

Embedded High availabilty using ETCD

Documentation says you need 3 or more, and odd number of servers only. Nothing explains why this is the case. I think it is specifically from the way ETCD works.

From trial and error testing:

2 servers reacts like this

  • shutting one down, none of kubectl works
  • Can ssh into each cluster computer to see which one seems to be running and which one is not, then troubleshoot any hardware problems on a per computer basis
  • Turning second server back on, everything started running again
  • pods on each surviving server are "suppose to stay running" (but this didn't seem true, not everything definitely), but anything running on the downed server was not automaticaly scaled/rescheduled
  • you can mostly trust that multiple pods (replicaset >= 2) will run on different nodes (it wont blindly run replicate: 2 on the same node when no nodeaffinity is defined)
  • metalLB doesnt seem to work when one of 2 servers goes down, so multiple replicas isnt enough

3 servers

  • mostly works as expected
  • 1 down : leaves enough nodes running to keep the website up, metallb works enough to get to the website
  • a few seconds of no response from kubectl, eventually works ok
  • something feels wrong with setting an ip address inside of /user/.kube/k3s.yaml to connect to cluster, if the first server goes down are we just suppose to manually edit the file to connect to one of the other 2?
  • 1 down : eventually it will reschedule some new pods on the 2 remaining nodes
  • 2 down seems to completely fail, see how 2 servers react (completely down)

Uninterruptible Power Supply

I wish I didn't have to think much about this, but I was forced to try and re-invent the wheel here No operating system support exists yet (hopefully someday) : rancher/k3os#548

Basic operating system support requires addition to the k3os config.yaml

write_files:
- encoding: ""
  content: |-
    #!/bin/bash
    DIR=/usr/local/killpower
    LOG=${DIR}/killpower.log
    MARK=${DIR}/killpower
    if test -f "$MARK"; then
      now=$(date +"%Y %m %d %T")
      echo "${now} : killpowerwatch boot found marker, cleared" >> $LOG
      rm $MARK
    fi
    for (( ; ; ))
    do
      if test -f "$MARK"; then
        now=$(date +"%Y %m %d %T")
        echo "${now} : killpowerwatch, found marker, shutting down" >> $LOG
        poweroff &
        break
      fi
      sleep 3
    done
  owner: root
  path: /home/killpowerwatch
  permissions: '0700'
run_cmd:
- "echo launching killpowerwatch"
- "start-stop-daemon --background --start --exec /home/killpowerwatch --make-pidfile --pidfile /run/killpowerwatch.pid"  

This is a super naive attempt at being minimal, probably a ton of problems doing it this way. A flag file is created and detected on the hostPath to poweroff the computer.

network-ups-tools

The meat of the logic was hopefully supplied in the helm chart:

#setup 
    helm repo add k8s-at-home https://k8s-at-home.com/charts/
#download to use values.yaml as example file
    helm pull k8s-at-home/network-ups-tools --untar
#also check their "common values" for all k8s at home charts 
    #   https://github.com/k8s-at-home/charts/blob/master/charts/common/values.yaml 
#see physical configuration https://networkupstools.org/docs/man/usbhid-ups.html

#the master NODE that is physically connected to UPS is marked with a label
    kubectl label nodes k3os-1351 ups-device=true
#test
    kubectl get node --selector='ups-device'

image:
  repository: cdeadspine/nut #check todo waiting pull request k8sathome/network-ups-tools
  tag: test #check todo waiting pull request v2.7.4-2124-g9defa7a9
  pullPolicy: Always #todo IfNotPresent

controllerType: daemonset #https://github.com/k8s-at-home/charts/blob/master/charts/common/values.yaml

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: ups-device
          operator: Exists

securityContext: #requird to access hardware
  privileged: true

service:
  port:
    name: server
    port: 3493

env: {}
  # TZ: UTC

additionalVolumes:
  - name: ups
    hostPath:
      path: /dev/bus/usb
  - name: kp
    hostPath:
      path: /usr/local/killpower #k3os specific path
additionalVolumeMounts:
  - mountPath: /dev/bus/usb
    name: ups
  - mountPath: /killpower
    name: kp
  
# The main configuration files for network-ups-tools
config:
  # If set to 'values', the configuration will be read from these values.
  # Otherwise you have to mount a volume to /etc/nut containing the configuration files.
  mode: values

  # See https://github.com/networkupstools/nut/tree/master/conf for config sample files
  files:
    nut.conf: |
      MODE=netserver

    upsd.conf: |
      LISTEN 0.0.0.0

    ups.conf: |
      [ups1]
        driver = usbhid-ups
        port = auto
        desc = "apc on usb"
    
    upsd.users: |
      [monmaster]
        actions = SET
        instcmds = ALL
        password  = secret
        upsmon master
      [monslave]
        password  = secret
        upsmon slave

    upsmon.conf: |
      MONITOR ups1@localhost 1 monmaster secret master
      SHUTDOWNCMD "`/usr/local/ups/sbin/upsdrvctl -u root shutdown && touch /killpower/killpower`"
  • Slave configuration
image:
  repository: cdeadspine/nut #check todo waiting pull request k8sathome/network-ups-tools
  tag: test #check todo waiting pull request v2.7.4-2124-g9defa7a9
  pullPolicy: Always #todo IfNotPresent

controllerType: daemonset #https://github.com/k8s-at-home/charts/blob/master/charts/common/values.yaml

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: ups-device
          operator: DoesNotExist

securityContext: #requird to access /usr/local/killpower
  privileged: true

env: {}
  # TZ: UTC

additionalVolumes:
  - name: kp
    hostPath:
      path: /usr/local/killpower #k3os specific path
additionalVolumeMounts:
  - mountPath: /killpower
    name: kp
  
# The main configuration files for network-ups-tools
config:
  # If set to 'values', the configuration will be read from these values.
  # Otherwise you have to mount a volume to /etc/nut containing the configuration files.
  mode: values

  # See https://github.com/networkupstools/nut/tree/master/conf for config sample files
  files:
    nut.conf: |
      MODE=netclient
      ALLOW_NO_DEVICE=true
      export ALLOW_NO_DEVICE
    
    dummy-ups.dev: | #required if we dont set up a upsd configuration
      #nothing

    upsmon.conf: |
      MONITOR [email protected]:3493 1 monslave secret slave
      SHUTDOWNCMD "touch /killpower/killpower" 
  • This ends up being a failure, slave shutdown works properly. But then the master pod screws up( 1 of 3 cluster computers running, apparently pods stop working) before it can shut itself and the UPS down. Perhaps just really short timer configuration would work but seems flaky
  • The real fix might be some kind of time delayed flag created on the master created during the NOTIFY stage of upsmon shut down? But this feels super hacky and gross

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published