Rapid deployment of K3S on vSphere using K3OS

No fluff here, just a quick writeup of how to rapidly deploy a multimaster kubernetes cluster using K3OS on vSphere. This can probably be done even faster by remastering ISOs, using packer or cloud-init directly to the VM but that's something for the future. K3OS is a minimalistic operating system intended to only run the minimal footprint required for K3S, which makes it incredibly compact, easy to deploy and manage. Another advantage is that K3OS can be updated through the kubernetes K3OS operating, which elimitates the overhead of having the manually manage an underlying operating system. For the people familiar with rancher, K3OS is the spiritial successor to rancherOS.  

Prerequisites

Create prerequisites such as DNS names (not required but handy), IP addresses, etc. If you want you can use DHCP reservations which makes the configuration a bit simpler.

You also need a DHCP server to initially boot the machines so that they can get an IP address. I have a reserved range with an extremely short lease time (2 minutes) for this purpose.

Download the K3OS Iso from https://github.com/rancher/k3os/releases

Create a VM in vSphere without any OS installed, the ISO attached and connected, and the correct network configured. I use 2vCPU/2GB RAM, 8GB Disk but this obviously depends on your use case. Do not boot it yet.

Create a template from this VM.

Create a cloud-init template file for your master with the following content and host them somewhere reachable over http/https. I've personally stored them in a github gist, which is obviously not ideal since it does contain sensitive information, but with the gists set to secret this is not an issue. Tip: If using github gists create a URL shortener using https://git.io as you will need to type in this URL manually. Make sure to also link the gist to the RAW gist and not the gist page itself.

- github:srobroek
write_files:
- path: /var/lib/connman/default.config
  content: |-
    [service_eth0]
    Type=ethernet
    IPv4=172.21.3.3/255.255.255.0/172.21.3.1
    IPv6=off
    Nameservers=172.20.11.9
    SearchDomains=int.vxsan.com
    Domain=int.vxsan.com
    TimeServers=pool.ntp.org
hostname: k3oscl01m01

k3os:


  k3sArgs:
    - server
    - --cluster-init
    - --tls-san=cl01-k3s.int.vxsan.com
    - --flannel-backend=host-gw

Now we'll go through this file section by section:

  • ssh_authorized_keys lets you set the ssh keys that are allowed to log in. I get my keys from my github profile but you can also manually enter keys here. This is critical as by default you cannot log in using a password either locally or using ssh.
  • write_files sets the connman_config which is the initial network setup. Note that you can also set nameservers and ntp in the k3os section using dns_nameservers and ntp_servers, both options are honestly fine. The IPv4 settings are in the format IP/NETMASK/GATEWAY.
  • The k3os section defines your K3OS config and determines how K3OS and K3S is set up initially (and on every reboot):
  • On the master node we set --cluster-init which initialises an embedded etcd cluster. Ofcourse you can also deploy this using another backend such as mysql/mariaDB but if you're at that stage you should be able to figure this out.
  • --tls-san is utilised to generate a tls SAN certificate containing my loadbalancer IP address. If you are connecting directly to a node this is not required.
  • --flannel-backend=host-gw is utilised as i was having some issues using the vxlan backend, most likely due to NSX-T being installed on the physical hosts. If you do not run this on hosts prepared for NSX-T/NSX-V you should leave this line out.

Deployment

Once you've got your initial configuration set up, boot your first VM, and log in using the username rancher. Then once you're logged in run the command sudo k3os install. Choose the option "install to disk" and select "yes" when asked to config the system with a cloud-init file. Enter the URL for your cloud-init file that you created earlier and after a short while your machine should reboot.

Now you should be able to log in to your first master using ssh with the ssh key provided and the user rancher. If not, check if the network configuration is done correctly and redeploy.

When you're logged into your first node, you should be able to run the command kubectl get nodes and you should see a single node with 3 roles as follows. If not, wait a bit and try again, otherwise check the logs in /var/log/k3s-service.log

k3oscl01m01 [~]$ kubectl get nodes
NAME          STATUS   ROLES                       AGE    VERSION
k3oscl01m01   Ready    control-plane,etcd,master   127m   v1.21.1+k3s1

You can also copy the kubeconfig to your local machine using scp, located in /etc/rancher/k3s/k3s.yaml. Note that you will have to adjust the hostname to point to your loadbalancer or your first node.

You will also need to retrieve your host token from the first node, which is located in /var/lib/rancher/k3s/server/node-token. Note that you cannot read this as a regular user, so you'll have to use sudo: sudo cat /var/lib/rancher/k3s/server/node-token.

Now for the other master nodes you will need to create a slightly different cloud-init file as follows. You will have to create a cloud-init file for each node unless you are using DHCP reservations, as the network details are different for each node. If you are using DHCP reservations, you can use the same cloud-init across your cluster.

ssh_authorized_keys:
- github:srobroek
write_files:
- path: /var/lib/connman/default.config
  content: |-
    [service_eth0]
    Type=ethernet
    IPv4=172.21.3.4/255.255.255.0/172.21.3.1
    IPv6=off
    Nameservers=172.20.11.9
    SearchDomains=int.vxsan.com
    Domain=int.vxsan.com
    TimeServers=pool.ntp.org
hostname: k3oscl01m02

k3os:
  token: fill_in_your_token
  k3sArgs:
    - server
    - --flannel-backend=host-gw
  server_url: https://172.21.3.3:6443

As you can see the initial config is still the same, but the k3os configuration is significantly different:

  • the server option is still set, as we want this machine to be a master.
  • The token field is new here, and should be filled with the token retrieved from the first master as this k3s machine will join the cluster to the first master.
  • The server_url field is set to the first master so that these machines can join the cluster. Note that you can also run this through a load balancer, but this requires a bit more setup with complicated health checks, so for my lab this is fine.
  • Again, if you are not running on NSX enabled hosts you can leave the --flannel-backend option out.

Repeat the above procedure of booting your machine from ISO, running sudo k3os install and pointing it to the cloud-init file. After a short while the node will reboot and you should be able to see an additional node appear in your kubectl get nodes.

k3os:
  token: fill_in_your_token
  k3sArgs:
    - server
    - --flannel-backend=host-gw
  server_url: https://172.21.3.3:6443

As K3S does not by default taint its masters this should be enough to start deploying a management plane such as Rancher. However, if you want to deploy additional non-master nodes you can use the same cloud-init file and change the server k3Args to k3sArgs agent as follows:

k3os:
  token: fill_in_your_token
  k3sArgs:
    - server
    - --flannel-backend=host-gw
  server_url: https://172.21.3.3:6443

You've successfully subscribed to
Great! Next, complete checkout for full access to
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.