Installing Rancher to create and manage a Kubernetes cluster
Update: When I went through this initially, I didn't realise I was setting up a high-availability cluster meant just to run Rancher... which no-one needs. So instead I checked out the guide on running a single node setup, and provisioned my Kubernetes cluster through the Rancher CLI. The way it was intended.
On the off-chance someone's interested in how I setup a high-availability cluster running Rancher, here goes.
I don’t know enough about Kubernetes or Rancher to create a guide, but here’s how I set them up so I could have a cluster of machines that can run Docker containers, for a growing project.
If you follow this and you come up against problems, check the guides listed in the Credits section, as I’m probably not knowledgeable enough to help.
Background
In March I wrote about Project Anvil, an attempt to create a Dockerised version of my web product. I've gone back and forth and round the houses with this project, and have come to the conclusion that the best thing to do is to keep the app as close to the current production version as possible. I've been making big changes to other parts of the system, and even though I'm still splitting a few things out into separate repos – which I think is the best course – I don't want to go the full SPA route.
So the first aim is to get the app to be deployed on a stack running Kubernetes (a system I don't really understand) via Rancher (an open source manager I've never used).
As I said up top, this is definitely not a guide, but a rundown of the steps I've taken so far that have lead to a seemingly stable Rancher installation.
Provisioning the nodes and load balancer
I started by creating 4 DigitalOcean droplets, using the following naming scheme:
universe-node0
universe-node1
universe-node2
universe-node3
Then I SSHed into universe-node0
and ran the following to install NginX:
$ apt-get update && apt-get -y install nginx
I then SSHed into the remaining nodes and ran the following to install Docker:
$ apt-get remove -y docker docker-engine docker.io
$ apt-get update && apt-get install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
$ add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb _release -cs) stable"
$ apt-get update && apt-get install -y docker-ce=17.03.2~ce-0~ubuntu-xenial
_
Configuring NginX
On universe-node0
, I replaced /etc/nginx/nginx.conf
with the following:
worker
processes 4; worker rlimit nofile 40000;
events { worker_connections 8192; }
http { server { listen 80; return 301 https://$host$request_uri; } }
stream { upstream rancher servers { least conn; server :443 max fails=3 fail timeout=5s; server :443 max fails=3 fail timeout=5s; server :443 max fails=3 fail timeout=5s; }
server { listen 443; proxy pass rancher servers; } }
I restarted NginX by running service nginx restart
.
Setting up the FQDN
I added a DNS record pointing the IP address of the load balancer node to rancher.example.com (obviously I've redacted the domain name).
Installing RKE
I download the Rancher Kubernetes installer, and in a terminal, entered the directory the binary was downloaded to and ran chmod +x rke _darwin-amd64_
to make the binary executable. I verified RKE was working by running ./rke
darwin-amd64 --version. I then made it executable from any location by running mv rke _darwin-amd64 /usr/local/bin/rke_
.
Configuring Rancher
I created a file called rancher-cluster.yml
, using the following template:
nodes:
- address: <node1 ip>
user: root
role: [controlplane,etcd,worker]
ssh
key _path: - address: user: root role: [controlplane,etcd,worker] ssh_key _path: - address: user: root role: [controlplane,etcd,worker] ssh_key_path:
addons: |-
kind: Namespace apiVersion: v1 metadata: name: cattle-system
kind: ServiceAccount apiVersion: v1 metadata: name: cattle-admin namespace: cattle-system
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cattle-crb namespace: cattle-system subjects: - kind: ServiceAccount name: cattle-admin namespace: cattle-system roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io
apiVersion: v1 kind: Secret metadata: name: cattle-keys-ingress namespace: cattle-system type: Opaque data: tls.crt: tls.key:
apiVersion: v1 kind: Service metadata: namespace: cattle-system name: cattle-service labels: app: cattle spec: ports: - port: 80 targetPort: 80 protocol: TCP name: http - port: 443 targetPort: 443 protocol: TCP name: https selector: app: cattle
apiVersion: extensions/v1beta1 kind: Ingress metadata: namespace: cattle-system name: cattle-ingress-http annotations: nginx.ingress.kubernetes.io/proxy-connect-timeout: "30" nginx.ingress.kubernetes.io/proxy-read-timeout: "1800" nginx.ingress.kubernetes.io/proxy-send-timeout: "1800" spec: rules: - host: rancher.example.com http: paths: - backend: serviceName: cattle-service servicePort: 80 tls: - secretName: cattle-keys-ingress hosts: - rancher.example.com
kind: Deployment apiVersion: extensions/v1beta1 metadata: namespace: cattle-system name: cattle spec: replicas: 1 template: metadata: labels: app: cattle spec: serviceAccountName: cattle-admin containers: - image: rancher/rancher:latest args: - --no-cacerts imagePullPolicy: Always name: cattle-server ports: - containerPort: 80 protocol: TCP - containerPort: 443 protocol: TCP
Getting an SSL certificate for the Rancher instance
I SSHed into the land balancer and ran the following to install the Let’s Encrypt certbot, so I could obtain an SSL certificate for my Rancher installation:
$ apt-get update && apt-get install -y software-properties-common && add-apt-repository ppa:certbot/certbot
$ apt-get update && apt-get install -y python-certbot-nginx
I stopped NginX from running so I could spin up a temporary webserver:
$ service nginx stop
I then ran certbot certonly
and when prompted, selected option 2 to spin up that temporary webserver.
I was then able to start the NginX server back up:
$ service nginx start
Once the certificate was registered, I had two files:
/etc/letsencrypt/live/rancher.example.com/fullchain.pem
/etc/letsencrypt/live/rancher.example.com/privkey.pem
I base64-encoded the certificate:
$ echo $(base64 /etc/letsencrypt/live/rancher.example.com/fullchain.pem)
I then pasted the value into the tls.crt
key (line 49) of the rancher-cluster.yml
file, remembering to remove any whitespace and line-breaks so it was pasted as a single, unbroken string (not a YAML multi-line string).
I base64-encoded the certificate key:
$ echo $(base64 /etc/letsencrypt/live/rancher.example.com/privkey.pem)
I then pasted that value into the tls.key
value (line 50) of the rancher-cluster.yml
file, in the same manner as before.
I ran the following to provision the nodes:
$ rke up —config <path to rancher-cluster.yml>
After a while I ended up with an output that ended like this:
INFO[0021] [addons] Executing deploy job..
INFO[0031] [addons] User addons deployed successfully
INFO[0031] Finished building Kubernetes cluster successfully
It failed the first few times for me (succeeded with warnings), because I’d stuffed up the certificate strings.
Once setup, I was able to pop my domain name into my browser and see my Rancher instance, and setup my password.