Recently we had the desire to cluster Consul (Hashicorps K/V store) without calling out to Atlas. We deploy many clusters per day as we are constantly testing and we want Consul to simply bring itself up without having to reach out over the internet.
So we added a few changes to our Consul setup, so here goes-
Dockerfile:
This is a typical dockerfile for Consul running on alpine. Nothing out of the ordinary.
FROM alpine:3.2 MAINTAINER Martin Devlin <martin.devlin@pearson.com> ENV CONSUL_VERSION 0.6.3 ENV CONSUL_HTTP_PORT 8500 ENV CONSUL_HTTPS_PORT 8543 ENV CONSUL_DNS_PORT 53 RUN apk --update add openssl zip curl ca-certificates jq \ && cat /etc/ssl/certs/*.pem > /etc/ssl/certs/ca-certificates.crt \ && sed -i -r '/^#.+/d' /etc/ssl/certs/ca-certificates.crt \ && rm -rf /var/cache/apk/* \ && mkdir -p /etc/consul/ssl /ui /data \ && wget http://releases.hashicorp.com/consul/${CONSUL_VERSION}/consul_${CONSUL_VERSION}_linux_amd64.zip \ && unzip consul_${CONSUL_VERSION}_linux_amd64.zip \ && mv consul /bin/ \ && rm -f consul_${CONSUL_VERSION}_linux_amd64.zip \ && cd /ui \ && wget http://releases.hashicorp.com/consul/${CONSUL_VERSION}/consul_${CONSUL_VERSION}_web_ui.zip \ && unzip consul_${CONSUL_VERSION}_web_ui.zip \ && rm -f consul_${CONSUL_VERSION}_linux_amd64.zip COPY config.json /etc/consul/config.json EXPOSE ${CONSUL_HTTP_PORT} EXPOSE ${CONSUL_HTTPS_PORT} EXPOSE ${CONSUL_DNS_PORT} COPY run.sh /usr/bin/run.sh RUN chmod +x /usr/bin/run.sh ENTRYPOINT ["/usr/bin/run.sh"] CMD []
config.json:
And here is config.json referenced in the Dockerfile.
{ "data_dir": "/data", "ui_dir": "/ui", "client_addr": "0.0.0.0", "ports": { "http" : %%CONSUL_HTTP_PORT%%, "https" : %%CONSUL_HTTPS_PORT%%, "dns" : %%CONSUL_DNS_PORT%% }, "start_join":{ %%LIST_PODIPS%% }, "acl_default_policy": "deny", "acl_datacenter": "%%ENVIRONMENT%%", "acl_master_token": "%%MASTER_TOKEN%%", "key_file" : "/etc/consul/ssl/consul.key", "cert_file": "/etc/consul/ssl/consul.crt", "recursor": "8.8.8.8", "disable_update_check": true, "encrypt" : "%%GOSSIP_KEY%%", "log_level": "INFO", "enable_syslog": false }
If you have read my past Consul blog you might notice we have added the following.
"start_join":{ %%LIST_PODIPS%% },
This is important because we are going to have each Consul container query the Kubernetes API using a Kubernetes Token to pull in a list of IPs for the Consul cluster to join up.
Important note – if you are running more than just the default token per namespace, you need to explicitly grant READ access to the API for the Token associated with the container.
And here is run.sh:
#!/bin/sh KUBE_TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token` NAMESPACE=`cat /var/run/secrets/kubernetes.io/serviceaccount/namespace` if [ -z ${CONSUL_SERVER_COUNT} ]; then export CONSUL_SERVER_COUNT=3 fi if [ -z ${CONSUL_HTTP_PORT} ]; then export CONSUL_HTTP_PORT=8500 fi if [ -z ${CONSUL_HTTPS_PORT} ]; then export CONSUL_HTTPS_PORT=8243 fi if [ -z ${CONSUL_DNS_PORT} ]; then export CONSUL_DNS_PORT=53 fi if [ -z ${CONSUL_SERVICE_HOST} ]; then export CONSUL_SERVICE_HOST="127.0.0.1" fi if [ -z ${CONSUL_WEB_UI_ENABLE} ]; then export CONSUL_WEB_UI_ENABLE="true" fi if [ -z ${CONSUL_SSL_ENABLE} ]; then export CONSUL_SSL_ENABLE="true" fi if [ ${CONSUL_SSL_ENABLE} == "true" ]; then if [ ! -z ${CONSUL_SSL_KEY} ] && [ ! -z ${CONSUL_SSL_CRT} ]; then echo ${CONSUL_SSL_KEY} > /etc/consul/ssl/consul.key echo ${CONSUL_SSL_CRT} > /etc/consul/ssl/consul.crt else openssl req -x509 -newkey rsa:2048 -nodes -keyout /etc/consul/ssl/consul.key -out /etc/consul/ssl/consul.crt -days 365 -subj "/CN=consul.kube-system.svc.cluster.local" fi fi export CONSUL_IP=`hostname -i` if [ -z ${ENVIRONMENT} ] || [ -z ${MASTER_TOKEN} ] || [ -z ${GOSSIP_KEY} ]; then echo "Error: ENVIRONMENT, MASTER_TOKEN and GOSSIP_KEY environment vars must be set" exit 1 fi LIST_IPS=`curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/$NAMESPACE/pods | jq '.items[] | select(.status.containerStatuses[].name=="consul") | .status .podIP'` #basic test to see if we have ${CONSUL_SERVER_COUNT} number of containers alive VALUE='0' while [ $VALUE != ${CONSUL_SERVER_COUNT} ]; do echo "waiting 10s on all the consul containers to spin up" sleep 10 LIST_IPS=`curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/kube-system/pods | jq '.items[] | select(.status.containerStatuses[].name=="consul") | .status .podIP'` echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//' > tester VALUE=`cat tester | wc -l` done LIST_IPS_FORMATTED=`echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//'` sed -i "s,%%ENVIRONMENT%%,$ENVIRONMENT," /etc/consul/config.json sed -i "s,%%MASTER_TOKEN%%,$MASTER_TOKEN," /etc/consul/config.json sed -i "s,%%GOSSIP_KEY%%,$GOSSIP_KEY," /etc/consul/config.json sed -i "s,%%CONSUL_HTTP_PORT%%,$CONSUL_HTTP_PORT," /etc/consul/config.json sed -i "s,%%CONSUL_HTTPS_PORT%%,$CONSUL_HTTPS_PORT," /etc/consul/config.json sed -i "s,%%CONSUL_DNS_PORT%%,$CONSUL_DNS_PORT," /etc/consul/config.json sed -i "s,%%LIST_PODIPS%%,$LIST_IPS_FORMATTED," /etc/consul/config.json cmd="consul agent -server -config-dir=/etc/consul -dc ${ENVIRONMENT} -bootstrap-expect ${CONSUL_SERVER_COUNT}" if [ ! -z ${CONSUL_DEBUG} ]; then ls -lR /etc/consul cat /etc/consul/config.json echo "${cmd}" sed -i "s,INFO,DEBUG," /etc/consul/config.json fi consul agent -server -config-dir=/etc/consul -dc ${ENVIRONMENT} -bootstrap-expect ${CONSUL_SERVER_COUNT}"
Lets go through the options here: Notice in the script we do have some defaults enabled so we may or may not included them when starting up the container.
LIST_PODIPS = a list of Consul IPs for the consul node to join to
CONSUL_WEB_UI_ENABLE = true|false – if you want a web ui
CONSUL_SSL_ENABLE = SSL for cluster communication
If true expects:
CONSUL_SSL_KEY – SSL Key
CONSUL_SSL_CRT – SSL Cert
First we pull in the Kubernetes Token and Namespace. This is the default location for this information in every container and should work for your needs.
KUBE_TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token` NAMESPACE=`cat /var/run/secrets/kubernetes.io/serviceaccount/namespace`
And then we use those ENV VARS with some fancy jq to get a list of IPs formatted so we can shove them into config.json.
LIST_IPS=`curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/$NAMESPACE/pods | jq '.items[] | select(.status.containerStatuses[].name=="consul") | .status .podIP'`
And we wait until the number of CONSUL_SERVER_COUNT has started up
#basic test to see if we have ${CONSUL_SERVER_COUNT} number of containers alive echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//' > tester VALUE=`cat tester | wc -l` while [ $VALUE != ${CONSUL_SERVER_COUNT} ]; do echo "waiting 10s on all the consul containers to spin up" sleep 10 echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//' > tester VALUE=`cat tester | wc -l` done
You’ll notice this could certainly be cleaner but its working.
Then we inject the IPs into the config.json:
sed -i "s,%%LIST_PODIPS%%,$LIST_IPS_FORMATTED," /etc/consul/config.json
which simplifies our consul runtime command quite nicely:
consul agent -server -config-dir=/etc/consul -dc ${ENVIRONMENT} -bootstrap-expect ${CONSUL_SERVER_COUNT}"
Alright so all of that is in for the Consul image.
Now lets have a look at the Kubernetes config files.
consul.yaml
apiVersion: v1 kind: ReplicationController metadata: namespace: kube-system name: consul spec: replicas: ${CONSUL_COUNT} # number of consul containers # selector identifies the set of Pods that this # replication controller is responsible for managing selector: app: consul template: metadata: labels: # Important: these labels need to match the selector above # The api server enforces this constraint. pool: consulpool app: consul spec: containers: - name: consul env: - name: "ENVIRONMENT" value: "SOME_ENVIRONMENT_NAME" # some name - name: "MASTER_TOKEN" value: "INITIAL_MASTER_TOKEN_FOR_ACCESS" # UUID preferable - name: "GOSSIP_KEY" value: "ENCRYPTION_KEY_FOR_GOSSIP" # some random key for encryption - name: "CONSUL_DEBUG" value: "false" # to debug or not to debug - name: "CONSUL_SERVER_COUNT" value: "${CONSUL_COUNT}" # integer value for number of containers image: 'YOUR_CONSUL_IMAGE_HERE' resources: limits: cpu: ${CONSUL_CPU} # how much CPU are you giving the container memory: ${CONSUL_RAM} # how much RAM are you giving the container imagePullPolicy: Always ports: - containerPort: 8500 name: ui-port - containerPort: 8400 name: alt-port - containerPort: 53 name: udp-port - containerPort: 8543 name: https-port - containerPort: 8500 name: http-port - containerPort: 8301 name: serflan - containerPort: 8302 name: serfwan - containerPort: 8600 name: consuldns - containerPort: 8300 name: server # nodeSelector: # optional # role: minion # optional
You might notice we need to move this to a deployment/replicaset instead of a replication controller.
These vars should look familiar by now.
CONSUL_COUNT = number of consul containers we want to run
CONSUL_HTTP_PORT = set port for http
CONSUL_HTTPS_PORT = set port for https
CONSUL_DNS_PORT = set port for dns
ENVIRONMENT = consul datacenter name
MASTER_TOKEN = the root token you want to have super admin privs to access the cluster
GOSSIP_KEY = an encryption key for cluster communication
consul-svc.yaml
--- apiVersion: v1 kind: Service metadata: name: consul namespace: kube-system labels: name: consul spec: ports: # the port that this service should serve on - name: http port: 8500 - name: https port: 8543 - name: rpc port: 8400 - name: serflan port: 8301 - name: serfwan port: 8302 - name: server port: 8300 - name: consuldns port: 53 # label keys and values that must match in order to receive traffic for this service selector: pool: consulpool
consul-ing.yaml
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: consul namespace: kube-system labels: ssl: "true" httpsBackend: "true" httpsOnly: "true" spec: rules: - host: consul.%%ENVIRONMENT%%.%%DOMAIN%% http: paths: - backend: serviceName: consul servicePort: 8543 path: /
We run ingress controllers so this will provision an ingress so we can make Consul externally available.
@devoperandi