Container orchestration with Kubernetes
Orchestration has a history that started from the need to automate and coordinate and manage systems. Today, we’ll take a look into how we used to handle automation and deliverables in the past and the main orchestration tool is used in the dev scene today.
Why of orchestration?#
Through times, we evolved from managing delivery of services manually through FTP, created processes by writting complex shell scripts in a standard way, delivered assets through rsync + ssh, or even by using Git hooks and checking in to the latest branch on remote servers.
More then a decade has past since LAMP developer was a job title and traditional tools faded into better tooling, to do configuration management or provisioning. And eventually, container orchestration.
Configuration Management - Chef, Puppet, Ansible, and SaltStack were designed to help install and manage applications or software on servers.
Provisioning - AWS Cloudformation or Terraform, were designed to provide the server infrastructure (load balancer, database, network topology, etc). Cloudformation is proprietary but in regards of Terraform, it contacts the Cloud provider API to accomplish it.
These terms are not mutually exclusive, as some of the configuration management tools can offer some degree of provisioning and vice-versa.
What’s important to understand this far is that these formed the base for what is now understood as “Infrastructure as code” (IAC). The purpose is to move away from non standardized shell script processes or manual labour, simplifying management and configuration.
Container orchestration - is the process of automating the management of container-based service apps across clusters. To understand it we need to look at what containers are and why they exist.
A Container guarantees that an application runs the same everywhere, its a distributable that includes libraries, binaries, dependencies and configuration in a linux operating system that can be stripped to its bare minimals. In the past, to run an application we’d make sure that the environment was setup correctly causing the classic “it works on my computer” (this was actually solved by tools like Vagrant, that use VM instead to provide a portable development environment).
We’d often had to spend a lot of time troubleshooting, figuring out dependencies, so that our staging and production servers were setup correctly for our application to run.
Container’s solved this by packaging every single requirement on its own contained and distributable package.
A single container in a development environment can be easily run through Docker and Dockerfiles. For multiple container applications you can even work with Docker compose - I’m using Docker here as an example and the tools provided that are easy to use through the CLI in the comfort of your machine.
So, while you can run multiple container applications this way, you wouldn’t want to run all the container applications in a machine with limited resources for production. This is one of the main reasons we’d want to distribute our container applications to multiple machines and with all the complexity that comes with this sort of setup, we’d need a container orchestration system!
A container-orchestration system is a tool to help manage how the container instances are created, managed at runtime, scaled, placed on underlying infrastructure (one or more servers, that we call the cluster), communicate with each other, etc beyond the “development” environment.
We’ll be looking at Kubernetes as a container orchestrator that delivers these capabilities.
Kubernetes, the container orchestrator#
Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications.
There might be different distributions of Kubernetes, such as the “vanilla upstream” (its not a distro, but the github repositories, all the pure Kubernetes project source codes), EKS, GKS, DOK, “vanilla installers” or “vanilla upstream” deployment (kubeadm, kops, kubicorn), kind (kubernetes in docker), Rancher k3s etc.
Some basic use cases can be listed as:
- Run 2 container applications using the Docker image
songs/api:v1.0
- Run 2 container applications using the Docker image
songs/client:v1.0
- Add load-balancer for internal and public services
- Basic autoscaling
- Update the containers with latest images
songs/xxx:v2.0
- Keep services running while upgrading
- Long running services, batch (one time) or CRON like jobs
- Access control (whom and in which resource)
- And much more..
A basic K8s architecture, where we have a logical part and the physical parts, the infrastructe and the applications that are required for k8s to work:
Master, also referred as control plane, functions as the brain of a cluster and is composed by the following components :
- The API server user to interact with the cluster
- Scheduler
- Controller manager
- The etcd a key/value store, the database of the cluster. Most clusters store their state in an etcd service. Etcd needs at least one node to function, and three to provide high availability.
The control plane usually runs on a dedicated node, except on single-node development clusters,such as when running minikube. In AKS, GKE, EKS, the control panel is invisible, we only have a Kubernetes API endpoints.
Nodes (formely called minions) where our applications actually run:
- Container runtime (typically Docker)
- “Node agent” (kubelet) is the agent that connects to the API server, reports the node status, and obtain the list of containers to run
- Network proxy (kube-proxy)
Kubernetes API is mostly a RESTful API that allows us to CRUD resources (create, read, update, delete). Some of these are:
- Node, a machine that runs in the cluster
- Pod, group of containers running in a node
- Service, a network endpoint to connect to one or multiple containers
Pod is an abstraction of one or more containers, the lowest minimal deployable unit in a Kubernetes network.
- Is a concept that only exists in K8s, not related with the container runtime (Docker)
- A pod must have at least one container, and can have multiple containers if necessary (we generally just have a single container in a Pod)
- Kubernetes can NOT manage containers directly
- IP addresses are assigned to Pods not containers
- Containers inside a Pod share the same hostname and volumes
Installing Kubernetes for learning#
I’m a MacOS (visit the Kubernetes docs for other instructions) user and these are the requirements to have Kubernetes running locally:
- Docker desktop
Assuming that you have Docker desktop already installed, open the preferences
> Kubernetes
> Enable Kubernetes
, apply and restart!
You may want to disable the Kubernetes engine when not using it, since this takes about 5-10% of your CPU power.
Note: When enabling on my machine (macOS Catalina), “Kubernetes is starting…” was indefinitely, but after uninstalling and re-installing Docker fixed it. More details here .
Once Docker displays that Kubernetes is running...
in your Macos system bar, top right. Run the command help
for the manual:
kubectl help
You can also find more information at: https://kubernetes.io/docs/reference/kubectl/overview/
Since we are going to use different tools besides the kubectl
tool, to make sure we have a consistent experience, we can use shpod
that provides a container with a shell inside the cluster with standard linux tools (jq, helm, stern, curl, shell auto-complete, etc) that we’ll use throughout the article series.
To setup the container (waits for attachment):
kubectl apply -f https://k8smastery.com/shpod.yaml
Attach to that container, giving a shell inside the cluster:
kubetcl attach --namespace=shpod -ti shpod
To delete:
kubetcl delete -f https://k8smastery.com/shpod.yaml
The shpod.sh script will:
- apply the shpod.yaml manifest to your cluster
- wait for the pod shpod to be ready
- attach to that pod
- delete resources created by the manifest when you exit the pod
Kubectl#
The kubectl (cube control) is a rich CLI tool around the Kubernetes API, which means that whatever we do with the CLI, we can do directly with the API. The kubectl uses a configuration file that can be found in the location ~/.kube/config
. The configuration file and parameters can be either passed as a file –kubeconfig or with individual flags –server, –user, etc.
Let’s get started and check the composition of our cluster:
kubectl get node
Similarly, we could do no
, node
or nodes
.
This will return us the hostname of the machine, for a single node setup that is our case so far, local machine.
NAME STATUS ROLES AGE VERSION
docker-desktop Ready master 12h v1.19.3
The get
command is important, as its used to get resource types, as node
.
We can pass flags such as -o wide
, -o yaml
, or -o json
. Or, pipe the output to jq
:
kubectl get node -o json | jq ".items[] | {name: .metadata.name} + .status.capacity"
To learn more about jq
CLI tool check the documentation here
. If you don’t have jq
in your local machine, feel free to use the shpod
mentioned above.
We can get extended humanly readable info by running kubectl describe <resource-type-name>/<resource-name>
:
kubectl get no
NAME STATUS ROLES AGE VERSION
docker-desktop Ready master 12h v1.19.3
kubectl describe node/docker-desktop
Name: docker-desktop
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=docker-desktop
kubernetes.io/os=linux
node-role.kubernetes.io/master=
...
We can list all available resource types, as documented when kubectl help
“Print the supported API resources on the server”:
kubectl api-resources
Similarly, we can get a description about each type:
kubectl explain <resource-type-name>
kubectl explain pods
kubectl explain pods.spec
kubectl explain pods.spec.volumes
To list all sub fields:
kubectl explain <resource-type-name> --recursive
The Kubernetes documentation can be found here . Bare in mind that vendor k8s distributions extend the list of options that are accessible through the CLI and not available in the standard API doc.
We can get
other resource types as stated, so lets look at Services
.
kubectl get services
There are shorter versions for services
, such as svc
.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 12h
A service
is explained as:
kubectl explain services
Service is a named abstraction of software service (for example, mysql)
consisting of local port (for example 3306) that the proxy listens on, and
the selector that determines which pods will answer requests sent through
the proxy.
If try the pods, we get a message “No resources found in default namespace”.
kubectl get pods
No resources found in default namespace.
At this point you can probably guess by running “kubectl get namespace” or for short “namespace”, or “ns”:
kubectl get namespace
kubectl get ns
NAME STATUS AGE
default Active 12h
kube-node-lease Active 12h
kube-public Active 12h
kube-system Active 12h
shpod Active 60m
Pick the shpod
, we started earlier on (we can also use the short version -n namespace):
kubectl get pods --namespace=shpod
kubectl get pods -n shpod
NAME READY STATUS RESTARTS AGE
shpod 1/1 Running 0 62m
The explainer for namespaces
says:
Namespace provides a scope for Names.
Use of multiple namespaces is optional.
You can get a list of all regardless, or shorter flag “-A”:
kubectl get pods --all-namespaces
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-f9fd979d6-4v6z6 1/1 Running 0 13h
kube-system coredns-f9fd979d6-gcbx6 1/1 Running 0 13h
kube-system etcd-docker-desktop 1/1 Running 0 12h
kube-system kube-apiserver-docker-desktop 1/1 Running 0 12h
kube-system kube-controller-manager-docker-desktop 1/1 Running 0 12h
kube-system kube-proxy-f5q94 1/1 Running 0 13h
kube-system kube-scheduler-docker-desktop 1/1 Running 0 12h
kube-system storage-provisioner 1/1 Running 0 12h
kube-system vpnkit-controller 1/1 Running 0 12h
shpod shpod 1/1 Running 0 63m
Although, our concern should be within the namespace, as we only want to see what’s specific for our service applications, this might come handy.
In the kube-system
you’ll find the essentials as mentioned above, like etcd
, kube-scheduler
, kube-apiserver
, etc.
There are other namespaces apart from default, docker and kube-system.
We have kube-node-lease
and kube-public
.
The kube-public
as explained in the answer here
, contains a single ConfigMap object, cluster-info, that aids discovery and security bootstrap and is readable without authentication. Used for installation mainly.
We’ll look into ConfigMaps
a bit later.
kubectl -n kube-public get configmaps
NAME DATA AGE
cluster-info 2 13h
kubectl -n kube-public get configmap cluster-info -o yaml
apiVersion: v1
data:
jws-kubeconfig-abcdef: eyJhbGciOiJIUzI1NiIsImtpZCI6ImFiY2RlZiJ9..3--vDb1wbI7Lh-Lc_Q4kHTRUZTsiXDsBdJPLqtRW5C4
kubeconfig: |
apiVersion: v1
clusters:
...
The kube-node-least
works as a keepalive, healthcheck ping system that calls the control plane. You can read more about it here
.
Running our first containers#
First, we can’t create a container directly, we need a pod.
Similar to a Docker run command, we can create a pod pingpong
, use a Docker image alpine
and execute a command, such as ping
.
kubectl run pingpong --image alpine ping 1.1.1.1
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/pingpong 1/1 Running 0 56m
Alternatively, we should also create a replicateSet and a deployment. But to run a command as we did above we need to pass a yaml file.
kubectl create deployment pingpong --image=alpine --replicas=1
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/pingpong 1/1 Running 0 56m
pod/pingpong-85f7749846-hgf69 0/1 CrashLoopBackOff 12 38m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 14h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/pingpong 0/1 1 0 38m
NAME DESIRED CURRENT READY AGE
replicaset.apps/pingpong-85f7749846 1 1 0 38m
In the list provided by get all
, we get all the resources in our cluster where the service/kubernetes
already existed as we discussed earlier, its the API connection point for anything in our cluster if need to communicate with the API.
Deployment resource provides a declarative interface for a Pod resource and a ReplicaSet resource. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments. Each Deployment resource requires a unique Deployment Name. Kubernetes resources are identified by their names, so the name must be unique in the target namespace. More detailed information here .
- allows scalling, rolling updates, rollbacks
- canary deployments, details here
- delegates Pods management to replica sets
ReplicaSet has the purpose to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods. A ReplicaSet resource monitors the Pod resources to ensure that the required number of instances are running. More details here .
- ensures that a given number of identical Pods are running
- allows scalling
- rarely used directly
Pods are the smallest deployable units of computing that you can create and manage in Kubernetes. A Pod resource configures one or more Containers resources. Container resources reference a Docker container image and provide all the additional configuration required for Kubernetes to deploy, run, expose, monitor, and secure the Docker container. More details here .
Deployment > ReplicaSet > Pod, these are abstractions, layers of different functionality, split into different purposes, that allows us flexibility on how we use Kubernetes.
It’s more important to understand the concepts at this point as you can always use the Kubectl reference to run the commands, here .
You might not find all commands, such as:
kubectl delete deployment pingpong
As the kubectl run
, we can override a the Dockerfile CMD in a deployment by passing a YAML file containing the spec, command and args fields:
apiVersion: apps/v1
kind: Deployment
metadata:
name: alpine-deployment
labels:
app: alpine
spec:
replicas: 3
selector:
matchLabels:
app: alpine
template:
metadata:
labels:
app: alpine
spec:
containers:
- name: alpine
image: alpine
command: ["ping"]
args: ["1.1.1.1"]
ports:
- containerPort: 80
And then run the CMD to generate the deployment, replicaSet and pods:
kubectl apply -f alpine-deployment.yaml
We can then get
all the resources in the cluster and confirm:
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/alpine-deployment-556cbc76fb-hx54q 1/1 Running 0 2m23s
pod/alpine-deployment-556cbc76fb-mnjds 1/1 Running 0 2m23s
pod/alpine-deployment-556cbc76fb-zcngw 1/1 Running 0 2m23s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 15h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/alpine-deployment 3/3 3 3 2m23s
NAME DESIRED CURRENT READY AGE
replicaset.apps/alpine-deployment-556cbc76fb 3 3 3 2m23s
Similary, we can check for logs
. Where deploy
is a short for deployment and the alpine-deployment
the name.
kubectl logs deploy/alpine-deployment
PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=37 time=25.266 ms
64 bytes from 1.1.1.1: seq=1 ttl=37 time=14.619 ms
64 bytes from 1.1.1.1: seq=2 ttl=37 time=11.260 ms
This will only return the log output for one Pod, but we can check the logs for a specific Pod.
kubectl logs alpine-deployment-556cbc76fb-hx54q
To summarize, for kubectl logs
we can either pass a pod name
or a type/name
. In the example above for the deploy/alpine-deployment
case it gets the very first pod by default.
Other options are --tail <number>
or --tail <number> --follow
.
Scaling our application#
I’ve set a number of replicas to be 3, in the YAML file I’ve shared previously, but we can scale this up.
kubectl scale deploy/alpine-deployment --replicas 10
We do this to change the declarative spec, which means to scale deployment alpine-deployment
to a given number regardless of what’s there before.
ReplicaSet in action#
Let’s see the replicaSet in action by watching the stream of logs from our Pod container and delete to see the effects it causes.
kubectl logs deploy/alpine-deployment --tail 1 --follow
Found 3 pods, using pod/alpine-deployment-556cbc76fb-zcngw
64 bytes from 1.1.1.1: seq=3301 ttl=37 time=13.381 ms
64 bytes from 1.1.1.1: seq=3302 ttl=37 time=12.077 ms
64 bytes from 1.1.1.1: seq=3303 ttl=37 time=11.508 ms
...
In a separate terminal window, while the log stream outputs, we execute:
kubectl delete pod/alpine-deployment-556cbc76fb-zcngw
We should get a response message stating the pod is deleted. Kubernetes does it gracefully, which means that it takes some time to terminate the process. For a short period of time (Kubernetes by default gives 30 seconds grace period for Docker to stop the container), you should see a 4 Pod that is created, while the target is Terminating
, as follows:
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/alpine-deployment-556cbc76fb-598kj 1/1 Running 0 5m21s
pod/alpine-deployment-556cbc76fb-hx54q 1/1 Running 0 57m
pod/alpine-deployment-556cbc76fb-rfwsf 1/1 Running 0 7s
pod/alpine-deployment-556cbc76fb-zcngw 1/1 Terminating 0 57m
This because ReplicaSet resource
monitors the Pod resources to ensure that the required number of instances are running.
Single run containers#
In this section will look at Pods that run a single time and do not restart, these create Jobs or Pods instead of deployments. We’ll also look into Cronjobs .
Previous to version 1.18, the flags to achieve these were:
kubectl run --restart=OnFailure
kubectl run --restart=Never
Similarly, we can create Cronjobs
by using the flag:
kubectl run --schedule=...
Under the hood, the commands invoked “generators” to create resources descriptions that we could write ourselves in YAML (there are other ways but YAML is more typical), we have an example of this in the previous topic.
For the current 1.18+, we use the resource description to achieve it, you can also check the original documentation here :
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
Applied by:
kubectl apply -f https://kubernetes.io/examples/controllers/job.yaml
kubectl describe jobs/pi
Name: pi
Namespace: default
Selector: controller-uid=c9948307-e56d-4b5d-8302-ae2d7b7da67c
Labels: controller-uid=c9948307-e56d-4b5d-8302-ae2d7b7da67c
job-name=pi
...
We can delete it by using one of the delete CMD formats:
kubectl delete jobs/pi
kubectl delete -f ./job.yaml
Better logs#
Stern allows you to tail multiple pods on Kubernetes and multiple containers within the pod. Each result is color coded for quicker debugging.
Install it on macOS & homebrew, run:
brew install stern
We then can run:
stern --tail 1 <deployment-name>
It’ll output log messages with callers and better formating. Check the documentation for details.
Be careful when using stern, becasue if not used properly, you might stream the logs of all the pods in the current namespace, opening one connection for each container. If thousands of containers are running, this can put some stress on the API server.
WIP#
References:
https://www.digitalocean.com/community/curriculums/kubernetes-for-full-stack-developers