Notice:
This post is older than 5 years – the content might be outdated.
Network policy validation basically ensures the functionality of your cluster’s firewall and therefore is a really important topic if you are using kubernetes and want to control the network behaviour of your pods. During the last months we further developed the work of Maximillian Bischoff and finally released our kubernetes network policy validator—illuminatio.
Why Do You Have to Validate Network Policies?
Sometimes network policies are declared but not enforced. This can be the case if the nodes of your cluster do not synchronize the policies in time which can mean that your policies will only take effect on your nodes after plenty of time has passed, maybe minutes, maybe hours. Due to the implementation of network policies in Kubernetes there is currently no feedback whether a plugin has implemented the network policy or not. If your network plugin does not support network policies or implements them incorrectly, you will not receive any error message; the policies could even have unwanted side effects. This can be a security issue for your Kubernetes cluster if you rely on your policies to work properly. It is best to validate them and make sure they are in effect.
What is illuminatio?
illuminatio is a command line tool that automatically tests all your network policies in a Kubernetes cluster. It is written in Python and uses the official kubernetes python package to interact with a Kubernetes cluster in a similar way to kubectl. It automatically fetches existing network policies from the cluster, creates and executes suitable test cases within the cluster and reports the results.
How Does illuminatio Work?
illuminatio fetches all network policies of a cluster, evaluates them and creates proper test cases. After that it checks whether there are pods that should be affected by them; when there are pods missing to execute a test case they will be created as dummy pods.
illuminatio will launch the illuminatio runner as a pod inside the cluster in the illuminatio namespace. The pod needs to be run with the capability SYS_ADMIN because it jumps into Linux network namespaces (not to be confused with Kubernetes namespaces). Linux network namespaces are a Linux kernel feature to isolate different processes on the network layer.
The target pod’s network namespace will be fetched either by the docker python library or crictl depending on whether you are using Docker or a CRI compliant runtime. This is a workaround for the behaviour of the Docker runtime returning incomplete data when queried with crictl.
To check whether a network policy is in effect the illuminatio runner jumps into an affected pod and tries to perform network requests that should be affected by the policy. The results telling whether a policy is in effect or not are written into a dedicated config map within the cluster.
The illuminatio CLI tool waits until all results have been written into the config map and prints the overall results to the command line.
Getting Started
illuminatio is available as a PyPi Package and can be easily installed with pip:
pip install illuminatio
Now you only need access to a Kubernetes cluster and a suitable kubeconfig file located at
~/.kube/config
illuminatio will use your kubeconfig to interact with the cluster.
Let’s create some resources to perform tests with. First we will create an nginx server as a deployment:
kubectl create deployment web --image=nginx
However this nginx server can only be reached under a random ip.
With a service we can create a stable endpoint for our nginx deployment, we can implicitly create it by exposing our deployment:
kubectl expose deployment web --port 80 --target-port 80
Our nginx server can now always be reached on http://web.default:80.
Finally we will create a network policy to prohibit any ingress traffic to our nginx deployment:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
cat <<EOF | kubectl apply -f - kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: web-deny-all spec: podSelector: matchLabels: app: web ingress: [] EOF |
Now we are ready to test our setup with illuminatio:
illuminatio run
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
Starting test generation and run. Got cases: [NetworkTestCase(from=ClusterHost(namespace=default, podLabels={'app': 'web'}), to=ClusterHost(namespace=default, podLabels={'app': 'web'}), port=-*)] Generated 1 cases in 0.0730 seconds FROM TO PORT default:app=web default:app=web -* Using existing cluster role Creating cluster role binding TestResults: {'default:app=web': {'default:app=web': {'-*': {'success': True}}}} Finished running 1 tests in 34.6288 seconds FROM TO PORT RESULT default:app=web default:app=web -* success |
As the output success suggests the run has been successful, which means that the tested network policy is in effect as expected. This is far more convenient than testing each policy manually by interactively entering your pods with kubectl exec.
Within this run illuminatio has created several resources in your cluster which you might want to remove afterwards. This can be easily done with another single command:
illuminatio clean
1 2 3 4 5 6 7 8 9 10 11 |
Starting cleaning resources with policies ['on-request', 'always'] Deleting namespaces ['illuminatio'] with cleanup policy on-request Deleting namespaces [] with cleanup policy always ... Deleting SAs in default with cleanup policy always Finished cleanUp |
Note: If you run illuminatio again on the same cluster make sure to include the clean command, as existing resources will otherwise influence the results:
illuminatio clean run
You can run this command as often as you’d like without affecting other resources in the cluster.
Improvements Since the Original Implementation
Many things have changed and improved since the initial implementation which our last post described:
- The overall code quality has improved.
- Pipelines have been built.
- Additionally to CRI compliant runtimes like containerd, illuminatio does now also support the Docker runtime on your Kubernetes nodes.
- This was achieved by inspecting the so called pause container.
- Each Kubernetes pod contains an additional pause container which itself contains information like the network namespace in which the pod is running.
- The functionality of illuminatio is now also tested inside the pipeline by spawning Kubernetes clusters, creating network policies with suitable resources and then running illuminatio against each cluster to validate the policies.
- Initially only Minikube was used as a test environment both locally and in the pipeline. However, despite our efforts it was not possible to run Minikube within a Travis CI VM with the containerd runtime. With a new version of Minikube it was finally possible to run it with containerd in travis.
- illuminatio was also released as a Python package on PyPI using PyScaffold.
- illuminatio received documentation for both users and developers to get an overview of its design and use.
How illuminatio Differs from Other Network Policy Validation Tools
Another tool for network policy validation is netassert. Netassert requires you to provide a config with the test cases to be executed and requires direct SSH access to the Kubernetes node on which your test pods are running. Another restriction is that this only works if your pods have been created by deployment.
Netassert uses the docker run --net feature to enter the network namespace of the pod’s pause container. Its biggest restriction—other than the tedious work of writing your own test cases—is, however, that you can only use it in a Kubernetes cluster using the Docker runtime. Clusters using other runtimes like containerd cannot use netassert at all.
Also, there is sonobuoy. Sonobuoy is actually a diagnostic tool to do things like conformance testing of Kubernetes clusters, however you can also execute NetworkPolicy e2e tests with it.
You only need access to a Kubernetes cluster with a proper kubeconfig and sonobuoy will be ready to use for this purpose. I haven’t had much experience with this tool, but it struck me that it takes plenty of time until you receive any results at all from it; a notice in the command line even tells you that it can take up to 60 minutes until the run is complete.
Future Work
illuminatio is still in its early days and has a bunch of issues we want to address in the future:
- Egress policies are not supported yet.
- There is no validation of the e2e test results yet.
- Each new run requires a clean beforehand because the runner does not continuously look for new cases.
- Only policies affecting intra-cluster traffic are examined.
- CIDR notation in policies is not supported yet.
You can find the entire issue list on Github.
The End
Thanks a lot for reading this blog post! If you liked it or want to provide any feedback make sure to checkout our Github and share your wishes and experiences with illuminatio. We are excited to see your contributions in the next release of illuminatio!