Notice:
This post is older than 5 years – the content might be outdated.
Kubernetes and Docker are great tools to manage your microservices, but operators and developers need tools to debug those microservices if things go south. Log messages and application metrics are the usual tools in this cases. To centralize the access to log events, the Elastic Stack with Elasticsearch and Kibana is a well-known toolset. In this blog post I want to show you how to integrate the logging of Kubernetes with the Elastic Stack. To start off, I will give an introduction to the log mechanism of Kubernetes, then I’ll show you how to collect the resulting log events and ship them into the Elastic Stack. I also provide a GitHub repository with a working demo. Finally, I highlight some considerations for the production deployment.
Logging in Kubernetes
Kubernetes recommends applications to log to the standard streams stdout and stderr. Logging to these streams has many advantages: First of all, both streams have been part of Unix systems for many decades. Thus the standard library of every major programming language does have support for logging to these streams. Nevertheless it is advisable to use a mature logging framework to manage the log verbosity and the log format. Secondly, logging to stdout and stderr does not involve any network protocol like syslog. Network logging protocols like syslog or GELF solve the problem of shipping the log messages to a central log destination. However, they come at a cost. Developers need to implement those protocols and handle errors in the network. Finally, the two streams are endless by nature. Thus they are a natural fit for an endless stream of log messages. Therefore many modern manifestos for good application design like the twelve factor app manifesto are recommending logging to stdout.
Kubernetes logs the content of the stdout and stderr streams of a pod to a file. It creates one file for each container in a pod. The default location for these files is /var/log/containers . The filename contains the pod name, the namespace of the pod, the container name, and the container id. The file contains one JSON object per line of the two streams stdout and stderr. Kubernetes exposes the content of the log file to clients via its API. The following example shows the content of the file for a Kubernetes dashboard pod and the output of the kubectl logs command.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
$ cat /var/log/containers/kubernetes-dashboard-udemm_kube-system_kubernetes-dashboard-5e4b95ee53705d1b664ad64540e9ad072bd8c8908373ad26c0679e178587a82b.log {"log":"Using HTTP port: 9090\n","stream":"stdout","time":"2016-12-22T08:24:51.228867198Z"} {"log":"Creating API server client for \n","stream":"stdout","time":"2016-12-22T08:24:51.232891038Z"} {"log":"Successful initial request to the apiserver, version: v1.5.1\n","stream":"stdout","time":"2016-12-22T08:24:51.456582591Z"} {"log":"Creating in-cluster Heapster client\n","stream":"stdout","time":"2016-12-22T08:24:51.456681353Z"} $ kubectl logs --namespace=kube-system kubernetes-dashboard-udemm Using HTTP port: 9090 Creating API server client for Successful initial request to the apiserver, version: v1.5.1 Creating in-cluster Heapster client |
Hence, logging to stdout and the kubectl logs command are a powerful combination to troubleshoot problems of applications running inside a pod. However, Kubernetes deletes all log files of a pod when it gets deleted, so you can’t troubleshoot errors in already deleted pods. To solve this problem I used fluentd together with the Elastic Stack to store and view the logs via a central tool.
Fluentd
Fluentd is a flexible log data collector. It supports various inputs like log files or syslog and supports many outputs like elasticsearch or Hadoop. Fluentd converts each log line to an event. Those events can be processed and enriched in the fluentd pipeline. I have chosen fluentd since there is a good Kubernetes metadata plugin. This plugin parses the filename of the log file and uses this information to fetch additional metadata from the kubernetes API. The metadata like labels and annotations are attached to the log event as additional fields so you can search and filter by this information. Furthermore, we use the metadata to route the log events to the proper elasticsearch indices. I use one index per pod, so I can implement a log rotation policy for each kind of pod. E.g. you may want to store the logs of a back-end system for two weeks but the access logs of the front-end for two days only.
To deploy fluentd into the Kubernetes cluster I have chosen a DaemonSet. A DaemonSet ensures that a certain pod is scheduled to each kubelet exactly once. The fluentd pod mounts the /var/lib/containers/ host volume to access the logs of all pods scheduled to that kubelet as well as a host volume for a fluentd position file. This position file saves which log lines are already shipped to the central log store. The fluentd setup described here can be created with the following yaml file. It contains the configuration of the DaemonSet and a ConfigMap. The configmap holds the fluentd configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
# This config should be kept as similar as possible to the one at # cluster/addons/gci/fluentd-gcp.yaml apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: fluentd-logging namespace: kube-system labels: k8s-app: fluentd-logging version: v1 #kubernetes.io/cluster-service: "true" spec: template: metadata: name: fluentd-logging # If you want different namespace replace it. namespace: kube-system labels: k8s-app: fluentd-logging spec: dnsPolicy: Default containers: - name: fluentd-logging # For Production use replace image name by an image in your private registry and set imagePullPolicy to reasonable value. image: local/kubernetes-logging:1.19 imagePullPolicy: Never volumeMounts: # Mount log directory of the host readonly to access all log files. - name: varlog mountPath: /var/log/ readOnly: true - name: varlibdockercontainers mountPath: /var/log/containers readOnly: true # Path where the position file is stored. - name: fluentdposfiles mountPath: /var/lib/fluentd # This mount is minikube specific since /var/log/containers/*.log are symlinks into this mount. # Probably you will not need it in produktion. Maybe you have to replace it by a mount of /var/lib/docker/containers since normal kubernetes # symlinks into that directory. - name: minikubemount mountPath: /mnt/sda1/var/lib/docker/containers readOnly: true # Mount the configuration file. - name: config-volume mountPath: /etc/td-agent terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log/ - name: varlibdockercontainers hostPath: path: /var/log/containers # In production put this position file somewhere where it survives node restarts. - name: fluentdposfiles hostPath: path: /tmp/fluentd - name: minikubemount hostPath: path: /mnt/sda1/var/lib/docker/containers - name: config-volume configMap: name: fluentd-config --- apiVersion: v1 data: td-agent.conf: |- # Do not directly collect fluentd's own logs to avoid infinite loops. <match fluent.**> type null </match> # Example: # {"log":"[info:2016-02-16T16:04:05.930-08:00] Some log text here\n","stream":"stdout","time":"2016-02-17T00:04:05.931087621Z"} <source> type tail path /var/log/containers/*.log pos_file /var/lib/fluentd/es-containers.log.pos time_format %Y-%m-%dT%H:%M:%S.%NZ tag kubernetes.* format json read_from_head true </source> # This plugin the metadata from the kubernetes API. <filter kubernetes.**> type kubernetes_metadata </filter> # This configures the elasticsearch output. It assumes the DNS name `elasticsearch-logging` as the target elasticsearch. # The @type elasticsearch_dynamic allows to route the <match kubernetes.**> @type elasticsearch_dynamic host elasticsearch-logging port 9200 # Save the tag key include_tag_key true # Use the logstash format logstash_format true # Saves the log events to the index kubernetes-<pod_name>. logstash_prefix kubernetes-${record['kubernetes']['pod_name']} # Set the chunk limit the same as for fluentd-gcp. buffer_chunk_limit 2M # Cap buffer memory usage to 2MiB/chunk * 32 chunks = 64 MiB buffer_queue_limit 32 flush_interval 5s # Never wait longer than 5 minutes between retries. max_retry_wait 30 retry_wait 10s # Disable the limit on the number of retries (retry forever). disable_retry_limit # Use multiple threads for processing. num_threads 1 </match> kind: ConfigMap metadata: creationTimestamp: null name: fluentd-config namespace: kube-system |
The complete demonstration based on minikube can be found in this GitHub Repository.
Considerations for Production Deployments
In a production environment you have to implement a log rotation of the stored log data. Since the above fluentd configuration generates one index per day this is easy. Elasticsearch Curator is a tool made for exactly this job. The minikube demonstration provides a good starting point to setup Curator in a Kubernetes environment.
As discussed above logging to stdout is very easy. However, you’ll want to log the data in a structured fashion to allow more efficient search in the log data. The demo linked above contains two minimal example applications, normal_logging and structured_logging. The following snippet shows that their log outputs contain the same amount of information.
1 2 3 4 5 6 7 |
$ kubectl logs normal-logging-2250448850-24m73 | tail -n 1 Textlen=609 prefix=2 words=100 level=info msg="In previous blog articles we talked about the new master. The additional slaves will be reconfigured automatically to use the new master. The application/clients that are using the Redis setup will be informed about the new address to use the new address. Configuration and example setup with three nodes The Sentinel process is a good start, so run your Redis Master and two Slaves before setting up the Sentinel which Redis master it should monitor. In short, these are the benefits of using Sentinel: Monitoring: Sentinel constantly checks if the Redis setup will be informed about the basic Redis features" time="2016-12-23T09:08:09Z" $ kubectl logs structured-logging-4189333180-10lsn | tail -n 1 {"Textlen":599,"words":100,"prefix":2,"level":"info","msg":"In previous blog articles we talked about the basic Redis features and learned how to persist, backup and restore your dataset in cause of a distributed system. That means your Sentinel processes are a part of a distributed system. That means clients can connect to the Sentinel processes. On each of your Redis Master and two Slaves before setting up the Sentinel which Redis master it should monitor. In short, these are the benefits of using Sentinel: Monitoring: Sentinel constantly checks if the Redis setup will be informed about the basic Redis features and learned how to persist, backup and","time":"2016-12-23T09:08:35Z"} |
The normal logging application prints the log message in a custom log format resembling a key=value format. The structured logging application logs the same amount of information as JSON, one complete JSON event per log message. Fluentd will recognize that JSON event per line and use it as a base for the stored log event in the Elasticsearch index so it can use those keys as fields in the Kibana front-end. The following screenshot shows a log message of the structured logger in the Kibana front-end. No additional parsing was configured in the fluentd pipeline.
The same effect can be achieved by parsing the log messages of the normal logging application in the fluentd pipeline, but not without a cost: Somebody has to maintain the parsing code, which usually is a set of regular expressions. Since each application tends to have its own log format, the set of regular expressions will grow. Furthermore fluentd has to invest CPU time to execute the parsing.
My personal preference is to establish the following policy: Logs to stdout have to be in JSON format. The disadvantage of that policy is that JSON is not that human readable and developers read logs a lot during development. A mature log handling library allowing different configurable log formats might help the developer to implement this policy, so they can chose a more readable log format while developing a new feature.
Read on
Have a look at our website to find out more about the services we offer in IT Engineering and Kubernetes (in German) write an email to info@inovex.de for more information or call +49 721 619 021-0.
Join us!
Looking for a job where you can work with cutting edge technology on a daily basis? We’re currently hiring Linux Systems Engineers (m/w/d) in Karlsruhe, Pforzheim, Munich, Cologne and Hamburg!