Notice:
This post is older than 5 years – the content might be outdated.
This article will guide you through updating the ELK stack from version 1.x to 2.x, taking into account the correct order of its components Elasticsearch, Logstash and Kibana.
The ELK stack became popular in recent years as a centralized log solution. Based on open source tools ELK enables the collection, the storage and the analysis of log files big data style.
As part of our inovex operations team we run ELK at a customer’s site. It stores about 11 terabytes of log data from the last 30 days. The Elasticsearch cluster consists of four data nodes and one master node. Kibana is distributed to two systems behind a load balancer and using nginx to proxy_pass requests. Logstash is distributed to ten systems using redis as cache for incoming log events.
End of Life?
As this infrastructure was built last year it is now past its prime. We see the following versions deployed in production:
- Elasticsearch 1.7.2
- Logstash 1.5.3
- Kibana 4.1.13
These versions are pretty common now as they were cutting edge when people started to employ the ELK stack back in 2015. Yet there is a problem: Both the installed versions of Logstash and Kibana are already close to their end of life date, meaning no one will provide any updates after this point. Elasticsearch 1.7 will be maintained until January 2017.
In this article we will demonstrate how to upgrade the single software components of an ELK stack to their latest released versions and provide you with all links necessary. Our target versions are:
- Elasticsearch 2.4.1
- Logstash 2.4.0
- Kibana 4.6.1
While it’s not difficult to upgrade the software itself (e.g. yum update Logstash on a Centos/Red Hat system) there are some prerequisites to meet.
Kibana
Let’s start at the top of the stack – the front-end. As mentioned, we aim for the latest version 4.6.1. So checking the support matrix page we see this version only works with Elasticsearch 2.4. As there are no further requirements an upgrade is easy. It just has to be scheduled after the Elasticsearch update.
Logstash
We want to update Logstash from 1.5.3 to 2.4.0 – the breaking changes are documented on the Elastic website. One major change was affecting the Elasticsearch output plugin. So before you work your update magic, change the Logstash config to something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
Elasticsearch { workers => 4 index => "some-index%{+YYYY.MM.dd}-new" - host => ["Elasticsearch-host.fancy.domain"] - port => "9200" - protocol => "http" + hosts => ["http://Elasticsearch-host.fancy.domain:9200"] manage_template => "true" template_name => "template" template_overwrite => "true" } |
After this it’s just a matter of your package manager upgrading to the new Logstash version. Keep in mind that Logstash is capable to work with any released Elasticsearch version. So it’s possible to upgrade without any dependency to the Elasticsearch upgrade.
Elasticsearch
First of all check the breaking changes doc. At this point it get’s tricky as we go from Elasticsearch 1.7 to 2.4, which means a lot of breaking changes.
There is a migration plugin to guide you through the migration from Elasticsearch 1.x to 2.x. This plugin, current version 1.18, checks the Elasticsearch configuration and the stored data. It can be installed via:
1 |
/usr/share/Elasticsearch/bin/plugin -i migration -u https://github.com/elastic/Elasticsearch-migration/releases/download/v1.18/Elasticsearch-migration-1.18.zip |
It will look similar to this.
Here are the main changes:
- The Elasticsearch option network.publish_host config is renamed to network.host.
- In case you need the Elasticsearch API for local API calls, e.g. for monitoring purposes, it is necessary to bind explicitly to 127.0.0.1:
1 2 3 |
network: host: ['PublicIp', '127.0.0.1'] |
- Starting with Elasticsearch 2.0 it is not allowed to have dots in field names. In case you need dots in field names there is a specific start option that allows this: just start the Elasticsearch process with the start option -Dmapper.allow_dots_in_name=true. There might be the need for dots in field names if you are processing log events without content filtering from some application logs into an Elasticsearch index. But keep in mind: if there are any dots in field names Elasticsearch 2.x won’t start at all.
After applying the changes the migration plugin will go from red to green:
Keep in mind: This migration plugin just checks the Elasticsearch config and data stored. If you use other apps connecting to your Elasticsearch cluster you have to check them separately.
Full cluster restart
For the sake of this example let’s assume we run on more than one Elasticsearch host. Say we have an Elasticsearch cluster with 3 nodes that share all shards and each shard has one replica. In this case it’s necessary to do a full cluster restart in order to upgrade all nodes. Otherwise just stop Elasticsearch, do an upgrade (e.g. yum upgrade Elasticsearch) and restart.
As we’ve got a cluster that will go from Elasticsearch 1.7.2 to 2.4.0 we must consider one constraint: there is no interoperability between these versions. There is a thorough guide that will show and explain the necessary steps, here’s the gist of it:
- Disable shard allocation within your cluster. This reduces the time until the cluster is fully recovered afterwards.
- Perform a synced flush to stop optimization on shards.
- Remove the migration plugin. Elasticsearch won’t start if its still installed: /usr/share/Elasticsearch/bin/plugin -r migration
- Stop all the nodes.
- Upgrade and start the nodes. Starting dedicated master nodes first will speed up the cluster start.
- Reactivate shard allocation.
- Wait until your cluster becomes green again.
12345678910111213141516171819202122232425262728293031sh-4.1$ curl -XGET http://localhost:9200/_cluster/health?pretty=true{"cluster_name" : "elasticsearch","status" : "yellow","timed_out" : false,"number_of_nodes" : 2,"number_of_data_nodes" : 2,"active_primary_shards" : 6,"active_shards" : 12,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 6,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0} - Don’t forget to update Kibana so that your ELK is usable again.
What we learned the hard way
Despite all the reading and preparations there were some pitfalls we stepped right in. These were:
- After the update our Elasticsearch snapshot repository wasn’t accessible to all cluster nodes, so we weren’t able to generate backups. To solve this we had to recreate the repository.
- We use the Kopf-Plugin to manage our Elasticsearch cluster. Oddly enough nobody provided the config of this plugin with the cluster name. This causes Elasticsearch to refuse to start at the new version.
123456789sh-4.1$ cat /usr/share/elasticsearch/plugins/kopf/plugin-descriptor.propertiesdescription=kopf - simple web administration tool for Elasticsearchversion=2.0.1site=truename=kopf
Summary
To wrap it up:
- Start updating Logstash as it’s compatible with newer Elasticsearch versions.
- Check breaking changes at your Elasticsearch (cluster) with the migration tool.
- Upgrade the Elasticsearch cluster.
- Upgrade Kibana.
Get in touch
For all your Big Data needs visit our website, drop us an Email at list-blog@inovex.de or call +49 721 619 021-0.
We’re hiring!
Looking for a change? We’re hiring Big Data Systems Engineers who have mastered the ELK stack. Apply now!