For my master’s thesis I investigated the potential of using Software-Defined Networking (SDN) to build security mechanisms like firewalls and intrusion detection systems (IDS) – in particular in the context of the Internet of Things (IoT). Find out about the role of SDN in IoT environments and how to build security mechanisms with it in this article!
Motivation
IoT is one of today’s most prominent buzzwords and rising technologies. Smart homes, smart buildings, smart cities, and mostly any object or place prefixed with „smart“ has something to do with IoT.
But as its popularity rises, so do security concerns. Insecam, for example, shows how easy it can be to access an unprotected device. Attacks like Mirai exploit several IoT vulnerabilities like weak/standard passwords and open ports to create a bot-net of infected devices to then perform DDoS attacks. As the devices become more ubiquitous it is not only electronic data that is at stake, but also physical assets. What if a smart-lock is hacked and opens the front door of your house? Because of this it is of high priority to secure such environments (as should any computer network).
Unfortunately, IoT devices themselves as well as the environments where they are used can drastically change. Some small sensors run on micro controllers with just a few KB RAM while other applications run on full-fledged Linux devices like a Raspberry Pi. This makes it difficult to create standards for device-level security. Instead, we could turn to network-level security, where resources like firewalls and IDS protect the hosts. But then again, the equipment available in a small smart home environment is also completely different to the one required for a smart city. The amount of devices as well as the traffic that needs to be handled changes too.
Because of all these issues, it is practically impossible to have a one-fits-all solution for IoT. But then, what is SDN and how can it help solve some of these problems?
SDN Crash Course
Traditionally, a network device (router, switch etc.) is in charge of both the control and data plane. Meaning, it physically forwards packages in the network (data plane) and runs the logic of how the packages should be processed (control plane).
Software-Defined Networking (SDN) on the other hand is a concept/architecture that separates the control and data planes in a network device by introducing an SDN controller. The controller is the one in charge of the control plane and is where the logic of the network resides; SDN-enabled devices (referred to as SDN switches) constituting the data plane can be programmed by the controller to execute the control logic. The interface to program the SDN switches is called a southbound interface and OpenFlow is as of now the most common protocol for this purpose. The main components of an OpenFlow flow are the following:
Component | Description |
Match Fields | Match against packet headers (Eth, VLAN, IP, TCP) |
Priority | Matching precedence of the flow |
Counters | Number of total matched packets and bytes |
Instructions | Action to execute on matching packets (forward, drop, queue…) |
Timeouts | (Idle-) Time until flow expires |
For a full description of all available components please refer to the OpenFlow specification.
The introduction of a central controller and a standard interface (like OpenFlow) allow a central management of the network devices as well as the flexibility to implement new network applications very easily. An overview of the whole SDN architecture is shown in Figure 1.
SDN in IoT Environments
SDN is widely used in cloud environments with major players like Google, AWS and OpenStack adopting it for example for multi-tenancy and to interconnect data centres. At inovex, it has also been used as the basis to explore AI routing algorithms. However, how does it look for IoT environments?
Before answering that question I will introduce one very important piece of software: Open vSwitch (OvS), a software-switch for Linux. Besides being open source and included in Linux as a kernel module, a big advantage of OvS is its built-in support for many network protocols including OpenFlow. This makes it suitable for many different scenarios.
As I mentioned earlier, IoT applications range from smart homes all the way to smart cities. OvS and OpenFlow help unify all of these environments by providing a common interface. As long as we have a Linux machine or a hardware switch with OpenFlow support, we can use the same technology stack to build network applications!
OpenFlow could then be found anywhere from a small smart home environment, powered by a Raspberry Pi or a router running Open WRT (thanks to OvS), to a university or company network with enterprise grade hardware switches. Going even beyond, we have projects such as the Cloud iNfrastructure Telco Taskforce (CNTT). Its goal is to provide a Network Functions Virtualisation (NFV) reference architecture for telecommunications operators using open-source technologies like Linux and Kubernetes. Some of the targeted scenarios are 5G and edge computing, both of which are important domains for IoT. So once again with the same set of technologies and protocols we can target much larger networks.
As a takeaway from this section, SDN not only plays a role in the cloud domain but is slowly spreading to other areas and is/will be important for IoT as well.
Security Mechanisms with SDN/OpenFlow
After a long preamble, how do we build security mechanisms based on an SDN architecture? I considered three different stages/types of security measures: Prevention, detection and reaction. To cover these stages I built three components: Firewall, IDS and a mitigation module.
To test the components I used mininet to emulate an MQTT network, a protocol frequently used with IoT. First, I looked at real-life implementations on which to base my testbed. A use case from the Deutsche Bahn for example is to send GPS coordinates of the trains every 10 seconds to offer services like delay prognosis. This pattern of sending regular measurement values is rather common and therefore I chose it.
The components of the testbed and how they are chained can be seen in Figure 2.
Firewall
A basic firewall can be created almost trivially with OpenFlow: A flow with a drop action will block all traffic that matches it, exactly what a firewall does. However, being implemented with OpenFlow, the firewall rules can be centrally updated by an SDN controller, giving us the flexibility to decide if we want to update all switches at once with the same rules, only a subset of them or whatever other custom logic we program.
For my implementation I made use of multiple flow tables, a feature available since OpenFlow version 1.3 which allows us to better separate processing stages. The first table is a block list: Anything not explicitly blocked will be forwarded to the next table. Here we can add any already known malicious traffic patterns and else this flow table will be dynamically filled by the mitigation module.
The second table is an allow list: Based on topology and application knowledge, we explicitly state the traffic we want to forward. For example in an MQTT scenario we can allow only traffic to and from the broker on the MQTT port, since clients do not need to communicate with each other directly or via other protocols. By doing this we already filter many potential attacks that exploit for example Telnet or ssh ports that are not needed by the application but are (unnecessarily) open by default on some platforms. All traffic that has not been dropped at this point will be forwarded to a third table, which can be something like a learning switch, IP routing etc.
While I implemented the preventive rules for the allow list manually, there are some interesting approaches to add flows automatically for example based on the Manufacturer Usage Description (MUD) profile of an IoT device [1] . The approach in the link also uses OpenFlow and therefore could be integrated into the proposed architecture seamlessly.
Monitoring
Monitoring is essential for a variety of purposes like alerting, debugging, etc. In this case it is where we will get the data for our IDS.
Native OpenFlow Monitoring
OpenFlow itself offers some monitoring capabilities via counters. The counters available vary depending on the specific OpenFlow element (table, flow, port, queue, etc.) and the switch implementation, but generally include duration (how much time since the element started counting), number of bytes and number of packages.
While useful, this information is not granular enough for many package analysis or anomaly detection methods. If using flow-level counters it is also possible to extract information from the flow match, giving access to network layers 2-4. However, the support for counters and match objects can vary drastically, especially when considering hardware devices. For example, on many hardware switches it is not possible to match against TCP flags, although software switches like OvS can. In this sense, some information can get lost.
An alternative would be to program the SDN switches to send all traffic to the SDN controller, allowing a more thorough package analysis. However, this would turn the SDN controller into a traffic bottleneck, slowing down performance and even increasing the risk of a control plane DoS attack.
Dedicated Network Monitoring Protocols
Because of the problems explained above, I think it is best to use a dedicated network monitoring protocol such as NetFlow or sFlow, both of which are supported by OvS as well as by many hardware switches.
NetFlow is a mature monitoring protocol developed by Cisco. While popular and well supported, in my opinion its only drawback is that it can only access layers 3-4 and that the data aggregation is rather inflexible.
For my experiments I chose instead sFlow. The protocol’s distinguishing characteristic is its configurable sampling mechanism (the „s“ in sFlow stands for „sampled“). The purpose is to only capture, in average, 1 out of every N packages in the network. This greatly helps to scale monitoring in larger and/or very busy networks. sFlow data can also be converted into other popular formats like pcap or even NetFlow.
Per default, with sFlow we can already access data in layers 2-4, but we can also configure it to send part of the payload in the probes. By doing so we can potentially analyse features all the way up to layer 7 like HTTP, MQTT, etc. However, it should be noted that the sampling is not optional in sFlow (as opposed to NetFlow, where it is an optional feature). This means that we cannot get certain network features like connection duration that depend on capturing specific packets, because we don’t know exactly which packages we will sample. This limitation can hinder some anomaly detection methods.
IDS
This module on its own is not directly related to SDN, but it is an important one since it inspects the traffic gathered from the switches and informs the mitigation module about suspicious activity.
Anomaly Detection
Detecting anomalies in network traffic is a very broad topic. The authors in [2] give a very good overview of the different types of detection, ranging from statistics to machine learning and information theory concepts. You can also check out this inovex blog series about network anomaly detection using clustering machine learning algorithms, including interesting topics like online vs offline learning.
The core idea of anomaly detection is to first create a baseline of what’s considered normal behaviour. This first step is called training phase. Subsequent behaviour is then compared against that baseline and if it deviates too much, it is considered an anomaly.
Modeling the Network Traffic
For my implementation I wanted a simple and lightweight method that worked well with sFlow’s sampled traffic and was useful for IoT. The main idea I had is to statistically model the communication patterns of the hosts, taking into account the sampling.
Remember the test scenario is an MQTT network where publishers send data at a more or less fixed rate. If we capture all packages from a host with sending rate r in a time window w, one could easily calculate the exact number of packets that would be observed every sampling period (time window) like so: \(n_{pckgs} = r * w\). However, sFlow only samples every N packages in average and thus the IDS does not observe all of them. Instead, we can use the Poisson distribution which expresses the probability that k events occur in an interval for a given λ (average). Our theoretical lambda can be calculated as follows: \(\lambda_p = \frac{r * w}{N}\).
In Figure 3 you can observe the poisson distribution with the theoretical λ compared to the normalised frequency distribution for one host in the emulated network. As we can see, after 12 minutes the general shape is already visible and after one hour it fits almost perfectly.
During the training phase, the IDS learns the λ that best fits each pair of 〈host, protocol〉 and sets an upper threshold to what is considered a „normal“ amount of traffic. I used the quantile function, which takes as an argument a probability p and returns the value of a random variable x such that the probability of said variable being less than or equal the returned value equals p. In other words, if we set p=0.95 for example, the quantile function answers the question „what is the maximum amount of packets that I will observe in 95% of my sampling periods?“. Everything above the calculated threshold is considered an anomaly.
This can be used detect a DoS, brute force or port scanning attack, which send an abnormal amount of packages. If there is any anomaly detected in the sampling period, then it is reported to the mitigation module.
Mitigation Module
This module is in charge of taking measures against found anomalies and communicates with the SDN controller (Ryu) via its REST API. For my implementation I created flows (firewall rules) to drop the traffic matching the characteristics reported by the IDS. The flows include a timeout so that if the host behaves correctly again its packages will be sent normally. If the misbehaviour continues, then the flow gets renewed with a new, longer timeout. As stated before, these rules are inserted into the first flow table of the firewall and have therefore the highest priority.
This policy of blocking the traffic is effective but simple. However, it is possible to use any custom logic here: We could redirect traffic to another host for further inspection, or set a rate limiter instead of dropping the traffic, etc.
Results
To test this approach, I let the IDS observe the MQTT network for a few minutes. After that, one or more of the MQTT clients will start an attack on the network. As can be seen in Figure 4, when an attack starts the throughput goes up but within a few seconds the system is able to recognise it and creates rules to mitigate it. In the case of the SYN flood attack the rules created were not specific enough to block only the attack traffic and thus we see a spike in false positive rate (FPR), meaning some benign traffic was blocked as well.
Conclusion
This article gave a quick introduction into SDN, explored its role in an IoT context and described why it is a good choice to target many potential scenarios. It also showed how to create different security mechanisms on top of an SDN architecture.
SDN is very flexible and offers a lot of room to explore new algorithms and methods. It is therefore possible to extend any of the methods presented in this article. For example one could automatically create preventive firewall rules, explore different anomaly detection algorithms and mitigation policies etc. Furthermore, the SDN architecture itself can be modified for example by introducing a distributed data plane.
Also, while I tried to focus on IoT, SDN can of course be used in other types of network. Overall, it is a very exciting technology and it has yet to achieve it full potential.
References
[1] A. Hamza, H. H. Gharakheili, T. A. Benson, and V. Sivaraman, “Detecting Volumetric Attacks on loT Devices via SDN-Based Monitoring of MUD Activity,“ in Proceedings of the 2019 ACM Symposium on SDN Research, Apr. 2019, pp. 36–48, https://doi.org/10.1145/3314148.3314352
[2] Fernandes, G., Rodrigues, J.J.P.C., Carvalho, L.F. et al. „A comprehensive survey on network anomaly detection“. Telecommun Syst 70, 447–489 (2019). https://doi.org/10.1007/s11235-018-0475-8