Thursday, 19 September 2013

Interworking betweeen OpenFlow and legacy IP networks

In this blog post I return to the goal of selling OpenFlow technology. If it is to become accepted, it does need to be sold.

One of the objections raised about OpenFlow is that it doesnt inter-work with "legacy" IP networks.

As with most sales objections, there is often some truth in these objections and also fear, uncertainty and doubt (FUD) spread by competitors that have something to lose.

So what does the word "interworking" actually mean? In order to answer that question, lets return to basics.

The world is full of IP networking equipment.  In fact thee human race has become dependent on the Internet.  It would be impossible to build a parallel OpenFlow network and have a big bang switch over on a particular date and time.  Lets also be realistic -  legacy IP networking does a reasonable job. The business case of ripping it out and replacing it with OpenFlow doesnt make sense if we apply it on a ubiquitous basis.

At least initially, OpenFlow is being deployed in very specific locations to solve particular problems. I am very familiar with mobile network architectures and in the core network (if you are familiar with the terminology, the Gi network) is where OpenFlow will deliver simplicity, elegance and flexibility for mobile operators and remove the mountain of different boxes which are attempting to control user policy whereas in a mobile feeder network, there are little gains to be made - at least today.  (This would be the Iu network before the SGSN if you are familiar with the terminology). So here OpenFlow has a sweet spot. It is likely that OpenFlow adoption will be on the periphery - places where the legacy technology is struggling or where it's a square peg in a round hole - in other words there are lots of boxes to do work-arounds.

Google actively uses OpenFlow in live operations.  If OpenFlow doesn't interwork then surely we wouldn't be able to use Google!

In my experience, there are real challenges getting legacy IP equipment to interwork. I don't mean interworking between different vendors.  I've seen network engineers struggle to get a Cisco box to talk with another Cisco box simply because one had a different software version!   Interworking challenges are simply part of the world of networking.

So let's look a little closer at OpenFlow. There are 2 types of interfaces: data and control.  Whereas on a legacy router, the control is usually embedded in with the data. One of the key differences is that OpenFlow separates out the data plane from the control plane.

So if you have an OpenFlow switch and we look at the data plane interfaces, they will probably be Ethernet. They will support the IP protocol and the box will forward packets from one port to another.  So far no difference to a legacy switch or router so no challenges for interworking here. At least on first sight.

So what is this control I'm referring to?  Control is the decision making how a packet is routed from A to B. In legacy networking there are a few ways to do this.

  1. Broadcast or flooding
  2. Someone manually configures it
  3. It is automatically discovered
Broadcasting or flooding works on very small networks but it doesnt scale. A basic hub does this. A packet arrives and the hub doesnt know what to do with it so sends it everywhere (including back to the originator).

Next someone manually programmes the equipment saying this address or address range can be reached down here.  Humans tend to make mistakes so traffic might be routed down a black-hole. Also the Internet is constantly changing so it's again not a scalable approach.

Finally we have discovery.  Legacy IP equipment uses a variety of protocols to discover and advertise routing information.  Protocols such as ARP, RIP, OSPF, BGP.

These protocols form the control and routers make their own decisions based on this routing information.

OpenFlow however relies on a "central" intelligence to make routing decisions. When a packet arrives at an OpenFlow switch, if it hasnt been programmed, the switch doesnt know what to do with it.  (A legacy router will have the same issue unless it has been configured and gathered routing information.)  The OpenFLow switch therefore sends a message via the control interface to a controller and says "I've got this packet - what do you want me to do with it?".  Controllers are programmable.  It is therefore possible to write a controller that behaves identically to a legacy router.  The only difference is there is either a logical or maybe a physical separation between the control and the data plane compared to a legacy router.

So if we can develop an OpenFlow switch controller combination that is conceptually identical to a legacy switch, interworking issues are clearly being overstated!

However it doesnt make sense to go to all the trouble of creating OpenFlow to only do what a legacy router does - it wouldnt create the shift away from a box driven networking industry!

Todays legacy IP networks are in-fact islands.  There are lots of fancy terms like routing domains and "autonomous systems" for these islands. The internet is not a single thing - it is a collection of networks where each network has it's own controls.

This is an important concept and it is one that will continue.  To achieve interworking between legacy networks and OpenFlow networks we need islands.  It is the joining point between islands where the challenges and complexities lie.  So at the demarcation point between a legacy network, the gateway router may be talking to the OpenFlow switch using OSPF, where it is advertising what networks it can reach. The OpenFlow switch at the border needs to participate in this discussion.

If look at the last sentence, it isn't 100% true.  The OpenFlow switch can in fact be pretty dumb. When an OSPF packet arrives from the legacy router, it can adopt the approach of forwarding it to the controller and asking what to do with it. It is the controller, using software, that can decide how to interact with the legacy router by programming the OpenFlow switch. 

So interworking with legacy networks is not an insoluble problem. It does require some thought, planning and intelligence but these are skills that you need in legacy networks anyway!


Wednesday, 18 September 2013

Don't buy this USB to Ethernet dongle!

In one of my first posts on how to build an OpenFlow switch using a  Raspberry Pi, I suggested buying some USB to Ethernet adaptors to overcome the limitation that it only has one Ethernet port and you need more than 1x Ethernet to do any switching!

The USB dongle I pointed to, is however rubbish. Do no buy this. The reason not to buy it, is that the company making these in China populates it with the same MAC address! How stupid is that.

[UPDATE. I guess I'll eat my words. You may as well buy this dongle. I bought a more expensive dongle to see if it was better.  This time it was a white box with a USB fly lead.  Looks totally different on the outside but it is identical on the insider to the much cheaper blue one. Same chipset and MAC address.....So I've done another search to explore yet more alternatives. So I paid £3 for the blue ones,  £6 for the white one and the next clearly different alternative dongle is £20.  I can't see the point in spending £20 for one of these more expensive ones when these £3's work or rather can be made to work. See below for work-around]

It took me a while to figure out why my OpenFlow switch wasn't working how I wanted it.  First rule: Never make assumptions.  I assumed MAC addresses would be different. Wrong.

eth1      Link encap:Ethernet  HWaddr 00:e0:4c:53:44:58 
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth2      Link encap:Ethernet  HWaddr 00:e0:4c:53:44:58          <=== SAME!!! Ughhh
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Here's a dump from ifconfig.  Note the MAC address is the same at 00:e0:4c:53:44:58

Several other people on the Internet have made the same discovery!

If you have bought one of these, the good news it is possible to "fix" the problem. On the Raspberry Pi type:

pi@raspberrypi ~/LINC-Switch $ sudo ifconfig eth2 down
pi@raspberrypi ~/LINC-Switch $ sudo ifconfig eth2 hw ether 00:e0:4c:53:44:59   <-Different MAC
pi@raspberrypi ~/LINC-Switch $ sudo ifconfig eth2 up

Also don't buy the  WiFi dongle I recommended based on the RTL8188CUS chipset.  The reason is although this dongle works, it is a pain to get it supporting AP mode on the Raspberry Pi. I wasted 1 hr and then switched to an RT5370 based one which I got working in AP mode in under 5 mins and they are cheaper too (approx £4).

I've just ordered a different type of USB to Ethernet adaptor to see whether this has the same problem! [UPDATE: Which it did.....]

Thursday, 12 September 2013

OpenFlow Switch on Raspberry Pi Part 5: First simple experiment

This is part 5 in the series of building an OpenFlow switch on the Raspberry Pi.

On in part 4 we set-up Ryu to be an L2 switch and it applied flow rules to the LINC switch. The traffic source which triggered the rules was port eth0 which effectively is the control port since this connects the Raspberry Pi to my network and ultimately to the Ryu controller. The flows were therefore applied as a result of noise and chatter on the local LAN.

For our first very simple experiment we need to have a more controlled environment so let's modify the LINC config so that eth0 is solely to connect the switch to the controller.

To shutdown LINC

(linc@raspberrypi)2> init:stop().

Let's edit the LINC config file

sudo vi /home/pi/LINC-Switch/rel/linc/releases/1.0/sys.config

       {ports,
        [
         %% - regular hardware interface
         {port, 1, [{interface, "eth1"}]},
         {port, 2, [{interface, "eth2"}]},
         %% {port, 4, [{interface, "eth0"}]}
         {port, 3, [{interface, "wlan0"}]}
         %% - hardware interface with explicit type

Comment out port 4 (I moved it as the comma on the last entry for port 3 causes the config file to fail).

Restart the switch

pi@raspberrypi ~/LINC-Switch $ sudo rel/linc/bin/linc console

So I now have 3 ports for traffic on the Pi.  I need a traffic source so I connected a laptop via a cable to eth1. There is no DHCP so no IP addresses on the laptop will need be assigned so you will need to change the IP address on the laptop to be a static IP address. I set mine to 192.168.1.130/24 with a default gateway of 192.168.1.1.

So the controller spots the laptop

>  installing new source mac received from port 1

If we now look at the flow tables on the switch we can see what's happening in more detail and understand.

Let's view LINCs flow table:

(linc@raspberrypi)1> ets:tab2list(linc:lookup(0, flow_table_0)).
[{flow_entry,{0,#Ref<0.0.0.374>},
             0,
             {ofp_match,[]},
             <<0,0,0,0,0,0,0,0>>,
             [],
             {1370,616692,527826},
             {infinity,0,0},
             {infinity,0,0},
             [{ofp_instruction_write_actions,4,
                                             [{ofp_action_output,16,controller,65535}]}]},
 {flow_entry,{123,#Ref<0.0.0.381>},
             123,
             {ofp_match,[{ofp_field,openflow_basic,in_port,false,
                                    <<0,0,0,1>>,
                                    undefined},
                         {ofp_field,openflow_basic,eth_src,false,
                                    <<32,207,48,0,192,96>>,
                                    undefined}]},
             <<0,0,0,0,0,0,0,0>>,
             [],
             {1370,616842,124920},
             {infinity,0,0},
             {infinity,0,0},
             [{ofp_instruction_goto_table,6,1}]}]

The line with

{ofp_field,openflow_basic,eth_src,false,
                                    <<32,207,48,0,192,96>>,
                                    undefined}]},

This is the laptop's MAC address in decimal notation 20:CF:30:00:C0:60

(linc@raspberrypi)1> ets:tab2list(linc:lookup(0, linc_ports)).
[{linc_port,1,<0.164.0>},
 {linc_port,2,<0.161.0>},
 {linc_port,3,<0.157.0>}]

OK. LINC isn't the most user friendly if you are a network engineer.  There are plans to improve this and adopt a more familiar user interface like Cisco IOS.

 Right. Let's stop ryu and install a really simple controller configuration to show how things work.

ryu is written in python.  I have to admit that it's taking me a while to get used to python syntax having used c, php and other languages that use {} structures for function declarations.  Python uses just space or tabs to identify what's a function ! Seems crazy to me but that's how it's done.

## Simple ryu layer 2 hub 
## All packets arriving at the OpenFlow switch are passed to the controller
## The controller simply floods all incoming messages out of all ports on the switch
## You would never do this in reality!
## No flows are installed on the switch to remember how to handle packets

from ryu.base import app_manager
from ryu.controller import ofp_event
from ryu.controller.handler import MAIN_DISPATCHER
from ryu.controller.handler import set_ev_cls

class L2Switch(app_manager.RyuApp):
    def __init__(self, *args, **kwargs):
        super(L2Switch, self).__init__(*args, **kwargs)


## set_ev_cls decorator does all the  work. Incoming packets referred to EventOFPPacketIn

    @set_ev_cls(ofp_event.EventOFPPacketIn, MAIN_DISPATCHER)

## packet_in_handler defines rules which are processed when a packet arrives

    def packet_in_handler(self, ev):## All below is part of the packet_in_handler function 
## These are datastructures for the incoming message
## ev.msg represents a packet_in 
        msg = ev.msg
## msg.dp reepresents the datapath for the switch
        dp = msg.datapath
## dp.ofproto represents the protocol to the switch which was negotiated
        ofp = dp.ofproto
        ofp_parser = dp.ofproto_parser
## OFPActionOutput(arg) is which port the message should be sent out of
## OFPP_FLOOD refers to all ports or a flood

        actions = [ofp_parser.OFPActionOutput(ofp.OFPP_FLOOD)]
## Build the packet to send using OFPPacketOut
        out = ofp_parser.OFPPacketOut(
            datapath=dp, buffer_id=msg.buffer_id, in_port=msg.in_port,
            actions=actions)
## Send the built packet
        dp.send_msg(out)
You would never actually use OpenFlow like this. Here's what it does.

A packet arrives at the switch.  The switch checks what rules (flows) have been defined for the arriving packet. The controller hasnt actually installed any so it then refers the packet to the ryu controller.

ryu now dissects the packet passed over the OpenFlow protocol and the above programme tells ryu how to process packets.

The function packet_in_handler is called.

The key line here is
actions = [ofp_parser.OFPActionOutput(ofp.OFPP_FLOOD)]

What this is actually saying is to send the arriving packet to all interfaces.  We are building a hub which is exactly what it does - it floods arriving packets to all ports. The final line commits this.

Now we would never do this in reality since it is a massive overhead. The switch will copy every single packet to the controller asking what to do with it.  The switch never learns anything!

Now given in a real network the controller may be remote from the switch, you can see this would introduce massive latency and massive traffic duplication !

The point of this is really to show the logic of how OpenFlow works.

We can run ryu with more verbose logging to see more about what it is doing

ryu-manager --verbose l2hub.py 

Here's what it comes back with:

loading app l2hub.py
loading app ryu.controller.ofp_handler
instantiating app l2hub.py
instantiating app ryu.controller.ofp_handler
BRICK ofp_event
  PROVIDES EventOFPPacketIn TO {'L2Switch': ['main']}
  CONSUMES EventOFPEchoRequest
  CONSUMES EventOFPErrorMsg
  CONSUMES EventOFPHello
  CONSUMES EventOFPSwitchFeatures
BRICK L2Switch
  CONSUMES EventOFPPacketIn
connected socket:<socket fileno=4 sock=192.168.1.4:6633 peer=192.168.1.15:45743> address:('192.168.1.15', 45743)
hello ev <ryu.controller.ofp_event.EventOFPHello object at 0xf7e510>
move onto config mode
switch features ev version: 0x4 msg_type 0x6 xid 0x70f534ec
move onto main mode
EVENT ofp_event->L2Switch EventOFPPacketIn
Ignore the reference to L2Switch - this is a hub. L2Switch is from the class declaration at the beginning - I copied this example.

You can see ryu is initialising, then it connects to the Raspberry Pi OpenFlow switch running at 192.168.15
It negotiates to use the OF1.3 protocol (0x04)

The Raspberry Pi will report it has also connected to the controller.

16:07:49.296 [info] Connected to controller 192.168.1.4:6633/0 using OFP v4

Now on the laptop connected to the Raspberry Pi, if I set ping running to ping some address, this will be forwarded to the controller. In the controller window you'll see each ping packet event showing in the verbose log

EVENT ofp_event->L2Switch EventOFPPacketIn
In the next post I'll  evolve our simple hub to at least not broadcast out of every port.

Wednesday, 4 September 2013

OpenFlow security - new exploits?

The past few weeks has seen several high profile DNS exploits.  Hackers have altered the DNS entries to route traffic elsewhere to either an unrelated  site or a fake site.  Typically the way companies discover  this is, is  that the traffic to their website has  disappeared and their web servers are sitting there idle.

More sophisticated exploits would be to leak some of the traffic to an alternative site so it is less likely to be detected through traffic anomolies.

So what has this got to do with OpenFlow?  Well OpenFlow has the potential to abstract routing so that IP addresses are mobile and traffic can be routed programmatically. This is not a million miles from the DNS hack - it would be possible to move traffic routed to a particular valid IP address to another location, in other words it's possible for the network to be the man in the middle and move traffic to another server.

Although this idea isn't new, the same can happen with today's IP networks through route injection, the OpenFlow concepts make this is simpler task.

So how do we prevent this? OpenFlow has put some basic functionality in place to prevent some of this such as secure connections between the controller and the switch,  however the logic on how a network behaves is set at the application level on the controller. The challenge, as OpenFlow networks become more prolific, is to ensure that applications sitting on the controller can be trusted and are doing what we expect.  Imagine a world where the applications installed on the controller have a virus or are simply malicious  and are taking rogue actions. How can we detect this? How can we prevent this?

With the controller exposing north bound interfaces elsewhere, the need for trust from "controllers of controllers" needs to be established.

These are not real risks today since it is likely that any OpenFlow network will be closed, secure and tightly controller by the network administrators but it is definitely something which could emerge as a real threat within the next 5 years.