Openstack – Layer2 Gateway(VXLAN -> Real world bridge)

This article is the culmination of 100’s of hours of work, I hope it can save others some time.

Here are some super useful articles that got me across the line

https://networkop.co.uk/blog/2016/05/21/neutron-l2gw/

https://wiki.openstack.org/wiki/Ovs-flow-logic

http://kimizhang.com/neutron-l2-gateway-hp-5930-switch-ovsdb-integration/

https://drive.google.com/file/d/0Bx8nDIFktlzBRm0tV3pmYURnZ3M/view

https://github.com/openstack/networking-l2gw

 

Setting up and Openvswitch VTEP

Step1 – Kill all Openvswitch processes

use ps ax | grep ovs to find any ovs processes that are running and kill them all

Step 2 – Bring up Openvswitch as a VTEP

Now configure the script below to suit

This process will kill any OVS config you have in place, if you like your config… well… do something else!

Here we use ens4 as out ‘trunk port’ and the name of our ‘physical switch'(Actually OpenvSwitch running on a server\VM) is switch-l2gw02

172.0.0.170 is the IP of the machine running OVS(Presumable the machine running this script)

#!/bin/bash
modprobe openvswitch
ip link set up dev ens4
rm etc/openvswitch/*
ovsdb-tool create /etc/openvswitch/vtep.db /usr/share/openvswitch/vtep.ovsschema
ovsdb-tool create /etc/openvswitch/vswitch.db /usr/share/openvswitch/vswitch.ovsschema
mkdir /var/run/openvswitch/
ovsdb-server --pidfile --detach --log-file --remote ptcp:6632:172.0.0.170 \
 --remote punix:/var/run/openvswitch/db.sock --remote=db:hardware_vtep,Global,managers \
 /etc/openvswitch/vswitch.db /etc/openvswitch/vtep.db
ovs-vswitchd --log-file --detach --pidfile unix:/var/run/openvswitch/db.sock
ovs-vsctl add-br switch-l2gw02
vtep-ctl add-ps switch-l2gw02
vtep-ctl set Physical_Switch switch-l2gw02 tunnel_ips=172.0.0.170
ovs-vsctl add-port switch-l2gw02 ens4
vtep-ctl add-port switch-l2gw02 ens4
/usr/share/openvswitch/scripts/ovs-vtep \
 --log-file=/var/log/openvswitch/ovs-vtep.log \
 --pidfile=/var/run/openvswitch/ovs-vtep.pid \
 --detach switch-l2gw02

Install and Configure the Neutron L2 Agent

For me the l2agent was available in the APT repo, so the installation was nice and simple.

This is a configuration agent, it doesn’t move any packets itself but it just orchestrates the required changes to the VTEP’s, So i run this on the same VM as my Neutron server services

Set the following line in l2gateway_agent.ini

ovsdb_hosts ='l2gw01:172.0.0.169:6632,l2gw02:172.0.0.170:6632'

 

 

Inbound ARP bug

This is a biggie and it’s a giant PIA
Inbound ARP requests will hit your VTEP but will not be forwarded on, even if the VTEP was to forward them on, it does have a ovs table suitable for sending broadcast packets(That is a table that speciifies an output port of every VXLan endpoint)

So to achieve this we use a bit of a workaround, first set a kind of ‘failover’ for all multicast packets on the VTEP to forward these unknown packets(inbound ARP requests) to one of the ‘network nodes’ that is a Neutron node that’s got a line in "ovs-vfctl dump-flows br-tun" that looks like this
table=22, n_packets=15, n_bytes=1030, idle_age=11531, priority=1,dl_vlan=9 actions=strip_vlan,load:0x3fa->NXM_NX_TUN_ID[],output:9,output:2,output:4,output:13
This is a boradcast rule, anything that hits it will be sent to all relevant VXLAN endpoints.
(I say relevant because it seems that it out outputs to ports that have devices on the other end on the same VXLAN, E.g If you have a compute node that doesn’t have any VM’s using that VXLAN network, the output port entry for that vxlan tunnel wont appear)

To configure this ‘failover’ run
sudo vtep-ctl add-mcast-remote 818b4779-645c-49bb-ae4a-aa9340604019 unknown-dst 10.0.3.10
Where the UUID is the result of vtep-ctl list-ls and the IP address is the IP of the neutron network node with table22 in place

Helpful hint: To find the names and numbers of the ports use this command
ovs-vsctl -f table -- --columns=name,ofport,external-ids,options list interface

Ok, so now the ARP packets are heading to the Network node but we aren’t quite done, we need to convince the network node to shunt all ARP requests out to table 22 and table 10(See here for a more detailed explanation from someone who actually knows what they are talking about https://networkop.co.uk/blog/2016/05/21/neutron-l2gw/ under the heading “Programming Network Node as BUM replication service node“)

To achieve this we need to add the following rule to br-tun on table 4
table=4,arp,tun_id=0x3fa,priority=2,actions=mod_vlan_vid:9,resubmit(,10),resubmit(,22)

Where 0x3fa is the segmentation id of our network in HEX format and vlan9 is the vlan used on THAT node for processing, you can find this ID by running ovs-ofctl dump-flows br-tun | grep 0x3fa
You’ll see a few entries and they’ll all share then same vlan_id.. That’s what we are after for our custom rule
When you have all the info run
ovs-ofctl add-flow br-tun "table=4,arp,tun_id=0x3fa,priority=2,actions=mod_vlan_vid:9,resubmit(,10),resubmit(,22)"

And that’s it!, not he ARP requests form the outside world hit the VTEP, overflow to the network node which kindly broadcasts them out to the VXLAN endpoints for us.

I hope this has helped you in some way to join your Openstack VXLAN networks to the real world.