Working with Openstack metadata service when using OVN

Metadata agent

Running the agent

neutron-ovn-metadata-agent –config-file /etc/neutron/neutron.conf –config-file /etc/neutron/neutron_ovn_metadata_agent.ini

Configure neutron_ovn_metadata_agent.ini.j2 on the compute node(s)

[ovn]
ovn_nb_connection=tcp:{{OVN Controller IP}}:6641
ovn_sb_connection=tcp:{{OVN Controller IP}}:6642
ovn_metadata_enabled = true

Configure neutron.conf on the Neutron server

[ovn]
ovn_metadata_enabled = true

 

 

Reading

https://docs.openstack.org/networking-ovn/latest/admin/refarch/refarch.html – For a nice diagram on how the bits fit together

https://man7.org/linux/man-pages/man7/ovn-architecture.7.html – Some more in depth technical secrets hidden in this doc

https://patchwork.ozlabs.org/project/openvswitch/patch/1493118328-21311-1-git-send-email-dalvarez@redhat.com/

Specifically the example of local ports

- One logical switch sw0 with 2 ports (p1, p2) and 1 localport (lp)
- Two hypervisors: HV1 and HV2
- p1 will be in HV1 (OVS port with external-id:iface-id="p1")
- p2 will be in HV2 (OVS port with external-id:iface-id="p2")
- lp will be in both (OVS port with external-id:iface-id="lp")
- p1 should be able to reach p2 and viceversa
- lp on HV1 should be able to reach p1 but not p2
- lp on HV2 should be able to reach p2 but not p1


ovn-nbctl ls-add sw0
ovn-nbctl lsp-add sw0 p1
ovn-nbctl lsp-add sw0 p2
ovn-nbctl lsp-add sw0 lp
ovn-nbctl lsp-set-addresses p1 "00:00:00:aa:bb:10 10.0.1.10"
ovn-nbctl lsp-set-addresses p2 "00:00:00:aa:bb:20 10.0.1.20"
ovn-nbctl lsp-set-addresses lp "00:00:00:aa:bb:30 10.0.1.30"
ovn-nbctl lsp-set-type lp localport

add_phys_port() {
name=$1
mac=$2
ip=$3
mask=$4
gw=$5
iface_id=$6
sudo ip netns add $name
sudo ovs-vsctl add-port br-int $name -- set interface $name
type=internal
sudo ip link set $name netns $name
sudo ip netns exec $name ip link set $name address $mac
sudo ip netns exec $name ip addr add $ip/$mask dev $name
sudo ip netns exec $name ip link set $name up
sudo ip netns exec $name ip route add default via $gw
sudo ovs-vsctl set Interface $name external_ids:iface-id=$iface_id
}

# Add p1 to HV1, p2 to HV2 and localport to both

# HV1
add_phys_port p1 00:00:00:aa:bb:10 10.0.1.10 24 10.0.1.1 p1
add_phys_port lp 00:00:00:aa:bb:30 10.0.1.30 24 10.0.1.1 lp

$ sudo ip netns exec p1 ping -c 2 10.0.1.20
PING 10.0.1.20 (10.0.1.20) 56(84) bytes of data.
64 bytes from 10.0.1.20: icmp_seq=1 ttl=64 time=0.738 ms
64 bytes from 10.0.1.20: icmp_seq=2 ttl=64 time=0.502 ms

--- 10.0.1.20 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.502/0.620/0.738/0.118 ms

$ sudo ip netns exec lp ping -c 2 10.0.1.10
PING 10.0.1.10 (10.0.1.10) 56(84) bytes of data.
64 bytes from 10.0.1.10: icmp_seq=1 ttl=64 time=0.187 ms
64 bytes from 10.0.1.10: icmp_seq=2 ttl=64 time=0.032 ms

--- 10.0.1.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.032/0.109/0.187/0.078 ms


$ sudo ip netns exec lp ping -c 2 10.0.1.20
PING 10.0.1.20 (10.0.1.20) 56(84) bytes of data.

--- 10.0.1.20 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1000ms


$ sudo ovs-ofctl dump-flows br-int | grep table=32
cookie=0x0, duration=141.939s, table=32, n_packets=2, n_bytes=196,
idle_age=123, priority=150,reg14=0x3,reg15=0x2,metadata=0x7 actions=drop
cookie=0x0, duration=141.939s, table=32, n_packets=2, n_bytes=196,
idle_age=129, priority=100,reg15=0x2,metadata=0x7
actions=load:0x7->NXM_NX_TUN_ID[0..23],set_field:0x2->tun_metadata0,move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30],output:59



# On HV2

add_phys_port p2 00:00:00:aa:bb:20 10.0.1.20 24 10.0.1.1 p2
add_phys_port lp 00:00:00:aa:bb:30 10.0.1.30 24 10.0.1.1 lp

$ sudo ip netns exec p2 ping -c 2 10.0.1.10
PING 10.0.1.10 (10.0.1.10) 56(84) bytes of data.
64 bytes from 10.0.1.10: icmp_seq=1 ttl=64 time=0.810 ms
64 bytes from 10.0.1.10: icmp_seq=2 ttl=64 time=0.673 ms

--- 10.0.1.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.673/0.741/0.810/0.073 ms

$ sudo ip netns exec lp ping -c 2 10.0.1.20
PING 10.0.1.20 (10.0.1.20) 56(84) bytes of data.
64 bytes from 10.0.1.20: icmp_seq=1 ttl=64 time=0.357 ms
64 bytes from 10.0.1.20: icmp_seq=2 ttl=64 time=0.062 ms

--- 10.0.1.20 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.062/0.209/0.357/0.148 ms

$ sudo ip netns exec lp ping -c 2 10.0.1.10
PING 10.0.1.10 (10.0.1.10) 56(84) bytes of data.

--- 10.0.1.10 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

$ sudo ovs-ofctl dump-flows br-int | grep table=32
cookie=0x0, duration=24.169s, table=32, n_packets=2, n_bytes=196,
idle_age=12, priority=150,reg14=0x3,reg15=0x1,metadata=0x7 actions=drop
cookie=0x0, duration=24.169s, table=32, n_packets=2, n_bytes=196,
idle_age=14, priority=100,reg15=0x1,metadata=0x7
actions=load:0x7->NXM_NX_TUN_ID[0..23],set_field:0x1->tun_metadata0,move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30],output:40

OpenVSwitch – Remove a port from a bond

Short answer: There is no dedicated command for that but it can done.

 

Long answer:

ovs-vsctl --id=@eth0 get Interface eth0 -- remove Port bond0 interfaces @eth0

 

https://mail.openvswitch.org/pipermail/ovs-discuss/2015-May/037372.html

Openstack – Layer2 Gateway(VXLAN -> Real world bridge)

This article is the culmination of 100’s of hours of work, I hope it can save others some time.

Here are some super useful articles that got me across the line

https://networkop.co.uk/blog/2016/05/21/neutron-l2gw/

https://wiki.openstack.org/wiki/Ovs-flow-logic

http://kimizhang.com/neutron-l2-gateway-hp-5930-switch-ovsdb-integration/

https://drive.google.com/file/d/0Bx8nDIFktlzBRm0tV3pmYURnZ3M/view

https://github.com/openstack/networking-l2gw

 

Setting up and Openvswitch VTEP

Step1 – Kill all Openvswitch processes

use ps ax | grep ovs to find any ovs processes that are running and kill them all

Step 2 – Bring up Openvswitch as a VTEP

Now configure the script below to suit

This process will kill any OVS config you have in place, if you like your config… well… do something else!

Here we use ens4 as out ‘trunk port’ and the name of our ‘physical switch'(Actually OpenvSwitch running on a server\VM) is switch-l2gw02

172.0.0.170 is the IP of the machine running OVS(Presumable the machine running this script)

#!/bin/bash
modprobe openvswitch
ip link set up dev ens4
rm etc/openvswitch/*
ovsdb-tool create /etc/openvswitch/vtep.db /usr/share/openvswitch/vtep.ovsschema
ovsdb-tool create /etc/openvswitch/vswitch.db /usr/share/openvswitch/vswitch.ovsschema
mkdir /var/run/openvswitch/
ovsdb-server --pidfile --detach --log-file --remote ptcp:6632:172.0.0.170 \
 --remote punix:/var/run/openvswitch/db.sock --remote=db:hardware_vtep,Global,managers \
 /etc/openvswitch/vswitch.db /etc/openvswitch/vtep.db
ovs-vswitchd --log-file --detach --pidfile unix:/var/run/openvswitch/db.sock
ovs-vsctl add-br switch-l2gw02
vtep-ctl add-ps switch-l2gw02
vtep-ctl set Physical_Switch switch-l2gw02 tunnel_ips=172.0.0.170
ovs-vsctl add-port switch-l2gw02 ens4
vtep-ctl add-port switch-l2gw02 ens4
/usr/share/openvswitch/scripts/ovs-vtep \
 --log-file=/var/log/openvswitch/ovs-vtep.log \
 --pidfile=/var/run/openvswitch/ovs-vtep.pid \
 --detach switch-l2gw02

Install and Configure the Neutron L2 Agent

For me the l2agent was available in the APT repo, so the installation was nice and simple.

This is a configuration agent, it doesn’t move any packets itself but it just orchestrates the required changes to the VTEP’s, So i run this on the same VM as my Neutron server services

Set the following line in l2gateway_agent.ini

ovsdb_hosts ='l2gw01:172.0.0.169:6632,l2gw02:172.0.0.170:6632'

 

 

Inbound ARP bug

This is a biggie and it’s a giant PIA
Inbound ARP requests will hit your VTEP but will not be forwarded on, even if the VTEP was to forward them on, it does have a ovs table suitable for sending broadcast packets(That is a table that speciifies an output port of every VXLan endpoint)

So to achieve this we use a bit of a workaround, first set a kind of ‘failover’ for all multicast packets on the VTEP to forward these unknown packets(inbound ARP requests) to one of the ‘network nodes’ that is a Neutron node that’s got a line in "ovs-vfctl dump-flows br-tun" that looks like this
table=22, n_packets=15, n_bytes=1030, idle_age=11531, priority=1,dl_vlan=9 actions=strip_vlan,load:0x3fa->NXM_NX_TUN_ID[],output:9,output:2,output:4,output:13
This is a boradcast rule, anything that hits it will be sent to all relevant VXLAN endpoints.
(I say relevant because it seems that it out outputs to ports that have devices on the other end on the same VXLAN, E.g If you have a compute node that doesn’t have any VM’s using that VXLAN network, the output port entry for that vxlan tunnel wont appear)

To configure this ‘failover’ run
sudo vtep-ctl add-mcast-remote 818b4779-645c-49bb-ae4a-aa9340604019 unknown-dst 10.0.3.10
Where the UUID is the result of vtep-ctl list-ls and the IP address is the IP of the neutron network node with table22 in place

Helpful hint: To find the names and numbers of the ports use this command
ovs-vsctl -f table -- --columns=name,ofport,external-ids,options list interface

Ok, so now the ARP packets are heading to the Network node but we aren’t quite done, we need to convince the network node to shunt all ARP requests out to table 22 and table 10(See here for a more detailed explanation from someone who actually knows what they are talking about https://networkop.co.uk/blog/2016/05/21/neutron-l2gw/ under the heading “Programming Network Node as BUM replication service node“)

To achieve this we need to add the following rule to br-tun on table 4
table=4,arp,tun_id=0x3fa,priority=2,actions=mod_vlan_vid:9,resubmit(,10),resubmit(,22)

Where 0x3fa is the segmentation id of our network in HEX format and vlan9 is the vlan used on THAT node for processing, you can find this ID by running ovs-ofctl dump-flows br-tun | grep 0x3fa
You’ll see a few entries and they’ll all share then same vlan_id.. That’s what we are after for our custom rule
When you have all the info run
ovs-ofctl add-flow br-tun "table=4,arp,tun_id=0x3fa,priority=2,actions=mod_vlan_vid:9,resubmit(,10),resubmit(,22)"

And that’s it!, not he ARP requests form the outside world hit the VTEP, overflow to the network node which kindly broadcasts them out to the VXLAN endpoints for us.

I hope this has helped you in some way to join your Openstack VXLAN networks to the real world.