User:Majavah/Cloud VPS Open vSwitch

From Wikitech

Notes about migrating Cloud VPS to use Neutron's Open vSwitch agent instead of the current linuxbridge agent.

Test setup

Setup networks:

$ sudo wmcs-openstack network create --project admin --share --provider-network-type vxlan lan-flat-cloudinstances3
$ sudo wmcs-openstack subnet create --network lan-flat-cloudinstances3 --subnet-range 172.16.129.0/24 --gateway 172.16.129.1 --dns-nameserver 172.20.254.1 cloud-instances-flat3-codfw-v4
$ sudo wmcs-openstack router create cloudinstances3-flat-gw
$ sudo wmcs-openstack router add subnet cloudinstances3-flat-gw cloud-instances-flat3-codfw-v4

Setup network and router on correct agents:

taavi@cloudcontrol2001-dev ~ $ os network list
+--------------------------------------+---------------------------+------------------------------------------------------------------------------------------------------------------+
| ID                                   | Name                      | Subnets                                                                                                          |
+--------------------------------------+---------------------------+------------------------------------------------------------------------------------------------------------------+
| 05a5494a-184f-4d5c-9e98-77ae61c56daa | lan-flat-cloudinstances2b | 7adfcebe-b3d0-4315-92fe-e8365cc80668                                                                             |
| 57017d7c-3817-429a-8aa3-b028de82cdcc | wan-transport-codfw       | 9341619f-7725-48e6-ba9e-728a2b395528, a9439c35-f465-475c-85a0-8e0f0f41ac4d, b0a91a7b-2e0a-4e82-b0f0-7644f2cfa654 |
| d967e056-efc3-46f2-b75b-c906bb5322dc | HA network tenant admin   | 651250de-53ca-4487-97ce-e6f65dc4b8ec                                                                             |
| e40a1c9f-cc09-4751-a6b8-0469a52318b7 | lan-flat-cloudinstances3  | c2868c65-9af2-4ecb-96af-8de1aa54f530                                                                             |
+--------------------------------------+---------------------------+------------------------------------------------------------------------------------------------------------------+

taavi@cloudcontrol2001-dev ~ $ os network agent list --network e40a1c9f-cc09-4751-a6b8-0469a52318b7
+--------------------------------------+------------+------------------+-------------------+-------+-------+--------------------+
| ID                                   | Agent Type | Host             | Availability Zone | Alive | State | Binary             |
+--------------------------------------+------------+------------------+-------------------+-------+-------+--------------------+
| 06d461b8-b9ec-45a3-8c6e-ef56f22c721b | DHCP agent | cloudnet2006-dev | nova              | :-)   | UP    | neutron-dhcp-agent |
| 59bc1a4d-5bbe-4035-a1cc-5e9a0cc790b2 | DHCP agent | cloudnet2005-dev | nova              | :-)   | UP    | neutron-dhcp-agent |
+--------------------------------------+------------+------------------+-------------------+-------+-------+--------------------+
taavi@cloudcontrol2001-dev ~ $ os network agent list --router 4663da98-bd3d-4f5b-a91d-afc23ca507f6
+--------------------------------------+------------+------------------+-------------------+-------+-------+------------------+
| ID                                   | Agent Type | Host             | Availability Zone | Alive | State | Binary           |
+--------------------------------------+------------+------------------+-------------------+-------+-------+------------------+
| 43074b3b-48d0-430d-9271-4139c73a161f | L3 agent   | cloudnet2007-dev | nova              | :-)   | UP    | neutron-l3-agent |
| 4ce9e60e-797d-47db-8e60-5d01405799eb | L3 agent   | cloudnet2006-dev | nova              | :-)   | UP    | neutron-l3-agent |
+--------------------------------------+------------+------------------+-------------------+-------+-------+------------------+

taavi@cloudcontrol2001-dev ~ $ os network agent remove network --dhcp 59bc1a4d-5bbe-4035-a1cc-5e9a0cc790b2 e40a1c9f-cc09-4751-a6b8-0469a52318b7
taavi@cloudcontrol2001-dev ~ $ os network agent remove network --dhcp 06d461b8-b9ec-45a3-8c6e-ef56f22c721b e40a1c9f-cc09-4751-a6b8-0469a52318b7
taavi@cloudcontrol2001-dev ~ $ os network agent remove router 43074b3b-48d0-430d-9271-4139c73a161f 4663da98-bd3d-4f5b-a91d-afc23ca507f6
taavi@cloudcontrol2001-dev ~ $ os network agent remove router 4ce9e60e-797d-47db-8e60-5d01405799eb 4663da98-bd3d-4f5b-a91d-afc23ca507f6

taavi@cloudcontrol2001-dev ~ $ os network agent list | grep net2007
| 43074b3b-48d0-430d-9271-4139c73a161f | L3 agent           | cloudnet2007-dev  | nova              | :-)   | UP    | neutron-l3-agent          |
| b3bc28b7-e095-4753-8899-e1a6bdf88f40 | DHCP agent         | cloudnet2007-dev  | nova              | :-)   | UP    | neutron-dhcp-agent        |
| ec2a6cbf-27b8-4c10-bad9-278aa183943b | Open vSwitch agent | cloudnet2007-dev  | None              | :-)   | UP    | neutron-openvswitch-agent |
| eedc5b9c-2d68-4810-be07-4921cb1d9b08 | Metadata agent     | cloudnet2007-dev  | None              | :-)   | UP    | neutron-metadata-agent    |
taavi@cloudcontrol2001-dev ~ $ os network agent add network --dhcp b3bc28b7-e095-4753-8899-e1a6bdf88f40 e40a1c9f-cc09-4751-a6b8-0469a52318b7
taavi@cloudcontrol2001-dev ~ $ os network agent add router 43074b3b-48d0-430d-9271-4139c73a161f 4663da98-bd3d-4f5b-a91d-afc23ca507f6

Schedule test VM:

# unset maintenance
$ sudo wmcs-openstack server create --os-compute-api-version 2.95 --os-project-id taavitestproject --flavor g3.cores1.ram2.disk20 --image debian-12.0-nocloud --security-group 4c29a64f-b883-4622-893c-eb3fd78b0b7f --nic net-id=e40a1c9f-cc09-4751-a6b8-0469a52318b7 --host cloudvirt2001-dev taavi-ovs-test
# set maintenance


CloudGW / Network Setup

Just some basic notes on how we will try to set this up for the POC day-one, with respect to routing to the outside and into the new instance.

Overview

The basic idea in terms of external connectivity is that the existing CloudGW hosts will continue to provide the NAT and external firewall function for the new OpenStack deploy. The CloudGW will thus connect to both the existing OpenStack instance as well as the new one. The below diagram shows roughly what this will look like.

The current setup, on the left of the diagram, remains as it is. CloudGW continues to run keepalived to provide an external GW IP to the existing Neutron, and the existing Neutron continues to run VRRP to provide a GW IP for the VM ranges (public/private) to the CloudGW. Routes are configured statically on the cloudgw for the ranges used by Neutron. In place of the existing vlan that's used between cloudsw and cloudgw (cloud-instance-transport1-b-eqiad) we can instead use the cloud-private vlan for the rack. And instead of the cloudgw running keepalived on that outside interface, we can establish BGP peering between cloudgw and cloudsw. The cloudgw can then announce the VM ranges to the switches with BGP, and we can remove the static routes on the switches for them.

New Vlan

To connect to the new OpenStack deploy a new Vlan will be deployed on the cloudgw's, suggested name cloudgw-openstack2-eqiad (with the desire to rename the equivalent on the old deploy 'openstack1' at some point). This vlan will be used to transport traffic between the hosts running the Neutron virtual router(s) in the new setup and the CloudGW hosts. Again the intention is to run EBGP between the cloudgw's and the Neutron hosts to exchange routes and provide redundancy. A more detailed diagram of the new setup is shown below:

The new cloudgw-openstack2-eqiad vlan will be configured in both of the head-end racks for cloud services in the given datacenter, and trunked between the switches in opposite racks. While this goes against netops typical preference to not stretch layer-2 networks beyond a single rack it is considered the simplest way forward in the medium term. As the vlan will only be present on two switches there is no concern about layer-2 loops to consider, so it is perfectly safe.

The reason for stretching it is so that each CloudGW can have layer-2 adjacency with each Neutron/cloudnet host, to send traffic directly to it. This also enables the CloudGW to send a default route to the Neutron hosts directly. If the CloudSW was instead involved in the routing between the CloudGW and Neutron then we would need a separate vrf just for that (see below). For as long as dedicated hosts are running responsible for the Neutron router functionality then they need to be placed adjacent to the cloudgw's, in the same rack as the uplinks from cloudsw to core routers. In other words connected at the "spine" layer of the cloud network. If the hosts are in those racks then it is far simpler to deploy a layer-2 vlan between the two head-end racks and not involve the cloudsw in routing between cloudnet and cloudgw.

Routing

Both CloudGWs will have an EBGP session to both Neutron routers/hosts. This ensures that either CloudGW can send traffic to the active Neutron router, and if the active Neutron router changes traffic can still flow via the same CloudGW (where the conntracks exist) and reach the new primary Neutron host.

Traffic between the hosts running the Neutron router and other hypervisor hosts occurs within virtual self-service networks managed by Neutron. This traffic will be encapsulated with VXLAN and sent between cloud hosts interfaces on the relevant cloud-private vlans. As VXLAN only requires IP connectivity it can operate across subnets/vlans, and thus we retain the separate cloud-private vlan in each rack, with the cloudsw routing between them as it does already.


Avoid stretching cloudgw-openstack2-eqiad between racks c8 and d5

If we were determined not to stretch the new cloudgw-openstack2-eqiad vlan between the two head-end racks, and instead bring the cloud switches in to participate in routing traffic from cloudnet to cloudgw hosts, then that vlan would need to be replaced with a new VRF. The new VRF is needed to accommodate a default route from the cloudgw without conflicting with the default route they currently have going to the CRs. The new VRF would exist to route traffic from cloudnet to cloudgw hosts.

As can be seen in the above the use of a vrf de-couples the cloudgw and cloudnet. They no longer need to be on the same vlan or BGP peer directly with each other. Instead they both BGP peer to the cloud switches, and the cloudsw propagates routes from one to the other.

The separate VRF allows the cloudgw to announce a default route to cloudsw without interfering with the one the CRs announce at the top (as it's in a different vrf). The result is that, regardless of what rack a device is in, if it sends external traffic to the cloudsw in the internal vrf, the traffic will route out through the cloudgw. The cloudgw's in turn have a leg in the (existing) cloud-external vrf, which enables them to send traffic out that via the CRs.

From a practical point of view this option does not bring many benefits. It does allow us to place hosts absolutely anywhere, but as long as the neutron routing function is centralised on a set of hosts (cloudnets) the optimal placement will always be in the 'head-end'/spine racks, beside the cloudgw and core routers. Putting them elsewhere only wastes bandwidth and increases latency.

For this reason this option is not being considered at the current time by netops. It requires significant additional work and complexity, for zero real benefit.

Distributed Virtual Routing (DVR)

The scenario in which such a setup would make sense is if we no longer centralise the neutron external gateway functionality. For example if we were to implement Neutron Distributed Virtual Routing. In that scenario every cloudvirt is able to send external traffic directly out - it doesn't have to send it to a separate cloudnet host hosting the extenal neutron router. Because we need to spread cloudvirt hosts across all racks we would not be able to connect them all to the same L2 vlan to send outbound traffic to the cloudgw. In other words there would be no option but to use the cloudsw to route between racks, and the 'internal' vrf would be required.

Of course that's assuming the cloudgw is still needed itself. If we can remove both cloudnet and cloudgw then we only have one default route - out to the core routers - and the need for a second vrf also disappears.