Portal:Cloud VPS/Admin/Network
This page explains how the CloudVPS network works, including the neutron Openstack component.
For the sake of explanation, this document uses the eqiad1 deployment as example, but may be others with same mechanisms.
Network topology

There are 2 different kind of network involved:
- control plane networks: those used in physical servers for SSH, puppet, monitoring, etc. Is a wiki-production network, usually in the 10.x.x.x range.
- data plane networks: those used by CloudVPS virtual clients, and all the traffic doing ingress/egress through the edge of the network.
On the top-of-rack switches these networks are divided into separate VRFs, or routing-instances. This keeps them as two separate, private domains on the connected cloudsw devices. The CR (Core Router) routers provide a default route to the switches into both of the networks, and thus traffic that needs to route between the two networks flows via the CRs where ACLs/filters are used for policy control.
There are 3 routers involved:
- The neutron virtual router (by means of neutron-l3-agent, neutron-linuxbridge-agent, neutron-server, etc). This router connects the internal software-defined networks to the cloud edge network.
- The physical cloudgw router (a pair of linux servers). This router is the main gateway for all CloudVPS ingress/egress traffic, and is the main netowrk endpoint facing the public internet.
- The physical cloudsw routers. These devices connect cloudgw to the rest of the internet, including wiki-production networks.
Datacenter network
cloudvirts
- control plane: primary interface (for example eth0) connected to the physical switch in their rack. The switch port connecting to this interface doesn't need any specific configuration.
- data plane: secondary interface (for example eth1) connected to the physical switch in their rack. This switch port must configured in VLAN tagged mode for vlan 1105.
There as been some research on whether we should collapse the 2 interfaces in one, aiming to reduce usage on 10G ports on the switches. The initial research showed promising results, but we didn't introduce this change yet.
cloudnet
- control plane: primary interface (for example eth0) connected to the physical switch in their rack. The switch port connecting to this interface doesn't need any specific configuration.
- data plane: secondary interface (for example eth1) connected to the physical switch in their rack. This switch port must configure a VLAN trunk with vlan 1105 and vlan 1107.
cloudgw
- control plane: primary interface (for example eth0) connected to the physical switch in their rack. The switch port connecting to this interface doesn't need any specific configuration.
- data plane: secondary interface (for example eth1) connected to the physical switch in their rack. The switch port must configure a VLAN trunk with vlan 1120 and vlan 1107.
- These hosts will failover automatically, so for a full reboot just take one down, wait for it to come up, run the tests cookbook (
wmcs.openstack.network.tests
, see this to set it up on your laptop), and then reboot the other.
ceph osd
- ceph control (ssh, monitoring, mon communication, client communication) plane:
- Primary interface on external card (for example ens2f0np0)
- 10.64.20.0/24 network
- Connected to the physical switch in their rack
- The switch port connecting to this interface needs to configure untagged vlan 1118 (cloud-hosts1-eqiad).
- ceph data plane (osd to osd communication):
- Secondary interface on external card (for example ens2f1np1)
- 192.168.4.0/24 network
- Connected to the physical switch in their rack
- The switch port connecting to this interface needs to configure untagged vlan 1105 (cloud-storage1-eqiad).
ceph mons
- ceph control (ssh, monitoring, client communication, osd communication) plane:
- Primary interface on external card (for example ens2f0np0)
- 10.64.20.0/24 network
- Connected to the physical switch in their rack
- The switch port connecting to this interface needs to configure untagged vlan 1118 (cloud-hosts1-eqiad)
Edge network

- neutron manages floating IP NAT and all the software defined network in the virtual realm.
- cloudgw handles routing_source_ip and dmz_cidr and connects neutron to cloudsw.
- cloudsw connects to the internet and the rest of wiki-production networks.
Virtual network
TODO. Inside the virtual realm.
Topology data example
Eqiad
In the case of the eqiad1 deployment, the relevant elements for the cloud network are:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN for instances | lan-flat-cloudinstances2b | cloud-instances2-b-eqiad | cloud-instances2-b-eqiad (vlan 1105) | 172.16.0.0/21 | vlan 1105 cidr |
WAN for floating IPs | wan-transport-eqiad | cloud-eqiad1-floating | --- (no vlan) | 185.15.56.0/25 | cidr |
WAN for transport | wan-transport-eqiad | cloud-gw-transport-eqiad | cloud-gw-transport-eqiad (vlan 1107) | 185.15.56.236/30 | vlan 1107 cidr |
WAN for transport | --- (ignored by neutron) | --- (ignored by neutron) | cloud-instances-transport1-b-eqiad (vlan 1120) | 185.15.56.240/29 | vlan 1120 cidr |
Per-rack networks are shown below. 'Legacy' LAN ranges in the production realm connect existing hosts, but no new hosts should be added to them. The new, per-rack vlans/subnets get used instead.
Rack C8:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-c8-eqiad (vlan 1128) | 10.64.151.0/24, 2620:0:861:11f::/64 | vlan 1128 ipv4 ipv6 |
Legacy LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-eqiad (vlan 1118) | 10.64.20.0/24, 2620:0:861:118::/64 | vlan 1118 ipv4 ipv6 |
Storage Network | --- | --- | cloud-storage1-eqiad (vlan 1106) | 192.168.4.0/24 | vlan 1106 N/A |
Rack D5:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-d5-eqiad (vlan 1127) | 10.64.150.0/24, 2620:0:861:11e::/64 | vlan 1127 ipv4 ipv6 |
Legacy LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-eqiad (vlan 1118) | 10.64.20.0/24, 2620:0:861:118::/64 | vlan 1118 ipv4 ipv6 |
Storage Network | --- | --- | cloud-storage1-eqiad (vlan 1106) | 192.168.4.0/24 | vlan 1106 N/A |
Rack E4:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-e4-eqiad (vlan 1123) | 10.64.148.0/24, 2620:0:861:11c::/64 | vlan 1123 ipv4 ipv6 |
Storage Network | --- | --- | cloud-storage1-e4-eqiad (vlan 1121) | 192.168.5.0/24 | vlan 1121 [N/A] |
Rack F4:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-f4-eqiad (vlan 1124) | 10.64.149.0/24, 2620:0:861:11d::/64 | vlan 1124 ipv4 ipv6 |
Storage Network | --- | --- | cloud-storage1-e4-eqiad (vlan 1122) | 192.168.6.0/24 | vlan 1122 [N/A] |
Codfw
In the case of the codfw1dev deployment, the relevant elements are:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN for instances | lan-flat-cloudinstances2b | cloud-instances2-b-codfw | cloud-instances2-b-codfw (vlan 2105) | 172.16.128.0/24 | vlan 2105 cidr |
WAN for floating IPs | wan-transport-codfw | cloud-codfw1dev-floating | --- (no vlan) | 185.15.57.0/29 | cidr |
WAN for transport | wan-transport-codfw | cloud-gw-transport-codfw | cloud-gw-transport-codfw (vlan 2107) | 185.15.57.8/30 | vlan 2107 cidr |
LAN provider (HW servers) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-b-codfw (vlan 2118) | 10.192.20.0/24 | vlan 2118 cidr |
WAN for transport | --- (ignored by neutron) | --- (ignored by neutron) | cloud-instances-transport1-b-codfw (vlan 2120) | 185.15.56.240/29 | vlan 2120 cidr |
Ingress & Egress
Some notes on the ingress & egress particularities.
routing_source_ip
By default, all the traffic from VMs to the Internet (egress) is source NATed using a single IPv4 address. This address is called routing_source_ip.
There are 2 cases in which this egress NAT is not applied:
- the VM->destination is part of the #dmz_cidr exclusions
- the VM has an explicit floating ip associated (the floating ip will be used as both SNAT and DNAT)
dmz_cidr
The dmz_cidr mechanisms allows us to define certain IP ranges to which VMs can talk to directly without NAT being involved.
A typical configuration per deployment looks like (please refer to ops/puppet.git for actual hiera values):
profile::openstack::eqiad1::cloudgw::dmz_cidr: # VMs --> wiki (text-lb.eqiad) - "172.16.0.0/21 . 208.80.154.224" # VMs --> wiki (upload-lb.eqiad) - "172.16.0.0/21 . 208.80.154.240"
You can read these config as: do not apply NAT to connections src:dst, src:dst, src:dst.
Please note that the dmz_cidr mechanism takes precedence over the routing_source_ip configuration.
A static route is required on the routers so return traffic knows what path to take to reach the Cloud Private IPs.
For example on cr1/2-eqiad: routing-options static route 172.16.0.0/21 next-hop 185.15.56.244/29
Floating IPs
This mechanisms allows us to create an additional public IPv4 address in Neutron. Then this new IP address will be associated with a given instance and all of his egress/ingress traffic will use it (both SNAT and DNAT).
A quota needs to be previously assigned to a project due to limited resources.
Please note that the dmz_cidr mechanism overrides floating IP NAT configurations, and you can see non-NATed packets arriving at VMs with a floating IP assigned.
Here is an example of 3 software defined floating IPs created by Neutron in the codfw1dev deployment, not using eqiad1 for brevity, but it works exactly the same:
root@cloudnet2003-dev:~ # nft -s list chain ip nat neutron-l3-agent-float-snat
table ip nat {
chain neutron-l3-agent-float-snat {
ip saddr 172.16.128.19 counter snat to 185.15.57.2 fully-random
ip saddr 172.16.128.20 counter snat to 185.15.57.4 fully-random
ip saddr 172.16.128.26 counter snat to 185.15.57.6 fully-random
}
}
root@cloudnet2003-dev:~ # nft -s list chain ip nat neutron-l3-agent-OUTPUT
table ip nat {
chain neutron-l3-agent-OUTPUT {
ip daddr 185.15.57.2 counter dnat to 172.16.128.19
ip daddr 185.15.57.4 counter dnat to 172.16.128.20
ip daddr 185.15.57.6 counter dnat to 172.16.128.26
}
}
traffic from instance to own floating IP
VM instances may try having traffic to its own floating IP. As described in T217681#5035533 - Cloud VPS instance with floating (public) IP can not ping that IP directly, this is not possible with default configuration.
That packet arriving the VM instance would be a martian packet.
A workaround of this is to instruct the network stack to allow this kind of martian packet:
sysctl net.ipv4.conf.all.accept_local=1
accept_local - BOOLEAN Accept packets with local source addresses. In combination with suitable routing, this can be used to direct packets between two local interfaces over the wire and have them accepted properly. default FALSE
ingress & egress data example
Some important IP addresses in the eqiad1 deployment:
Type | Name | Address | Explanation | Where is defined, where to change it | DNS FQDN |
---|---|---|---|---|---|
ingress | incoming gateway | 185.15.56.244/29 | neutron address in the WAN transport subnet for ingress | Core routers (static route) & neutron main router object | cloudinstances2b-gw.openstack.eqiad1.wikimediacloud.org
|
egress | routing_source_ip | 185.15.56.1 | IP address for main source NAT for VMs (mind dmz_cidr exclusions) | /etc/neutron/l3_agent.ini in cloudnet nodes (puppet). No NIC has this IP assigned. | nat.openstack.eqiad1.wikimediacloud.org
|
Some important IP addresses in the codfw1dev deployment:
Type | Name | Address | Explanation | Where is defined, where to change it | DNS FQDN |
---|---|---|---|---|---|
ingress | incoming gateway | 208.80.153.190/29 | neutron address in the WAN transport subnet for ingress | Core routers (static route) & neutron main router object | cloudinstances2b-gw.openstack.codfw1dev.wikimediacloud.org
|
egress | routing_source_ip | 185.15.57.1 | IP address for main source NAT for VMs (mind dmz_cidr exclusions) | /etc/neutron/l3_agent.ini in cloudnet nodes (puppet). No NIC has this IP assigned. | nat.openstack.codfw1dev.wikimediacloud.org
|
What Neutron is doing
This section tries to give some light on how Neutron is implementing our network topology under the hood, and what is doing with all this configuration.
Neutron uses 2 specific boxes: cloudnetXXXX.site.wmnet and cloudnetXXXX.site.wmnet (active-standby).
The neutron-server service (daemon, API, etc) runs on cloudcontrol boxes.
All the agents run in cloudnet boxes, execept neutron-linuxbridge-agent, which runs in cloudvirt boxes.
Example of running agents |
---|
The following content has been placed in a collapsed box for improved usability. |
root@cloudcontrol1003:~# neutron agent-list
+--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
| id | agent_type | host | availability_zone | alive | admin_state_up | binary |
+--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
| 468aef2a-8eb6-4382-abba-bc284efd9fa5 | DHCP agent | cloudnet1004 | nova | :-) | True | neutron-dhcp-agent |
| 601bef99-b53c-4e6a-b384-65d1feebedff | Metadata agent | cloudnet1003 | | :-) | True | neutron-metadata-agent |
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | L3 agent | cloudnet1003 | nova | :-) | True | neutron-l3-agent |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | L3 agent | cloudnet1004 | nova | :-) | True | neutron-l3-agent |
| 9f8833de-11a4-4395-8da5-f57fe8326659 | Linux bridge agent | cloudnet1003 | | :-) | True | neutron-linuxbridge-agent |
| ad3461d7-b79e-4279-921d-5a476e296767 | Linux bridge agent | cloudnet1004 | | :-) | True | neutron-linuxbridge-agent |
| b2f9da63-2f16-4aa5-9400-ae708a733f91 | Linux bridge agent | cloudvirt1021 | | :-) | True | neutron-linuxbridge-agent |
| d475e07d-52b3-476e-9a4f-e63b21e1075e | Metadata agent | cloudnet1004 | | :-) | True | neutron-metadata-agent |
| e382a233-e6a0-422e-9d2e-5651082783fc | Linux bridge agent | cloudvirt1022 | | :-) | True | neutron-linuxbridge-agent |
| ff2a8228-3748-4588-927b-4b6563da9ca0 | DHCP agent | cloudnet1003 | nova | :-) | True | neutron-dhcp-agent |
+--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
|
The above content has been placed in a collapsed box for improved usability. |
When a virtual router is created, and assigned to an l3-agent, a linux network namespace (netns for short) will be created:
Example virtual router netns and l3 agents hosting routers |
---|
The following content has been placed in a collapsed box for improved usability. |
root@cloudnet1004:~# ip netns list | grep router
qrouter-d93771ba-2711-4f88-804a-8df6fd03978a
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | active |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | :-) | standby |
+--------------------------------------+--------------+----------------+-------+----------+
|
The above content has been placed in a collapsed box for improved usability. |

This netns will hold all the configuration: IP addresses (such as gateways, floating IPs), iptables rules (NAT, filtering, etc), and other information (static routes, etc).
Using virtual taps, this automatically-generated netns is connected to the main netns where the physical NICs live, along with bridges and vlan tagged subinterfaces.
All this is done in the eth1 interface, while eth0 is left for connection of the cloudnet box to the provider network.
When a virtual router is created, Neutron will decide in which l3-agent will be deploying it, taking into account HA parameters.
In our active-standby setup, only one l3-agent is active at a time, which means that all this netns/interfaces/iptables configuration is deployed by Neutron to just one node.
The 'q-' prefix in netns is from earlier development stages, Neutron was called Quantum.
Security policy
TODO: talk about security groups, dmz_cidr exclusion, core route filtering, etc