Portal:Cloud VPS/Admin/Neutron

From Wikitech
Jump to navigation Jump to search

This page explains how the neutron Openstack component is used in our Cloud VPS service.

For the sake of explanation, this document uses the eqiad1 deployment as example, but may be others with same mechanisms.

Network topology

WMCS eqiad1 network topology.png

Mainly, there are 4 networks involved:

  • LAN for instances. Each VM gets one IP from this network when created. Private addressing.
  • WAN for instances transport. This subnet connects the Neutron virtual router with the external core router. Private addressing.
  • WAN for floating IPs. This subnet consist of a pool of IP addressed to be used for internet->VM NATs. Public addressing.
  • LAN for physical servers. Each physical net/virt Openstack server is wired to this subnet. AKA provider network. Private addressing.

Also mainly, there are 2 routers involved:

  • The Neutron virtual router (by means of l3-agents, neutron-server, etc). This router connects LAN and WANs.
  • The external (and physical) core router. This router is the final gateway between the deployment networks and the rest of the WMF networks (and internet).

The main Neutron router is deployed in HA (active-standby). The relevant IP addresses (gateways, etc) are associated with a router object which is managed by Neutron and can be moved from one cloudnetXXXX node to another.

Compat networking

The setup was originally developed in T202636 to allow communication between VMs in nova-network and neutron deployments. Then it was later dropped in T223923.

Physical connections

All physical servers in the deployment have an eth0 interface (usually 1G) connected to the physical switch in their rack. This is the interface used for ssh management (LAN provider network). The switch port connecting to this interface doesn't need any specific configuration.

Additionally, both cloudnet and cloudvirt servers have an eth1 interface (usually 10G in cloudnets, 1G in cloudvirts) which is connected to the physical switch in their rack using a trunk with several vlans (for transport subnet, instances, etc). The switch port connecting to eth1 needs this specific configuration active (cloud-virt-instance-trunk in asw2 and cloud-instance-ports in asw) in order for packets to circulate. Also, servers need all the vlan tagged interfaces and bridges created (this is done with puppet).
Usually this switch trunk contains all cloud related vlans, so we can move cloudvirt servers between deployments (different vlans) without having to change switch configurations.

Topology data example

In the case of the eqiad1 deployment, the relevant elements are:

What Neutron network object Neutron subnet object Physical name Addressing Netbox
LAN for instances lan-flat-cloudinstances2b cloud-instances2-b-eqiad cloud-instances2-b-eqiad (vlan 1105) 172.16.0.0/21 vlan 1105 cidr
WAN for transport wan-transport-eqiad cloud-instances-transport1-b-eqiad cloud-instances-transport1-b-eqiad (vlan 1120) 208.80.155.88/29 vlan 1120 cidr
WAN for floating IPs wan-transport-eqiad cloud-eqiad1-floating --- (no vlan) 185.15.56.0/25 cidr
LAN provider (HW servers) --- (ignored by neutron) --- (ignored by neutron) cloud-hosts1-b-eqiad (vlan 1118) 10.64.20.0/24 vlan 1118 cidr

In the case of the codfw1dev deployment, the relevant elements are:

What Neutron network object Neutron subnet object Physical name Addressing Netbox
LAN for instances lan-flat-cloudinstances2b cloud-instances2-b-codfw cloud-instances2-b-codfw (vlan 2105) 172.16.128.0/24 vlan 2105 cidr
WAN for transport wan-transport-codfw cloud-instances-transport1-b-codfw cloud-instances-transport1-b-codfw (vlan 2120) 208.80.153.184/29 vlan 2120 cidr
WAN for floating IPs wan-transport-codfw cloud-codfw1dev-floating --- (no vlan) 172.16.129.0/24 cidr
LAN provider (HW servers) --- (ignored by neutron) --- (ignored by neutron) cloud-hosts1-b-codfw (vlan 2118) 10.192.20.0/24 vlan 2118 cidr

Ingress & Egress

Eqiad1 transport.png

The ingress traffic is handled by the core router, which has an explicit static route pointing to the address of the Neutron router in the WAN transport subnet. Same happens in the case of floating IPs.

By default, all the traffic from VMs to the Internet (egress) is source NATed using a single IPv4 address. This address is called routing_source_ip.

There are 2 cases in which this egress NAT is not applied:

  • the VM->destination is some internal WMF network (dmz_cidr exclusions, see below)
  • the VM has an explicit floating IP associated (the floating IP will be used as both SNAT and DNAT)

These mechanisms (routing_source_ip and dmz_cidr) have been customly added to Neutron, see section below for further details on this customization.

dmz_cidr

The dmz_cidr mechanisms allows us to define certain IP ranges to which VMs can talk to directly without NAT being involved. This allows us to offer services to VMs easily, implementing access control in those services, etc.
One classic example is NFS stores, which would like to see actual VM IP addresses rather than generic NAT addresses.

A typical configuration per deployment looks like (please refer to ops/puppet.git for actual hiera values):

profile::openstack::eqiad1::neutron::dmz_cidr:
 - 172.16.0.0/21:91.198.174.0/24
 - 172.16.0.0/21:198.35.26.0/23
 - 172.16.0.0/21:10.0.0.0/8
 - 172.16.0.0/21:208.80.152.0/22
 - 172.16.0.0/21:103.102.166.0/24
 - 172.16.0.0/21:172.16.0.0/21

You can read these config as: do not apply NAT to connections src:dst, src:dst, src:dst.

Please note that the dmz_cidr mechanism takes precedence over both floating IPs and routing_source_ip configurations. This means that no NAT will be applied if the packet src/dst addresses matches any of the configured in dmz_cidr.

In addition, a static route is required on the routers so return traffic knows what path to take to reach the Cloud Private IPs.

For example on cr1/2-eqiad: routing-options static route 172.16.0.0/21 next-hop 208.80.155.92/29

Floating IPs

This mechanisms allows us to create an additional public IPv4 address in Neutron. Then this new IP address will be associated with a given instance and all of his egress/ingress traffic will use it (both SNAT and DNAT).

A quota needs to be previously assigned to a project due to limited resources.

Please note that the dmz_cidr mechanism overrides floating IP NAT configurations, and you can see non-NATed packets arriving at VMs with a floating IP assigned.

traffic from instance to own floating IP

VM instances may try having traffic to its own floating IP. As described in T217681#5035532 - Cloud VPS instance with floating (public) IP can not ping that IP directly, this is not possible with default configuration.
That packet arriving the VM instance would be a martian packet.

A workaround of this is to instruct the network stack to allow this kind of martian packet:

sysctl net.ipv4.conf.all.accept_local=1
accept_local - BOOLEAN
	Accept packets with local source addresses. In combination with
	suitable routing, this can be used to direct packets between two
	local interfaces over the wire and have them accepted properly.
	default FALSE

Proxies

TODO: fill me

ingress & egress data example

Some important IP addresses in the eqiad1 deployment:

Type Name Address Explanation Where is defined, where to change it DNS FQDN
ingress incoming gateway 208.80.155.92/29 neutron address in the WAN transport subnet for ingress Core routers (static route) & neutron main router object cloudinstances2b-gw.openstack.eqiad1.wikimediacloud.org
egress routing_source_ip 185.16.56.1 IP address for main source NAT for VMs (mind dmz_cidr exclusions) /etc/neutron/l3_agent.ini in cloudnet nodes (puppet). No NIC has this IP assigned. nat.openstack.eqiad1.wikimediacloud.org

Some important IP addresses in the codfw1dev deployment:

Type Name Address Explanation Where is defined, where to change it DNS FQDN
ingress incoming gateway 208.80.153.190/29 neutron address in the WAN transport subnet for ingress Core routers (static route) & neutron main router object cloudinstances2b-gw.openstack.codfw1dev.wikimediacloud.org
egress routing_source_ip 172.16.129.254 IP address for main source NAT for VMs (mind dmz_cidr exclusions) /etc/neutron/l3_agent.ini in cloudnet nodes (puppet). No NIC has this IP assigned. TBD

What Neutron is doing

This section tries to give some light on how Neutron is implementing our network topology under the hood, and what is doing with all this configuration.

Neutron uses 2 specific boxes: cloudnetXXXX.site.wmnet and cloudnetXXXX.site.wmnet (active-standby).
The neutron-server service (daemon, API, etc) runs on cloudcontrol boxes. All the agents run in cloudnet boxes, execept neutron-linuxbridge-agent, which runs in cloudvirt boxes.

When a virtual router is created, and assigned to an l3-agent, a linux network namespace (netns for short) will be created:

Example of NICs in Neutron.png

This netns will hold all the configuration: IP addresses (such as gateways, floating IPs), iptables rules (NAT, filtering, etc), and other information (static routes, etc).
Using virtual taps, this automatically-generated netns is connected to the main netns where the physical NICs live, along with bridges and vlan tagged subinterfaces.

All this is done in the eth1 interface, while eth0 is left for connection of the cloudnet box to the provider network.

When a virtual router is created, Neutron will decide in which l3-agent will be deploying it, taking into account HA parameters.
In our active-standby setup, only one l3-agent is active at a time, which means that all this netns/interfaces/iptables configuration is deployed by Neutron to just one node.

The 'q-' prefix in netns is from earlier development stages, Neutron was called Quantum.

Security policy

TODO: talk about security groups, dmz_cidr exclusion, core route filtering, etc

Neutron customizations

Our Neutron has been customized to bring back (forwardport?) a functionality from the old nova-network days.
The two functionalities added are routing_source_ip and dmz_cidr, and their behavior is explained in the rest of the document. In a nutshell, we get a couple more config options in /etc/neutron/l3_agent.ini which allow us to implement our CloudVPS use cases.

Currently, these modifications are for the Openstack Mitaka release, and can be found at:

Related phabricator tasks: T168580.

See also