Peering management

From Wikitech

Peering management is something with still a large manual process. This is mostly due to communication (for new, changes, or issues) happening over emails.

Finding peering candidates

peering@ email alias

The easy one, as we usually accept all peering requests.

Equinix peering opportunities portal

https://ix.equinix.com/portal/peering/peering-opportunities

Using their own flow data, Equinix brings to light networks we peer with at some of the Equinix IXPs, but where sessions are missing from other IXPs.

Eg We peer with AS X at Equinix Ashburn, and both Wikimedia and X are present at Equinix Chicago but we don't have any BGP session yet.

Netflow

See also Netflow

https://turnilo.wikimedia.org/#wmf_netflow allows to filter/sort through all our external traffic on BGP criteria.

For example https://w.wiki/5mj6 sorts outbound drmrs traffic by AS_PATH (with the first 2 AS the traffic is going to transit through), and the final AS (the final AS of the path).

It's also possible to filter for BGP communities matching peering or transit traffic.

This can help reveal networks (ASNs) that we currently reach through our transit peers. This needs to be used with PeeringDB to identify those peers' IX presences.

PeeringDB

https://www.peeringdb.com/asn/14907

This is the most comprehensive list of networks present at a given IXP, and thus candidates networks.

Peering News

Python script that checks for any new routers at the IXPs we're present. It currently runs weekly on diffscan02 (cloud VPS) with a systemd timer (Puppet profile) and send the output by email to peering@.

IX mailing lists

New IXP members will be announced on the IXP mailing list

Peering workflow

This workflow is quite flexible as peers behaviors varies greatly.

Setting up new sessions

New peerings workflow

Blue: via cookbooks, yellow: manual.

Notes

  • Even though we prefer to not use any MD5 key, some peers require it, this need to be manually added after running the configure cookbook
  • If the peer's ETA is long in the future, wait to be close to the date to configure our side to limit the risk of alerting/log noise.

Managing down sessions

Flowchart diagram of actions to take when a peering is down

Notes

  • Icinga BGP status alerts, only applies to WARNING: https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=bgp%20status
  • It's also possible to check directly the peer's PeeringDB page to check if the peer is still present on the IXP.
  • The "configured prefix limit can be seen by running the following command on the alerting router:
    • show bgp neighbor <peerIP> | match Prefixlimit (eg. "inet-unicast Limit: 10000")
    • To be compared with the IPv4/IPv6 Prefixes fields of the ASN's PeeringDB page.

Manually increase the prefix limit

If the current limit is too low, set a new custom limit for that peer. Generally set to PeeringDB limit + 20%: 1. Commands:

configure
set protocols bgp group IX4 neighbor <IP> family inet unicast prefix-limit maximum <new_limit>
set protocols bgp group IX4 neighbor <IP> family inet unicast prefix-limit teardown 80
set protocols bgp group IX4 neighbor <IP> family inet unicast prefix-limit teardown idle-timeout forever
commit
exit

NOTE: For an IPv6 peer replace 'IX4' with 'IX6' and 'inet' with 'inet6'

2. Once the new limit is set clear the BGP session to the peer: clear bgp neighbor <IP>

3. After a minute or so the BGP peer should show as status 'established' if things go ok:

cmooney@cr3-eqsin> show bgp summary | match 27.111.228.33
27.111.228.33          4800        201          8       0       6        1:03 Establ

Providers not using emails to manage peering

Google - https://isp.google.com/

Cloudflare - https://peering.cloudflare.com/

Microsoft - https://learn.microsoft.com/en-us/azure/internet-peering/howto-exchange-portal

Netflix - https://openconnect.zendesk.com/hc/en-us/requests/new?ticket_form_id=360001023311

Possible improvements

  • Automate extensively the peering candidate search by joining PeeringDB data, Hive/netflow, the list of current peers (from the routers), and a manual list of exceptions.
    • An even more advanced version could show the benefits of extending our peering to new IXPs based on the peer's list
  • Automate the "BGP sessions down" workflow