Jump to content

Nova Resource:Deployment-prep/Blocking and unblocking

From Wikitech

The Beta Cluster websites are protected against various types of abuse by configuration in our Content Delivery Network (CDN) edge. This page describes the procedures for finding abusive traffic sources, blocking based on IP network, and unblocking based on IP network.

Finding abusive traffic sources

The grafana load graph for deployment-mediawiki14 is usually helpful for determining if the server is being overloaded when folks are reporting unusually slow responses. Sustained load over 3-4 is a common indication of unusually heavy traffic.

There are a number of helper scripts in deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud:/root that can be used to identify networks that are sending the largest share of traffic. These scripts output YAML lists suitable for pasting into the hiera data in Horizon.

big-ban-hammer.sh
Find the top class A networks (0.0.0.0/8) sending more than 500 out of the last 50,000 requests
mid-ban-hammer.sh
Find the top class B networks (0.0.0.0/16) sending more than 100 out of the last 50,000 requests
small-ban-hammer.sh
Find the top class C networks (0.0.0.0/24) sending more than 50 out of the last 50,000 requests
user@laptop:~$ ssh deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud
user@deployment-mediawiki14:~$ sudo -i
root@deployment-mediawiki14:~# ./big-ban-hammer.sh
    - 57.0.0.0/8
    - 74.0.0.0/8
    - 91.0.0.0/8
    - 94.0.0.0/8
    - 101.0.0.0/8
    - 110.0.0.0/8
    - 111.0.0.0/8
    - 119.0.0.0/8
    - 124.0.0.0/8
    - 159.0.0.0/8
    - 166.0.0.0/8
    - 172.0.0.0/8
    - 217.0.0.0/8

Ideally we will block smaller networks rather than giant class A and class B blocks, but when there are attackers using residential networks as their attack base we can end up needing to use larger blocks. T392534 was an example of blocking a large number of class A networks. A number of these have subsequently been split into smaller blocks to allow networks carrying friendly traffic to be unblocked.

To scan more than just the last 50k requests, you can use something like sudo grep -oP '"X-Client-IP": "\d+\.\d+\.\d+\.\d+' /var/log/apache2/other_vhosts_access-json.log|sort|uniq -c|sort -nr|head -n10 instead.

Blocking an IP or network

Blocking an abusing IP can be done by adding the IP or more commonly a CIDR network containing the IP to the abuse_networks:blocked_nets:networks Hiera configuration at https://horizon.wikimedia.org/project/puppet/.

To ensure the Hiera change is picked up quickly, check the varnish server deployment-cache-text*:

  • Check /etc/haproxy/ipblocks.d/all.map to ensure your changes are present
    • If they are not listed yet force a Puppet run: sudo run-puppet-agent
  • Reload HAProxy's configuration with sudo service haproxy reload

Repeat the forced Puppet run and HAProxy reload on deployment-cache-upload* to fully block the new ranges.

Unblocking an IP or network

Unblocking works the same way as blocking. We will adjust the abuse_networks:blocked_nets:networks Hiera configuration at https://horizon.wikimedia.org/project/puppet/, run Puppet on deployment-cache-text*, and reload config for the haproxy service.

First we need to identify the specific IP or network CIDR to unblock. Typically a user will report a single IP. We can use that IP to identify the containing network via whois lookup. One convenient way to do that lookup is by using the whois-dev tool. For example if the IP requesting to be unblocked is 69.92.197.73 then whois will tell us that the address is part of the 69.92.197.0/24 network. There is a command line tool in deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud:/root/whoisit.sh that can be used as well:

$ ssh deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud
$ sudo -i
$ ./whoisit.sh 69.92.197.73
AS      | IP               | BGP Prefix          | CC | Registry | Allocated  | AS Name
11492   | 69.92.197.73     | 69.92.197.0/24      | US | ARIN     | 0001-01-01 | CABLE ONE, INC.

Once we have the network to unblock we need to find the currently blocked network that contains it. To do this we need to review the abuse_networks:blocked_nets:networks data in Hiera configuration at https://horizon.wikimedia.org/project/puppet/ or the cloud/instance-puppet.git:deployment-prep/_.yaml file that tracks the Beta Cluster global Hiera data. For the sake of this example let's assume that the list currently contains an entry for the 69.92.0.0/16 network.

Now we need to figure out what smaller network blocks should remain blocked after splitting 69.92.0.0/16 to remove 69.92.197.0/24. The subtractNetworks.py script can do that for you. We keep a copy on deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud with the scripts we use to find ranges to block:

user@laptop:~$ ssh deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud
user@deployment-mediawiki14:~$ sudo -i
root@deployment-mediawiki14:~# ./subtractNetworks.py 69.92.0.0/16 69.92.197.0/24
abuse_networks:
  blocked_nets:
    networks:
    - 69.92.0.0/17
    - 69.92.128.0/18
    - 69.92.192.0/22
    - 69.92.196.0/24
    - 69.92.198.0/23
    - 69.92.200.0/21
    - 69.92.208.0/20
    - 69.92.224.0/19

Now that we have the list of networks to remain blocked, we paste it into the abuse_networks:blocked_nets:networks Hiera configuration at https://horizon.wikimedia.org/project/puppet/ in place of the prior 69.92.0.0/16 network.

Finally we need to run Puppet and reload haproxy on to pick up the changes on relevant hosts:

user@laptop:~$ for host in {deployment-cache-text08,deployment-cache-upload08}.deployment-prep.eqiad1.wikimedia.cloud; do echo Processing $host; ssh $host 'sudo run-puppet-agent && sudo systemctl reload haproxy'; done

Processing deployment-cache-text08.deployment-prep.eqiad1.wikimedia.cloud
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-cache-text08.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(5327bea896) gitpuppet - puppetserver: Generalize git-rebase fix to work for labs/private'
Notice: /Stage[main]/Prometheus::Varnishkafka_exporter/Service[prometheus-varnishkafka-exporter]/ensure: ensure changed 'stopped' to 'running' (corrective)
Info: /Stage[main]/Prometheus::Varnishkafka_exporter/Service[prometheus-varnishkafka-exporter]: Unscheduling refresh on Service[prometheus-varnishkafka-exporter]
Notice: Applied catalog in 16.00 seconds
Processing deployment-cache-upload08.deployment-prep.eqiad1.wikimedia.cloud
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-cache-upload08.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(5327bea896) gitpuppet - puppetserver: Generalize git-rebase fix to work for labs/private'
Notice: /Stage[main]/Prometheus::Varnishkafka_exporter/Service[prometheus-varnishkafka-exporter]/ensure: ensure changed 'stopped' to 'running' (corrective)
Info: /Stage[main]/Prometheus::Varnishkafka_exporter/Service[prometheus-varnishkafka-exporter]: Unscheduling refresh on Service[prometheus-varnishkafka-exporter]
Notice: Applied catalog in 13.88 seconds

See also