Jump to content

Liberica

From Wikitech

Liberica is the next-generation load balancer replacing PyBal. It is designed with a modular approach and written in Go. The source code is available on GitLab.

Daemons

Control Plane

The control plane (cp) daemon orchestrates the rest of the Liberica daemons. It uses a configuration file to define a list of services and relies on etcd for real server discovery. Communication with other Liberica daemons and gobgpd occurs over gRPC.

Forwarding Plane

The forwarding plane (fp) daemon exposes the following forwarding planes to Liberica’s control plane via its gRPC API:

  • IPVS
  • Katran

Healthcheck Forwarder

The healthcheck forwarder (hcforwarder) daemon forwards health check traffic to real servers in the same way production traffic reaches them (using IPIP encapsulation). It consists of two components: a Go daemon that exposes a gRPC API and Prometheus metrics, and an eBPF program that handles the network traffic. The healthcheck daemon targets a specific real server by setting a SOMARK. The hcforwarder uses this SOMARK to identify the real server's IP and perform IPIP encapsulation.

Healthcheck

The Liberica healthcheck daemon performs health checks and tracks their results. It notifies subscribers (usually the control plane daemon) of any changes in the state of real servers. Additionally, it exposes health check results as Prometheus metrics.

Operating Liberica

Liberica provides a CLI tool called liberica for fetching the current state of its various daemons. This tool uses the same gRPC API employed by the Liberica control plane daemon to gather insights from different components.

General operating status can be found at the Grafana dashboard for load balancers.

Pooling a liberica instance

liberica control plane will automatically pool the load balancer when the control plane is started, it can be done manually using systemctl:

$ systemctl start liberica-cp.service

The sre.loadbalancer.admin cookbook is also provided for convenience:

$ sudo -i cookbook sre.loadbalancer.admin --query 'P{lvsXXXX.ulsfo.wmnet}' --reason "finished LB maintenance" pool

On a succesful start eventually liberica should configure the BGP paths for all the configured services, the journal log should contain something like this:

time=2025-01-28T14:03:20.036Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredirlb6_80
time=2025-01-28T14:03:20.037Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredir-httpslb_443
time=2025-01-28T14:03:20.037Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredirlb_80
time=2025-01-28T14:03:20.037Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredir-httpslb6_443

Additionally BGP status can be checked using gobgp CLI:

$ gobgp neighbor # list BGP neighbors
Peer           AS  Up/Down State       |#Received  Accepted
10.64.130.1 64810 00:26:03 Establ      |        0         0

$ gobgp neighbor 10.64.130.1 adj-out # check exported IPv4 prefixes 
   ID  Network              Next Hop             AS_PATH              Attrs
   4   208.80.154.232/32    10.64.130.16         64600                [{Origin: i} {Communities: 14907:11}]
   
$ gobgp neighbor 10.64.130.1 adj-out -a ipv6 # check exported IPv6 prefixes
   ID  Network                Next Hop                    AS_PATH              Attrs
   4   2620:0:861:ed1a::9/128 2620:0:861:109:10:64:130:16 64600                [{Origin: i} {Communities: 14907:11}]

Depooling a liberica instance

Depooling is often needed only for primary instances, backup instances usually doesn't need to be depooled.

You can always check which lvs hosts are the primary and which one are the secondary for a given DC using cumin, eg. sudo cumin 'A:lvs-secondary-eqiad'


Depooling an lvs (liberica) host can be done mainly in two ways, depending on the type of intervention that needs to be performed on the load balancer host:

  • Temporary depool (for smaller intervention like reboots, upgrades, ...)
  • Full depool (for longer intervention that needs the host to be offline for quite some time)
Consider the role of the liberica host you are going to depool first: if the secondary (backup) has been promoted to primary and something happens to some other primary or the secondary itself, the best way to proceed is to depool the whole site instead

Temporary depooling

Temporary depooling of primary instances can be useful for reboots or other types of short time frame maintenance. It is recommended to not leave an host depooled this way for a longer time, because it disables (among the other things) puppet.

Depooling an instance is easily performed using the sre.loadbalancer.admin cookbook, eg. cookbook sre.loadbalancer.admin --query 'P{lvs4008.ulsfo.wmnet}' --reason "LB maintenance" depool

Full depooling

When a lvs (liberica) host needs to be depooled for a considerable amount of time the following steps are required:

  • Commit and merge a custom hiera key for that host setting a lowest BGP community (14907:0). Ex. (for lvs3009)
profile::liberica::bgp_config:
  communities:
    - "14907:0" # lowest BGP priority, depooled for T393616
  peers:
    - 10.80.1.1 # asw1-by27-esams
profile::liberica::include_services:
  - upload
  - upload-https

Example patches for this

  • Run the usual puppet agent on the affected host (run-puppet-agent)
  • Reload liberica configuration with the sre.loadbalancer.admin cookbook, eg: sudo cookbook sre.loadbalancer.admin --query 'P{lvs3008*}' --reason "BGP config reload" -t T393616 config_reload
Reverting a depool (pooling)
  • Revert the depool patch and merge on puppetserver.
  • Run run-puppet-agent on the affected host (such as lvs3008).
  • Reload the liberica configuration again, repeating the same config_reload command above.
    • sudo cookbook sre.loadbalancer.admin --query 'P{lvs3008*}' --reason "BGP config reload" -t T393616 config_reload
If you had previously also used the sre.loadbalancer.admin cookbook for temporary depooling, be sure to re-run it with pool instead of depool.

Alternative way to temporary depool an instance

Depooling a liberica based load balancer should be as easy as stopping the control plane using the following command:

systemctl stop liberica-cp.service

After a successful depool gobgp should show an empty list of neighbors, this can be verified using gobgp CLI:

$ sudo -i gobgp neighbor 
Peer AS Up/Down State       |#Received  Accepted

Alerts

LibericaDiffFPCheck

This alert is triggered after finding a mismatch between realservers that should be effectively pooled according to the control plane and the ones that are actually pooled on the forwarding plane. Both can be queried using the liberica CLI tool:

vgutierrez@lvs4009:~$ liberica cp services  # control plane status
upload-httpslb6_443:
        2620:0:863:101:10:128:0:12      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:14      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:35      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:21      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:36      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:24      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:10      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:37      1 healthy: true | pooled: yes
[...]
vgutierrez@lvs4009:~$ liberica fp services  # forwarding plane status
[...]
2620:0:863:ed1a::2:b:443 mh
        2620:0:863:101:10:128:0:37      1
        2620:0:863:101:10:128:0:10      1
        2620:0:863:101:10:128:0:24      1
        2620:0:863:101:10:128:0:36      1
        2620:0:863:101:10:128:0:21      1
        2620:0:863:101:10:128:0:35      1
        2620:0:863:101:10:128:0:14      1
        2620:0:863:101:10:128:0:12      1

This alert should be an indicator of a bug or a transient issue in Liberica, the easiest way of fixing it should be restarting liberica using the sre.loadbalancer.upgrade cookbook from a cumin host:

$ sudo cookbook sre.loadbalancer.upgrade --query "P{lvs4009.ulsfo.wmnet}" --reason "clear LibericaDiffFPCheck alert" restart

LibericaStaleConfig

This alert is triggered if a configuration deployed by puppet hasn't been loaded in the next hour. This can be fixed reloading Liberica configuration using the sre.loadbalancer.admin cookbook from a cumin host:

$ sudo cookbook sre.loadbalancer.admin --query "P{lvs4009.ulsfo.wmnet}" --reason "clear LibericStaleConfig alert" config_reload

LibericaEtcdErrors

This alert is triggered if liberica control plane is having issues watching an etcd key. Please open a phabricator task if you notice this alert being triggered.

If you suspect this is a transient issue a seamless (no depool required) restart of liberica on the impacted instance should solve the issue. This can be done using the sre.loadbalancer.upgrade cookbook from a cumin host:

$ sudo cookbook sre.loadbalancer.upgrade --query "P{lvs4009.ulsfo.wmnet}" --reason "clear LibericaEtcdErrors alert" --seamless restart

LibericaUnhealthyRealserverPooled

This alert is triggered when liberica control plane is forced to keep unhealthy realservers serving traffic. This is usually triggered by liberica enforcing the depool threshold, indicating an issue with the service and not with liberica.

A detailed state of the service can be retrieved using liberica CLI tool:

vgutierrez@lvs5006:~$ sudo liberica cp services
upload-httpslb6_443:
        2001:df2:e500:101:10:132:0:16   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:36   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:37   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:38   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:25   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:26   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:27   1 healthy: true | pooled: yes
        2001:df2:e500:101:10:132:0:28   1 healthy: true | pooled: yes

Debugging

Debugging traffic on Liberica instances can be tricky. First thing you need to do is check if liberica is configured to use katran checking the hiera key profile::liberica::fp_config, forwarding_plane will be set to katran. Locally it can be checked grepping the liberica config file:

# lvs1013 configured to use katran
vgutierrez@lvs1013:~$ fgrep forwarding_plane /etc/liberica/config.yaml 
  forwarding_plane: katran
# lvs6001 configured to use ipvs
vgutierrez@lvs6001:~$ fgrep forwarding_plane /etc/liberica/config.yaml 
  forwarding_plane: ipvs

IPVS

If IPVS is being used inbound traffic can be inspected on the main network interface of the load balancer. You can also check outbound traffic there but you should consider that's gonna be encapsulated with an IPIP header (or IP6IP6 for IPv6 traffic)

# inspecting TCP syn packets incoming to port 80 from users
vgutierrez@lvs6001:~$ sudo -i tcpdump -n -i ens3f0np0 "port 80 and tcp[tcpflags] & tcp-syn != 0"                                                                           
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode                                                                                                        
listening on ens3f0np0, link-type EN10MB (Ethernet), snapshot length 262144 bytes                                                                                                
09:55:35.809655 IP REDACTED.48724 > 185.15.58.224.80: Flags [S], seq 1131418028, win 65535, options [mss 1452,sackOK,TS val 1299563729 ecr 0,nop,wscale 10], length 0 
09:55:35.809759 IP REDACTED.60470 > 185.15.58.224.80: Flags [S], seq 2829960310, win 14600, options [mss 1452,sackOK,TS val 120479635 ecr 0,nop,wscale 3], length 0
09:55:35.810316 IP REDACTED.61394 > 185.15.58.224.80: Flags [S], seq 1426643605, win 65535, options [mss 1460,sackOK,TS val 3238768359 ecr 0,nop,wscale 9], length 0

If we want to inspect outbound traffic (from the loadbalancers to the realservers), we can filter IPIP traffic like this:

  • proto 4 matches IPIP encapsulation
  • ip[20]: skip 20 bytes of the outer IP header and check that we got another 20 bytes IPv4 header (0x45)
  • ip[29]: check that the inner IP header protocol field is set to 6 (TCP)
  • ip[53]: check that the TCP field value is 2 (SYN)
vgutierrez@lvs6001:~$ sudo tcpdump -n -i ens3f0np0 "proto 4 and ip[20] = 0x45 and ip[29] = 6 and ip[53] = 2"
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens3f0np0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:16:36.505441 IP 172.16.6.102 > 10.136.0.12: IP 147.161.184.190.1654 > 185.15.58.224.443: Flags [S], seq 2404180515, win 65535, options [mss 1460,nop,wscale 5,sackOK,TS val 85026473 ecr 0], length 0
10:16:36.533562 IP 172.16.6.27 > 10.136.0.10: IP 90.85.224.21.1547 > 185.15.58.224.80: Flags [S], seq 2025563303, win 29200, options [mss 1460,sackOK,TS val 40808629 ecr 0,nop,wscale 5], length 0
10:16:36.635833 IP 172.16.6.5 > 10.136.1.13: IP 154.121.55.110.1557 > 185.15.58.224.443: Flags [S], seq 3802172249, win 65535, options [mss 1400,sackOK,TS val 563094647 ecr 0,nop,wscale 10], length 0

Katran

Katran leverages the eXpress Data Path or XDP, so unless Katran triggers an XDP_PASS action and the inbound packet is handled to the kernel networking stack, traffic won't appear on tcpdump. This means that TCP and ICMP traffic directed to any configured VIP on the load balancer won't be visible unless we use xdpdump:

# inspecting TCP syn packets incoming to port 80 from users
vgutierrez@lvs1013:~$ sudo -i xdpdump -i enp4s0f0 -w - |tcpdump -r - -n "port 80 and tcp[tcpflags] & tcp-syn != 0"
listening on enp4s0f0, ingress XDP program ID 788 func balancer_ingress, capture mode entry, capture size 262144 bytes
reading from file -, link-type EN10MB (Ethernet), snapshot length 262144
13:46:34.641742 IP REDACTED.36062 > 208.80.154.232.80: Flags [S], seq 4124548088, win 64240, options [mss 1460,sackOK,TS val 2087457777 ecr 0,nop,wscale 8], length 0
13:46:34.943854 IP REDACTED.40948 > 208.80.154.232.80: Flags [S], seq 3137424210, win 64240, options [mss 1460,sackOK,TS val 2468684832 ecr 0,nop,wscale 11], length
13:46:35.243008 IP REDACTED.39106 > 208.80.154.232.80: Flags [S], seq 1992686468, win 35844, options [mss 1460,nop,nop,TS val 1440660353 ecr 0,nop,wscale 7], length 0

If we wanna capture the traffic going from the load balancer to the realservers we need to append --rx-capture exit to see the outbound traffic from the XDP program:

# capture SYN packets IPIP encapsulated and targeting port 80
vgutierrez@lvs1013:~$ sudo -i xdpdump -i enp4s0f0 --rx-capture exit -w - |tcpdump -r - -n "proto 4 and ip[20] = 0x45 and ip[29] = 6 and ip[53] = 2"
listening on enp4s0f0, ingress XDP program ID 788 func balancer_ingress, capture mode exit, capture size 262144 bytes
reading from file -, link-type EN10MB (Ethernet), snapshot length 262144
13:54:24.508781 IP 172.16.180.72 > 10.64.32.89: IP REDACTED.39116 > 208.80.154.232.80: Flags [S], seq 712928429, win 64240, options [mss 1460,sackOK,TS val 2087927777 ecr 0,nop,wscale 8], length 0
13:54:24.535049 IP 172.16.196.115 > 10.64.0.153: IP REDACTED.41518 > 208.80.154.232.443: Flags [S], seq 211167461, win 42340, options [mss 1460,sackOK,TS val 2445851461 ecr 0,nop,wscale 8], length 0
13:54:24.573691 IP 172.16.85.254 > 10.64.32.89: IP REDACTED.5216 > 208.80.154.232.443: Flags [S], seq 4293813065, win 42600, options [mss 1420,sackOK,TS val 3509097350 ecr 0,nop,wscale 8], length 0

See also