SystemTap

From Wikitech

SystemTap is a scripting language and tool to instrument production Linux systems. SystemTap scripts are compiled into loadable kernel modules and can instrument the execution of functions or statements in the kernel or user-space.

Examples

The following SystemTap script prints successful and failed TCP Fast Open connections by instrumenting the execution of the tcp_try_fastopen kernel function:

#! /usr/bin/env stap
#

function printconn(skb) {
    iphdr  = __get_skb_iphdr(skb)
    tcphdr = __get_skb_tcphdr(skb)
    daddr  = format_ipaddr(__ip_skb_daddr(iphdr), AF_INET())
    saddr  = format_ipaddr(__ip_skb_saddr(iphdr), AF_INET())
    dport  = __tcp_skb_dport(tcphdr)
    sport  = __tcp_skb_sport(tcphdr)

    printf(" TFO connection %s:%d -> %s:%d\n", saddr, sport, daddr, dport);
}

probe kernel.function("tcp_try_fastopen").return {
    if ($foc->len > 0) {
        if ($return) {
            printf("Successful");
        } else {
            printf("Failed");
        }
        printconn($skb);
    }
}

The user-space probing script below prints the Content-Length HTTP header while it gets re-written by varnish in function cnt_vdp. Only responses with status code 200 are considered. Note that this is a statement-level probe using line 104 of cache_req_fsm.c as a probe point. See the SystemTap Language reference for details about the language, and in particular section 4. Probe points for how to specify which portions of the code to instrument.

#! /usr/bin/env stap
#

probe process("/usr/sbin/varnishd").statement("cnt_vdp@cache/cache_req_fsm.c:104")
{
    if ($req->resp->status == 200) {
        printf("pid: %d time: %d func: %s -> ", pid(), gettimeofday_s(), ppfunc())
        printf("CL: %d -> %d\n", $resp_len, $req->resp_len)
    }
}

It is also possible to discover all available probe points in a given function. For example:

stap -L 'process("/usr/sbin/varnishd").statement("VRB_Cache@cache/cache_req_body.c:*")'

Building and running SystemTap scripts

On Debian systems, the systemtap package needs to be installed on the machine where probes are being developed. The stap-prep command prepares the system for SystemTap use by installing kernel headers, debug symbols and build tools that match the currently running kernel or optionally the kernel version given by the user. Note that, for user-space probing, the kernel debug symbols do necessarily need to be installed.

The following example shows how to build the h2_spdy_stats.ko kernel module from a .stp script called h2-spdy-stats.stp:

stap -v -m h2_spdy_stats h2-spdy-stats.stp -p 4

Specific kernel versions can be targeted using the -m option in case the production kernel version differs from the kernel version installed on the development machine:

stap -v -r 4.4.0-1-amd64 -m h2_spdy_stats h2-spdy-stats.stp -p 4

On production machines, only the systemtap-runtime package needs to be installed in order to run compiled SystemTap scripts.

The staprun command allows to execute kernel modules produced by stap on production machines. Users do not need to be root in order to do so, provided that the following conditions are met:

  • The user is member of the stapusr group
  • The kernel module is owned by root and located under /lib/modules/`uname -r`/systemtap/

The script can then be executed like this:

staprun -v /lib/modules/`uname -r`/systemtap/h2_spdy_stats.ko

In case of user-space probes, the execution can be limited to a specific PID by passing -x $pid to staprun.