.. title: BPF Superpowers for Linux
.. slug: bpf-superpowers-for-linux
.. date: 2019-07-11 23:36:32 UTC+02:00
.. tags: linux, kernel
.. type: text

A powerful set of tools is helpful not only for developers.  On
GNU/Linux `strace <https://strace.io/>`_ is one such tool and helped
me already with a wide variety of problems.  However it is always
confined to one process or a process tree and cannot help with system
wide problems / questions.  Although there have been many attempts in
the past, the Linux developers seem to have found a common underlying
infrastructure in the form of `BPF
<https://www.kernel.org/doc/html/latest/bpf/index.html>`_ for such
tools, allowing flexible and high-performance probing.

Already back in 2016, the Netflix developer Brendan Gregg showed what
is possible in his talk `Linux BPF Superpowers
<http://www.brendangregg.com/blog/2016-03-05/linux-bpf-superpowers.html>`_.

.. image:: /images/bcc_tracing_tools_2016.png
   :alt: Linux bcc/BPF Tracing Tools
   :name: Linux bcc/BPF Tracing Tools
   :align: center
   :width: 400

.. TEASER_END

If you want to get a quick idea of what is possible with this tool set,
you can do this easily on a Debian Buster GNU/Linux system:

.. code-block:: console

  dzu@krikkit:~$ sudo apt install bpfcc-tools
  [sudo] password for dzu: 
  Reading package lists... Done
  Building dependency tree       
  Reading state information... Done
  The following additional packages will be installed:
    ieee-data libbpfcc python-bpfcc python-netaddr
  Suggested packages:
    ipython python-netaddr-docs
  The following NEW packages will be installed:
    bpfcc-tools ieee-data libbpfcc python-bpfcc python-netaddr
  0 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
  Need to get 15.9 MB of archives.
  After this operation, 66.7 MB of additional disk space will be used.
  Do you want to continue? [Y/n] 
  Get:1 http://deb.debian.org/debian buster/main amd64 libbpfcc amd64 0.8.0-4 [13.4 MB]
  Get:2 http://deb.debian.org/debian buster/main amd64 python-bpfcc all 0.8.0-4 [29.4 kB]
  Get:3 http://deb.debian.org/debian buster/main amd64 ieee-data all 20180805.1 [1590 kB]
  Get:4 http://deb.debian.org/debian buster/main amd64 python-netaddr all 0.7.19-1 [229 kB]
  Get:5 http://deb.debian.org/debian buster/main amd64 bpfcc-tools all 0.8.0-4 [654 kB]
  Fetched 15.9 MB in 1s (11.4 MB/s)   
  Selecting previously unselected package libbpfcc.
  (Reading database ... 307420 files and directories currently installed.)
  Preparing to unpack .../libbpfcc_0.8.0-4_amd64.deb ...
  Unpacking libbpfcc (0.8.0-4) ...
  Selecting previously unselected package python-bpfcc.
  Preparing to unpack .../python-bpfcc_0.8.0-4_all.deb ...
  Unpacking python-bpfcc (0.8.0-4) ...
  Selecting previously unselected package ieee-data.
  Preparing to unpack .../ieee-data_20180805.1_all.deb ...
  Unpacking ieee-data (20180805.1) ...
  Selecting previously unselected package python-netaddr.
  Preparing to unpack .../python-netaddr_0.7.19-1_all.deb ...
  Unpacking python-netaddr (0.7.19-1) ...
  Selecting previously unselected package bpfcc-tools.
  Preparing to unpack .../bpfcc-tools_0.8.0-4_all.deb ...
  Unpacking bpfcc-tools (0.8.0-4) ...
  Setting up ieee-data (20180805.1) ...
  Setting up python-netaddr (0.7.19-1) ...
  Setting up libbpfcc (0.8.0-4) ...
  Setting up python-bpfcc (0.8.0-4) ...
  Setting up bpfcc-tools (0.8.0-4) ...
  Processing triggers for man-db (2.8.5-2) ...
  Processing triggers for libc-bin (2.28-10) ...
  dzu@krikkit:~$ 
  
One simple but very powerful tool is *execsnoop* showing every exec
call on the system.  On Debian the package maintainer has decided to
name all the tools with an *-bpfcc* suffix not to collide with tools
from the perf suite, so we have to call it like this:

.. code-block:: console

  dzu@krikkit:~$ sudo execsnoop-bpfcc
  PCOMM            PID    PPID   RET ARGS
  nikola           11710  7479     0 /home/dzu/src/python/nikola/bin/nikola build
  ldconfig         11711  11710    0 /sbin/ldconfig -p
  sh               11715  11714    0 /bin/sh -c command -v debian-sa1 > /dev/null && debian-sa1 1 1
  debian-sa1       11716  11715    0 /usr/lib/sysstat/debian-sa1 1 1
  nikola           11717  7479     0 /home/dzu/src/python/nikola/bin/nikola build
  ldconfig         11718  11717    0 /sbin/ldconfig -p
  nikola           11724  7479     0 /home/dzu/src/python/nikola/bin/nikola build
  ldconfig         11725  11724    0 /sbin/ldconfig -p
  sshd             11741  797      0 /usr/sbin/sshd -D -R
  sshd             11766  797      0 /usr/sbin/sshd -D -R
  sh               11768  767      0 /bin/sh -c iptables -w -n -L INPUT | grep -q 'f2b-sshd[ \t]'
  iptables         11769  11768    0 /usr/sbin/iptables -w -n -L INPUT
  grep             11770  11768    0 /usr/bin/grep -q f2b-sshd[ \t]
  sh               11771  767      0 /bin/sh -c iptables -w -I f2b-sshd 1 -s 61.177.172.158 -j REJECT --reject-with icmp-port-unreachable
  iptables         11772  11771    0 /usr/sbin/iptables -w -I f2b-sshd 1 -s 61.177.172.158 -j REJECT --reject-with icmp-port-unreachable
  nikola           11788  7479     0 /home/dzu/src/python/nikola/bin/nikola build
  ldconfig         11789  11788    0 /sbin/ldconfig -p

This run shows the blog software Nikola building the page during my
edits, some unsolicited attempts at logging in to my PC and `fail2ban
<https://www.fail2ban.org/wiki/index.php/Main_Page>`_ finally kicking
in to block the IP address of the offending ssh attempt.

The nifty little tool is in fact a readable `Python
<https://github.com/iovisor/bcc/blob/tag_v0.8.0/tools/execsnoop.py>`_
script `dynamically
<https://github.com/iovisor/bcc/blob/tag_v0.8.0/tools/execsnoop.py#L166>`_
loading `C code
<https://github.com/iovisor/bcc/blob/tag_v0.8.0/tools/execsnoop.py#L61>`_
to `hook
<https://github.com/iovisor/bcc/blob/tag_v0.8.0/tools/execsnoop.py#L168>`_
the `syscall__execve` and `do_ret_sys_execve` Linux kernel functions.
This is indeed very elegant.  It has to be kept in mind however that
the C code will not get compiled into machine code but into byte code
for the BPF virtual machine.  This virtual machine is intentionally
*not* a generic virtual machine to preserve safety and security
guarantees.

Other interesting tools are `biotop
<https://github.com/iovisor/bcc/blob/master/tools/biotop_example.txt>`_
and `cachetop
<https://github.com/iovisor/bcc/blob/master/tools/cachetop_example.txt>`_.
The first shows Block I/O operations whereas the latter shows Linux
Page Cache hits / misses - both in a *top* like fashion.

If you are interested in BPF and you are a fan of printed books then
you will be glad to hear that O'Reilly just published `Linux
Observability with BPF
<http://shop.oreilly.com/product/0636920242581.do>`_ for your perusal.

All in all, BPF is an exciting new Linux kernel technology and it is
interesting to reflect on how this subsystem originating as a "packet
filter" has gained (and will gain) completely new applications.  It
certainly is a powerful tool for Linux developers.
