Netconsole

From Wikitech

What is netconsole? What is it used for?

Quoting the manual:

This module logs kernel printk messages over UDP allowing debugging of problem where disk logging fails and serial consoles are impractical.

It can be used either built-in or as a module. As a built-in, netconsole initializes immediately after NIC cards and will bring up the specified interface as soon as possible. While this doesn't allow capture of early kernel panics, it does capture most of the boot process.

Production setup

The syslog::centralserver hosts run a rudimentary but fully working netconsole server to receive kernel logs. Clients send logs through syslog anycast IP address (syslog.anycast.wmnet) on port 6666, therefore the logs might land in eqiad or codfw syslog hosts.

To display the netconsole log into centrallog hosts and issue journalctl -u netconsole

As of Dec 2021 netconsole is opt-in for clients, see also the documentation on profile::netconsole::client on how to enable it for your role.

Manual setup

This is a quick guide to set up a rudimentary netconsole logger over netcat just for a punctual requirement.

There is no need to install extra tools for this setup.

For this testing we will use the following two servers

db2034 - server that needs troubleshooting
db2048 - server that will receive the messages

On db2048 we need to set up a netcat listening on any port, let's use 6666 and let's redirect that to a logfile so we can check it later:

db2048: nc -l -u -p 6666 | tee /var/log/netconsole.log

From db2034 we can test if the port is able to receive UDP connections:

Now let's configure netconsole on db2034 to send the messages to db2048. If possible, use a host on the same subnet, to simplify things

Data we need before we set up the kernel module:

db2034 IP: 10.192.32.5
db2034 MAC: a0:d3:c1:fa:2d:80
db2048 IP: 10.192.32.108

Let's go ahead and install the kernel module:

modprobe netconsole netconsole=6666@10.192.32.5/eth0,6666@10.192.32.108/40:a8:f0:35:22:8c

Now we can check that it was enabled

 db2034: dmesg | tail -n5
 <snip>
 [ 5205.486828] console [netcon0] enabled
 [ 5205.486829] netconsole: network logging started

We are using the same local and remote port (6666 to simplify things). After that we are saying that db2034's ip will send stuff from eth0 via 6666 to db2048's IP and it should expect db2034's mac

Before we are ready to test if it works, let's change the logging level on db2034 (so it sends everything to console)

db2034:dmesg -n 8

How can I test if it is working:

Let's try to log a test message to see if we can see it on db2048:

root@db2034:/srv# modprobe floppy
modprobe: ERROR: could not insert 'floppy': No such device
root@db2034:/srv#

And we can see it on the other side:

root@db2048:~# nc -l -u -p6666 | tee /var/log/netconsole.log
[ 8644.321281] Floppy drive(s): fd0 is 1.44M
[ 8647.363068] floppy0: no floppy controllers found