SysRq

|^^|

source: https://access.redhat.com/node/2023

How to use the SysRq facility to collect information from a server which has hung

Solution Verified - Updated April 1 2015 at 3:40 AM -

English

Environment

    • Red Hat Enterprise Linux 6

    • Red Hat Enterprise Linux 5

    • Red Hat Enterprise Linux 4

    • Red Hat Enterprise Linux 3

Issue

    • How to use the SysRq facility to collect information from a server which is frozen or unresponsive

    • How to manually force a crash in server hang conditions

    • If the system goes into a hung state, can a vmcore or core dumps be captured?

    • How do I enable SysRq to force a kernel panic?

Resolution

What is the "Magic" SysRq key?

    • According to the Linux kernel documentation:

      • It is a 'magical' key combo you can hit which the kernel will respond to regardless of whatever else it is doing, even if the console is unresponsive.

    • The sysrq key is one of the best (and sometimes the only) way to determine what a machine is really doing. It is useful when a server appears to be "hung" or for diagnosing elusive, transient, kernel-related problems.

How do I enable and disable the SysRq key?

    • For security reasons, Red Hat Enterprise Linux disables the SysRq key by default. To enable it, run:

        • Raw

        • # echo 1 > /proc/sys/kernel/sysrq

    • To disable it:

        • Raw

        • # echo 0 > /proc/sys/kernel/sysrq

    • To enable it permanently, set the kernel.sysrq value in /etc/sysctl.conf to 1.

        • Raw

        • # grep sysrq /etc/sysctl.conf kernel.sysrq = 1

    • To make this change live and persistent, run:

Security Note: Because enabling sysrq gives someone with physical console access extra abilities, it is recommended to disable it when not troubleshooting a problem or to ensure that physical console access is properly secured.

How do I trigger a sysrq event?

    • There are several ways to trigger a sysrq event. On a normal system, with an AT keyboard, sysrq events can be triggered from the console with the following key combination, even if other commands are hanging:

        • Raw

        • Alt+PrintScreen+[CommandKey]

    • For instance, to tell the kernel to dump memory info (command key "m"), you would hold down the Alt and Print Screen keys, and then hit the m key.

    • Note that this will not work from an X Window System screen. You should first change to a text virtual terminal. Hit Ctrl+Alt+F1 to switch to the first virtual console prior to hitting the sysrq key combination.

    • On a serial console, you can achieve the same effect by sending a Break signal to the console and then hitting the command key within 5 seconds. This also works for virtual serial console access through a out-of-band service processor included in many servers, or remote console like HP iLO, Sun ILOM and IBM RSA. Refer to service processor specific documentation for details on how to send a Break signal; for examples:

    • If you have a root shell on the machine (and the system is responding enough for you to do so), you can also write the command key character to the /proc/sysrq-trigger file. This is useful for triggering this info when you are not on the system console or for triggering it from scripts.

        • Raw

        • # echo 'm' > /proc/sysrq-trigger

Note: The above method of utilizing the SysRq facility is not affected by the previously indicated kernel.sysrq tunable. The only method available for fully deactivating the SysRq facility would be a custom kernel with the following compilation option deactivated:

Raw

CONFIG_MAGIC_SYSRQ=y

When I trigger a sysrq event that generates output, where does it go?

When a sysrq command is triggered, the kernel will print out the information to the kernel ring buffer and to the system console. This information is normally logged via syslog to /var/log/messages.

Unfortunately, when dealing with machines that are extremely unresponsive, syslogd is often unable to log these events. In these situations, provisioning a serial console is often recommended for collecting the data.

What sort of sysrq events can be triggered?

There are several sysrq events that can be triggered once the sysrq facility is enabled. These vary somewhat between kernel versions, but there are a few that are commonly used:

    • m - dump information about memory allocation

    • t - dump thread state information

    • p - dump current CPU registers and flags

    • c - intentionally crash the system (useful for forcing a disk or netdump)

    • s - immediately sync all mounted filesystems

    • u - immediately remount all filesystems read-only

    • b - immediately reboot the machine

    • o - immediately power off the machine (if configured and supported)

    • f - start the Out Of Memory Killer (OOM)

    • w - dumps tasks that are in uninterruptable (blocked) state [Introduced with kernel 2.6.32]

Before using the SysRq facility, please consult with your vendors as third party applications may be impacted.

ON COMMENTS:

Dannie Obbink

Active Contributor

A useful sequence of commands for rebooting an otherwise unresponsive system is known as the "Raising Elephants" acronymn.

The following commands would be entered in this sequence:

unRaw \(take control of keyboard back from X\),

tErminate \(send SIGTERM to all processes, allowing them to terminate gracefully\),

kIll \(send SIGKILL to all processes, forcing them to terminate immediately\),

Sync \(flush data to disk\),

Unmount \(remount all filesystems read\-only\),

reBoot\,

A mnemonic for this sequence is "Raising Elephants Is So Utterly Boring"

When running a kernel with SysRq compiled in, /proc/sys/kernel/sysrq controls the functions allowed to be invoked via the SysRq key. The default value in this file is set by the CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE config symbol, which itself defaults to 1. Here is the list of possible values in /proc/sys/kernel/sysrq: 0 - disable sysrq completely 1 - enable all functions of sysrq >1 - bitmask of allowed sysrq functions (see below for detailed function description): 2 = 0x2 - enable control of console logging level 4 = 0x4 - enable control of keyboard (SAK, unraw) 8 = 0x8 - enable debugging dumps of processes etc. 16 = 0x10 - enable sync command 32 = 0x20 - enable remount read-only 64 = 0x40 - enable signalling of processes (term, kill, oom-kill) 128 = 0x80 - allow reboot/poweroff 256 = 0x100 - allow nicing of all RT tasks You can set the value in the file by the following command: echo "number" >/proc/sys/kernel/sysrq