Linux Shell Scripts
Monitoring Linux with Nagios
How to parse SNMP traps
Example 1
This is an example of how to configure snmptrapd to parse MIBS from 3rd party Vendor.
By default, Net-SNMP clients and servers only understand a set of default MIBs.
Identify the MIBs for your platform/product.
Will choose by instance this one and in particular the alertTrap block:
alertTrap TRAP-TYPE
ENTERPRISE tripwireEnterprise
VARIABLES { twNodeName, twRuleName, twElementName, twSeverity,
twNodeId, twRuleId, twElementId, twChangeType }
DESCRIPTION
"This event is generated when Tripwire detected a
change in an element. The event includes the origin
node, the rule that detected the change, the element
for which the change was detected, and the severity
of the change as indicated by the rule."
::= 0
Install them into your $MIBDIR, usually $PREFIX/share/snmp/mibs/
/usr/share/snmp/mibs/TRIPWIRE-ENTERPRISE.MIB.txt
Update the snmptrapd configuration file (/etc/snmp/snmptrapd.conf)
traphandle TRIPWIRE-ENTERPRISE-MIB::tripwireEnterprise /usr/sbin/mparser.sh
Restart the snmptrapd network service;
Here's the skeleton of the bash script /usr/sbin/mparser.sh:
DATE_VERBOSE="$(date +"%Y.%m.%d %H:%M:%S (%a)")"
while read oid val; do
case "$oid" in
TRIPWIRE-ENTERPRISE-MIB::*)
eval ${oid/*::/}="'$(echo "$val")'" ;;
SNMP-COMMUNITY-MIB::snmpTrapAddress.0)
snmpTrapAddress="$val" ;;
*) ;;
esac
done
# remove heading string "Wrong Type (should be IpAddress):"
[[ "$twNodeName" =~ ^Wrong\ Type ]] && twNodeName="${twNodeName/*\ /}"
(
echo -e "\
[$DATE_VERBOSE - MIB alertTrap]
snmpTrapAddress\t$snmpTrapAddress"
for snmptrapvar in \
twNodeName twRuleName twElementName twSeverity twChangeType; do
echo -en "$snmptrapvar\t"; echo "${!snmptrapvar}"
done
echo
) >> /var/log/mibparser.log
In the log you'll find a list of blocks similar to the following one:
[2012.08.31 22:15:30 (Fri) - MIB alertTrap] snmpTrapAddress 192.168.1.100 twNodeName "server01.domain.com" twRuleName "RHEL System Configuration Files" twElementName "/etc/passwd" twSeverity 100 twChangeType 2
Example 2
This example explains how to configure the Linux snmptrapd daemon (net-snmp) for catching a specific SNMP trap sent by a QRadar cluster when the master node has switched from one node another one.
The QRadar MIB follow.
systemTrapsGroup NOTIFICATION-GROUP NOTIFICATIONS { xcbLogin, xcbLoginFailure, xcbLogout, xcbConfigChange, xcbAlert, xcbError, xcbBackupFailed, xcbArchiveFailed, xcbDBError, xcbLimitReached, xcbHaNodeChanged, xcbTimestampError, xcbTimeSyncLost, xcbRaidStatus, xcbHWError, xcbFirmwareTainted, xcbDiskFull } STATUS current DESCRIPTION "System related traps ::= { groups 1 } xcbHaNodeChanged NOTIFICATION-TYPE OBJECTS { description, node } STATUS current DESCRIPTION "HA node state changed" ::= { traps 12 }
One line has to be added to the snmptrapd configuration file (/etc/snmp/snmptrapd.conf)
traphandle XCB-SNMP-MIB::xcbHaNodeChanged /usr/sbin/qradar_trap_handler.sh
So a shell script will be executed each time an SNMP trap will be catched by the snmptrad daemon and Nagios will be notified via en external command. Here's the script.
#!/bin/bash # Send a notification to Nagios when an SNMP trap # "XCB-SNMP-MIB:xcbHaNodeChanged" is received. # Copyright (C) 2014 Davide Madrisan <davide.madrisan@gmail.com> [ $(id -u) -eq 0 ] || { echo "${0##*/}: This script requires root. Aborting..." 1>&2 && exit 1; } NOW=`date +%s 2>/dev/null` NAGIOS_CMDFILE="/var/spool/nagios/cmd/nagios.cmd" LOGFILE="/var/log/snmptrapd_qradar_node_changed.log" HOST_NAME="QRADAR_VIP" SERVICE_NAME="SNMPTRAP_QRADARP_NODE_CHANGED" RETCODE="2" MSG="CRITICAL: Node has changed" # for debug only - this block can be removedwhile test -n "$1"; do case "$1" in --debug|-d) RETCODE="0" MSG="OK: No SNMP traps catched" ;; esac shift done ( /usr/bin/printf "\ [%lu] PROCESS_SERVICE_CHECK_RESULT;$HOST_NAME;$SERVICE_NAME;0;$MSG\n" $NOW \ > $NAGIOS_CMDFILE ) >> $LOGFILE 2>&1
See the Nagios documentation about externalcommands for more informations.
We can simulate such a SNMP trap by executing the follow command on the Nagios poller command line:
/usr/bin/snmptrap -v 2c -c public 127.0.0.1 '' XCB-SNMP-MIB::xcbHaNodeChanged
The nagios log will reports a couple of lines:
[1395307951] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;\ QRADAR_VIP;SNMPTRAP_QRADARP_NODE_CHANGED;0;CRITICAL: Node has changed [1395307956] PASSIVE SERVICE CHECK: QRADAR_VIP;SNMPTRAP_QRADARP_NODE_CHANGED;\ 2;CRITICAL: Node has changed
Nagios/NRPE plugin skeleton
#!/bin/bash
# check_something.sh: Check something...
# Copyright (C) 2014 ... <...@...>
PROGNAME=`/bin/basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION="<SCRIPT-VERSION>"
. $PROGPATH/utils.sh
print_usage() {
echo "Usage: $PROGNAME <...plugin-options...>"
echo " $PROGNAME --help"
echo " $PROGNAME --version"
echo
}
print_help() {
echo "$PROGNAME v$REVISION"
echo "Nagios Plugin: check something..."
echo "Copyright (C) 2012 ... <...@...>"
echo ""
print_usage
}
# Make sure the correct number of command line
# arguments have been supplied
if [ $# -lt 1 ]; then
print_usage
exit $STATE_UNKNOWN
fi
# Set default values
SCRIPT_OPTION=
EXIT_STATUS=$STATE_UNKNOWN #default
# Grab the command line arguments
while test -n "$1"; do
case "$1" in
--help|-h)
print_help
exit $STATE_OK
;;
--version|-V)
echo "$PROGNAME v$REVISION"
exit $STATE_OK
;;
--<long-option>|-<short-option>)
SCRIPT_OPTION="$2"
shift
;;
*)
echo "Unknown argument: $1"
print_usage
exit $STATE_UNKNOWN
;;
esac
shift
done
# <...SCRIPT-CODE...>
# if OK:
echo "OK"; EXIT_STATUS=$STATE_OK
# if WARNING:
echo "WARNING: <WARNING-MESSAGE>"; EXIT_STATUS=$STATE_WARNING
# if critical:
echo "CRITICAL: <CRITICAL-MESSAGE>"; EXIT_STATUS=$STATE_CRITICAL
exit $EXIT_STATUS
Nagios/NSCA plugin skeleton
#!/bin/bash
# monitor_something.sh: Send alerts to Nagios ...
# Copyright (C) 2014 ... <...@...>
NSCA_HOSTNAME="SERVER_NAME"
NSCA_TARGET_IP="192.168.1.1" # this is the nagios IP address
NSCA_SERVICE_NAME="NAME-OF-THE-NAGIOS-PASSIVE-SERVICE"
NSCA_MESSAGE_OK="this is an OK message"
NSCA_MESSAGE_CRITICAL="this is a CRITICAL message"
LOG="/var/log/nsca_logging.log"
NSCA_STATE_OK=0
NSCA_STATE_CRITICAL=2
# <code-that-do-something-and-set-RET>
if [ "$RET" = 0 ]; then
echo -e "\
$NSCA_HOSTNAME\t$NSCA_SERVICE_NAME\t$NSCA_STATE_OK\tOK: $NSCA_MESSAGE_OK" | \
/usr/sbin/send_nsca $NSCA_TARGET_IP -c /etc/nagios/send_nsca.cfg \
>> $LOG 2>&1
else
echo -e "\
$NSCA_HOSTNAME\t$NSCA_SERVICE_NAME\t$NSCA_STATE_CRITICAL\tCRITICAL: $NSCA_MESSAGE_CRITICAL" | \
/usr/sbin/send_nsca $NSCA_TARGET_IP -c /etc/nagios/send_nsca.cfg \
>> $LOG 2>&1
fi
Monitoring Linux boxes via SNMP
#!/bin/bash
# check the health of a Linux appliance
# Copyright (C) 2013-2014 Davide Madrisan <davide.madrisan@gmail.com>
PROGNAME=`/bin/basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION=
. $PROGPATH/utils.sh
print_usage() {
echo "Usage: $PROGNAME -H <hostname> -C <snmp-community> --check <action>"
echo " $PROGNAME --help"
echo " $PROGNAME --version"
echo " $PROGNAME -H targetsrv -C public --check uptime"
echo " $PROGNAME -H targetsrv -C public --check disk:37"
echo " $PROGNAME -H targetsrv -C public --check netif:1"
echo
echo "List of known checks:"
echo " disk"
echo " load15"
echo " memory"
echo " netif"
echo " swap"
echo " uptime"
}
print_help() {
echo "$PROGNAME v$REVISION"
echo "Nagios Plugin: check the health of a Linux appliance"
echo "Copyright (C) 2013 Davide Madrisan <davide.madrisan@gmail.com>"
echo
print_usage
}
EXIT_STATUS=$STATE_UNKNOWN #default
WARN=0
CRIT=0
# Grab the command line arguments
while test -n "$1"; do
case "$1" in
--help|-h)
print_help
exit $STATE_OK
;;
--version|-V)
echo "$PROGNAME v$REVISION"
exit $STATE_OK
;;
--host|-H)
HOST="$2"; shift
;;
--community|-C)
COMMUNITY="$2"; shift
;;
--check)
SNMP_CHECK="$2"; shift
;;
--warning|-w)
WARN="$2"; shift
;;
--critical|-c)
CRIT="$2"; shift
;;
*)
echo "Unknown argument: $1"
print_usage
exit $STATE_UNKNOWN
;;
esac
shift
done
[ "$SNMP_CHECK" ] || { print_usage; exit $STATE_UNKNOWN; }
die {
echo "CRITICAL: ${1:-SNMPWALK error while querying $HOST}"
exit $STATE_CRITICAL
}
snmp_helper() {
local OID="$1" snmp_opts="Ov"
[ "$2" = "alltokens" ] || snmp_opts="${snmp_opts}q"
snmp_output="$(\
LC_ALL=C \
/usr/bin/snmpwalk -v2c -c $COMMUNITY -$snmp_opts $HOST $OID 2>/dev/null)"
case "$snmp_output" in
"No Such Instance currently exists at this OID"|"") die ;;
*) echo "$snmp_output" ;;
esac
}
round0() { printf "%.0f" "$1"; }
case "$SNMP_CHECK" in
disk*)
[ "${SNMP_CHECK/*:}" = "$SNMP_CHECK" ] &&
{ echo "UNKNOWN: no filesystem set! (configuration BUG)";
exit $STATE_UNKNOWN; }
snmp_diskid="${SNMP_CHECK/*:}"
mountpoint="$(\
snmp_helper HOST-RESOURCES-MIB::hrStorageDescr.${snmp_diskid})" || die
disk_size_total="$(\
snmp_helper HOST-RESOURCES-MIB::hrStorageSize.${snmp_diskid})" || die
disk_size_used="$(\
snmp_helper HOST-RESOURCES-MIB::hrStorageUsed.${snmp_diskid})" || die
disk_used_perc=$(( (100 * $disk_size_used) / $disk_size_total ))
if [ $CRIT -ne 0 -a $CRIT -le $disk_used_perc ]; then
EXIT_STATUS=$STATE_CRITICAL
echo -n "CRITICAL: "
elif [ $WARN -ne 0 -a $WARN -le $disk_used_perc ]; then
EXIT_STATUS=$STATE_WARNING
echo -n "WARNING: "
else
EXIT_STATUS=$STATE_OK
echo -n "OK: "
fi
echo -n "\
Filesystem $mountpoint: $disk_used_perc% used \
($disk_size_used / $disk_size_total) | $mountpoint=$disk_used;"
[ "$WARN" -eq 0 ] && echo -n ";" || \
echo -n "$(($WARN * $disk_size_total / 100));"
[ "$CRIT" -eq 0 ] && echo -n ";" || \
echo -n "$(($CRIT * $disk_size_total / 100));"
echo "0;$disk_size_total"
;;
load15)
load15="$(snmp_helper UCD-SNMP-MIB::laLoad.3)" || die
load15_rounded="$(round0 $load15)"
if [ $CRIT -ne 0 -a $CRIT -le $load15_rounded ]; then
EXIT_STATUS=$STATE_CRITICAL
echo -n "CRITICAL: "
elif [ $WARN -ne 0 -a $WARN -le $load15_rounded ]; then
EXIT_STATUS=$STATE_WARNING
echo -n "WARNING: "
else
EXIT_STATUS=$STATE_OK
echo -n "OK: "
fi
echo -n "load average (15min): $(round0 $load15) | load_15min=$load15_rounded;"
[ "$WARN" -eq 0 ] && echo -n ";" || echo -n "$WARN;"
[ "$CRIT" -eq 0 ] && echo -n ";" || echo -n "$CRIT;"
echo
;;
memory)
ram_total="$(snmp_helper UCD-SNMP-MIB::memTotalReal.0)" || die
ram_free="$(snmp_helper UCD-SNMP-MIB::memAvailReal.0)" || die
ram_cache="$(snmp_helper UCD-SNMP-MIB::memAvailCached.0)" || die
ram_used=$(( $ram_total - ($ram_free + $ram_cache) ))
ram_used_perc=$(( (100 * $ram_used) / $ram_total ))
if [ $CRIT -ne 0 -a $CRIT -le $ram_used_perc ]; then
EXIT_STATUS=$STATE_CRITICAL
echo -n "CRITICAL: "
elif [ $WARN -ne 0 -a $WARN -le $ram_used_perc ]; then
EXIT_STATUS=$STATE_WARNING
echo -n "WARNING: "
else
EXIT_STATUS=$STATE_OK
echo -n "OK: "
fi
echo -n "\
RAM used: $ram_used_perc% ($ram_used / $ram_total) | ram_used=$ram_used;"
[ "$WARN" -eq 0 ] && echo -n ";" || \
echo -n "$(( $WARN * $ram_total / 100 ));"
[ "$CRIT" -eq 0 ] && echo -n ";" || \
echo -n "$(( $CRIT * $ram_total / 100 ));"
;;
netif*)
[ "${SNMP_CHECK/*:}" = "$SNMP_CHECK" ] &&
{ echo "UNKNOWN: no network interface set! (configuration BUG?)"
exit $STATE_UNKNOWN; }
snmp_netifid="${SNMP_CHECK/*:}"
netif_ifName="$(snmp_helper IF-MIB::ifName.${snmp_netifid})" || die
netif_ifOperStatus="$(snmp_helper IF-MIB::ifOperStatus.${snmp_netifid})" \
|| die
netif_ifInOctets="$(snmp_helper IF-MIB::ifInOctets.${snmp_netifid})" \
|| die
netif_ifInErrors="$(snmp_helper IF-MIB::ifInErrors.${snmp_netifid})" \
|| die
netif_ifOutOctets="$(snmp_helper IF-MIB::ifOutOctets.${snmp_netifid})" \
|| die
netif_ifOutErrors="$(snmp_helper IF-MIB::ifOutErrors.${snmp_netifid})" \
|| die
case "$netif_ifOperStatus" in
down*)
EXIT_STATUS=$STATE_CRITICAL
echo -n "CRITICAL: network interface '$netif_ifName' is DOWN"
;;
up*)
EXIT_STATUS=$STATE_OK
echo -n "OK: network interface '$netif_ifName' is UP"
;;
*)
echo "UNKNOWN: unexpected SNMP output: $netif_ifOperStatus"
exit $STATE_UNKNOWN
;;
esac
echo " | \
${netif_ifName}_ifInOctets=$netif_ifInOctets \
${netif_ifName}_ifInErrors=$netif_ifInErrors \
${netif_ifName}_ifOutOctets=$netif_ifOutOctets \
${netif_ifName}_ifOutErrors=$netif_ifOutErrors"
;;
swap)
swap_total="$(snmp_helper UCD-SNMP-MIB::memTotalSwap.0)" || die
swap_free="$(snmp_helper UCD-SNMP-MIB::memAvail.0)" || die
if [ $swap_total -eq 0 ]; then
swap_used=0
swap_used_perc=0
else
swap_used=$(( $swap_total - $swap_free ))
swap_used_perc=$(( (100 * $swap_used) / $swap_total ))
fi
if [ $CRIT -ne 0 -a $CRIT -le $swap_used_perc ]; then
EXIT_STATUS=$STATE_CRITICAL
echo -n "CRITICAL: "
elif [ $WARN -ne 0 -a $WARN -le $swap_used_perc ]; then
EXIT_STATUS=$STATE_WARNING
echo -n "WARNING: "
else
EXIT_STATUS=$STATE_OK
echo -n "OK: "
fi
echo -n "SWAP used: $swap_used_perc ($swap_used / $swap_total) | swap_used=$swap_used;"
[ "$WARN" -eq 0 ] && echo -n ";" || \
echo -n "$(( $WARN * $swap_total / 100 ));"
[ "$CRIT" -eq 0 ] && echo -n ";" || \
echo -n "$(( $CRIT * $swap_total / 100 ));"
echo "0;$swap_total"
;;
uptime)
uptimemsg="$(snmp_helper DISMAN-EVENT-MIB::sysUpTimeInstance alltokens)" \
|| die
echo "UPTIME: $uptimemsg"
EXIT_STATUS=$STATE_OK
;;
*)
echo "UNKNOWN: unsupported SNMP check: $SNMP_CHECK"
exit $STATE_UNKNOWN
;;
esac
exit $EXIT_STATUS
IP Subnet Calculator
#!/bin/bash
# NETCMPT (netcompute) v2.1.2
# Copyright (C) 2001-2004 Davide Madrisan <davide.madrisan@gmail.com>
# NETCP is free software; you can redistribute it and/or modify it under the
# terms of the GNU General public License as published by the Free Software
# Foundation; either version 2 of the License, or (at your option) any later
# version. This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# See the GNU General Public License for more details.
script_name="netcmpt"
version_num="2.1.2"
function usage() {
echo "
$(echo $script_name | tr a-z A-Z) (NETcompute), version $version_num
Computes mask, wildcard mask, network and broadcast addresses
Copyright (C) 2004 Davide Madrisan <davide.madrisan@gmail.com>
Usage: $script_name <ipv4_address ipv4_mask>
$script_name <ipv4_address/bitmask>
$script_name --help
Examples are:
$script_name 192.168.229.161 255.255.248.0
$script_name 192.168.229.161/21
For bugs and suggestions, please contact me by e-mail"
exit 0
}
# Check $1 and $2 (if not empty) for syntax errors.
# The first or the first two args must contain a valid ipv4 address.
# If no erros are detected the function returns 0 and set
# 'ip', 'mask_bit', and 'mask'. Otherwhise it returs 1.
function check_args() {
# is $1 == "N1.N2.N3.N4" or == "N1.N2.N3.N4/M"?
if ! echo $1 | grep -Eq "^([0-9]+\.){3}[0-9]+(/[0-9]+){0,1}$"; then
return 1
fi
ip=$(echo $1 | cut -f1 -d/)
for i in 1 2 3 4; do
local byte=$(echo $ip | cut -f$i -d.)
[ $byte -le 255 ] || return 1
done
mask_bits=$(echo $1 | cut -f2 -s -d/)
if [ $mask_bits ]; then # the bitmask has been provided
[ $mask_bits -le 32 ] || return 1
# fill `mask' with the extended version of the given mask
# i.e. mask_bits = 24 --> mask = 255.255.255.0
local mb=$mask_bits
for i in 1 2 3 4; do
if [ $mb -ge 8 ]; then
mask=$mask"255"
let "mb -= 8"
else
mask=$mask$[256 - (2 << (7-$mb))]
[ $mb -gt 0 ] && let mb=0
fi
[ $i -lt 4 ] && mask=$mask"."
done
else
if ! echo $2 | grep -Eq "^([0-9]+\.){3}[0-9]+$"; then
return 1
mask=$2
local prev_byte=255
local prev_bit=1
# calculate `mask_bits' using `mask'
let "mask_bits = 0"
for i in 1 2 3 4; do
local byte=$(echo $mask | cut -f$i -d.)
[ $byte -gt 255 ] && return 1
# the previous byte != '255'? --> the current one must be 0'
[ $prev_byte -lt 255 -a $byte -ne 0 ] && return 1
for j in $(seq 7 -1 0); do
local bit_value=$[($byte >> j) & 1]
# the previous bit = '0'? --> the current one cannot be set to '1'
[ $prev_bit -eq 0 -a $bit_value -eq 1 ] && return 1
[ $bit_value -eq 1 ] && let mask_bits+=1
prev_bit=$bit_value
done
prev_byte=$byte
done
return 0
}
[ "$1" = "--help" -o "$1" = "-h" -o -z "$1" ] && usage
check_args $1 $2 ||
{ echo "\
ERROR: illegal IP address or mask
enter \`$script_name --help' if you need help"; exit 1; }
# calculate wildcard mask, broadcast and network addresses
for i in 1 2 3 4; do
ip_byte=$(echo $ip | cut -f$i -d.) # select the byte number 'i' from 'ip'
mask_byte=$(echo $mask | cut -f$i -d.) # same for 'mask'
let "ip_and_mask = $(($ip_byte & $mask_byte))"
let "mask_cplm = (255 - $mask_byte)"
# create the strings 'wildcard', 'broadcast', and 'network'
wildcard=$wildcard$mask_cplm
network=$network$ip_and_mask
broadcast=$broadcast$[ip_and_mask + mask_cplm]
if [ $i -lt 4 ]; then
network=$network"."
broadcast=$broadcast"."
wildcard=$wildcard"."
fi
done
echo "\
IP and mask $ip $mask (/$mask_bits)
wildcard mask $wildcard
network $network
broadcast $broadcast"