Archive

Posts Tagged ‘Nagios’

An unlikely correlation

February 5th, 2010 Jonathan No comments

I just spotted that my Nagios/RRD graphs of my home server are showing a strange correlation.

From these graphs, it seems that the higher the outdoor temperature, the more free memory the system has available. I’m sure this is just a coincidence, though…

Outdoor temperature

Free memory

Categories: Gadgets, Linux Tags: , , , ,

Various Nagios plugins

October 15th, 2009 Jonathan No comments

I’ve now written several Nagios plugins and submitted them all to MonitoringExchange.

Here’s a quick summary:

  • check_temper for monitoring the temperature with a TEMPer USB thermometer
  • check_kernel for checking that the currently running kernel on an RPM-based system is the most recent installed kernel (not necessarily the latest available kernel in the repository)
  • check_aql_balance for monitoring the number of SMS text message credits on your AQL account[1]
  • check_k8temp for monitoring the temperature of an AMD K8 (e.g. Athlon or Sempron) CPU

[1] See my blog post if you are interested in setting up SMS alerts with Nagios

Nagios plugin for TEMPer USB thermometer

October 12th, 2009 Jonathan 2 comments

As I said in a previous post, I finally got my TEMPer USB thermometer to work on Fedora, thanks to a patch by Tollef Fog Heen that has now been incorporated into the Fedora kernel.

I’m not familiar with C so I only made minor tweaks to Tollef Fog Heen’s code, which returns a temperature as a number. I wrote a wrapper in Perl that crudely interfaces this program to Nagios. In reality, I should wise up on my C a little and write the whole thing in C. When I do this, I’ll submit it to Monitoring Exchange.

For the time being, I’ll publish my Nagios plugin on this blog, in the hope that it might be useful to someone, despite being incredibly hacky.

First you’ll need the code for the program that reads the temperature from the TEMPer. Compile it like this:

g++ -o get_temper TEMPer2.c

Note that the path to the TEMPer device is hard-coded in the C If yours isn’t at /dev/ttyUSB0 then you’ll need to change the source before compiling.

Then download my Nagios plugin (check_temper), and put both the plugin and the program get_temper in your Nagios plugin directory. This is likely to be /usr/local/nagios/libexec if you built from source, and /usr/lib/nagios/plugins if you installed from RPM in the Fedora repository.

Now all you have to do is the usual Nagios magic for adding any other plugin. Simple!

Update

Forget all that you’ve read above! I’ve now rewritten the entire plugin in C, so there is no need for the perl wrapper. You can download it from MonitoringExchange.

Setting up NRPE remote Linux monitoring with Nagios

August 18th, 2009 Jonathan No comments

This a short and simple guide, explaining how to set up remote monitoring of Linux hosts using NRPE in Nagios. The procedure is simple, but having searched for information on this earlier today I didn’t find a straightforward all-inclusive guide, so I’ve written my own.

These instructions were written with Nagios 3.0.6, and they assume that you already have a working Nagios monitoring server. They assume that the monitoring server was installed from RPM, not from source (some paths will vary).

Configuring the remote server

First, we install the NRPE on the remote server to be monitored. This comes as standard in the Fedora repositories, but on CentOS you’ll need to add the EPEL repository first.

yum install nrpe

We’ll need to make one or two changes to get it working. First open up /etc/nagios/nrpe.cfg and find the allowed_hosts directive. Replace it with the IP address of your Nagios monitoring server.

allowed_hosts=123.123.123.123

Edit your /etc/sysconfig/iptables and add a line to allow port 5666/TCP from the monitoring server’s IP address.

-A INPUT -m tcp -p tcp -s 123.123.123.123--dport 5666 -j ACCEPT

Finally, restart iptables and start NRPE to get it working. We also tell NRPE to start on boot.

service iptables restart
service nrpe start
chkconfig nrpe on

Configuring the Nagios server

Edit your commands.cfg (usually in /etc/nagios/objects/ if you installed from RPM) and add the following command definition:

define command{
        command_name    check_nrpe
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }

If this is your first remote Linux host to monitor, create a new host definition file in the same directory as commands.cfg, e.g. linux.cfg. Make a host definition for your new server:

define host{
        use                     linux-server
        host_name               myserver
        alias                   My Server
        address                 234.234.234.234
        }

Add the following to it as a test to show it works:

define service{
        use                         generic-service
        host_name                   myserver
        service_description         PING
        check_command               check_ping!100.0,20%!500.0,60%
        }

define service{
        use                         generic-service
        host_name                   yourserver
        service_description         Load
        check_command               check_nrpe!check_load
        }

Restart Nagios and ensure that both tests work OK. If so, we can move on to some custom test.

Custom checks

The default NRPE client comes with a handful of built-in tests. You can see these near the bottom of nrpe.cfg on your remote machine. But they’re not very exciting, and you’ll probably want to use some of the other checks. If you want to see a list of the available checks in your yum repo, try this:

yum list available nagios-plugins-*

Install any that take your fancy. You’ll need to set up a definition for them in your nrpe.cfg. Use the examples in the file, and try running the Nagios plugin yourself to see if it gives you any clues about the arguments it wants.

Please note, in the default config of NRPE, you cannot use placeholders like $ARG1$, for security reasons. Either hardcode the values in, like

command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1

or enable dont_blame_nrpe=1 further up in the file. There is a security risk associated with doing this. Your funeral!

Restart NRPE again, and let’s move on to setting up your Nagios server. There is no need to create a new command definition, since we are using NRPE again. So open up linux.cfg and let’s add a service definition for the check_hda1 that exists in nrpe.cfg.

define service{
        use                             generic-service
        host_name                       myserver
        service_description             Disk status
        check_command                   check_nrpe!check_hda1
        }

Restart Nagios again and your new checks should appear. Go ahead and install any useful plugins from your yum repository, or have a look at Monitoring Exchange, a great source of free Nagios plugins.

I wrote my own plugins for monitoring your account balance with AQL and checking for the latest installed kernel. One day I will probably get round to uploading them to Monitoring Exchange.

Categories: Linux, Nagios Tags: , , ,

Checking for the latest kernel with Nagios

August 17th, 2009 Jonathan No comments

I’ve just written a module for Nagios that will determine if the currently running kernel is the latest kernel available on the system. It will not tell you if there is a newer kernel in a yum repository or similar.

The main gotcha is that you need an RPM-based system for my script to work, e.g. RHEL, CentOS, Fedora and many others. It is most certainly not bulletproof, but it works on my systems.

All feedback welcome.

N.B. I’ve now published this module on Monitoring Exchange. Please download the plugin from there, as I will keep that copy up to date if there are changes in the future (and the copy on this page is likely to go out of date).

check_kernel

#!/usr/bin/perl -w

# Usage:   check_kernel

use strict;
use lib "/usr/local/nagios/libexec";
use utils qw(%ERRORS);

my $running_kernel=`uname -r`;
my $installed_kernel=`rpm -q kernel | tail -n 1`;
my $rpm = `which rpm`;

chomp $running_kernel;
chomp $installed_kernel;

if ($rpm =~ m/no rpm in/i) {
   print "UNKNOWN - You must be running an RPM-based system\n";
   exit $ERRORS{'UNKNOWN'};
}

if (!defined $running_kernel || !defined $installed_kernel) {
   print "UNKNOWN - Test failed\n";
   exit $ERRORS{'UNKNOWN'};
}

# Strip off the "kernel-" prefix so the strings will match
$installed_kernel =~ s/kernel-//gi;

# Do the test
if ($running_kernel eq $installed_kernel) {
   print "OK - running latest installed kernel ($running_kernel)\n";
   exit $ERRORS{'OK'};
} else {
   print "WARNING - reboot to run latest installed kernel ($installed_kernel)\n";
   exit $ERRORS{'WARNING'};
}
Categories: Guides, Linux, Nagios Tags: , , ,

Monitoring AQL SMS credit with Nagios

August 12th, 2009 Jonathan No comments

Further to yesterday’s post about setting up SMS alerts from Nagios, I decided I wanted to monitor how many SMS credits I have left in my account.

AQL provide a way of finding out via an HTTP request, so I set about writing a perl module to check and return the result to Nagios.

N.B. I’ve now published this module on Monitoring Exchange. Please download the plugin from there, as I will keep that copy up to date if there are changes in the future (and the copy on this page is likely to go out of date).

check_aql_balance

#! /usr/bin/perl -w
# Usage: check_aql_balance [username] [password] [warning] [critical]
# Example: check_raid fred bloggs 100 50
#         WARNING Balance 23 credits

use strict;
use LWP::Simple;
use lib "/usr/local/nagios/libexec";
use utils qw(%ERRORS);

my $username = $ARGV[0];
my $password = $ARGV[1];
my $warningval;
my $criticalval;
$warningval = $ARGV[2] or $warningval = 20;
$criticalval = $ARGV[3] or $criticalval = 10;
$warningval =~ s/[^0-9]//gi;
$criticalval =~ s/[^0-9]//gi;

if (!defined $username || !defined $password) {
    print "UNKNOWN, Unable to retrieve account balance\n";
    exit $ERRORS{'UNKNOWN'};
}

my $url = "http://gw1.aql.com/sms/postmsg.php?username=$username&password=$password&cmd=credit";
my $content = get $url;

if (!defined $content) {
    print "UNKNOWN, Unable to retrieve account balance\n";
    exit $ERRORS{'UNKNOWN'};
} elsif ($content =~ m/AUTHERROR/i) {
    print "UNKNOWN, Unable to retrieve account balance\n";
    exit $ERRORS{'UNKNOWN'};
}

$content =~ s/[^0-9]//gi;
if ($content >=0) {
    if ($content < $criticalval) {
        # critical
        print "CRITICAL, Balance $content credits\n";
        exit $ERRORS{'CRITICAL'};
    } elsif ($content < $warningval) {
        # warning
        print "WARNING, Balance $content credits\n";
        exit $ERRORS{'WARNING'};
    } else {
        # ok
        print "OK, Balance $content credits\n";
        exit $ERRORS{'OK'};
    }
} else {
    # invalid number
    print "UNKNOWN ,Unable to retrieve account balance\n";
    exit $ERRORS{'UNKNOWN'};
}

The only required arguments are the AQL username and password, but you can optionally specify the limits that trigger Warning or Critical status. If you omit these, the script defaults to values of 20 and 10.

In your commands.cfg, add a block like this to define the command. Again, you can omit the last 2 parameters if you are happy with the defaults..

define command{
    command_name    check_aql_balance
    command_line    $USER1$/check_aql_balance $ARG1$ $ARG2$ $ARG3$ $ARG4$
}

And finally, in the localhost.cfg (or any other config file for hosts/services) you can add the service like this.

define service{
    use                             local-service
    host_name                       localhost
    service_description             AQL account balance
    check_command                   check_aql_balance!fred!bloggs!20!10
    notifications_enabled           1
}

Simples!

Categories: Guides, Linux, Nagios Tags: , ,

SMS alerts with Nagios

August 11th, 2009 Jonathan 2 comments

In a previous post I mentioned how easy it is to increase functionality in Nagios.

Today I was asked to set up SMS alerts in Nagios, as well as the existing email alerts. I am by no means the first person to write about this, but this post is intended to be a start-to-finish guide, covering all aspects.

Choosing a provider

The first step is choosing a provider which has a decent API for sending SMS messages. I chose AQL, as I have used them in the past. They allow you to send SMS messages via a web interface, HTTP GET, HTTP POST, or email.

In that way, perhaps the easiest way to get SMS alerts is to get Nagios to email its alerts to the AQL SMS gateway. But I wanted to do it “properly”.

So I signed up for an account with AQL and bought a small number of SMS credits for the account. It’s also possible to have a contract for heavy usage, but I can always upgrade to that if I need to.

Defining a new method of alerting in Nagios

In the file /usr/local/nagios/etc/objects/commands.cfg there is a section where notification commands are defined. I added a couple more definitions for SMS alerts for hosts and services. My SMS script would have a calling syntax like /path/to/script phone_number message.

So I added a couple of definitions like this, obviously using a real mobile phone number:

define command{
command_name alert-service-by-sms
command_line /usr/local/nagios/libexec/alert-by-sms 01234567890 "Nagios Service $NOTIFICATIONTYPE$ Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$"
}

define command{
command_name alert-host-by-sms
command_line /usr/local/nagios/libexec/alert-by-sms 01234567890 "Nagios Host $NOTIFICATIONTYPE$ Alert: $HOSTALIAS$ is $HOSTSTATE$"
}

The script

Now all remains is to write the script that will do the legwork. If you decide to go with AQL as your provider, you need to install their Perl module from CPAN. Use a command like this:

cpan SMS::AQL

And then make a Perl script like this. You can save it anywhere you like; I chose to put it with the other Nagios executables in /usr/local/nagios/libexec just to keep it with the rest. Adjust the username and password to match your AQL credentials, and set the sender parameter to be either a UK mobile number (so the recipient can reply to the message) or simply a text string which appears as a name to the recipient, and does not allow them to reply.

#!/usr/bin/perl -wT

use strict;
use SMS::AQL;

my $to = $ARGV[0];
my $msg = $ARGV[1];
$to =~ s/[^0-9]//gi;

my $sms = new SMS::AQL({
username => 'fred',
password => 'bloggs',
options => {
sender => 'Nagios',
},
});

my ($ok, $error) = $sms->send_sms($to, $msg);
if (!$ok) {
print "Failed to send the message, error: $error\n";
} else {
print "Success!\n";
}

It is, of course, wise to test that your script works by calling it from the command line. Once you’re happy it works, it’s time to tell Nagios to start sending alerts.

Enabling SMS alerts

This time, we need to edit /usr/local/nagios/etc/objects/contacts.cfg. Modify your contact entry to add the lines in bold. This assumes you have only one user with a mobile phone – remember their mobile number is hard-coded into the command definition!

If you have more than one user and you set alert-service-by-sms or alert-host-by-sms for both, you will get two text messages sent to the same phone number for each Nagios alert. As I only have one user it’s not a problem for me, but in the future I will probably post a more elegant solution where each user can get an individual text message.

define contact{
contact_name                    jonathan
use                             generic-contact
alias                           Jonathan
email                           alerts@example.com
service_notification_commands   alert-service-by-sms
host_notification_commands      alert-host-by-sms

}

And that should be everything! Now you have to test it, either by breaking a host or service, or setting up a bogus one that will definitely fail.

Categories: Guides, Linux, Nagios Tags: , ,

On Nagios

August 11th, 2009 Jonathan No comments

The things I’m about to say will almost certainly be common knowledge to anyone who has used Nagios before, but I’ll say them anyway.

In short, Nagios is a network/server monitoring tool. It’s web based and can monitor almost any network device. It comes with dozens of modules included, to monitor things with something as simple as a ping, to more complicated tests such as executing MySQL queries.

A friend of mine runs a web design and hosting business, and I look after his Linux boxes and some aspects of his network infrastructure in the data centre. He asked me to look into a monitoring tool for his various devices, which include Windows servers, Linux servers, managed switches, routers and firewalls, a couple of NAS boxes and some data centre kit such as an IP KVM and PDU.

I had heard of Nagios before but never used it, so I thought I’d give it a go. I was delighted at how easy it was to compile and install on CentOS, and to get a handful of basic tests set up on localhost. A small amount of fiddling later, I had the majority of the “advanced” tests set up, such as monitoring of HTTP, FTP, MySQL and other services on the Windows boxes. A slightly larger amount of fiddling later and I was interrogating several of the infrastructure devices for SNMP.

I was also very pleased at how modular and extendible the system is. Each test is simply defined in a config file, and an appropriate executable for the test is provided. By “executable” I mean anything that can be called by Nagios, and provides a return code for yay or nay. Many of the included executables were binary files, but I found many free downloadable modules online, many written in Perl. I have written some of my own in bash and Perl.

I’ve also downloaded other extensions, such as the ability to have an RSS feed of status alerts.

If any of the Nagios developers read this, well done, and keep up the good work. This is the ideal tool – quick to set up, yet with endless possibility for expansion once you have a little familiarity.

Categories: Linux, Nagios Tags: , ,