| Subcribe via RSS

Nagios: Monitor more with NRPE (Nagios Remote Plugin Executor)

July 13th, 2009 | 3 Comments | Posted in General Site, Nagios, Networking, Security

Previous reading:

Nagios set-up guide #1 – http://vkernel.co.uk/?p=36

Nagios set-up guide #2 - http://vkernel.co.uk/?p=41

Using NRPE – Nagios Remote Plugin Executor, as the name suggests, allows us to remote execute plugins to retrieve data which will be of some use to us in a monitoring capacity.

Normal check_nt! style Nagios commands work well but are pretty primitive. NRPE provides you with the framework to do much much more with your newly created monitoring solution, such as monitoring Microsoft Exchange functions, check the status and completion state of Veritas and Symantec BackupExec products, and lots of other neat little features.

There is very little point me re-iterating how to install NRPE, as i will be only spoon feeding you information from this fantastic guide which i learned from – written by Jonathan Lydard and Tony Reix of bull.net : http://nfsv4.bullopensource.org/doc/admin_tools/latex_doc/installNRPE.pdf

The basic gist of the guide, is to download the NRPE files from http://www.nagios.org/download and do a “tar -zxvf nagiosfilename.tar.gz”.

Then “cd nagios…”. Once inside the nagios directory, do a “./configure”, followed by a “make && make install” command.

Next, run “cp src/nrpe /usr/bin”, “cp src/check_nrpe /usr/local/nagios/libexec” and “cp sample-config/nrpe.cfg /etc”. This copies all the relevant files into directories which will allow it to play nice and integrate with the original Nagios install. Finally, you will need to start the NRPE service as a daemon by running the following command “nrpe -c /etc/nrpe.cfg -d”.

Finally, open up “/usr/local/nagios/etc/objects/commands.cfg” and add the line:

# ’check_nrpe’ command definition define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

This is all you need to do on your server to enable NRPE. Easy eh, and people say Linux is hard (sarcasm). If you have any dependencies make sure you read the error and either google it (people will have the same issue as you), or try “yum install missing-package” or “apt-get install missing-package” for RHEL/CentOS and Ubuntu respectively, where missing-package is the package in the error message – i.e. gcc.

Now, you will need to configure the remote server to listen for NRPE and to respond to it correctly. Firstly, check and make sure that TCP 5666/5667 on the external IP address is being forwarded to the LAN IP of your remote host i.e. your server. This is the port NRPE listens on by default (Nagios listens on 12489 by default). Once we have ensured connectivity, you will need to open up your NSC.ini file (C:/Program Files/NSClient++). Towards the bottom of the file you will see the following lines:

[NRPE Handlers]
;# COMMAND DEFINITIONS
;# Command definitions that this daemon will run.
;# Can be either NRPE syntax:
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
;# Or simplified syntax:
test=c:\test.bat foo $ARG1$ bar
;check_disk1=/usr/local/nagios/libexec/check_disk -w 5 -c 10
check_cpu=inject checkCPU warn=80 crit=90 5 10 15
...
nrpe_cpu=inject checkCPU warn=80 crit=90 5 10 15

These are the commands that Nagios via NRPE will be running, and the commands we will be specifying in our objects file. In order for the command to be retrievable, you need to uncomment it by deleting the “;” infront of it. Save the file and restart the “NSClient++” service via Services.msc (Start -> Run ->..).

On the Linux machine, you can now check NRPE connectivity to your host by running the commands:

cd /usr/local/nagios/libexec
./check_nrpe -H host.ip.add.ress -p 5666

You need to [c]hange [d]irectory libexec as thats where the check_nrpe executable is (so to speak). The check_nrpe then checks the external ip on port 5666 for NRPE feedback. You can check commands by doing:

./check_nrpe -H host.ip.add.ress -p 5666 -c nrpe_cpu

Which if working will return something like so:

OK CPU Load ok.|'5'=0%;80;90; '10'=0%;80;90; '15'=0%;80;90;

You can now add this NRPE connectivity into your monitoring of the server. If you navigate to “/usr/local/nagios/etc/objects/theserver.cfg” (where theserver is name of server), and add a new service to monitor a newly created NRPE command (in our example nrpe_cpu). You need to format it like so:

define service{
 use                     generic-service
 host_name               External Server on Site 1
 service_description     CPU Performance via NRPE
 check_command           check_nrpe!nrpe_cpu
 }

NOTE: The check_command is a pitfall for many. You need to specify it with “check_nrpe![command_specified_on_remote_server]“. Only an exclamation mark, nothing else. Once you have reloaded Nagios, the NRPE CPU will now be added to the list of items monitored on your server.

You can take this one step further, and monitor lots of other cool things. In this blog, I will talk about how I have setup Nagios to monitor Symantec and Veritas BackupExec scheduled backups.

Monitoring Symantec and Veritas BackupExec (V10 and V12) Backups with Nagios and NRPE

First of all, you need to navigate to the page http://exchange.nagios.org/directory/Plugins/Backup-and-Recovery/BackupExec/Symantec-BackupExec-job-check/details on the server where BackupExec is installed. You need to download the check_be.exe program and save it on C:/ root drive, i.e. the full path of check_be will be C:/check_be.exe (this is for simplicity). Secondly, you now need to go back into NSC.ini and where above we uncommented the default commands, add the command:

check_be=check_be.exe "C:\Program Files\Veritas\Backup Exec\Data" "Job Name" -w1 -c3

In this example we are using Veritas BackupExec, with a scheduled backup with a job name of “Job Name”. It is IMPERATIVE that this is correct, along with the C:/Progr… location. The location you need to put in here is the “..\Data\” folder within Symantec/Veritas\Backup Exec where all the .xml files are stored. Save the file, and restart the NSClient++ service via services.msc.

Now, on the “/usr/local/nagios/etc/objects/theserver.cfg” file (where theserver is your monitored server), we need to specify how we are going to monitor. We do this by the following entry:

define service{
host_name Server 1 on External Site
use generic-service
service_description BackupExec Job Check
check_command check_nrpe!check_be
normal_check_interval 60
retry_check_interval 30
register 0
}

This service will go off and check your backup status, and will return a warning for a failed backup or a green light for a successful backup. Save your .cfg and reload Nagios and go to your web interface to see your hard work in action. Next up I will go through advanced monitoring of Exchange 2003/2007 and mail servers in general.

Sam

Tags: , , ,

Nagios Remote WAN Monitoring: Multiple Servers, 1 IP

July 9th, 2009 | 1 Comment | Posted in General Site, Nagios

OK, Now you have a server being monitored over the WAN, you want to monitor the rest of the devices in that organisation; right? The problem here, is that you can only monitor whatever you forward port 12489 – wrong.

12489 is the port used by the default “check_nt!” command which is found in “commands.cfg” (towards the bottom).

# ‘check_nt’ command definition
define command{
command_name check_nt
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -s securepassword -v $ARG1$ $ARG2$
}

You will see here, the -p 12489 which is the port your nagios server will try to get data from on the external ip, i.e. if ExternalIP:12489 is going to Server1, you can bet that Server1 will be sending data back, not Server2.

The way to change this, is in the Server2 config, instead of using “check_nt!check_cpu”, use a seperate command that specifies to use a different port which will be port forwarded to Server2, for example port 10001.

In commands.cfg, you will need to create a “check_nt2!” if you like, and change the port like so:

define command{
command_name check_nt2
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 10001 -s securepassword -v $ARG1$ $ARG2$
}

Now, on the firewall at the client site, forward port 10001 to the IP of Server2. Finally, in the Windows-server.cfg file (or Server2.cfg etc), instead of defining commands using check_nt!, use the newly created check_nt2! command, like so:

define service{
use generic-service
host_name Server2
service_description Avast Update Service
check_command check_nt2!PROCSTATE!-d SHOWALL -l aswUpdSv.exe
}

This is all you need to do on the Nagios Server. Now, on SERVER2, you must install NSClient++ and configure etc as mentioned in my earlier blog. However, in NSC.ini, you need to edit the port that NSclient.dll listens on to the new port, 10001, like so:

[NSClient]
;# ALLOWED HOST ADDRESSES
; This is a comma-delimited list of IP address of hosts that are allowed to talk to NSClient deamon.
; If you leave this blank the global version will be used instead.
;allowed_hosts=
;
;# NSCLIENT PORT NUMBER
; This is the port the NSClientListener.dll will listen to.
port=10001

Restart the NSClient++ service and bingo you will now be able to monitor multiple servers behind a single external IP using port forwarding.

In my next blog, i will talk about the different services and things we can monitor, ranging from Exchange 03/07 through to BackupExec and other programs.

Enjoy!

sam

Tags: , , ,

Nagios Remote Monitoring Over the WAN to Multiple Windows Servers

July 1st, 2009 | 2 Comments | Posted in General Site, Nagios

Hi All,
Firstly – greatest apologies for long time away. I have been completing my thesis into virtualization. In my role, i’ve spent the last 3 weeks deploying a Nagios solution to multiple customer locations to monitor Windows servers. Here, i will post how i did it, the issues that arose, how i dealt with them etc so you wont have to.

First of all, you will need to forward ports to not only the remote server, but the local server.
Your local Nagios box needs to have ports 5666-5667, 1248-1249 and 12489 forwarded to IP (TCP ports).
The remote server will need to have the above ports forwarded to it also. On Netgears you need to create the service and forward it through firewall, in Drayteks it is hidden under “NAT -> Port Redirection”, etc.

Once your ports are forwarded correctly, we need to look at how the Nagios solution works. Basically, you need to install Linux (your choice) onto a system that will act as your monitoring box. It does not have to be powerful, but again running it on a Atom processor is not advisable!

I have my Nagios solution running on a 5 year old Desktop PC with 1GB of RAM and an 80GB IDE hard disk, to give you an understanding of how “low spec” i’m talking.

Once you have installed your Linux of choice (i used Ubuntu 9.04 for this operationg but normally i’m firmly a CentOS/RHEL man), you will need to install Nagios. This is very simple, all you need to do is follow the instructions on the Nagios website and ensure that all your dependencies are met during make, make install etc.

Once you have Nagios configured, and you can access it via http://itsipaddress/nagios using nagiosadmin as the user and [whatever] as the password you set a second ago, you should be greeted with a Nagios web page. This is now your working Nagios solution.

The best way to import hosts to monitor into Nagios is via the config files, therefore you will need to do large amounts of this through SSH/terminal to make sure its done correctly (no surprise there).
Nagios (on my Ubuntu 9.04 server) by default is stored at /usr/local/nagios. Within here there are many folders, but the one we are interested in is the “/etc” folder. In here, the config files or “.cfg” files are stored. The big config file we need to take great care of is the nagios.cfg file as this is the master who controls all. If you open it up with vi “vi nagios.cfg”, you will see lots of options, most of which are commented out, i.e.

;/objects/windows.cfg

To enable the monitoring of Windows servers, you will need to uncomment this option (press “i” to begin editing, press ESC then colon followed by wq! and enter to save the file). To reload Nagios (on ubuntu in our case), use Service nagios restart. This is the same in CentOS.

Within the folder “objects” there is a file called “windows.cfg”, this is what the above uncommenting refers to. By adding hosts to this file, you are adding hosts which are going to be monitored. If you open it up with vi again, you will see an example host template.

Bit of advice here, if you are going to monitoring a lot of hosts (servers to you and me), you dont want to add them all into the windows.cfg file as it will get VERY messy very quickly, so it is best to keep the Windows.cfg file to a bare minimum and use a seperate cfg file per server which i will explain later. In the Windows.cfg, you should remove everything except:

# Define a hostgroup for Windows machines
# All hosts that use the windows-server template will automatically be a member of this group

define hostgroup{
hostgroup_name  windows-servers ; The name of the hostgroup
alias           Windows Servers ; Long name of the group
}

This just sets the hostgroup that will be used later during the addition of other Windows servers. You will see in Nagios.cfg (/usr/local/nagios/etc/nagios.cfg) that the line /usr/local/nagios/etc/objects/windows.cfg relates directly to the one we just edited. By adding .cfg files per server and merely adding another look-up line, ala the above, you can add as many servers as you want without cluttering your organisational structure. For example, i have the directory “Servers” within objects, within which i store all my server configs, i then link them back into Nagios.cfg so the files are read and the hosts processed, like so:

# You can specify individual object config files as shown below:

cfg_file=/usr/local/nagios/etc/objects/commands.cfg

cfg_file=/usr/local/nagios/etc/objects/contacts.cfg

cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg

cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host

#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

# Definitions for monitoring a Windows machine

cfg_file=/usr/local/nagios/etc/objects/windows.cfg

cfg_file=/usr/local/nagios/etc/objects/Servers/Accountants.cfg

cfg_file=/usr/local/nagios/etc/objects/Servers/SystemX.cfg

#cfg_file=/usr/local/nagios/etc/objects/Servers/SamPC.cfg


The files “SystemX.cfg”, “SamsPC.cfg” and “Accountants.cfg” are all servers which i set up Nagios to monitor, created a seperate cfg file for and tied it into Nagios. Its that simple.

Now onto the actual config files themselves.

As mentioned before, each host you want to monitor should have its own config file, to make it managable and easy to add/remove extra hosts without risk of damaging others. A sample host configuration file is as follows:

#
# Client Y’s Server – Configuration file
#

define host{
use                 windows-server
host_name           ClientServer
alias               Client Y’s Server
address             194.168.4.100
}

define service{
use            generic-service
host_name        ClientServer
service_description    Uptime
check_command        check_nt!UPTIME
}

define service{
use            generic-service
host_name        ClientServer
service_description    CPU Load
check_command        check_nt!CPULOAD!-l 5,80,90
}

define service{
use            generic-service
host_name        ClientServer
service_description    Memory Usage
check_command        check_nt!MEMUSE!-w 80 -c 90
}

define service{
use            generic-service
host_name        ClientServer
service_description    C:\ Drive Space
check_command        check_nt!USEDDISKSPACE!-l c -w 80 -c 90
}

define service{
use            generic-service
host_name        ClientServer
service_description    Explorer.exe
check_command        check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
}

The first stanza refers to the actual host we are setting up to monitor. The “use” line refers to the hostgroup we created / left created in windows.cfg. The host_name is a name given to the server we wish to monitor, it can contain spaces but no special characters, i.e. “Client server” is fine, “Client(Server)” is not. The alias is a string field which is used to identify the server but is not used for calculations in the lookup for the rest of the config. The important field is the “Address” field, which refers to the external IP address of the client’s server you wish to monitor. This is the internet IP of the site who you forwarded port 12489 through to the server, in our example, we are showing as 194.168.4.100 on the website www.whatismyip.com . It is imperative this is correct.

Once the host has been defined, you can create as many “services” as you like. In our example, i have set monitoring on the “NSClient++” version (the executable which runs on the Windows server), the Uptime of the server, the CPU and Memory usage of the server (notice the -w and -c flags, referring to -Warning and -Critical (yellow and red fields in the web interface)).

I am also monitoring disk space usage on C:/, along with the state of the explorer.exe program – this can be changed to anything you like, such as:

define service{
use            generic-service
host_name        ClientServer
service_description    SQL Server
check_command        check_nt!PROCSTATE!-d SHOWALL -l Sqlservr.exe
}

Which will monitor the status of the SQL Server process, etc. Once the configuration files are done and the ports are forwarded, you can now start the Nagios service on your server, by typing “service nagios restart” or “service nagios start” in the command line. You can access the web interface by “http://ipaddressofserver/nagios” and using the username “nagiosadmin” and the password you set earlier. Now all you need to do is configure the NSC.ini file and the NSclient++ on the server(s) you wish to monitor.

For ease, it is advised that before hand you find out the IP address of the site on which the Nagios server is located. You can do this again by going onto www.whatismyip.com .

Now, go onto the server you wish to monitor and open up a web browser such as Firefox and go to the URL:

http://sourceforge.net/project/downloading.php?group_id=131326&filename=NSClient%2B%2B-0.3.6-Win32.msi

Install the program using the defaults as they will all change during the configuration anyway. Once the program is installed, go to “Start -> Run ->” and type services.msc and press enter. Find the service NSClient++ and double click on it. On the second tab along “Log On”, check the box “Allow service to interact with Desktop” and click apply. Now go to C:/Program Files/NSClient++ and edit the file NSC.ini. This is where we need to make the final configurations in order to ensure the host can talk properly with the Nagios box.

Once you are in editing mode, remove the semi-colons / uncomment all the .dll files except “CheckWMI.dll” and “RemoteConfiguration.dll”. Next, scroll down and edit the “allowed_hosts=” field to point to the IP address of the site the Nagios server is on (we found it out earlier). Once this has been done, click save. Now, go to services.msc again and make sure that “NSclient++” is started.

If it is, you should now go to “http://serverip/nagios”, log in, and be able to see the host you have just added. If you click on the traffic lights next to the host, you should be able to see the service health status, green through amber all the way to red. This allows you monitor the health of remote servers quickly and easily. Next post, i will talk about how we can get it to work with multiple servers behind the same external IP using non-port forwarding.

Sam

Tags: , ,