Nagios plugins and Splunk.. wait, what?

sparkline server stats

Recently i’ve been on a bit of a tear with my infrastructure, moving from Apache to Nginx and migrating to new hardware (I moved from my beloved 25KG Fractal Define XL to a new mATX box that is 25% the size.. i call it ‘wife friendly infrastructure’!).

In my infrastructure of many ridiculous things, I use Opsview to monitor server temperatures (CPU/HDD/RAM), free space on my logical volumes, SMART status, RAID status and a few other things (systemd service status, etc). I then use Splunk Light to parse and display information gathered from logs for my web applications: ownCloud, Opsview, etc and also the logs forwarded from my router which handles port forwarding into the LAN (so i can see all the naughty port scanners..tsk tsk).

One thing I was always curious about was how could I get Splunk to analyse and interpret data generated by the Nagios (c) or Monitoring Plugins ran by software such as Opsview, Nagios, Icinga 2, or pretty much any monitoring tool out there.

This blog will show you at a basic level how you take the plugin-secret sauce and smother your Splunk in it, so you can analyse everything! (oh and how to create some funky graphics, because, I like colours.. mmm pretty).

This guide will only show the return of Nagios (c) / Monitoring Plugin data from the local Splunk system. If you want to gather remote plugin data, download the NRPE agent / NSClient and configure check_nrpe.

1. Configuration

Firstly, lets look at the command line. Scary I know.

Splunk scripts can be stored in one of four locations (as per the UI):

Screenshot 2016-05-24 09.45.45

Now on my server, $SPLUNK_HOME is “/opt/splunk” (if you are lazy, download ‘locate’, run ‘updatedb’ and then run ‘locate “/bin/scripts” to find the path).

Splunk expects the scripts you want to run (Nagios / Monitoring plugins) to be ‘in’ these directories. In Opsview and Nagios variants, the plugins live in /usr/local/nagios/libexec – so you can either copy the scripts from B to A, or you can do the preferred option and symlink those badboys:

This gives you the benefit of only one version of the plugin on the system, so your results will be the same between monitoring tools (you never know!) and also it makes maintenance a lot simpler.

Now that the plugins are ‘in’ the Splunk directory, you can begin to work with them at the GUI level.

2. Data inputs

On your Splunk system, navigate to ‘Data Inputs’ via the hamburger menu icon in the top left:

Screenshot 2016-05-24 09.59.26

Within this section you can choose to add ‘Files and Directories’, TCP/UDP ports, and also SCRIPTS! That looks interesting, right? Lets click on it.

Screenshot 2016-05-24 10.01.44

In here you will see all of your existing scripts that Splunk is collecting and parsing the output of, and also a big, ominous ‘New’ button, deceptively named as thats what we need to click on to add a new script input. Crazy right? *click*

In the first step, we need to give Splunk the source. Click on the ‘Script path’ dropdown and choose ‘bin/scripts’, then select the plugin you want to configure.

In the example below, I am going to show you the configuration needed to take ‘check_lm_sensors’ data (temperature data grepped from the ‘sensors’ command output) and add it into Splunk. I’ll be using the temperature of my /dev/sda hard drive as an example.

Screenshot 2016-05-24 10.05.15

The command must be as it would be ran on the command line, i.e. on the CLI I can run:

Therefore in the UI ‘Command:’ section I need to enter:

Next, as I want somewhat granular data im going to run this plugin every 60 seconds (i.e. get a temperature data plot every 60 seconds). Finally, give the ‘Source name override’ a unique value, i.e. ‘sdaTemp’. Next!

Screenshot 2016-05-24 10.06.45

Next we need to classify the data. I took a simple approach to the temperature based data I’m gathering and simply called it ‘nagiosplugin’. After configuring the source type, click ‘Review’ and then complete, after which you will be presented with the following screen:

Screenshot 2016-05-24 10.07.06

The best thing to do from here is click on that ‘Start searching’ button which will take you to a prefiltered view which you can begin to parse and filter further.

3. Extract

Firstly, have a nosy at the log data generated:

Screenshot 2016-05-24 10.32.10

In order to create pretty pictures from this data, we need to parse it and turn the temperature value into a custom field. To do this, click on ‘Extract New Fields’ in the bottom left:

Screenshot 2016-05-24 10.35.03

Then click on an example row of data and click next (ignore the colours on mine, i’ve already done these steps):

Screenshot 2016-05-24 10.43.11

In the next screen, click on ‘Regular Expression’ and click next. On the next screen, drag your mouse over the temperature information after which a popup will appear, asking for a field name:

Screenshot 2016-05-24 10.45.42

Select the ’30’ and call it temperature, and select the ‘sdaTemp’ and call it ‘temperaturesource’, and then click ‘Next’, ‘Next’ and ‘Finish’.

4. Create reports

Now that we have our parsed data, lets open up a search and begin to carve it up. To get the ‘latest’ value, use the ‘stats’ filter as below:

This will show you a single number as below:

Screenshot 2016-05-24 10.55.21

We can then graph this by adding the next filter ‘| gauge sdaTemp’, giving us:

Screenshot 2016-05-24 10.58.48

We can then pimp this up by changing the format using the ‘Format’ drop down, adding a label, background colour range (i.e. 0-45oC is green, 45 to 60 is yellow, 60+ is red), and more. Finally, click ‘Save as’ in the top right, and select ‘Report’:

Screenshot 2016-05-24 11.02.12

Once saved, you report will look something similar to mine below. You can view all of your reports via the ‘Reports’ menu option at the top:

Screenshot 2016-05-24 11.03.46

5. Create dashboards

Finally, to bring it all together, click on the ‘Dashboards’ tab at the top of your Splunk GUI, and once loaded click ‘Create new dashboard’ which will prompt for a name and other options.

We are going to create a simple dashboard, by adding all of our reports onto a single screen (think of reports as widgets or dashlets). To do this, select ‘Add Panel > New from report’ as below:

Screenshot 2016-05-24 11.08.39

Simply click on your report and click ‘Add to dashboard’ and voila, its added! Now repeat this step X number of times depending on how many reports you have, and before you know it you’ve got your own Nagios-plugin data-based dashboard (Thats a mouthful):

Screenshot 2016-05-24 11.11.31

You can also graph these values historically using timechart and the search query:

Which gives us:

Screenshot 2016-05-24 11.33.12

Now, go forth and blend Nagios plugins into your log based world.

PS: Update to the blog, I found out that since writing this blog you can get sparklines on the ‘number’ reports above, by using ‘timechart’ as the query and selecting ‘Single value’ in the drop down, as below. This allows you to select ‘Show sparkline: Yes” in the ‘Format’ section. Very cool!

single value sparkline

Converting SSL-enabled Apache vhosts to Nginx

So I recently became beyond-the-point of fed up with Apache2, it is slow and clunky and has been doing a shitty job recently of hosting my 7-8 virtualhosts (4 of which are SSL-enabled), so I thought i’d move them over to Nginx. Simple right? You’d think so, but…

Some of the directives in Apache dont map very nicely to Nginx, but there is a lot to love about Nginx (namely, its a LOT faster!). This guide is to show you how to migrate the trickier parts of your Apache configs to Nginx.

Pre-reading

In order to start Nginx you need to stop Apache2, as in:

Nginx wont start if Apache2 has taken control of *:80.

Migrating ProxyPass

In my setup, I run Opsview and Splunk Light behind a ProxyPass virtualhost (one runs on a VM, one runs as a web app on port 8000). The config in Apache2 looks like:

The config for Nginx is simpler still:

Simply copy the code above to your website in /etc/nginx/conf.d/website.conf (for example) and modify accordingly. Very simple.

Migrating SSL

This one was a lot tricker and a total faff. In my Apache2 vhosts, I had the following entries:

Whereas Nginx only supports the ‘ssl_certificate’ and ‘ssl_certificate_key’ directives. 2 directives, three files.. you see the problem, right?

What you have to do is simply combine the .crt and the .pem file into a single ‘name.pem’ file. For those using StartSSL, you will get 2 files on download:

  • 2_name.uk.crt
  • 1_root_bundle.crt

Cat those files into  ‘bundle.pem’ as below:

And then update Nginx:

Basically your key stays the same, you simple combine your crt’s into a .pem and reference that.

Hope this helps – and dont forget to use Qualys to test your SSL strength. For reference, my ‘hardened’ nginx config for Splunk/Opsview/ownCloud is below.

Note: There are some items you will need to do, such as generate a harder diffie-hellman param file, etc. Google is your friend.

 

 

 

Fail2ban for Owncloud: Brute force prevention and alerting

In this guide, I will show you how to configure your ownCloud server so that brute force attacks are one less thing to worry about. Not only will fail2ban block someone from having X number of failed login attempts to your ownCloud server, it will also notify you via pushbullet that an attempt has been blocked.

So lets begin!

Step 1: Install fail2ban and ownCloud filters

Following this excellent guide here (src: https://github.com/AykutCevik/owncloud-fail2ban) you can have fail2ban and ownCloud filters configured in no time. Simply run the command:

.. and follow the instructions. Make sure you enter the logfile path correctly; i.e. /var/log/owncloud.log, or /var/www/owncloud/owncloud.log, etc.

Once configured, this will add two main files of interest:

The first, ‘/filter.d/’, is the regex that fail2ban will use to parse your specified ownCloud log for failed login attempts.

The second, ‘/jail.d/’, tells fail2ban to actively use that filter, what port to monitor on, and the logpath (edit the logpath here if you entered it wrong during setup).

The logfile, /var/log/owncloud.log (for example) must log using a few specifics. To set these specifics, crack open your config.php (i.e. /var/www/owncloud/config/config.php) and set the following directives:

Make sure that you have the logtimezone set to your local timezone, otherwise fail2ban wont work.

.. and thats fail2ban setup. So how does it work?

Fail2ban will create a new iptables chain called ‘f2b-owncloud’, which is visible if you run ‘iptables -nL –lin’ as below:

Basically, iptables will detect if traffic has come into the server on the port specified earlier (https), and will direct all traffic to the new chain ‘f2b-owncloud’. When fail2ban detects an IP has had 5 failed logins, it will then add a iptables REJECT rule for that specific IP, at the top of the f2b-owncloud chain. This means that the IP will be rejected, but all other traffic will pass.

Obviously this means that HTTPS traffic will bypass all other iptables rules in your INPUT chain as it is being diverted to the f2b-owncloud chain, so ensure you add your rules to that chain also, i.e.:

Using ipset, I only allow GB IP addresses access to HTTPS, so I’ve had to add this rule to the f2b-owncloud chain, along with a REJECT all. Now when fail2ban bans an IP, it’ll get added to the top. The flow will be then be:

  1. Is the source IP banned? If yes, then reject, if not ..
  2. Is the source IP from the GB IP range? If no, then reject, if yes, then allow.
  3. Reject everything else.

Step 2: Testing

Next we need to test that fail2ban will actually block the malicious IP. Simply attempt some invalid logins, and then run the command ‘fail2ban-client status owncloud’:

Here you can see you have now been banned from accessing the ownCloud server via HTTPS. You can confirm this via iptables -nl –lin:

You should now be unable to access your owncloud server. To unban yourself, simply run:

Where 134.. is your IP. If your not being being, double check the regex is valid and picking up your logs correctly using the command:

This should show an output similar to the below:

If you have issues, double check your /var/log/fail2ban.log. If you see an error like the below, then double check the config.php file for your owncloud server:

I saw this issue when I had a specific ‘logdateformat’ entry in config.php. Simply removing this entry solved my problem.

Step 3: Notifications

Now that fail2ban is actively blocking brute force attackers, we want to be notified about it. Doing this is rather simple, thanks to this excellent guide here (src: http://blog.meinside.pe.kr/How-to-get-Pushbullet-notification-on-Fail2ban-ban-actions/).

Firstly, install Go:

Next, setup your users .bashrc profile to contain the necessary paths by adding the following lines to ~/.bashrc:

Next, download the pushbullet-fail2ban.go code:

Next, edit that file and enter your pushbullet API key (to find out your key, visit: https://www.pushbullet.com/account 

Find the line ‘MyPushbulletToken = ‘ ‘ and add your API key within the quotes (no spaces at the start or end).

Once done, build the notification script:

This will leave you with a file called ‘pushbullet-fail2ban’. Move this file to /etc/fail2ban/ using the command:

Next, test that the notification works using the command:

You should get an alert on your mobile phone similar to:

38786f12-83c7-11e5-95b4-f431a84c53d9

Next, we need to modify fail2ban to use this new pushbullet-fail2ban method. To do this, simply copy the method fail2ban uses to ban IP’s using iptables, and edit the copied method:

Within this file, find the line ‘actionban =’ and replace it with the following:

 

Next, lets tell fail2ban to use this new method and not the existing ‘iptables-multiport.conf’. Edit the file /etc/fail2ban/jail.conf:

.. and amend the line ‘banaction’ (line 157) to look like:

And thats it. Simply restart fail2ban, update your iptables rules (if you have modified them as per my step 1) and your fail2ban protected owncloud is now ready:

Next time a naughty hacker tries to brute force your box, you’ll get a push notification similar to:

Screenshot 2016-01-13 11.23.49

Notes

My iptables rules for fail2ban/owncloud look like the below:

These rules send all HTTPS traffic to the f2b-owncloud chain, and then the HTTPS traffic is parsed through that traffic. I have an ipset rule here, and then a reject to end the chain. Hope this helps!

Fail2ban for Opsview Monitor

This guide will show you a very quick and dirty way to use Fail2ban to prevent brute-force attacks on your Opsview Monitor 5.0 server. This should work the same for Opsview 4.x servers, but I havent tested it.

Fail2ban, for those who arent familiar, is “an intrusion prevention framework written in the Python programming language. It works by reading SSH, ProFTP, Apache logs etc.. and uses iptables profiles to block brute-force attempts.” (src: https://help.ubuntu.com/community/Fail2ban).

Firstly, install fail2ban. In my example I am using Ubuntu 14.04, so simply:

Next, go to the fail2ban directory and create the opsview ‘filter’:

Within here, copy and paste the following:

This is a simple regex that filters for the source IP of the ‘hacker’, using the standard syslog message left in opsview- web.log. Example message below:

Next, lets tell fail2ban to actually use this rule. Create a new file called ‘/etc/fail2ban/jail.local’ and add the following:

Obviously, if you arent using https then change this to ‘http’. Next, modify /etc/fail2ban/jail.conf and modify the line

to

Finally,  Simply start fail2ban using the command:

You can view the ‘jail’ by running the script here: https://gist.github.com/kamermans/1076290 . Simply clone this file, chmod +x and then run, as below:

This will give an output similar to:

Now, go and try and brute force 10-15 times and see what happens when you run the command above again:

Here you can see that 10 failed attempts have been made, and an IP address has now been banned from trying to login. To prove this, run the iptables command below:

To unblock an IP address, simply run the command:

Sorted. Now, go forth and fail2ban!