Nagios and passive checks

Using Nagios to monitor systems gives you a freedom that you will marvel at if you’ve not had it before. No longer must you search out problems with your systems; they are flagged and noted for you immediately – and you can be notified any way you can think of.

However, sometimes there is no way to reach out to the remote host and check details. One could perhaps log in using SSH and get check information that way, but that might not be the best way.

If you are checking for one-time notifications – such as security notifications – an active check of the remote host is either impossible or too time and resource intensive for the results you get. What you need is to use passive checks.

Passive checks originate at the host and are sent back to the Nagios server. In my opinion, the best way to do this is to utilize the send_nsca script provided by the Nagios team.

The format of service check data sent to send_nsca is the following:

host \t service description \t status \t check output \t performance data

The service description will be used (with the host name) to put the information in the right slot. Thus a misspelling will cause the service check information to “disappear.”

Each field is separated by a tab character. The performance data is optional and unnecessary – but it can be used by other programs through Nagios.

To send check data to the Nagios server, use the program send_nsca program. There is a Perl version of this that should work almost anywhere Perl does – though I’ve yet to test this in an OpenVMS environment!

There are certain Nagios configuration options that must be enabled before passive checks work. In nagios.cfg:

accept_passive_service_checks=1
check_service_freshness=1

Make sure that your passive services have the following entries:

active_checks_enabled           0                       ; Active service checks are disabled
        passive_checks_enabled          1
        parallelize_check               1
        is_volatile                     1
        obsess_over_service             0
        check_freshness                 1                       ; Check service 'freshness'
        freshness_threshold             4800                    ; How fresh must the check be?
        check_command                   gone_stale              ; Report staleness

If the active checks are enabled (1) then your check_command can check and flag the freshness of the service check – instead of just failing outright. Freshness is how recent does the check have to be in order to be valid.

Then you have to make sure that the NSCA service daemon is running; on this particular Red Hat installation I’m using it requires a

/etc/init.d/nsca start

This should be enough to make your passive checks start working. Don’t forget to restart Nagios to make sure your configuration changes have taken effect.

One more thing: if you confuse NSCA (Nagios Service Check Acceptor) with NCSA (National Center for Supercomputing Applications) – NCSA was where the web browser Mosaic originated, and where Apache derived from (starting with NCSA httpd) – remember that NCSA is wrong. If you remember the National Center for Supercomputing Applications – then the Nagios addon and programs become nsca.

6 thoughts on “Nagios and passive checks”

Brian McManus says:

9 December 2009 at 12:48 pm

This is more of a quick question then a comment. I am implementing passive host checks for some remote Linux servers behind a firewall. It works great.

I set my check freshness to 600 seconds.

The only problem is it’s not alerting when the host check goes stale. Would you be kind enough to share your gone_stale command or explain the configuration that alerts for passive host checks that have gone stale?

Great blog. Thank you!

1. ddouthitt says:
  
  10 December 2009 at 6:32 pm
  
  Having a “stale” command was always problematical for me; I don’t know that I ever got it to work.
  
  The alternative is to have the passive check always return something – including an error when there is no data from the check. Along with that, use the active checks against the host to check for proper activities and so on.
  
  Check your freshness interval, your check run interval, and your command. Also: I believe that the command has to be set up in much the same way as an active check, but with some of the passive check options.
  
  Once you figure out the appropriate settings, use m4 (with make) so you don’t have to repeat the proper commands every time.
  
2. ddouthitt says:
  
  10 December 2009 at 6:33 pm
  
  Thanks for the compliment as well. You made my day.
  
Bas says:

16 June 2011 at 10:45 am

And if the freshness threshold is 4800 how much should the send check interval be?

Thnx,

Bas

Isaac says:

4 October 2011 at 3:06 pm

The “stale” command should just be a dummy command that returns 0.
I added this to the “/etc/nagios-plugins/config/dummy.cfg”:
# return-stale definition
define command {
command_name return-stale
command_line /usr/lib/nagios/plugins/check_dummy 2 “There hasn’t been an update recently!”
}

I wonder if anyone knows how to remove the red flags on the “tactical overview” though… I dont’ care that a service is scheduled to only be passive. That’s actually what I want.

Pingback: Configuring Nagios Passive-Service | Cloud+DevOps