Expanding Integrity Virtual Machine Disk Volumes (HP-UX)

When an Integrity Virtual Machine (or IVM) is set up, the disks that the IVM will have internally can be backed with any number of things: a DVD device, an ISO file, a regular file, a physical disk, or a logical volume. Using a logical volume can be the most flexible option.

However, when expanding a disk volume for the Integrity Virtual Machine, the obvious solution is the wrong one. The logical volume on the VM host can be expanded – this, however, will all be in vain: there is no way to adjust the size of the disk inside the VM. Once pvcreate is done and the device is present, the volume cannot be expanded.

So even though the logical volume backing the VM disk has been expanded, there is no way to make the VM utilize the “new” space (which it can’t see).

It may be possible to do a pvremove on the disk, then remove the disk from the VM (using hpvmmodify) and add it back in as a new disk. It might also be wise to zero out the disk (being careful!) so no LVM structures remain in the VM disk. In this case, that would mean using a command like this:

dd if=/dev/zero of=/dev/vgvdisk/lvol1 bs=1024 count=50

This will lay down zeros onto the disk. Be very careful about which disk you are doing this to! Once this command is done, there will be no usable data on that disk – so if you choose the wrong one you could mess your system up completely.

The recommended way of increasing storage in an IVM is to create a new disk for the virtual machine, then add it to the VM (using hpvmmodify with a -a option). Once this new disk is presented to the guest HP-UX environment, add the new disk to the old volume group using vgextend, extend the logical volume with lvextend and extend the filesystem with fsadm.

No down time and no problems!

Powered by ScribeFire.

About NagiosGrapher (v1.7.1)

NagiosGrapher is a tool that fits in with Nagios and adds graphing capabilities. Anything that is reported by a Nagios check can be graphed. The software is available from NagiosForge.

NagiosGrapher uses RRDTool to maintain its data, so space is not typically an issue – old data that falls off the graph is not kept any longer. For example, details reported every 15 minutes don’t have to be kept beyond about 48 hours (two days).

Graphs are available for daily, weekly, monthly, and yearly reports.

If you install NagiosGrapher under a Red Hat system, you may have to tell it to use layout “red_hat” (as the automatic tool may discover “redhat” instead of “red_hat”). Once installed, there are a number of files and directories created.

/etc/init.d/nagios_grapher

This is the startup script for NagiosGrapher. Make sure that the location of the daemon is set correctly; I had to change it to read this way:

DAEMON=/usr/lib/nagios/plugins/contrib/collect2.pl

The script as delivered does not support Red Hat’s chkconfig utility; add these lines to make it support chkconfig (just under the top line):

#
# chkconfig: 345 99 01
# description: NagiosGrapher

Then these commands need to be run to activate NagiosGrapher:

# chkconfig --add nagios_grapher
# chkconfig nagios_grapher on

Next time the system starts NagiosGrapher should start as well.

/etc/nagios/ngraph.ncfg

This is the general configuration file for NagiosGrapher itself. Perhaps the main thing to edit in this file is the option log_level (a bit-mapped value). This may be set high initially. I have it set to 63; it could probably be reduced further.

/var/log/nagios/ngraph.log

This is where the NagiosGrapher log is stored. The log details the operations of NagiosGrapher and the amount of detail is directly related to the log_level value in /etc/nagios/ngraph.ncfg.

/var/log/nagios/service-perfdata*

These files are the performance data stored by Nagios. These are actually Nagios files (not NagiosGrapher files) but are read by NagiosGrapher to create its information. If these start building up, then NagiosGrapher is not reading them appropriately; it may be worthwhile to restart the service using the initiliazation script.

/etc/nagios/ngraph.d

This directory contains the graph configuration. This configuration data is used to create the actual graphs. Many of these items will directly correlate with RRDTool commands and their options.

The defaults are set in nmgraph.ncfg, and the specific graphs can be found under the templates directory. To disable a graph configuration, just add the word _diabled on the end. Presumably this works because the file no longer ends in .ncfg (the standard NagiosGrapher configuration file ending). Likewise, to enable a disabled configuration remove the _disabled from the end of it (and make sure it ends in .ncfg.

/etc/nagios/serviceext

This is a directory where NagiosGrapher stores appropriate configurations for each host that it reads from the performance data that it handles. These files are created automatically; to get the configuration to be loaded into Nagios, Nagios must be reloaded.

/usr/lib/nagios/plugins/contrib/

NagiosGrapher adds several files to this directory: udpecho, fifo_write.pl, fifo_write, collect2.pl, and the directory perl (which contains library files for NagiosGrapher). The most notable of these is collect2.pl: this is the actual program that runs and which scans the performance data for appropriate details.

/usr/lib/nagios/cgi/

NagiosGrapher adds several files to this directory as well: graphs.cgi, rrd2-graph.cgi, and rrd2-system.cgi.

I found that rrd2-system.cgi would not work: it would not present any PNG graphs. Adding this “fix” took care of the problem:

--- rrd2-system.cgi     2009-02-04 15:34:39.000000000 -0600
+++ /usr/lib/nagios/cgi/rrd2-system.cgi 2009-02-04 18:47:23.000000000 -0600
@@ -97,6 +97,7 @@
        $image_bin =~ s/^(\[.*?\])//;
 }

+if (1 == 0) {
 if ($image_format eq 'PNG' && $code == 0 && !$only_graph && !$no_legend) {
        $ng->time_need(type => 'start');
        # Adding brand
@@ -128,6 +129,7 @@
                $image_bin = $blobs[0];
        $ng->time_need(type => 'stop', msg => 'Adding PerlMagick');
 }
+}

 # no buffered operations
 STDOUT->autoflush(1);

It is rather a brute force fix, but I didn’t want to haggle with it. The fix just removes some stuff that isn’t needed for working operation of NagiosGrapher. It would have been nice to actually fix the problem, though.

/var/lib/nagios/rrd/

This is where the RRDTool files are kept. All RRD files are kept in a directory by hostname, and each has a hexadecimal name with the extension .rrd. The seemingly random names are correlated to services by the file index.ngraph, also in this directory.

Powered by ScribeFire.

Configuring Nagios with m4

When using m4 to configure Nagios, great advantages can be realized.  One of the easiest places to gain an advantage by using m4 is when defining a new host.

Typically, a new host not only has a host definition but a number of fairly standardized services – such as ping, FTP, telnet, SSH, and so forth.  Thus, when defining a new host configuration, you not only have to add a new host, but all of the relevant services as well – and may also include host extra info and service extra info also.

#----------------------------------------
# HOST: marco
#----------------------------------------
define host{
        use                     hpux-host               ; Name of host template
        host_name               marco
        address                 192.168.4.1
        }
define hostextinfo{
        host_name               marco
        action_url              http://marco-mp/
}
define service{
        use                             passive-service          ; Name of servi
        host_name                       marco
        service_description             System Load
        servicegroups                   Load
        }
define service{
        use                             hpux-service          ; Name of service
        host_name                       marco
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }
define service{
        use                             hpux-service          ; Name of service
        host_name                       marco
        service_description             TELNET
        servicegroups                   TELNET
        check_command                   check_telnet
        }
define serviceextinfo{
        host_name                       marco
        service_description             TELNET
        action_url                      telnet://marco
}
define service{
        use                             hpux-service          ; Name of service
        host_name                       marco
        service_description             FTP
        servicegroups                   FTP
        check_command                   check_ftp
        }
define service{
        use                             hpux-service          ; Name of service
        host_name                       marco
        service_description             NTP
        servicegroups                   NTP
        check_command                   check_ntp
        }
define service{
        use                             hpux-service          ; Name of service
        host_name                       marco
        service_description             SSH
        servicegroups                   SSH
        check_command                   check_ssh
        }

Compare that output from the m4 code that generated it:

DEFHPUX(`marco',`192.168.4.1')

Another benefit is that if DEFHPUX is coded correctly (with each service independent – such as an m4 macro DOSSH for SSH) – then a single change to the m4 file, propogated to the Nagios config file, can alter a service for every HP-UX host (in this example).

Here is a possible definition of DEFHPUX:

define(`DEFHPUX',`
#----------------------------------------
# HOST: $1
#----------------------------------------
define host{
        use                     hpux-host               ; Name of host template
        host_name               $1
        address                 $2
        }
define hostextinfo{
        host_name               $1
        action_url              http://$1-mp/
}'
DOLOAD(`$1')
DOPING(`$1')
DOTELNET(`$1')
DOFTP(`$1')
DONTP(`$1')
DOSSH(`$1')

There is a lot more that m4 can do; this is just the tip of the iceberg.

Powered by ScribeFire.

Nagios and m4

The macro processor m4 is perhaps one of the most underappreciated programs in a typical UNIX environment. Sendmail may be the only reason it still exists – or would that distinction include GNU autotools?

Configuring Nagios can be dramatically simplified and made easier using m4 with Nagios templates. Even when using Nagios service and host templates, there is a lot of repetition in the typical services file – perhaps an extreme amount of duplicated entries.

Using m4 can reduce the amount of work that is required to enter items.

Here is an example:

DEFHPUX(`red',`10.1.1.1')
DEFHPUX(`green',`10.1.1.2')
DEFHPUX(`blue',`10.1.1.3')
DEFHPUX(`white',`10.1.1.4')
DEFHPUX(`black',`10.1.1.5')
DEFHPUX(`orange',`10.1.1.6')

In my configuration, each line above expands into 64 lines (including three lines of header in the comments). So the result of those six lines is 384 lines of output.

Every DEFHPUX creates a host, complete with standard service checks such as PING, SSH, and TELNET. This is done all with just a few macro definitions at the beginning of the file.

Read about m4 and understand it, and your Nagios configurations will be much easier. You can use the program make to automate the creation of the actual config files and the check and reload necessary for Nagios to incorporate the changes.

Nagios and passive checks

Using Nagios to monitor systems gives you a freedom that you will marvel at if you’ve not had it before. No longer must you search out problems with your systems; they are flagged and noted for you immediately – and you can be notified any way you can think of.

However, sometimes there is no way to reach out to the remote host and check details. One could perhaps log in using SSH and get check information that way, but that might not be the best way.

If you are checking for one-time notifications – such as security notifications – an active check of the remote host is either impossible or too time and resource intensive for the results you get. What you need is to use passive checks.

Passive checks originate at the host and are sent back to the Nagios server. In my opinion, the best way to do this is to utilize the send_nsca script provided by the Nagios team.

The format of service check data sent to send_nsca is the following:

host \t service description \t status \t check output \t performance data

The service description will be used (with the host name) to put the information in the right slot. Thus a misspelling will cause the service check information to “disappear.”

Each field is separated by a tab character. The performance data is optional and unnecessary – but it can be used by other programs through Nagios.

To send check data to the Nagios server, use the program send_nsca program. There is a Perl version of this that should work almost anywhere Perl does – though I’ve yet to test this in an OpenVMS environment!

There are certain Nagios configuration options that must be enabled before passive checks work. In nagios.cfg:

accept_passive_service_checks=1
check_service_freshness=1

Make sure that your passive services have the following entries:

active_checks_enabled           0                       ; Active service checks are disabled
        passive_checks_enabled          1
        parallelize_check               1
        is_volatile                     1
        obsess_over_service             0
        check_freshness                 1                       ; Check service 'freshness'
        freshness_threshold             4800                    ; How fresh must the check be?
        check_command                   gone_stale              ; Report staleness

If the active checks are enabled (1) then your check_command can check and flag the freshness of the service check – instead of just failing outright. Freshness is how recent does the check have to be in order to be valid.

Then you have to make sure that the NSCA service daemon is running; on this particular Red Hat installation I’m using it requires a

/etc/init.d/nsca start

This should be enough to make your passive checks start working. Don’t forget to restart Nagios to make sure your configuration changes have taken effect.

One more thing: if you confuse NSCA (Nagios Service Check Acceptor) with NCSA (National Center for Supercomputing Applications) – NCSA was where the web browser Mosaic originated, and where Apache derived from (starting with NCSA httpd) – remember that NCSA is wrong. If you remember the National Center for Supercomputing Applications – then the Nagios addon and programs become nsca.