Turning off NET-SNMP overlogging

In the normal configuration – both on Red Hat and Ubuntu – you’ll find that SNMP is filling your logs with an endless amount of log entries, especially if you have monitoring tools that use SNMP every five minutes. They’ll generate messages like this:

Jan  8 13:45:02 example snmpd[2048]: Connection from UDP: [10.0.0.1]:51890
Jan  8 13:45:02 example snmpd[2048]: Received SNMP packet(s) from UDP: [10.0.0.1]:51890
Jan  8 13:45:02 example last message repeated 2 times

To get rid of these, change the priority levels that are logged by NET-SNMP. This can be done by changing the options sent to SNMP.

Look for a file /etc/default/snmpd or /etc/sysconfig/snmpd or similar. There should be a set of SNMP options – probably with an option like this one:

-Ls d

Change this option to be:

-LS5d

This will log everything at level NOTICE or higher (that is, severity level 5 down to severity 0). The severity levels used are those used by syslog; they are described in syslog(3).

This works because the messages being seen are logged at level INFO; by not logging items at that severity level the log entries no longer clutter the syslog files.

However, there is another set of messages that are common to NET-SNMP logs:

Sep  7 09:47:29 burp snmpd[19242]: diskio.c: don't know how to handle 9 request
Sep  7 09:47:29 burp snmpd[19242]: diskio.c: don't know how to handle 10 request
Sep  7 09:47:29 burp snmpd[19242]: diskio.c: don't know how to handle 11 request

This is the result of a bug (Red Hat Bugzilla #474093 – login required) which causes these “errors” when the SNMP diskIOTable is traversed. Red Hat fixed this bug back in September of 2009.

According to this message by Chris Rizzo, Red Hat stated:

The message that you see here is a result of querying for statistics
that are not available on the linux system. Requests 9, 10, and 11 are
defined as:

#define DISKIO_LA1 9
#define DISKIO_LA5 10
#define DISKIO_LA15 11

You can see where these statistics pop up by querying the SNMP diskIOTable using this command:

snmptable -v 2c -c public host diskIOTable

The output will look like this:

SNMP table: UCD-DISKIO-MIB::diskIOTable

 diskIOIndex diskIODevice diskIONRead diskIONWritten diskIOReads diskIOWrites diskIOLA1 diskIOLA5 diskIOLA15 diskIONReadX diskIONWrittenX
           1         ram0           0              0           0            0         ?         ?          ?            0               0
           2         ram1           0              0           0            0         ?         ?          ?            0               0
           3         ram2           0              0           0            0         ?         ?          ?            0               0
           4         ram3           0              0           0            0         ?         ?          ?            0               0
           5         ram4           0              0           0            0         ?         ?          ?            0               0
           6         ram5           0              0           0            0         ?         ?          ?            0               0
           7         ram6           0              0           0            0         ?         ?          ?            0               0
           8         ram7           0              0           0            0         ?         ?          ?            0               0
           9         ram8           0              0           0            0         ?         ?          ?            0               0
          10         ram9           0              0           0            0         ?         ?          ?            0               0
          11        ram10           0              0           0            0         ?         ?          ?            0               0
          12        ram11           0              0           0            0         ?         ?          ?            0               0
          13        ram12           0              0           0            0         ?         ?          ?            0               0
          14        ram13           0              0           0            0         ?         ?          ?            0               0
          15        ram14           0              0           0            0         ?         ?          ?            0               0
          16        ram15           0              0           0            0         ?         ?          ?            0               0
          17        loop0           0              0           0            0         ?         ?          ?            0               0
          18        loop1           0              0           0            0         ?         ?          ?            0               0
          19        loop2           0              0           0            0         ?         ?          ?            0               0
          20        loop3           0              0           0            0         ?         ?          ?            0               0
          21        loop4           0              0           0            0         ?         ?          ?            0               0
          22        loop5           0              0           0            0         ?         ?          ?            0               0
          23        loop6           0              0           0            0         ?         ?          ?            0               0
          24        loop7           0              0           0            0         ?         ?          ?            0               0
          25          sr0           0              0           0            0         ?         ?          ?            0               0
          26          sda  2840214016     1178369536     8946299      2080062         ?         ?          ? 990682692096     18358238720
          27         sda1      598016         208896          82            8         ?         ?          ?       598016          208896
          28         sda2        2048              0           2            0         ?         ?          ?         2048               0
          29         sda3     2286592        4649984         449          332         ?         ?          ?      2286592         4649984
          30         sda5  2836463104     1173473792     8945636      2079527         ?         ?          ? 990678941184     18353342976
          31          sdb  2960873984     1173473792     1940422      2079476         ?         ?          ? 990803352064     18353342976
          32         sdb1      638976              0          83            0         ?         ?          ?       638976               0
          33         sdb2      798720              0         153            0         ?         ?          ?       798720               0
          34         sdb3        6144              0           2            0         ?         ?          ?         6144               0
          35         sdb5  2958672384     1173473792     1940080      2079284         ?         ?          ? 990801150464     18353342976
          36          md0  3051716608     1727252992       93765      1641387         ?         ?          ?   3051716608     14612154880
          37          sdc  3067452416      425118208     1640751      3638614         ?         ?          ? 101851700224    438511782400
          38         sdc1  3067317248      425118208     1640721      3638613         ?         ?          ? 101851565056    438511782400
          39          sdd      307712              0          76            0         ?         ?          ?       307712               0
          40         sdd1      147968              0          37            0         ?         ?          ?       147968               0
          41         sdd2      131072              0          32            0         ?         ?          ?       131072               0

Towards the right side of center, you can see the metrics diskIOLA1, diskIOLA5, diskIOLA15; these are unsupported on Linux (as marked by the ? in each column). These are the 1 minute average disk load (as a percentage), the 5 minute average disk load, and the 15 minute average disk load respectively.

The three have SNMP OIDs of .1.3.6.1.4.1.2021.13.15.1.1.9 and .1.3.6.1.4.1.2021.13.15.1.1.10 and .1.3.6.1.4.1.2021.13.15.1.1.11 respectively – thus, the logged complaint of not knowing how to handle request 9 (or 10 or 11).

Without changing the code, there doesn’t seem to be any way to eradicate this message if you are querying the diskIOTable. Red Hat fixed the bug, perhaps others will? The bug remains on Ubuntu Lucid Lynx, unfortunately.

Net-SNMP SMUX Fails on Ubuntu: A Fix

When trying to set up Dell OpenManage on some servers, I found that SMUX was not working. SMUX is a protocol that allows agents to connect to a SNMP daemon and provide answers to SNMP queries in a portion of the tree.

When working, SNMP should generate this log message in /var/log/daemon.log:

May 11 17:14:18 serverx snmpd[29678]: accepted smux peer: oid SNMPv2-SMI::enterprises.674.10892.1, descr Systems Management SNMP MIB Plug-in Manager

In my case, I saw this instead:

May 11 17:06:59 serverx snmpd[29471]: /etc/snmp/snmpd.conf: line 370: Warning: Unknown token: smuxpeer.

After a long time fussing with SNMP and Dell OpenManage, it turned out that the problem was that the SMUX subsystem was being disabled at daemon startup by an option set in /etc/default/snmpd. Using the -I option will turn on (or off) a particular module used by snmpd. In this case, the line looked like this:

SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid'

With this configuration, the SMUX module is disabled. For snmpd to support SMUX, the line should look like this instead (removing the -I option and its argument):

SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -g snmp -p /var/run/snmpd.pid'

After making the change, restart the daemon:

service snmpd restart

This should then fix problems in using Dell OpenManage (or other SMUX agents). You don’t have to restart Dell OpenManage to make this work, but it should have SNMP enabled (which provides the smuxpeer line in snmpd.conf).

To enable SNMP for Dell OpenManage, use this:

service dataeng enablesnmp

To restart the Dell OpenManage services, don’t use the usual services; use this command instead (which takes care of all Dell OpenManage services):

srvadmin-services.sh restart

Why SMUX should be disabled I couldn’t say. The system is running Ubuntu 10.04.2 LTS with snmpd 5.4.2.1-dfsg0ubuntu1-0ubuntu2.1.

Installing VMware vSphere CLI 4.0 in Ubuntu 10.04 LTS

Installing the VMware vSphere Command Line Interface (CLI) has the potential for problems. In my case, it generated an error – a three-year old error. Perl returns the error:

undefined symbol: Perl_Tstack_sp_ptr

Not only has this error been around for three years, it also has shown up in numerous other instances. Ed Haletky wrestled with the error in VMware vSphere CLI back in June of 2008. The error surfaced in Arch Linux in 2008, both in running their package manager and in running cpan itself. This error also came up (again in 2008) in attempting to build and run Zimbra. (The response from Zimbra support was cold and unwavering: we don’t support that environment and won’t discuss it. How unfortunate.) The error also affected the installation of Bugzilla according to this email thread from 2009.

On the Perl Porters mailing list, there is an in-depth response as to what causes this error. From reading these messages, it appears that there are two related causes:

  • Using modules compiled for Perl 5.8 with Perl 5.10
  • Using modules compiled against a threaded Perl with an unthreaded Perl

One recommended solution is to recompile the modules using the cpan utility:

cpan -r

That may or may not be enough; it depends on if there were other errors. In attempting to run the vSphere CLI, I get this error:

IO::Compress::Gzip version 2.02 required--this is only version 2.005

To fix this, I ran cpan this way:

cpan IO::Compress::Gzip

In my case, that loaded IO::Compress::Gzip version 2.033.

I also loaded the libxml2-dev package; I don’t know if that was necessary or not:

apt-get install libxml2-dev

Whenever using cpan, I always wonder how it affects my packaged installations and whether it installs for all users or just me (and how to control that) – but I’ve never had any problems and installs as root seem to go into /usr/local – which makes sense.

Having done all this, I can now use the vSphere CLI to activate SNMP on the ESXi 4 servers. For the record, this is an integral part of ESXi 4 and supports all SNMP polling and traps – previously, only SNMP traps were supported. Certainly a nice improvement.

Using NET-SNMP and HP-UX SNMP together

Why would one want to do such a thing? A major reason is that the HP-UX SNMP daemon only supports the EMANATE protocol for subagents; this means that subagents that support the AgentX protocol (which NET-SNMP – provided as part of HP’s Internet Express – supports) are not supported and cannot be accessed via HP’s SNMP daemon.

However, the HP-UX specific information is only available via the HP-UX native SNMP daemon. What is the answer?

Change one or the other to run on a non-native port, that’s the answer. With the two daemons listening on different ports – in essence, acting like to discrete damons – the capabilities of both can be exploited. Since the native HP-UX snmp daemon does not provide the capability of specifying the port, the net-snmp daemon can be moved – and it is relatively trivial to do so as well.

There is probably already a line that says:

agentaddress 161

Change this line to a new port – I used 166:

agentaddress 166

Restart the daemon. Once the NET-SNMP daemon has been moved, enable HP’s SNMP daemon (if you’ve not already done so) and start it up again:

cd /sbin/init.d
SnmpMaster start

This should enable your two SNMP daemons on different ports. Now you can access whichever one holds the data you want. For example, using the command snmpwalk, getting Caché data can be as simple as:

snmpwalk -m ALL -v 2c -c public my:166 .1.3.6.1.4.1.16563

Whereas getting HP-specific data can be retrieved this way:

snmpwalk -m ALL -v 2c -c public my .1.3.6.1.4.1.11

Note the contrast between the two commands: one accesses the host my with the standard port (my); one uses the host my with the port 166 (my:166).

As a side note, note that Caché provides AgentX subagents, and note, too, that OpenVMS supports SNMP and AgentX as of v8.x. Thus, there’s no fighting with the SNMP daemon on OpenVMS.

Using SNMP with Intersystems Caché

Intersystems Caché can be monitored using SNMP, but it must be started. The details of using SNMP in Caché are detailed in the Caché Monitoring Guide in Appendix B.

Firstly, to make life easier, the SNMP MIB for Caché is included in the installation of a Caché instance. Go to the top level directory (which contains the CPF file) and then change to the directory SNMP. This directory contains the SNMP MIB (named ISC-CACHE.mib).

Put this file with the other MIBs that your client uses. This will provide names and details for your SNMP client. If using net-snmp under Red Hat Enterprise Linux, put the MIB file in /usr/share/snmp/mibs/.

To start using SNMP in Caché (assuming your SNMP server supports AgentX and is already running), use this command (in the %SYS namespace):

%SYS> d start^SNMP(705,20)

The first parameter is the standard port for AgentX (705), and the second is a timeout value (default of 20). When you look at the jobs running in %SYS (using THIS^%SS) you will see a job named SNMP.

To stop SNMP, just enter (again, in the %SYS namespace):

%SYS> d stop^SNMP()

(Don’t forget the parenthesis; it won’t work otherwise.) Logs are written to the mgr/SNMP.log file.

Once SNMP is started, you can check Caché data:

snmpwalk -m ALL -v 2c -c public server .1.3.6.1.4.1.16563.1.1

This command is a net-snmp command, and assumes a server running SNMP v2 with a “public” community and Caché SNMP running. If SNMP is fully set up, you will get a variety of details about your Caché instance. The MIB file is well-documented as to what each element is and means.

AgentX and SNMP on HP-UX

A recent (in a relative fashion) protocol for support SNMP agents is called AgentX. This protocol is an attempt to standardize the protocol between a “master” SNMP agent (or server daemon) and the client agents. This style of SNMP configuration then makes the SNMP support extensible, providing for the ability to add and remove whole SNMP subtrees as desired. Intersystems Caché is one product that provides an agent that uses that AgentX protocol.

Unfortunately, HP-UX does not support AgentX, but rather a commercial protocol called EMANATE.

To be able to use subagents that support AgentX, the native SNMP server must be disabled, and one that supports AgentX installed to take its place. Thankfully, this is not difficult.

First, disabling SNMP requires modifying a number of files in /etc/rc.config.d:

  • SnmpHpunix
  • SnmpMaster
  • SnmpMib2
  • SnmpNaa
  • SnmpTrpDst
  • cmsnmpagt
  • emsagtconf

Change the entry that enables the affiliated agent or server: typically, this is by setting an entry such as SNMP_MASTER_START to 0.

This completely disables all agents as well as the master SNMP daemon. Don’t forget to stop all of these using their init scripts to stop them:

cd /sbin/init.d
./SnmpHpunix stop
./SnmpIpv6 stop
./SnmpMib2 stop
./SnmpTrpDst stop
./SnmpNaa stop
./SnmpMaster stop
./cmsnmpagt stop
./emsa stop

After this, get HP’s Internet Express and install Net-SNMP, which supports AgentX. Strangely enough, the Net-SNMP documentation states that Tru64 and OpenVMS servers come with AgentX support built-in.

Once installed, use the utility snmpconf to configure the agent – it creates the appropriate configuration file in the directory you are in. You’ll want to set these configuration parameters, either through the use of snmpconf or directly:

master yes
agentuser root
agentgroup sys
agentaddress 161
agentXSocket tcp:localhost:705

Better yet, create a special user and group for SNMP – installation of Net-SNMP does not set this up for you.

Once the snmpd.conf is configured, then run snmpd. Make sure you are running the right snmpd; reread the path if you need to:

# type snmpd
snmpd is /opt/iexpress/net-snmpd/sbin/snmpd
# snmpd -Lsdaemon -c /opt/iexpress/net-snmp/etc/snmpd.conf

You’ll have to create an initialization script in /sbin/init.d; we’ve discussed how to do this before.

You should not expect the rich set of HP-UX-specific entries that are provided with the standard installation, but in trade you get extensibility – which allows you to run subagents such as that provided with Intersystems Caché.

Securing your network traffic

If you want to start some exciting discussion in a security forum, just say you use telnet: you’ll find that every admin knows that telnet is insecure, that one should use OpenSSH or similar to encrypt the traffic, and that telnet should be banned from the server environment entirely.

However, telnet is not the only server that transmits its passwords in the clear. There are a lot of others. Here’s a list I came up with:

  • FTP
  • HTTP
  • IMAP
  • IPP
  • LDAP
  • LPD
  • NFS
  • POP3
  • rsync
  • SMTP
  • SNMP
  • syslog
  • VNC
  • X11
  • XDMCP

I won’t cover all of these here (more about these items can be found in my book) but I do want to cover just a few.

Consider, for example, the mail protocols: SMTP, POP3, and IMAP. SSL encryption is available with all three – but do you use it? And what about your logins to your mailbox at your ISP? Every time you login, your password to your mailbox goes across the wire in the clear.

What about NFS – particularly NFS home directories? If you have unencrypted secrets in your home directory, then these items will be transmitted across the network in the clear as well. What about private SSH keys? Unfortunately, there is no way to encrypt NFS traffic.

VNC is another one to watch for: if you type passwords for your root logins over VNC – even if you are using SSH in your VNC session – the passwords are in the clear. The only way to secure VNC entirely is to use an SSH tunnel to encrypt it.

X11 is insecure in the same way, but presents special problems. However, OpenSSH handles X transparently through the use of special tunnels just for X.

syslog is another unencrypted service; do you have passwords put into the system logs? What about secret doings of your servers? How much information leakage can you handle? Unfortunately, syslog is another service that cannot be secured unless you use something such as syslog-ng which permits you to use TCP (and thus, an OpenSSH tunnel).