Puppet error: already in progress; skipping

Sometimes, you may try to run your puppet agent, and get an error like this:

# puppet agent --test
notice: Run of Puppet configuration client already in progress; skipping

If there is indeed another puppet agent running, it is simple enough to stop it and try again. However, what if this message appears, and there aren’t any other puppet instances running?

This happens because there is a flag stored in a file that didn’t get erased. Do this – but only if puppet is not in fact running:

# cd /var/lib/puppet/state
# rm -f puppetdlock

This will delete the lock, and puppet should start cleanly the next time. This tip works with puppet version 2.6.3.

Puppet refuses to run: “run already in progress”

Recently, one of the servers appeared to not be keeping up with configuration changes. Since it runs Puppet, this is a problem – it means that the changes at the puppet server are not getting propagated to the clients. The server is running Ubuntu Lucid Lynx Server 10.04.3 and Puppet 2.6.3.

So I shut down the puppet agent and tried running it manually:

# service puppet stop
 * Stopping puppet agent
   ...done.
# puppet agent --test
notice: Run of Puppet configuration client already in progress; skipping
#

Since puppet is definitively not running, I had to do some research and find out why it was not running.

I found this bug (Puppet bug #2888) that stated sometimes puppet does not remove its lockfile /var/lib/puppet/state/puppetdlock. Sure enough, on my system, the lockfile was still there. I deleted it and puppet ran normally.

There was also a bug report (Puppet bug #5246) that suggested puppet sometimes does not remove its pidfile /var/lib/puppet/run/agent.pid. Some of the testing suggests that this bug is confined to running puppet --onetime (without other options). I don’t think this affected me: after removing the lockfile, puppet ran normally.

Investigating Mysterious Network Traffic

I discovered in our network a host that was generating a huge amount of traffic compared to every other host in the network. All I had to go on initially was the IP address – so how do we track down the culprit and see what is going on?

First, we have the IP address and access to the network. Thus, my next step was to log into the firewall and sniff the network to see where the traffic is going. That did not prove to be helpful.

However, if we have an IP address, we have a MAC address in the ARP table. There are lots of ways to get this, including the arp command. I used the -e option to tcpdump to get it during the sniff of the network.

If you have a MAC address, you can look up the manufacturer of the network card, which in many cases may be the computer manufacturer. In this case, it was AsusTek. A search of the premises for Asus equipment turned up nothing.

Since this was almost certainly a Windows machine, using nmblookup -A $ip may turn up something useful: in particular, it may return a host identifier or name that can identify its owner. In this case, it turned up a name that had no meaning to me.

The traffic will be viewable at the switch where the equipment is plugged in. Thus, we can go to the switch where all of the workstations are plugged in, and plug a laptop into a port to sniff traffic. Once I had done this, I logged into the web interface, then mirrored 12 of the 24 ports to the port the laptop was in even as the laptop was sniffing the network for the suspect host.

Doing this will send all traffic on those ports also to the port the laptop is listening on – and in this case, turned up the suspect host. Performing a binary search will narrow it down – that is, mirror half (6) of the 12 ports and see if the traffic continues to flow, then mirror half of that (3).

Once I narrowed down the ports to a single port, I tested it. Does the traffic stop when we stop mirroring that port? Once the mirroring of the port was stopped, the network traffic seen by the laptop stopped. Mirroring again resumed the suspect traffic.

Now we know what port it is using. Following the cable from the switch to the patch panel shows which physical outlet is connected, and with the map showing where all of the outlets are we can track the outlet down.

Going to the outlet, I found that there were several devices plugged into a cheap hub. Since there was no one in that office, I pulled each computer’s link to the hub and plugged it into a laptop. This laptop, again, was used to sniff the traffic coming off of the host. A couple of tests and the host was identified.

Next step is to get a virus checker on it and run that to see if anything is running that shouldn’t be.

Investigating Mysterious Outbound TCP Connections

Recently, I had a situation where a firewall had outgoing TCP connections I knew nothing about. If you are to maintain a secure system and a secure network, this sort of thing demands investigation. (I won’t report full details in order to maintain anonymity for various entities.)

Where to start, then? First, use tcpdump and capture the traffic. It may be useful to capture it into a file for looking at with Wireshark. I watched the traffic flow across all interfaces by using tcpdump:

tcpdump -s0 -n -i any host 999.999.999.999

I noticed that there was no traffic coming from anywhere except the outgoing port on the firewall.

Then I became more interested in the IP address being connected to and the port (443 or HTTPS in this case). Connecting to the IP on port 443 didn’t turn up anything interesting (except they used Red Hat Enterprise Linux). Looking up the IP address in a whois listing showed that the IP address was very similar to that of the firewall maker – very interesting indeed. Looking up the IP in reverse DNS showed it to be an Amazon AWS host in Ireland.

Then I wrote a script that used lsof to watch for a connection and find the program making the connection:

#!/bin/bash

WORK=/tmp/work.$$
PORT=":443"

# Prep: erase if present
rm -f $WORK

while true ; do
    if $(lsof -ni $PORT > $WORK) ; then
        ( echo "Found ports open:" ; echo ; cat $WORK ; echo ; echo "Process data:" ; echo
          lsof $(cat $WORK | sed -n '1d; s/^[^ ]*  *\([^ ]*\).*$/-p \1 /p;') ) | \
        mail -s "Found something on port $PORT" me@myhost.example.com
        echo "Sent message at $(date)..."
    fi
done

# Clean up
rm -f $WORK

Because lsof returns 0 only if it has something to report, this works beautifully. I could have slowed it down with a sleep command, but this worked for my purposes.

It showed a program being run that was part of the firewall. Since it was running periodically, I went and looked for it in the crontab files:

grep program /etc/cron/

I found this program in a file in the /etc/anacron.hourly directory. If I had wanted to, I could have stopped the program from running at all by changing this file. I ran the commands independently of the crontab file to see what the output would be.

I was also able to get help from the program by using the option --help. The program was actually a python script located in /usr/bin, and I searched out the actual code that was called: it was compiled python source (a *.pyc file) found in /usr/lib/python2.4/site-packages/ – the compiled source can be decompiled and investigated.

If I wanted to take complete control, the program could have been renamed and a script put in its place which called the original script and did a little extra – such as report by mail every time the command runs, what the command line was, what the output was, and more.

There’s a lot that can be found out if you just know where to look.

OpenLDAP with SSL in Ubuntu Lucid Lynx

In researching configuration tasks for OpenLDAP, I found this article about using sudo with OpenLDAP. As I am going to implement SSL with OpenLDAP, problems with sudo and SSL could be fatal, so I decided to investigate further.

It turns out that the problem is embedded in GnuTLS, a GPL-licensed OpenSSL replacement. GnuTLS was used by Debian because of a licensing conflict between OpenSSL and OpenLDAP. A backend library used by GnuTLS (libgcrypt11) causes problems based on the way it is initialized and the way it handles the dropping of privilege (that is, it gets rid of its “root” access). This shows up as suid applications failing when run against LDAP users; Ubuntu bugs 926350 (GnuTLS) and 423252 (sudo) are this exact problem.

GnuTLS is replacing libgcrypt11 with the nettle backend as of 2.11.x, but Ubuntu Lucid Lynx continues to use GnuTLS with the original (flawed) backend. The “fix” espoused by some is to use nscd – but this is acknowledged to be a workaround.

There is a GnuTLS version compiled not against libgcrypt11 but libnettle which should the problem. I did not test this PPA; if you wish to stick with OpenLDAP and GnuTLS this might be the way to go.

However, there is also a version of OpenLDAP compiled against OpenSSL. Add the PPA to your APT configuration and then perform a system update (apt-get update && apt-get upgrade). This will upgrade your OpenLDAP and cause it to stop at least temporarily; thus, make sure you allow for some server downtime.

This version of OpenLDAP works except for a few initial problems that must be overcome. When you first install, it may refuse to run.

First, it seems to add an openldap user and group – and while this is good, it does not completely change appropriate files to give access to the openldap user. There are two locations that need to be fixed:

  • /var/lib/ldap (the location of the LDAP data store)
  • /etc/ssl (the location of the SSL certificates)

Fixing the first is simple:

chown -R openldap:openldap /var/lib/ldap

The second is not as straight-forward; in my case, I added the user openldap to the group ssl-cert which has access into the /etc/ssl directory and subdirectories. Use the vigr command to make this happen: add openldap to the end of the ssl-cert group line (your group id might be different):

ssl-cert:x:108:openldap

Note that if the SSL certificates aren’t set up right, then running the new OpenLDAP will not work – even if LDAPS is not enabled at startup. (There is a fantastic message showing how to make sure your certificates match from OpenVPN.net.) You also need to make sure that the certificates are not expired; it is reported that OpenLDAP will also fail to start with expired certificates.

As the final step, change the /etc/default/slapd file to start LDAPS:

SLAPD_SERVICES="ldap:/// ldapi:/// ldaps:///"

Eventually, the best thing to do is to remove LDAP support entirely (and use LDAPS completely):

SLAPD_SERVICES="ldapi:/// ldaps:///"

Tips and Tidbits About LDAP

Setting up and understanding LDAP is not easy. In my opinion, nothing is obfuscated more and unnecessarily so than LDAP. There are a number of tips that can help you to understand LDAP.

LDAP is not authentication. This was the number one problem I had when I started (a while back). The first time user might search for documents on setting up LDAP when in fact they are looking for documents on how to set up UNIX and Linux authentication using LDAP. An LDAP server at its most basic doesn’t understand UNIX uids, doesn’t understand GECOS fields, doesn’t understand UNIX gids, and doesn’t understand Linux shadow files. Support for all of this must be added.

Support for UNIX authentication must be added. You would think that the most common usage for LDAP would come bundled and ready to go with the server; however, often this is not the case. Even if it is the case, you may find that for certain capabilities you are expected to add new LDIF files to support the fields in LDAP.

LDAP is not just another database server. Virtually everything in LDAP has a different name; it is unlike anything you’ve done before. Take heart: X.500 (where LDAP comes from) was worse. You’ll have to slog through a pile of new terms, but after a while it will become easier to understand.

OpenLDAP is not synonymous with LDAP. There are other servers out there. OpenLDAP does come with virtually every Linux platform; there are however, many others – many of which may be easier to use. There is the 389 Directory Server from Red Hat, the ApacheDS (part of the Apache Directory Project) from Apache, and OpenDJ from ForgeRock. OpenDJ itself is a spin-off from OpenDS, originally from Sun.

OpenLDAP is known for making non-backwards-compatible changes. The most recent example is the complete replacement of the configuration system.

OpenLDAP no longer uses slapd.conf. This will cause you no end of problems: there are a lot of people trying to explain how to set up OpenLDAP, and with a single strike (as of version 2.3) OpenLDAP made all of that documentation obsolete and useless. This is incomprehensible, but it is a fact.

Using and administering LDAP requires command line expertise. This is basically true, but like many things, it is not the complete truth. There are many programs designed to make it easy to browse LDAP stores, along with editing capabilities. Some of the more interesting products include Jxplorer, Luma, and the Apache Directory Studio. Of these, the Apache Directory Studio is the most capable, robust, and actively developed – and by far the largest.

Some LDAP entries can be present more than once or have more than one value. If you are comparing LDAP to a database, then this will come as a surprise. One valuable example is UNIX groups: the original UNIX systems only had one group per user; later, secondary groups were added – thus presenting a single user with multiple groups. This is handled in LDAP in a variety of ways, but they all amount to having multiple entries with different values.

Limiting user logins by host is not available in LDAP. This capability is most likely to be done by using the client host. There are a number of ways to do it, but all require LDAP client configuration, and all are limited in their application. Without client configuration, all LDAP users will have authenticated access to the host.

Be prepared to do a lot of web searches for documentation and solutions. The best places to go for searches are: Google (of course) and Ubuntu Documentation.

There are also very good articles and documents on using LDAP for authentication. There is an article about OpenLDAP authentication on FreeBSD (FreeBSD articles tend to be very well-written). Similarly, Ubuntu documentation is well-written as well; each of the Ubuntu versions has a section in the documentation on using and configuring OpenLDAP for authentication. Ubuntu 11.04 documentation has a good article on OpenLDAP for example.

Ubuntu documentation also includes a lot of well-written (and current) articles. For example, there are articles on OpenLDAP Server (a general article), LDAP Client Authentication, Samba and LDAP (from the 10.04 Server documentation), and Securing OpenLDAP Conenctions. If you plan to use 389 Server instead, there are even a couple of articles on using it with Ubuntu: Fedora Directory Server (the original name of 389 Server) and Fedora Directory Server Client Howto.

A nice overview of LDAP comes from Brian Jones at O’Reilly: specifically, his 2006 articles on Demystifying LDAP and Demystifying LDAP Data. Linux Journal also has myriad different articles on LDAP (not to mention an OpenLDAP theme for the December 2002 issue). Linux Journal also has an article from 2007 on Fedora Directory Server (now 389 Server).

Lastly, an excellent resource is the “book” LDAP for Rocket Scientists from Zytrax.com. You simply must go and read portions of this book. One very apt quote from the introduction to the book which sums up the state of LDAP documentation generally:

The bad news is that [in our humble opinion] never has so much been written so incomprehensibly about a single topic with the possible exceptions of BIND and … and …

(It should be noted that the other book at Zytrax is about DNS. Is it any surprise?)

UPDATE:

Yet another trick to LDAP:

The cn= attribute is not solely a leaf attribute. This can be seen in OpenLDAP’s cn=config tree with OpenLDAP configuration. For example, a typical admin user can be designated like so:

cn=admin,dc=bigcorp,dc=com

However, when you use OpenLDAP’s configuration, the designation for the admin user is this:

cn=admin,cn=config

When you look into the configuration tree, there are more cn= entries – like this:

cn={4}misc,cn=schema,cn=config

Resetting the MacOS X 10.4 (Tiger) Admin Password (without disk!)

Resetting the MacOS X Tiger administrator password can be done by booting with the Installation Disk, and selecting the appropriate menu option. This is the most commonly referred to option, with a lot of high-quality instructions available via Google.

The problem is what to do when you have no disk – or it is too inconvenient to get it. In my case, the PowerPC Mac Mini that runs MacOS X Tiger has a bad DVD drive.

In most cases, resetting a password just requires physical access to the machine and a reboot. (This is why nearly all security professionals say, If you’ve physical access to the box, it’s over.)

With Tiger, you can indeed do this. (In fact, Leopard and Snow Leopard can too – it’s just more complicated.)

Start your MacOS X 10.4 system, and at the gray screen hit (and hold) Cmd-S to enter single user mode. The screen should go black, and white writing commences – kernel messages. You should eventually get a root prompt:

#

At this prompt, type in these three commands (terminated with a return, of course):

sh /etc/rc
passwd admin
reboot

(Replace admin with your administrator user’s short username.) When I did this, I found that if you waited too long after doing the command sh /etc/rc, then the system would take away your prompt. So don’t lag!

This article (from 2009) over at MacYourself is one of the most complete descriptions I’ve seen; this 2007 article at MacOSX Tips is nearly as complete and adds some more thoughts too.

These articles saved me; I hope they can be of some use to you too.

Using SSH Agents with GNU Screen (and Byobu)

SSH is an encryption tool that allows you to connect to machines using an authenticated and encrypted connection. With an SSH agent, your authentication (and keys) can be “carried” from one system to the next. You load all of your keys on your local system into the agent, then connect to a remote system with the agent. Even though none of your keys are present on the remote system, they all exist and can be used to authenticate to another system.

This capability that the SSH agent gives you is very useful: you can keep all of your keys on a laptop or other personal system and only keep public keys on remote systems.

Running the agent is as simple as:

eval ssh-agent

This will load all keys that the agent can find (keys in your .ssh directory). You can add specific keys with:

ssh-add mykey

Replace mykey with your specific key name. If there is a password on the key, you only have to enter it once – at the very beginning.

Once the agent is configured, you can connect to a remote system with:

ssh -A host

The -A option tells SSH to use “Agent Forwarding” which is what allows us to take our keys “with” us from one host to the next.

Here is the really nice part: once you’ve connected to the place where your GNU screen sessions are located, copy the value of the SSH_AUTH_SOCK variable:

# set | grep SSH
SSH_AUTH_SOCK=/tmp/ssh-ttQal19039/agent.19039
SSH_CLIENT='192.168.6.181 42243 22'
SSH_CONNECTION='192.168.6.181 42243 192.168.6.161 22'
SSH_TTY=/dev/pts/1

Take the value of SSH_AUTH_SOCK and input it into GNU screen:

:setenv SSH_AUTH_SOCK /tmp/ssh-ttQal19039/agent.19039

After this command is executed, start new sessions to your remote hosts. For the local host, it may be easiest just to restart the session – but you could also just set the variable SSH_AUTH_SOCK in your shell – such as this command for the Korn shell:

export SSH_AUTH_SOCK=/tmp/ssh-ttQal19039/agent.19039

To verify that the agent now works, use the command

ssh-add -l

. You should see all of your keys; if instead you see

Could not open a connection to your authentication agent.

then you should check the setting of SSH_AUTH_SOCK.

With SSH agents, agent forwarding, and GNU screen, you will find your authentication difficulties eased considerably.

UPDATE: Added information about not always having to restart screen sessions.

Restricting Users to SFTP Only and to Home Directories Using a Chroot

A chroot jail is a miniature environment which provides only just enough resources for one or another program to run. In this case, the sftp program is the one that is chroot’ed. The process can be made easier with tools like rssh or scponly – both of whcih are available in Red Hat Enterprise Linux 5.

However, the problem comes up when you want to use multiple users in this environment and you want to restrict each user so that they cannot see the rest of the environment, including other users. Neither rssh nor scponly will handle this sort of thing.

There is the program MySecureShell but it is not included with either RHEL or Ubuntu Server. One would have to compile it from source or use their packages for these environments.

However, as of OpenSSH 4.9 this problem was resolved: with the addition of the Match and ChrootDirectory options to OpenSSH 4.9 – as well as an internal SFTP server (in contrast to the external helper program otherwise used) – this allows the kind of restrictions wanted here.

Ubuntu Server Lucid Lynx (10.04) and Red Hat Enterprise Server 6 both come with OpenSSH 5.3p1 which includes both of these options. However, Red Hat Enterprise 5 only comes with OpenSSH 4.3p2. To see which version you have, use a command like this:

$ ssh -V
OpenSSH_4.3p2, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008
$

Red Hat did backport the ChrootDirectory option into OpenSSH in RHEL 5 back in 2009, but not the Match option. Thus the ChrootDirectory option is good for everyone or no one. It is possible to use the option on the command line or in a specially crafted configuration file, but this requires a special server for SFTP instead of using the normal one – meaning two instances of SSH running if you want standard SSH logins.

It doesn’t seem to be possible to pass such options into rssh; rssh doesn’t allow you to pass options to ssh.

If you are using RHEL 5, it appears that the best option is to compile the most recent OpenSSH from source. To build the source, unpack the archive and copy the file contrib/redhat/openssh.spec from the unpacked openssh directory to the /usr/src/redhat/SPECS directory. Edit the file to your desire (such as disabling all of the X11 capabilities) and then perform this command:

rpmbuild -ba openssh.spec

In my case, the SPEC file contained a error in 5.9p1: there is a line that says:

%doc CREDITS ChangeLog INSTALL LICENCE OVERVIEW README* PROTOCOL* TODO WARNING*

Since there are no files named WARNING* this fails. Remove that entry from the end of the line:

%doc CREDITS ChangeLog INSTALL LICENCE OVERVIEW README* PROTOCOL* TODO

Then it compiles without problem.

If you want to use a recent version of OpenSSH on RHEL 5, you’ll have to make some configuration changes – and watch out for some changes. Changes include:

  • ServerKeyBits increased from 768 to 1024 (knock back down with specific entry if this worries you)
  • Protocol only supports version 2 by default
  • ShowPatchLevel not supported by standard OpenSSH (requires patch)
  • New ECSDA keys not supported fully or properly; `ssh-keygen -A` generates errors (can be ignored)
  • MaxSessions added to configuration file
  • AuthorizedKeysFile changes meaning slightly
  • UseDNS is included in the configuration file (instead of being hidden – set it to “no” for speed)
  • Banner is set to “none” by default

Change the Subsystem configuration entry for sftp to look like this:

Subsystem sftp internal-sftp

Add this to the end of the configuration:

Match group sftp-only
 ChrootDirectory %h
 AllowTCPForwarding no
 X11Forwarding no
 ForceCommand internal-sftp

Don’t forget to restart the ssh server:

service sshd restart

When you do this, make sure to have a root shell already open on the system so that you don’t get locked out. The user should exist in the password file like so:

test:x:521:521::/home/test:/bin/nologin

The home directory must be owned by user root and group root – including all directories in the path. The shell does not matter, as SSH will take over before the shell is activated; however, if there are other ways to log in with this user aside from SSH, then a proper shell like /bin/nologin or /bin/false is necessary.

For the specified configuration the user must be a member of the sftp-only group – either primary or secondary, it doesn’t matter. The Match option can also be used against a single user instead.

If you want the user – or anyone else – to be able to write files in this tree, create a subdirectory in the home directory with the appropriate permissions. The user – and others – will not be able to write to the home directory, but will be able to write to the subdirectory.

Don’t forget to test the login first – and don’t forget to have a root shell open before you start changing OpenSSH configuration or you may be unable to SSH into the system!

The Wheel Group: Updated

Working with Ubuntu Server (Lucid Lynx) the wheel group has been changed slightly.

Firstly, there doesn’t seem to be any wheel group at all – not by name. The group is now called root by default, and is enabled the same as before: uncomment the appropriate line in /etc/pam.d/su so it looks like this:

auth required pam_wheel.so

The system uses the root group because that is the group name for group 0, and because there is no group named wheel. However, if you want to maintain the original standard – make the entry look like this instead:

auth required pam_wheel.so group=wheel

Then rename (or duplicate) the group in /etc/group with id 0:

root:x:0:root
wheel:x:0:root

This maintains the highest level of compatibility: the group root remains as before, but the name wheel is also available. Having two groups with the same group ID is not typically recommended, but it doesn’t necessarily break anything either as long as the two groups are seen as completely equivalent. The first group in the list will normally be used when names are given for GIDs, but both names will be recognized from the user.

According to the documentation, this is overkill – but it does force the issue and make su work with the actual wheel group rather than a renamed one.

What pam_wheel actually does is search for group wheel first, then if it can’t find that, searches for group 0 (zero) next. It is this configuration that allows the renaming of the wheel group.

Apparently Debian or Ubuntu named the group sudo at one point, now root. The best thing to do – when there is no distinct advantage to change – is to go with the status quo: in doing so, any administrator that comes along will be able to learn and adapt to the system rapidly, leading to quicker completion of administration tasks, simple and complex.