Why You Should Learn Perl

Previously I spoke on why one should learn vi (summary: learn vi because it’s on every UNIX and Linux system you’ll install…). Well, why should one learn Perl?

Because it’s only every UNIX and Linux system you’ll install…. and on OpenVMS… and available for Windows, too.

Unlike vi, I’m not as big a fan of Perl as I once was: having been interested in (and a fan of) object-oriented programming (OOP) for years – it only took Ruby little time to dislodge me from my interest in Perl (that would have been just prior to Perl 5).

Yet, this does not matter: Ruby is nice, but not ubiquitous. In particular, making Ruby run on HP-UX has proven to be extremely difficult in recent years – and it is not loaded by default in any case. I don’t know of any UNIX that installs Ruby as part of the base package (or that makes it available at all).

Learning Perl is not as hard as it may seem: since it is ubiquitous, there are many excellent books from which to learn Perl – and excellent references as well:

I have all of these, and find all of them to be useful. In my progression of learning Perl (or relearning it…) I find that Effective Perl Programming is fantastic. Specifically, presents a series of items (or tips) then shows you how to use and understand the tip in detail. I recommend this book fully.

Don’t neglect Perl, as it is everywhere, unlike any other language (including Korn shell!). If all you write is Korn shell, then your program will be unusable in any environment that does not provide ksh (think Linux and FreeBSD and OpenVMS for three). It’s true: ksh is not installed on Linux by default: bash is – and FreeBSD uses the C shell. However, all three environments provide Perl.

Using Perl to make big files

A while ago, I talked about making big files. This was, by nature, UNIX-specific – that’s what I deal with all day, and the focus of this blog.

However, not all systems are UNIX (or Linux) – and not all the systems I deal with all day long are UNIX or Linux. However, perl is everywhere – and can be used quite easily to generate large files whatever you might be on.

For example, to make a 5M file, try this:

perl -e "open(FD, 'myfile'); print FD 'x' x (1024 * 1024 * 5);"

If you are inside of vim (a vi-clone which also runs everywhere), try this:

:!perl -e "print '-' x (1024 * 1024 * 5);"

This gives you a single line (5 megabytes in size). To make multiple lines:

:!perl -e 'for ($i = 1; $i < 500; $i++) { print "x" x 39, "\n"; };'

This makes 500 lines of 40 characters each (including single-character line terminator). If the system line terminator is two characters, then use 38 instead of 39. In total, this gives 19000 characters (about 18 kilobytes).

Perl is quite useful for creating portable scripts – but is by no means the only one. The ideas given here carry over to other languages that may be available. For instance, tcl and python and ruby are also available in other environments, and can do the same things as perl does here.

Of course, perl’s repetition operator ‘x’ makes it particularly easy here.

Update: corrected perl one-liner.

Writing Portable Administration Scripts

Writing portable scripts for UNIX and Linux is fairly easy – Korn shell is everywhere, and ksh scripts work the same and have the same basic tools available (sed, awk, pipes, etc.).

What about writing portable scripts to work on UNIX, Linux, and OpenVMS? UNIX and Linux are similar enough that things will work across the different platforms – the same holds for the BSD platforms and for Windows with the Cygwin utilities.  But radically different platforms such as OpenVMS require a different approach.

The first thing I did in looking at OpenVMS was to search out the languages and utilities that were available.  HP offers a number of open source tools, and has Freeware CDROMs available as well. SAIC has a large OpenVMS archive, including the contents of the HP Freeware CDROMs.

Under OpenVMS, I found these languages available:

  • Java (built-in)
  • GNU awk
  • Perl
  • Perl/tk
  • tcl/tk
  • Python

I doubt that Java would be used for scripting purposes, but it is becoming ubiquitous and if it is well-known by the scripter, it is possible that it could be used.

However, the other (add-on) alternatives seem to be much more likely.  Perl, Python, GNU awk, and tcl have extensive capabilities, and with tk visual displays are possible.  My main choice would probably be Perl.

The next step is to make sure that any coding that is done is truly portable.  Perl, with its extensive documentation, includes documentation specifically for portability and for OpenVMS as well.

The Perl portability documentation goes into complete detail about the various points that may trip a programmer up; in short, several of the main points cover:

  • Data representation (high-endian order? low-endian order? line terminator?)
  • File path representation
  • Character sets and encoding (including order)
  • Time and date representation

The best thing to do is to following the guidelines in the Perl portability document (even if using other languages) and to then test the portable code on all systems affected. Only in extreme circumstances should code be written specifically to the target system and selected based on target OS type.  Better to make it portable at its core.