An Map of the IPv4 Routing Space

The nixCraft blog has an article about a very interesting map of the IPv4 address space from the Measurement Factory as part of the RouteViews project. The most recent map looks like this:

Map of the IPv4 Address Space

The map is very detailed (the above image is only a small version) and is available from CafePress as a full-size poster.

The map shows the IPv4 space as understood by the BGP routing protocol. The black areas are unroutable (at least from the Measurement Factory’s location at the University of Oregon) and the gray areas are either reserved or unallocated space.

Blogging and the law

Turns out there is a lot of things to watch out for!

I recently read this blog post on Steve Tobak’s blog “Train Wreck” over at CNet. Turns out there is a lot of legal liability (ouch) that can arise from posting. A most interesting source of information is the Electronic Frontier Foundation‘s Bloggers page (with a Legal Handbook to boot).

Reporters without Borders has a handbook, too: Handbook for Bloggers and Cyber-Dissidents.  (Their entry page also has links to other languages as well – this international organization is actually French).

If you value digital rights, I’d recommend a donation or two to the Electronic Frontier Foundation.

Quickly creating large files

I’m surprised how many people never think to do this…. but it makes it quite easy.

If you need a large text file, perhaps with 1,000s of lines (or even bigger) – just use doubling to your advantage! For example, create 10 lines. Then use vi (or other editor) to copy the entire file to itself – now 20 lines. If you remember how a geometric progression goes, you’ll have your 1,000s of lines rather fast:

  1. 10 lines…
  2. 20 lines…
  3. 40 lines…
  4. 80 lines…
  5. 160 lines…
  6. 320 lines…
  7. 640 lines…
  8. 1280 lines…
  9. 2560 lines…
  10. 10240 lines…

Ten steps and we’re at 10,000+ lines. In the right editor (vi, emacs, etc.) this could be a macro for even faster doubling. This doubling could also be used at the command line:

cat file.txt >> file.txt

Combined with shell history, that should double nicely – though using an editor would be more efficient (fewer disk reads and writes).

When writing code, often programmers will want to set things off with a line of asterisks, hash marks, dashes, or equals signs. Since I use vi, I like to type in five characters, then copy those five into 10, then copy those 10 and place the result three times. There you have 40 characters just like that.

If only a certain number of characters is needed, use dd:

dd if=/dev/random of=myfile.dat bs=1024 count=10

With this command (and bs=1024), the count is in kilobytes. Thus, the example will create a 10K file. Using the Korn shell, one can use this command to get megabytes:

dd if=/dev/random of=myfile.dat bs=$(( 1024 * 1024 )) count=100

This command will create a 100M file (since bs=1048576 and count=100).

If you want files filled with nulls, just substitute /dev/null for /dev/random in the previous commands.

You could use a word dictionary for words, or a Markov chain for pseudo-prose. In any case, if you only want a certain size, do this:

~/bin/datasource | dd if=- of=myfile.dat bs=$(( 1024 * 1024 )) count=100

This will give you a 100M file of whatever it is your datasource is pumping out.

OpenSolaris on a MacBook

OpenSolaris is very interesting, and since the introduction of dtrace and ZFS has enthralled many. I tried to install it onto my HP Compaq E300 laptop (which it was unsuitable for), and tried to install it onto an HP Compaq 6910p laptop. In this case, the networking was unsupported: both the ethernet and the wireless drivers were not included with OpenSolaris Express (Developer Edition).

In any case, I expect I might just be shopping for a laptop in the next year – and it’s nice to see that OpenSolaris does run on the Apple MacBook.  This article goes into detail about how the writer got it to work, and each of the steps that were taken to make it happen.  Paul Mitchell from Sun discusses dual-partitioning a MacBook in this context as well.  Alan Perry (also from Sun) had done the same thing with a Mac Mini, and Paul extended it to the MacBook.  Both entries are detailed and have to do with MacOS X and Solaris dual-booting.

An a different note, check out the graph of library calls from dtrace in this article.  From what I’ve heard of dtrace, it’s the ultimate when it comes to debugging…

New header yeah!

We’ll see how this looks.  I changed the header – now it is all picture, and I put it together myself using the Gimp and a public domain photo.  We’ll see how it looks – I may yet change the text (it just doesn’t seem smooth here…).

Open Source Math

This recent post talks about a paper from the American Mathematical Society, arguing in favor of using more open source software in mathematics. Traditionally, academia has been open; indeed, the Internet was created (in part) – the network was created – even Usenet was created – to allow researchers to share information back and forth.

If today’s prevailing ideas of patents and making profits off of ideas had been prevalent then, none of these technologies would have been created. It was scientists (including computer scientists, mathematicians, and others) who shared their research, with researchers from one university freely exchanging with researchers at other universities.

In mathematics, we have proprietary software like MATLAB, Mathematica, S-PLUS, and others.  However, there are indeed suitable replacements for most everything: there is R (instead of S) and GNU Octave (instead of MATLAB) for example.

The author of the previously mentioned post also mentions SciLab, one I’ve not heard of before.

As a system administrator, statistical data can be gathered from various sources (uptime, disk usage, trends, etc.) and plotted with some of these tools.  I’ve seen articles about using R to do such things.

Otherwise, encouraging others to use open source is always (in my mind) a Good Thing.

Tips on using the UNIX find command

When I used find, it took a while before I was able to use it regularly without looking it up.  For a smart introduction, this article from the Debian/Ubuntu Tips and Tricks site is good.  The GNU project has all of their manuals on the web, including the GNU find manual.

There is much more to the find command than just these introductory topics, however.  First, let us consider the tricks and traps of the find command:

  • The original find command required the -print option or nothing was printed at all.  Today, the GNU find does not require -print, and most other find commands seem to have followed suit.
  • Using the -exec option to find is less efficient than using the xargs command; in the Sun Manager’s mailing list there was a nice summary from Steve Nelson of this contrast.
  • Watch out for filenames with spaces and other things; the GNU find contains a -print0 option (and GNU xargs has a -0 option to match) just for this reason.  These options use an ASCII NUL to separate filenames.

Some tips for using find:

Multiple options can be placed in sequence with AND and OR boolean options (and parenthesis). For example, to find all files containing “house” in the name that are newer than two days and are larger than 10K, try this:

find . -name “*house*” -size +10240 -mtime -2

This is where some of the power of find can be seen.

Use all appropriate options.  The more you can narrow down the selection, the less you have to look.  For example, the -type and -xdev options can be quite useful.  The -type options selects a file based on its type, and the -xdev prevents the file “scan” from going to another disk volume (refusing to cross mount points, for example).  Thus, you can look for all regular directories on the current disk from a starting point like this:

find /var/tmp -xdev -type d -print

Get to know all of find’s options.

Use xargs instead of -exec.  Find will spawn a new process for each execution of -exec (though GNU find might be different).  xargs will load a single process (binary) into memory, parcels out the arguments (one to a line on stdin) into a set of command arguments, and runs the binary as necessary – repeating this process as often as necessary.

For example, an “exec” of rm would spawn a process for rm, load the rm binary for each file, run it once for each file, and release process memory.  Using xargs, the rm binary is loaded once, then as many arguments as possible are read from the standard input, rm is run with these arguments.  If there are more arguments, xargs repeats the process.

Don’t use find / .  Doing a find on a large number of files can slow the system down drastically.  Typically this is used by an administrator in order to find a file somewhere on the hard drive.  Better yet is to perform this command sequence overnight:

find / -print > /.masterfile

Then the /.masterfile can be searched using grep instead of tying the system up with lots of disk I/O during the day when users are counting on excellent system performance.

Remember to quote special characters.  In particular, any regular expressions and the left and right parenthesis should be quoted.  Typically, the regular expressions are put into double quotes, and left and right parens are quoted with a backslash.

Be wary of extensions to POSIX.1 find.  It’s not that they are bad, but rather that you cannot count on them being present.  Unfortunately, some of the most useful options fall into this category – but as long as you are aware of them, they can be used appropriately.  Some options in this category are:

  • -print0
  • -maxdepth
  • -mindepth
  • -iname
  • -ls

In particular, the -print0 is the most useful of the lot.

The BSD man page also brings up an interesting point about find and find options:

Historically, the -d, -L and -x options were implemented using the pri-
maries -depth, -follow, and -xdev. These primaries always evaluated to
true. As they were really global variables that took effect before the
traversal began, some legal expressions could have unexpected results.
An example is the expression -print -o -depth. As -print always evalu-
ates to true, the standard order of evaluation implies that -depth would
never be evaluated. This is not the case.

This has been a source of confusion in the past; considering them as global options (and placing them first) will provide some relief. Note that the -d, -L and -x options are likely BSD-specific.

Follow

Get every new post delivered to your Inbox.

Join 43 other followers