Duplicate Passwords? Just Say No!

With the recent server compromise at Automattic (the fine people behind wordpress.com) and the compromise of the commenter accounts at Lifehacker not that long ago, I’ve been having to deal with changing passwords. Using Lastpass makes it much easier; I can just imagine how hard it would be without Lastpass.

The nicest thing about having Lastpass store accounts is that you don’t have to try and remember all the places you used a particular password. Using the Lastpass audit feature, it will scan your list of stored passwords and determine which web sites have duplicate passwords, and you can change them all – one by one, of course.

One other nice thing about Lastpass is their Android application – it provides all of the nice capabilities of their browser extensions but on the small Android platform and for all applications as well.

With Lastpass you can have a different random password for all of your web service accounts – and don’t even have to remember them. Just remember your Lastpass account password and you are good to go.

Without Lastpass… I dread the thought of having to remember all the web services, much less try to change all the passwords to go with them.

Side note to web service designers: Could you make it just a little easier to change passwords? Please? Several services had the change password selection hidden quite well – strangely, msn.com was one of these! Yahoo.com wasn’t any better – and Yahoo has the most exasperating password setup I’ve ever seen – requesting your password almost every two minutes. There simply must be a better way.

A Single Character Causes Downtime for… WordPress.com!

Last Thursday, an error in the wordpress.com software caused some user settings to be overwritten, which resulted in loss of settings for some customers. The site was taken down for checks, and an hour later, 99% of users were back online.

The cause of the error? A coding error of a single character. Certainly checks and balances are needed, but according to Matt Mullenweg, founder of WordPress.com, they are already using reviews and testing.

It was less than a month ago that Toni Schneider, CEO of Automattic, wrote in glowing terms about the use of “continuous deployment” at wordpress.com. Is this event going to lead to the death of “continuous deployment” at WordPress? I suspect not.

In fact, Paul Graham described in a paper how he used Lisp for Viaweb in just this fashion. Viaweb was bought by Yahoo! and became the Yahoo Store. Viaweb would fully implement features before it had even become mainstream.

Let this WordPress.com downtime be a lesson as to what a single character can do, and also a lesson in how none of us are immune from such mistakes.

Downtime Reports: How Did They Respond?

I like reading downtime reports, because it shows what can happen and how people and departments respond to the crisis. There were two sites that experienced downtimes over the weekend – one very well known and one not.

WordPress.com went down over the weekend, disrupting thousands of blogs, including VIP subscribers. According to the report, the data hosting company had an unscheduled change take place in a router, resulting in wordpress.com responding to a fraction of the requests coming in. This meant that wordpress.com was not down, just inaccessable to 90% of incoming traffic. The failover mechanism was not activated, presumably because the host was not down – rather its ability to serve up web pages was hampered – the server itself was running fine.

This suggests the following improvement areas (speaking overall):

  • Use some sort of change control – and test changes when made. This unscheduled change very likely did not just affect wordpress.com, but perhaps many others.
  • Monitor not just the server, but paths into the server – everything between the customer and the server.
  • Failover mechanisms should be sensitive to not just server performance, but anything that affects the presenting of web pages to the public (or whatever service is being offered).
  • Relying on a single hosting provider (at one time) means that any problems that arise at that hosting provider affect your service in its entirety; relying on multiple providers in a cluster configuration means that if one hosting provider drops, your service continues (though degraded slightly).

The other site that went down was jdorganizer.com (the web site for Jeri Dansky: Professional Organizer). Since she used to be a system administrator before being a professional organizer, she knows IT. As a user, she had to respond to the outage she experienced (again caused by the data hosting provider).

Jeri explains on her blog what happened, and how she responded as a user of services. She lists the things she learned from the experience, in particular preparing a disaster plan and reviewing it.

Another thing she did was to switch providers when she no longer trusted hers to provide reliable services; being of a technical bent, she was able to make the switch and configure things reasonably easily. She had someone check availability and fixed the problems that arose.

Both of these experiences provide a window into how companies and other users of hosting services can respond when things fail. In both of these cases, the providers failed: the response from the users of the hosting provider services can help us to learn what to do if and when it happens to us.

Kudos to the WordPress.com team for keeping the blogs running, and kudos to both for being willing to tell us what happened (in delightfully complete technical detail…).