I Am Not A Serial Killer

I just finished the John Cleaver series by Dan Wells: I Am Not A Serial Killer, Mr. Monster, and I Don’t Want To Kill You. They’re supposedly horror, but they lack the thing that normally makes me dislike horror books: point-of-view characters that just react to events, and spend half the book being afraid. All the trappings of horror are present—a contemporary, run-down environment, dead bodies, blood and gore, and a scary (possibly supernatural) killer closing in on you—but it feels like fantasy. The protagonist is actively trying to achieve things, and you soon realize that this isn’t the kind of story where he’ll ultimately fail.

But it is the kind of story where “protagonist” is at times a worryingly inaccurate synonym for “hero”. See the book titles.

Highly recommended.

How much entropy in that password?

On his Security Now podcast, Steve Gibson recently disagreed (transcript; search for “xkcd”) with the way Randall Munroe calculated the number of bits of entropy of passwords in this excellent XKCD comic:

XKCD: Password Strength

The disagreement boils down to Randall giving a lower bound on the entropy, and Steve giving an upper bound (and not realizing—or neglecting to mention—the difference). Randall is right, and Steve is wrong. Let me explain:

Lower bound

In both passwords featuring in the comic, Randall assumes a particular password template—in effect, an automatic procedure where you input some number of random bits and get a password. An example, similar to (but simpler than) the first of the comic’s password templates: Pick a six-letter dictionary word (in lowercase), capitalize one of the letters, and stick a number at the end. The number of possible passwords generated by this template is the number of words in the dictionary times six times ten; the number of bits of entropy is the base-2 logarithm of this number.

This is a lower bound of the entropy in the passwords generated by this template, because we assume that the adversary knows everything—in particular, knows which template the password was made from—except the random numbers we used to pick the particular dictionary word, the letter to capitalize, and the last character. Any particular adversary may know less, but no attacker will know more. The lower bound on the entropy describes how many guesses even a maximally well-informed adversary must make.

Of course, there are caveats: In particular, where the template asks for randomness (such as in “pick a dictionary word”), you are expected to use high-quality randomness such as produced by coin flips, dice, or good computer programs. Just trying to think of a random word won’t cut it; you’ll be more likely to think of some words than others, and thus get less entropy than you thought.

Upper bound

An upper bound on the entropy of a given password can be obtained by finding a simple template compatible with the password (i.e., such that the password could have been generated from that template) and then counting how many bits of randomness the template uses to produce the password. For example, given the password “zoMbie8”, one candidate template is “pick seven characters that are either lower- or uppercase letters or numbers”; this gives an upper bound of (26+26+10)^7 = 3,521,614,606,208 possible passwords, or about 42 bits.

Another candidate template is “pick a six-letter dictionary word (in lowercase), capitalize any one letter, and add a digit at the end”. The dictionary on my computer has 9300 six-letter words, times six possible positions for the capitalized letter, times ten possible digits at the end, gives 558,000 possible passwords, or about 19 bits.

Now, we can all agree that 19 bits is quite a bit less than 42—meaning that the former is a tighter bound. But how low can we go? Given a specific enough template, the entropy becomes arbitrarily small; consider e.g. the template “take the word ‘zombie8’ and choose at random whether to capitalize the third letter”, which has just two possible passwords, for a single bit of entropy. How can we meaningfully speak of an “upper bound”, if we can make it as small as we want? (Specifically, consider the case where the password was in fact generated by the second template, and the lower bound is 19 bits. If our upper bound can be lower than the lower bound, we have a problem.)

The catch is the provision that the template be “simple”, as I sneakily specified in the first sentence describing the upper bound. The precise meaning of “simple” is “describable with few bits”, because strictly speaking, the upper bound is the sum of the number of bits needed to describe the template and the number of random bits used by the template. (This what the parenthesis in the first panel of the comic is talking about.) As long as the latter is much larger than the former we don’t make much of an error by omitting it, but as we make the template more and more specific, the error grows.

A useful rule of thumb for evaluating the simpleness of a template is to consider how many other templates just like it there are. The first template, with seven random characters, has very few variants (you could choose to include other characters than letters and digits or change the length, but that’s about it). The second template has more, but still not many (you could e.g. change the number of capitalized characters, and change the position of the digit). The third template obviously has a great many very similar variants—just choose another word or another digit!

So, which one of them is the password strength?

Which number do you go by when choosing a password? A lower bound, which describes how strong your password is guaranteed to be, or an upper bound, which describes how strong it isn’t?

Which number is more useful printed on elevators? The number of people they’re guaranteed to hold, or the number of people guaranteed to be enough to make the cable snap?

Password strength meters

So, the lower bound on the entropy is the useful number when creating a pasword. But which number do you get when testing your password in a password strength meter (such as this one)? An upper bound! There’s a simple reason for this: when calculating the lower bound, you need to know what template was used; just seeing one particular password produced by the template isn’t enough.

The password strength meter tries a number of templates. If one of them is close enough to what was actually used, its entropy estimate can be quite good; but if not, it can be way, way off (and it’s always an overestimate, never an underestimate).

Password strength meters are useful for identifying weak passwords, but they can’t guarantee that a given password isn’t weak. That can only be done by generating the password from a template that uses enough random bits.

And that’s why Randall is right and Steve is wrong—Randall gets his numbers from the amount of random bits poured into the password template to generate the password (lower bounds), while Steve gets his numbers by feeding Randall’s passwords to a password strength meter (upper bounds). It’s not that Steve isn’t doing his math right, it’s that he isn’t doing the right math.

How hard is it to crack?

The whole point of a password is that it shouldn’t be guessable, so in a very real sense, the strength of a password is defined by how hard it is to crack.

Password cracking programs work essentially the same way as password strength meters, but backwards: using a succession of templates, they try every combination of random bits that can be fed to that template and see if the generated password is the right one. They start by using slightly complex templates (such as ones based on a random dictionary word), and if those fail, fall back to simpler templates (based on trying every possible password of some bounded length). If the template actually used to generate the password is close to one of the templates the cracker tries, the number of tries needed won’t be all that much more than what’s guaranteed by the lower bound on the password’s entropy. But if the actual template isn’t close to any of the ones the cracker tries, it needs many more tries.

Consider for example the password “zoMbie8”; if the cracker tries templates based on a single dictionary word with a few simple tweaks, it should be able to find it in only millions of tries, but if it has to resort to trying raw combinations of letters and digits, the tries will number in the millions of millions.

This means that there are two ways to make a secure password: use a template the password crackers don’t know about (or don’t bother to try, because so few people use it for their passwords), or use any old template and feed it with enough random bits. The former strategy relies on outwitting smart people who spend much of their time coming up with better ways to crack passwords; the latter just takes more coin flips. It’s security by obscurity vs. real security.

(Excercise for the reader: What’s the problem with Steve’s Password Haystack scheme?)

Postscript: The actual entropy

I’ve talked a lot about lower and upper bounds on a password’s entropy, so you might be wondering the obvious: What’s the actual entropy, and why aren’t I talking about it?

The entropy of a string of characters (which is just what a password is) is closely related to its Kolmogorov complexity—in a sentence, the entropy of your password is equal to the length of the shortest computer program that generates it. However, this definition is useless to us, since it can be shown that Kolmogorov complexity isn’t computable.

A password template, being essentially just a string, has its entropy computed the same way, which is why I was just waving my hands when discussing it earlier. Approximating it by zero as we do gives lower bounds that are lower than strictly necessary, and upper bounds that are lower than they should be and thus run the risk of not being correct (but nothing much in the analysis depends on them being correct).

Bitcoin: Trust can shorten the transaction delay

I’ve touched on this before, but it just occurred to me again: Normally, the receiver of a Bitcoin payment can’t trust that everything’s OK just because she sees the transaction (which is signed with the sender’s private key); until she sees the transaction becoming part of the main block chain, there is always the possibility that the sender is cheating by double-spending her coins.

However, since only the account owner (who has the private key) can create transactions that withdraw coins from an account, it’s possible to build up trust over time: if Alice has made several payments to Bob in the past, and they all worked out, Bob might start to acknowledge Alice’s transactions before the block chain proves that she didn’t cheat, cutting the delay to almost zero. This would make Bitcoin practical for a number of applications where a half-hour transaction verification time is unacceptable, such as micropayments in peer-to-peer networks.

Using a web of trust or similar, Bob could even start trusting Alice without having to have a personal history with her—she just needs to have a history of non-cheating with someone he already trusts.

Interestingly, if Alice ever does try to double-spend, the conflicting transactions she produces will (when seen together) provide absolute proof of this, so a trust protocol would probably be complemented by a protocol for gathering these proofs of double-spending. Since these proofs don’t rely on trust, they will be effective enough that no account can be used to attempt double-spending more than once.

Drink without a name

I invented a new drink today, and the result was good enough that I thought I’d share the recipe:

For six drinks
4 dl coconut milk
1 lime (juice and zest)
3 tbsp sugar
6 dl grapefruit soda
ice

  1. Pour the coconut milk into a small bowl. Add lime zest, lime juice, and sugar. (I used brown sugar, but I doubt it matters much.)
  2. Stir vigorously.
  3. Split the mix between six glasses.
  4. Add the ice.
  5. Add the grapefruit soda.

I haven’t tried it, but I guess light rum or vodka might be the sort of thing you could add to this drink if you wanted an alcoholic variant.

Since this drink only has like three ingredients, I figure there’s virtually no chance I’m the first to invent it. If anyone knows one or more names for it, I’d be interested to hear it.

Making your program testable

Ian Lance Taylor has a nice post about DejaGNU, the test harness used for GCC and GDB. One of his main points—the one I found the most interesting—is that it was a mistake to try and test the actual user interaction with GDB, with a test script that essentially pretends to be a human typing commands and reading GDB’s responses. What should have been done instead is to create a programmatic interface. This would have been much easier to write test scripts for; in particular, parsing the replies would have been much easier and much less fragile. (Naturally, it’d still be prudent to have some tests for the user interaction, but for testing the bulk of GDB’s functionality a programmatic interface would be superior.)

This perfectly matches my experience writing tests for Simics, where we have both programmatic and interactive interfaces; testing the former is such a pleasure compared to testing the latter.

I guess the leasson to learn here is that testing gets so much easier if you have a programmatic interface to write your tests against that it’s even worth it to create such an interface if it didn’t already exist, just to be able to use it in tests. That might involve serious redesigning, but testing is that important!

Bitcoin and peer-to-peer networks

Recently, it occurred to me that Bitcoin could be used for nearly overhead-free, arbitrarily small payments between clients in a peer-to-peer network.

The thing that made BitTorrent succeed where other peer-to-peer filesharing protocols failed was the fact that receivers could pay senders for data chunks they wanted by offering to send them other data chunks in return (“I’ve got A and C, and want B and D; you’ve got B and D, and want A and C; so let’s swap”). This pretty much solved the problem of freeloading that other protocols suffered from.

The way BitTorrent does this is clever: when n receivers want to download a file, the node that seeds the file gives each of them one nth of the file, and tells them to swap chunks among each other until they all have all n chunks. If you want all the chunks of the file, you have to give chunks to the other receivers.

However, the node that seeds the file does so without getting anything in return. And in most peer-to-peer networks (not just filesharing networks), that’s the way it is for all nodes: it works only so long as enough nodes aren’t selfish. But Bitcoin could change that, by enabling nodes to simply charge for services: sending data chunks, storing data chunks (such as for peer-to-peer backup), internet access (your neighbor’s wireless access point), and so on.

Many of these uses would of course be prohibited: your ISP typically doesn’t allow you to resell your bandwidth, and pirating movies is even more illegal if you make money off of it. But that hasn’t stopped peer-to-peer network innovation in the past, and a lot of use cases would be perfectly legal.

One technical challenge is that Bitcoin payments take many minutes to be sufficiently verified, so protocols can’t rely on immediate payments. But that’s just the same situation we have in the real world, with invoices and credit cards (and I hear that they still use cheques in some countries); it involves keeping track of who’s been paying their bills on time in the past, requiring advance payments from people you don’t know, and so on.

Frankenstein

I just finished reading Frankenstein by Mary Shelley—the classic tale of a man who gives life to a monster. I guess it’s the prototypical mad scientist story.

Those two sentences pretty much sum up my total knowledge of the book prior to reading it, so I had a few surprises lined up: (Warning: mild spoilers)

  • Victor Frankenstein, the creator of the monster, isn’t your typical mad scientist. He’s very social, rather young, mentally balanced, and generally happy.
  • The monster is initially a monster only by way of his hideous appearence; he is described as being very intelligent, kind, and patient, and only becomes a true monster because he can’t stand that everyone he meets is frightened and disgusted by the way he looks. Victor Frankenstein himself starts off by running away from the newly created monster, and doesn’t even consider the question of his mental qualities until years later, when it’s far too late.
  • The book intentionally glosses over the entire process of the monster’s creation. It mentions body parts from dead people—I think—but even electricity isn’t mentioned. This is because Frankenstein is telling his story to a man who rescued him from a sheet of ice, and he doesn’t want to risk anyone following in his scientific footsteps.

I get the sense that the book’s argument is that Frankenstein was foolish to try to play God, and was made to suffer for it—at least that’s what he himself seems to believe. But it seems to me that the only real problem was that he completely abandoned a newborn being instead of taking care of him; as I said, the monster was initially very benign. In other words, the problem isn’t that Frankenstein tries to play God and fails, it’s that he fails to even try being a parent.

As a science fiction-minded reader, it also bothered me that the scientist Frankenstein never even considered a technological solution to his problems. He developed the process of creating a mentally perfect but physically flawed sentient being in just a few years—who’s to say what a few years further research could’ve accomplished?

Nevertheless, I enjoyed reading the book, but I did feel much more sorry for the monster than for Victor Frankenstein.

Follow

Get every new post delivered to your Inbox.