I’ve been trying to come up with strong passwords since forever, and have failed to find a magic alternative to entropy. Recently, I took Diceware for a roll, but wasn’t entirely happy with passwords like “wn rare swung strop situs slept”—wn isn’t even a word, is it? I also tried the Swedish dictionary but wasn’t much happier.
How about Mandarin Chinese, written using pinyin? There are only around 400 pinyin syllables, but thousands of characters with different meanings, so I guessed that for a random sequence of syllables it should often be possible to come up with a somewhat meaningful phrase.
The kHanyuPinlu property from the Unihan database turned out to be an excellent source for character to syllable mapping, so I wrote syllables.py to reverse that mapping. The output is a list of 392 pinyin syllables with example characters in traditional and simplified Chinese.
Unfortunately, 392 is not a power of 6, so using real dice to generate the numbers is a bit complicated, albeit possible. Instead I wrote roll.py, which uses as few bytes as possible from /dev/random
to roll a die with an arbitrary number of sides.
Using my list and my virtual D-392, here are the first 6-syllable pinyin phrases I generated, each with a memorable (?) Chinese phrase and a rough English translation.
- yan kai bo ren se dai—眼開撥任色帶—eyes open, poke any ribbon
- zui ku ba ge mei xu—最酷八個沒序—the coolest eight have no order
- ban zhai dian die keng bao—搬宅殿爹吭抱—moving villa/palace, dad says hold (this)
- you mu sa kang xu su—有母萨扛需速—(things) carried by mother Bodhisattva need speed
A native speaker would probably be able to come up with better phrases, but I think that I could remember any of these, with 最酷八個沒序 being the easiest. If this is a representative sample, I think the scheme works.
How about the entropy? With 392 syllables, each syllable contributes log2(392) = 8.6 bits, so these 6-syllable phrases have 51.6 bits of entropy, slightly better than a completely random 8-character alphanumeric password. English Diceware has 12.9 bits of entropy per word, so to get as much entropy as with a 6-word English phrase, a 9-syllable Chinese phrase is needed. The average word and syllable length are 4.2 and 3.2 respectively, so the average phrase lengths (including spaces) would be 30.2 for English and 36.8 for Chinese. (Removing spaces blindly will lose some entropy if the pinyin becomes ambiguous.)
Feel free to use/improve my lists and scripts, and never forget: the coolest eight have no order!