Combination Counting - ligos/readablepassphrasegenerator GitHub Wiki
The passphrase generator can give an (reasonably) accurate count of how many combinations of words based on the dictionary and phrase strength selected.
It reports this using three numbers: the minimum, maximum and weighted average combinations.
You can see these numbers on the options screen, when you run the console application or by calling the PassphraseGenerator.CalculateCombinations()
method of the API.
C:\Users\Murray\PassphraseGenerator.exe -s strong
Readable Passphrase Generator
Generating 1 phrase(s) of strength`Strong'...
Dictionary contains 10,482 words (loaded in 356.58ms)
Average combinations ~9.612E+017 (~59.74 bits)
Total combinations 4.901E+013 - 9.090E+019 (45.48 - 66.30 bits)
...
Obviously, the numbers you actually see will be different depending on what dictionary you're using.
Simply, the minimum and maximum represent the worse and best cases respectively (based on optional components in the phrase), and the weighted average is the "middle ground" showing an approximate average case over lots of passphrases.
They are calculated based on:
- The number of words in the dictionary.
- The components of the passphrase (Normal has less components than Strong).
- The components which are optional vs required.
- The likelihood of optional vs required components being chosen.
Each number reports:
- Minimum assumes no optional components (ie: the lest complex phrase).
- Maximum assumes all optional components (ie: the most complex phrase).
- Average weights optional and required components by their relative likelihood.
Versions of the generator before 0.9.0 would always report the maximum number as the combinations, and give no minimum or average. But that doesn't accurately reflect the true number of combinations. And knowing the worst case is arguably more useful than the best case.
It's different to all the rest, obviously. Random, RandomShort and so on, choose between a list of the base phrase strengths at random. They result in a strange amalgamation phrases from many strengths, from the simplest (just 2 words) to the insanely complex (according to the statistics I gathered, 21 words and 128 characters is the longest).
So, when it calculates combinations for Random, it works like this:
- Min: is the smallest minimum count for any phrase strength. Effectively, this is the minimum for Normal.
- Max: is the sum of all maximum counts for all phrase strengths. Technically, this is wrong, because we should multiply them together rather than adding. But there's so much overlap between different strengths, and I know people won't choose a 15 word phrase as often as a 6 letter one, I just add to be conservative.
- Avg: is the average of all average bit counts for all phrase strengths. This means it's working from the log base 2 numbers rather than the raw combinations. Averaging the logs means we don't end up with quite as much of a high weighted average. This number is very approximate, and isn't counting combinations in the same sense as all the others, but it is a simple (if inaccurate) way to represent the middle ground.
The full list of what PhraseStrengths each Random version is working from is at the bottom of this page.
The Strong phrase strength looks like this (simplified):
<noun> <verb> [optional adjective] <noun>
And it's numbers are (for the 0.9.0 dictionary):
Min = 44,853,399,953,682 (or 45.35 bits)
Max = 64,720,092,128,167,100,000 (or 65.81 bits)
Avg = 811,875,709,692,926,000 (or 59.49 bits)
To calculate:
- Minimum: the generator assumes any optional components are not present, and multiplies the number of words together (ie: nouns * verbs * nouns).
- Maximum: the generator assumes any optional components are present, and multiplies the number of words together (ie: nouns * verbs * adjectives * nouns).
- Average: the generator weights optional components by their probability. In Strong, adjectives appear 1/3 of the time and are absent 2/3 of the time, so the formula reduced by 1/3 (ie: nouns * verbs * adjectives * 1/3 * nouns).
(The actual formulas are more complex than this because they take more factors into account, but the principal remains the same).
StrongEqual has identical minimum and maximum, but it's average is higher because it has a 50% chance of having an adjective instead of 33%:
Min = 44,853,399,953,682 (or 45.35 bits)
Max = 64,720,092,128,167,100,000 (or 65.81 bits)
Avg = 7,737,134,600,467,430,000 (or 62.75 bits)
StrongRequired has slightly lower maximum (because technically an optional component counts as another combination; it might be there or not), but much higher minimum and average. But also has a much narrower range; that is, it generates complex phrases more consistently.
Min = 5,938,590,153,867,540,000 (or 62.36 bits)
Max = 53,447,311,384,807,900,000 (or 65.53 bits)
Avg = 24,178,545,626,460,700,000 (or 64.39 bits)
(For those wondering, the "bits" simply takes the logarithm of the combinations to the base 2; it's a common way of measuring big numbers in the computing industry.)
Another way to count combinations is to generate some statistics. There's code in the test application to generate histograms as CSV files of how long each passphrase is. I've included all the Random strengths here as a reference (based on the 0.12.0 version).
Mutators make slight changes to the final passphrase such as adding upper case letters or numbers. The use of mutators does increase the number of combinations (a little; probably less than you think), but is not reflected in any figures or graphs.
The intention of mutators is to make a passphrase pass complexity rules that require upper, lower, numbers, etc. They are not there to increase the strength of your passphrase (although they might, it's entirely by accident).
And I'd prefer to err on the side of caution and show slightly less combinations than there actually are.
From version 1.3.0, a number of fake words are included in the dictionary (taken from ThisWordDoesNotExist.com).
You can exclude these fake words when building a passphrase using the configuration screen in the KeePass plugin, or the --noFake
option with the console app.
As the fake words are excluded, this reduces the size of the dictionary and the number of possible combinations.
For the security geeks out there, here's an exhaustive list of all phrase strength templates.
This lists which of the base strengths each random one is made up from. They are based on average lengths of generated phrases.
- Random (default): Normal, NormalAnd, NormalSpeech, Strong, StrongAnd, StrongSpeech, Insane, InsaneAnd, InsaneSpeech
- RandomShort: Normal, NormalEqual, NormalRequired, Strong, Insane, StrongEqual
- RandomLong: NormalAnd, NormalSpeech, NormalEqualSpeech, NormalRequiredAnd, NormalRequiredSpeech, InsaneEqual, NormalEqualAnd, StrongRequired, StrongSpeech, StrongAnd.
- RandomForever: StrongEqualSpeech, InsaneAnd, InsaneSpeech, StrongEqualAnd, InsaneRequired, StrongRequiredSpeech, InsaneEqualSpeech, InsaneEqualAnd, StrongRequiredAnd, InsaneRequiredSpeech, InsaneRequiredAnd
I haven't included the weightings for optional components; if you want those you'll have to look at the source code.
(Keep in mind that a noun will include factors for an article / demonstrative, plural / singular. And verbs have factors for tense, transitive / intransitive and interrogative).
-
Normal:
<noun> <verb> <noun>
-
Strong:
<noun> <verb> [preposition] [adjective] <noun>
-
Insane:
<noun> [adjective] [adverb] <verb> [adverb] [preposition] [adjective] <noun>
-
NormalAnd:
<noun> <verb> <noun> and <noun>
-
NormalSpeech:
<noun> <speech verb> <noun> <verb> <noun>
-
NormalEqual:
<noun> <verb> <noun>
-
NormalEqualAnd:
<noun> <verb> <noun> and <noun>
-
NormalEqualSpeech:
<noun> <speech verb> <noun> <verb> <noun>
-
NormalRequired:
<noun> <verb> <noun>
-
NormalRequiredAnd:
<noun> <verb> <noun> and <noun>
-
NormalRequiredSpeech:
<noun> <speech verb> <noun> <verb> <noun>
-
StrongAnd:
<noun> <verb> [preposition] [adjective] <noun> and <noun>
-
StrongSpeech:
<noun> <speech verb> <noun> <verb> [preposition] [adjective] <noun>
-
StrongEqual:
<noun> <verb> [preposition] [adjective] <noun>
-
StrongEqualAnd:
<noun> <verb> [preposition] [adjective] <noun> and <noun>
-
StrongEqualSpeech:
<noun> <speech verb> <noun> <verb> [preposition] [adjective] <noun>
-
StrongRequired:
<noun> <verb> <preposition> <adjective> <noun>
-
StrongRequiredAnd:
<noun> <verb> <preposition> <adjective> <noun> and <noun>
-
StrongRequiredSpeech:
<noun> <speech verb> <noun> <verb> <preposition> <adjective> <noun>
-
InsaneAnd:
<noun> [adjective] [adverb] <verb> [adverb] [preposition] [adjective] <noun> and <noun>
-
InsaneSpeech:
<noun> <speech verb> <noun> [adjective] [adverb] <verb> [adverb] [preposition] [adjective] <noun>
-
InsaneEqual:
<noun> [adjective] [adverb] <verb> [adverb] [preposition] [adjective] <noun>
-
InsaneEqualAnd:
<noun> [adjective] [adverb] <verb> [adverb] [preposition] [adjective] <noun> and <noun>
-
InsaneEqualSpeech:
<noun> <speech verb> <noun> [adjective] [adverb] <verb> [adverb] [preposition] [adjective] <noun>
-
InsaneRequired:
<noun> <adjective> <adverb> <verb> <adverb> <preposition> <adjective> <noun>
-
InsaneRequiredSpeech:
<noun> <speech verb> <noun> <adjective> <adverb> <verb> <adverb> <preposition> <adjective> <noun>
-
InsaneRequiredAnd:
<noun> <adjective> <adverb> <verb> <adverb> <preposition> <adjective> <noun> and <noun>