Just a brief note about some distractions from my PhD, which may be useless.
For those who haven't heard of or want to know more about these interesting
ideas I'd suggest looking at the Wikipedia article on the
Infinite monkey
theorem or, else, a more entertaining medium might be the subject of
Borges's short story the
The
Library of Babel (1941).
Basically the idea of both of these concepts is a combinatorial (or more accurately
permutational- see
this combination and permutation calculator to see the difference)
explosion/exponentiation one, and tells us interesting things about probability, complexity and the
nature of infinity. Quoting Wikipedia:
"The
infinite monkey theorem states that a monkey hitting keys
at random on a typewriter keyboard for an infinite amount of time will
almost
surely type a given text, such as the complete works of William
Shakespeare."
As this problem has had some nice formal treatment (as can be seen on the wiki page alone, without literature hunting), I won't pretend to offer a new proof that this is the case, but rather just go through some of the basics to elucidate the problem to myself and to the few people who will read this.
In this auto-didactic exercise, I did a naive calculation and thought the following sort of maths that would
seem sensible for beginning to work out the extent of this problem for our
monkey tapping away randomly. In cryptographic (code-cracking) terms this is equivalent to working out how hard it would be to crack a password the length of Shakespeare's work in characters in a
brute-force attack.
If we take
length(WS) to be the length of Shakespeare's complete works in characters, 26 as the length of the Latin alphabet, forgetting punctuation, allowing for upper and lower case and ignoring any type-setting keys for now, so relaxing the problem a bit, albeit not in terms of the complexity order, we have the following:
Different character sequences of length
length(WS)
= 26
length(WS)
This works in the same way that a 4-digit numerical lock has 10
4 different
permutations (often wrongly called
combinations- order matters here!). In fact generally in this situation permutation with arbitrary repeats allowed of any symbol, is
nk (
n
is the number of values,
k is the length of the sequence). The simple addition to the formula above used to calculate the probability of a monkey producing the complete
works in a set order from the start of the experiment just requires the fact that there is only one unique string of letters that gets us to the desired result. And, to be empirically accurate, this also requires the addition to the total word count wonderfully provided by the
Shakespeare text statistics
website:
1/Different character sequences of length
length(WS)
= 1/26
884,429 x mean word length
If we make some approximations for discussion by saying we've got 1 million words and the
average word length in English text is 5 letters, let's say we have
26
to the 5 million (i.e. 26 5,000,000) different possible sequences of length 5 million characters the monkey could
come up with, leaving the probability of finding the correct permutation
the first time at 1/
26 5,000,000 (the length of quantum
foam, smallest known thing in the universe, in meters is massive compared to
this!). This shows the awesome explosive power of exponentiation.
Our initial estimation is being a bit mean to the monkey, as we presume we are giving him
a set order of Shakespeare's 43 works: he could get all the texts all following each
other directly, but in a different order to the one we prescribe, and it
wouldn't count and in fact, the possible space he has to work with on this higher level of work order is
number of works!/ (number of works - possible orderings)! which is 43!/(43-43)!, which as 0!= 1 by normal conventions is 43! (i.e. 43 factorial), which is 43 *
42 * 41 *..etc.... * 1. Check
this excellent permutation and combination calculator to confirm the difference between order mattering (permutation) and it not mattering (combination).
If we give him the option of producing them in any order (combination) instead of the permutational constraint where order matters, we have many more different possible good results and reduce the space complexity (and time complexity) by the space he would have had to search for the correct order, i.e. 43!, meaning the monkey's chance of success at the outset is roughly
1/(265,000,000/43!). This knocks an order of about
50 off the exponent of the 5 million, increasing his chances somewhat but with the order
still so large it hasn't helped him that much. Further relaxing the conditions by removing the stipulation that the
works have to be produced back-to-back, so he could type nonsense in between successful works could further reduce the complexity of the task, but we're still
into the hundreds of thousands of characters for correct sequences (and still no further down the computational
complexity order as it's still exponential).
Estimating how long it could take by simply multiplying the time taken to type one attempt by the entire number
of combinations, or by some fraction of the entire number of possibilities
using a probability distribution seems unfair in terms of measuring his success, including near misses. We give the monkey a better chance by
reviewing the attempts
incrementally, i.e. character by character. When
he begins typing, at the first character, he's got a 1/26 chance of hitting the
correct first character, and if he does so then he has another 1/26 chance of
hitting the second one correctly (the probability of him getting the first 2
characters correct for the start is 1/26 x 1/26 = 1/676) and so on- this is because this is the one successful permutation out of the whole permutational space that needs to be typed. Should he type an incorrect character the string length goes back down to length 0 (or 1 if that letter happens to start a Shakespeare play). To illustrate the ever-changing probability of success during the task, he might type:
Henry
There's a probability of 1/26
5 of him typing this. He
has done very well here, as it forms the start of the title of 7 Shakespeare works
(bless the bard's love of the British monarchy!). Now the monkey's only got 4,999,995 more characters to get right in a row, and given the work ordering
relaxation, he has started 7 works correctly and still has a chance to get any of the
orderings of works that begin with any of these 7 Henry plays correct - so from
here he's got chances of producing any of 7 x 42! different
possible successful 4,999,995 remaining character sequences, and this number should be factored out of the number of combinations. His simultaneous production of successful start sequences for 7 possible works in our relaxed problem reduces the exponent in the probability's denominator by roughly 50, which is small, but something.
This idea of multiple possible future
successes with each typed letter really makes this an interesting probability
problem over time. Essentially at each point, we are looking at the last
n
letters to see if they constitute a valid starting string of one or more of the
possible 43! successful combinations, and that will determine the probability
of the monkey’s future success at any given point. When we have a number of
possible future successful outcomes, the probabilities improve, as mentioned
above. In practice these multiple possible future success situations are few
and far between in this particular task, as less than half of Shakespeare’s plays have start strings
with more than a 2-letter beginning sequence identical to that of another play- however given
the exponential nature of the improvement of the monkey's chances, it is still nice to know that should he type the following strings at the beginning, there will be more than
one possible starting work he could be in the process of completing, and hence the probability of him being
successful is greater than if he stumbled across a string of the same length that began another title.
Start String |
No. of Works still
possible (NoW) |
p(monkey's success for remaining letters) |
Henry |
7 |
1/(264,999,995/7* 42!) |
Henry V |
5 |
1/(264,999,994/5* 42!) |
Henry VI |
4 |
1/(264,999,993/4* 42!) |
Henry VI Part I |
3 |
1/(264,999,988/3* 42!) |
Henry VI Part II |
2 |
1/(264,999,987/2* 42!) |
Henry VII |
2 |
1/(264,999,992/2* 42!) |
Richard |
2 |
1/(264,999,993/2* 42!) |
Love |
2 |
1/(264,999,996/2* 42!) |
King |
2 |
1/(264,999,994/2* 42!) |
Ti |
2 |
1/(264,999,998/2* 42!) |
Me |
2 |
1/(264,999,998/2* 42!) |
While not doing the full maths here, while the factoring out the factorial denominator here (NoW* 42! where NoW=number of works)
from the overall permutational space will increase the probability of success by several orders in the sequence continuation space, this only
linearly increases with NoW, the number of works that share the string in question. So, if a given string starts a greater number of number works than another (e.g. "Henry" (7) vs "Richard" (2)), the former sequence will not have increased possibility of success by an exponent- reducing by 7 x 42! rather than 2 x 42! will not reduce the number of final possibilities exponentially, while the extra successful 2 letters in "Richard" will do.
Sharing of start strings only boosts probabilities when comparing start strings of the same length,
say "Henr" will be a better sequence than "Love" in the
above table. So, given it is length of the current successful string that is
the principal determiner of better future chances of successful completion, the monkey's
best current chance of future success possible in those shown above will be
after typing "Henry VI Part II"- it's the start string of 13 characters that covers the biggest number of possible future successes. In terms of the overall task, in the combinatorial works setting where any order is permitted but with no repeated works, the overall probability of generating the complete works will be the same i.e.
1/(265,000,000/43!),regardless of the order of works he generates, but what is interesting is how the division of the sequence into subsequences and allowing interchangeability (combination) reduces the complexity of the task from the set-order (permutational) setting and increases the probability of it getting done faster- perhaps suggesting the need for things that unitize information such as words and syntactic phrases for human beings.
What's also interesting is the agony we experience as a bystander watching our Simian stoic friend tap
away: we could watch him reach the end of a long valid start sequence
and then see him fail instantly. Even if he reached the end of an entire work
successfully, having watched his probabilities incrementally increase with each
correct character by an exponential amount, his subsequent typing of a letter
that doesn't begin a remaining Shakespeare play shatters his chances by
resetting the denominator of the probability back to its old exhaustive self.
Plotting the probability of success change over time could be very interesting
(its time axis would be rather long and the data fairly sparse!). Whether it
could be done in infinite time remains the subject of debate of minds greater
than my own. I just found thinking about this interesting and a good way of sharpening my maths (forgive
incorrect numbers/calculations!).
As an end note, while it changes the nature of the problem slightly, an interesting idea in complexity research put to me by lecturers at my university is
whether you could reduce the time if you gave the monkey a basic programmable
computer with a
simple text terminal and option to start writing in assembly code (or even
binary), rather than a typewriter. The chances of
stumbling upon a program that was, say, able to permute any substring written so
far with any other one, greatly increasing generative capacity and chance of
success with each consequent key stroke, might in fact have a much smaller text
length; it may be possible to write the program then leave it running and add
to it again later after some interim meaningless characters, rather than requiring the exhaustive sequence of characters we require
here. The monkey might even stumble across a program that can generate English
words and sentences grammatically before it comes across the entirety of 'Hamlet'. The level
of language permitted would be fairly important for the project; Python and
Java may give the monkey a better chance than machine code or binary. Perhaps this might have corollary with poverty of the stimulus arguments in linguistics- how much innate structure for information processing can we give a child so that a grammar may be 'found', even in a stochastic process?