En medborgare och matematiker ger synpunkter på samhällsfrågor, litteratur och vetenskap.
fredag 24 januari 2020
No free lunch?
torsdag 16 april 2015
The libraries of Babel, Mendel and Turing
- The Library of Babel goes back to Jorge Luis Borges' short story from 1941 with the same name. It contains of all of the 251,312,000 books with 1,312,000 characters (we learn from the story that "each book is of four hundred and ten pages; each page, of forty lines, each line, of some eighty letters which are black in color"; 1,312,000=410·40·80) taken from a 25-letter alphabet, including blank space. 251,312,000 is an enormously large number, so large that the number of books that would fit in the observable universe is ridiculously small by comparison. We can still try to imagine the library. It contains many wonderful books, including Shakespeare's Hamlet and Solzhenitsyn's One Day in the Life of Ivan Denisovich, but it contains far more books consisting of pure gibberish. The library is so huge that, even if we assume the books to be ordered alphabetically, the interesting books are virtually impossible to find, unless you know exactly what you are looking for. To know which of the library's 25 buildings to go to, you need to know the book's first letter; to know which of the 25 floors of that building to go to, you need to know the book's second letter, and so on and so forth. If you are sitting in front of a word processor, you actually have the Library of Babel at your fingertips: if you first specify the book's first letter by typing it, then specify the second letter similarly, all the way down to the 1,312,000th and last, then - ta-dah! - there is your book.
- The Library of Mendel, invented by Daniel Dennett in his 1995 book Darwin's Dangerous Idea, consists of all DNA sequences of length up to (say) 10,000,000,000. Here we have only four characters (A, C, G and T) but the greater lengths allowed compared to the Library of Babel more than make up for that, so that the number of DNA sequences in the Library of Mendel is even larger than the number of books in the Library of Babel. Just like there are very many interesting books in the Library of Babel, there are very many DNA sequences in the Library of Mendel that code for viable complex organisms. These two subsets from the respective libraries are what Dennett call Vast but Vanishing, where Vast means roughly that their number is far far bigger than the number of elementary particles in the observable universe, and Vanishing means that their proportion of the entire library is so small that the reciprocal of that proportion is Vast.
- The Library of Turing - Thore Husfeldt's new invention - consists of all sequences of length up to (say) 1020 (a very large number, but far from Vast) from an 128-character alphabet - sequences that are interpreted as computer programs written in a given programming language, which Husfeldt takes to be Lisp, but any Turing-complete language will do. As is the case in the libraries of Babel and Mendel, most sequences are just gibberish, but there is a Vast but Vanishing subset of interesting programs.
- Given that the description of a superintelligence exists somewhere in the library, all that remains is to find it. If you don’t work in computer science, this task seems to be relatively trivial. After all, even a mindless exhaustive search will find the volume in question. While this approach is not a particularly tempting exercise for a human, computers are supposed to be good at performing mindless, well-defined, repetitive, and routine tasks at high speed and without complaining. Thus, sooner or later we stumble upon the monster; the only debate is whether the search time is measured in years, generations, or geological time scales.
But this is a fallacy. It is a fallacy because of the unreasonable growth rate of the exponential function, and the powerlessness of exhaustive search.
- Could there be another way of discovering the superintelligence than exhaustive search? Certainly. After all, nature has discovered one of the monsters, the brain of Homo sapiens, starting with very simple "brains" hundreds of millions of years ago, and refining them stepwise by natural selection and mutation in environments that favoured cognition. Thus, there is some gradient, some sensible set of stepwise refinements, that Nature has followed through the Library of Turing to arrive at Einstein’s brain. We just don’t know how to recognise, nor efficiently follow this gradient. In the enthusiasm surrounding artificial intelligence in the 1960s we might have thought that we were on such a gradient. If so, we have abandoned it. The current trend in artificial intelligence, away from symbolic reasoning and towards statistical methods like machine learning, that do not aim to build cognitive models, strikes me as an unlikely pathway.
Given our current understanding of the limits of computation, there are many relatively innocent-looking computational problems that are computationally hard. After more than a generation of very serious research into these problems, nobody knows anything better than exhaustive search for many of them. In fact, while the amazing features of your smartphone may tell you a different story, the main conclusion of a generation of research in algorithms is bleak: for most computational problems we don’t know what to do.
- Computation is a resource, exponential growth is immense, and the universe is finite.
Bostrom pays little attention to this problem. For instance, in his discussion of current efforts on brain simulation, he writes:
- Success at emulating a tiny brain, such as that of C. elegans, would give us a better view of what it would take to emulate larger brains. At some point in the technology development process, once techniques are available for automatically emulating small quantities of brain tissue, the problem reduces to one of scaling.