torsdag 16 april 2015

The libraries of Babel, Mendel and Turing

There is much disagreement concerning the dangers associated with a breakthrough in artificial intelligence. Could such an event lead to our extinction, or the End of the Human Era as James Barrat put it in the subtitle of his book Our Final Invention? Thinkers like Eliezer Yudkowsky and Nick Bostrom argue that the risk is real and that we need to solve the (very difficult) problem of how to avoid an AI Armageddon. As is clear from my review of Bostrom's book Superintelligence, I find their arguments sufficiently compelling to warrant serious attention.

Many others consider (for a wide variety of reasons) the risk to be an illusion. The latest contributor to this highly diverse camp is computer scientist Thore Husfeldt, with his essay The Monster in the Library of Turing, which is a review of Superintelligence to appear in the Swedish philosophy journal Filosofisk tidskrift and published in English translation on his blog. I find Husfeldt's essay very interesting, and what sets him apart from nearly every other critic of the Bostromian-Yudkowskian position (such as Selmer Bringsjord, Rodney Brooks, David Deutsch, Steven Pinker, John Searle and David Sumpter, just to mention a few of those whose criticisms I debunk at shorter or greater length in my upcoming book Here Be Dragons: Science, Techonlogy and the Future of Humanity (Oxford University Press, to appear)) is that neither does his argument consist in attacking straw men, nor does it succumb to any obvious fallacy. Husfeldt has read Superintelligence carefully and seems to accept many of its arguments, but disagrees on (at least) one crucial point. To make his case, he invokes a new metaphor: the Library of Turing, building on Jorge Luis Borges' Library of Babel and Daniel Dennett's Library of Mendel. Let me breifly describe these three libraries:
  • The Library of Babel goes back to Jorge Luis Borges' short story from 1941 with the same name. It contains of all of the 251,312,000 books with 1,312,000 characters (we learn from the story that "each book is of four hundred and ten pages; each page, of forty lines, each line, of some eighty letters which are black in color"; 1,312,000=410·40·80) taken from a 25-letter alphabet, including blank space. 251,312,000 is an enormously large number, so large that the number of books that would fit in the observable universe is ridiculously small by comparison. We can still try to imagine the library. It contains many wonderful books, including Shakespeare's Hamlet and Solzhenitsyn's One Day in the Life of Ivan Denisovich, but it contains far more books consisting of pure gibberish. The library is so huge that, even if we assume the books to be ordered alphabetically, the interesting books are virtually impossible to find, unless you know exactly what you are looking for. To know which of the library's 25 buildings to go to, you need to know the book's first letter; to know which of the 25 floors of that building to go to, you need to know the book's second letter, and so on and so forth. If you are sitting in front of a word processor, you actually have the Library of Babel at your fingertips: if you first specify the book's first letter by typing it, then specify the second letter similarly, all the way down to the 1,312,000th and last, then - ta-dah! - there is your book.

  • The Library of Mendel, invented by Daniel Dennett in his 1995 book Darwin's Dangerous Idea, consists of all DNA sequences of length up to (say) 10,000,000,000. Here we have only four characters (A, C, G and T) but the greater lengths allowed compared to the Library of Babel more than make up for that, so that the number of DNA sequences in the Library of Mendel is even larger than the number of books in the Library of Babel. Just like there are very many interesting books in the Library of Babel, there are very many DNA sequences in the Library of Mendel that code for viable complex organisms. These two subsets from the respective libraries are what Dennett call Vast but Vanishing, where Vast means roughly that their number is far far bigger than the number of elementary particles in the observable universe, and Vanishing means that their proportion of the entire library is so small that the reciprocal of that proportion is Vast.

  • The Library of Turing - Thore Husfeldt's new invention - consists of all sequences of length up to (say) 1020 (a very large number, but far from Vast) from an 128-character alphabet - sequences that are interpreted as computer programs written in a given programming language, which Husfeldt takes to be Lisp, but any Turing-complete language will do. As is the case in the libraries of Babel and Mendel, most sequences are just gibberish, but there is a Vast but Vanishing subset of interesting programs.
Husfeldt accepts that the human brain is probably far from an optimal arrangement of matter to produce intelligence, and he accepts functionalism. This leads him to belive (as Bostrom does, and Yudkowsky, and me) that somewhere in the Library of Turing, there is some program that far outstrips our feeble human powers of intelligence. But can we find it? Here is Husfeldt:
    Given that the description of a superintelligence exists somewhere in the library, all that remains is to find it. If you don’t work in computer science, this task seems to be relatively trivial. After all, even a mindless exhaustive search will find the volume in question. While this approach is not a particularly tempting exercise for a human, computers are supposed to be good at performing mindless, well-defined, repetitive, and routine tasks at high speed and without complaining. Thus, sooner or later we stumble upon the monster; the only debate is whether the search time is measured in years, generations, or geological time scales.

    But this is a fallacy. It is a fallacy because of the unreasonable growth rate of the exponential function, and the powerlessness of exhaustive search.

The point here is that, although presumably Vast, the set of superhumanly intelligent programs in the Library of Turing is also Vanishing, and therefore geological time scales will be far far from sufficient for brute force to succeed.

If the Library of Turing entirely lacked structure, this would be the end of the argument: if there were no structure, there would be no way to improve on brute force search, so there would be no hope (or, to be pedantically precise, a Vanishingly small hope) to come up with a superhumanly intelligent computer program. Intelligent design proponent William Dembski, in his book No Free Lunch, tried (for the purpose of concluding that we have been deliberately designed by an intelligent being) to employ this type of argument to prove that Darwinian evolution is impossible. He hid away the (utterly unreasonable) unstructuredness assumption behind mathematical smokescreens, so as to render it undetectable to most readers, but he failed to make it undetectable to yours truly.

The unstructuredness assumption is just as unreasonable for the Library of Turing as it is for the Library of Mendel, since if the former lacked structure, we would not be able to write useful computer programs. Now, Husfeldt, unlike Dembski, is an honest thinker, so of course he makes no attempt to pretend that the Library of Turing is unstructured, and he admits that there might be ways to improve on brute force search sufficiently well that a superhumanly intelligent program can be found, but intuits that it can't be done:
    Could there be another way of discovering the superintelligence than exhaustive search? Certainly. After all, nature has discovered one of the monsters, the brain of Homo sapiens, starting with very simple "brains" hundreds of millions of years ago, and refining them stepwise by natural selection and mutation in environments that favoured cognition. Thus, there is some gradient, some sensible set of stepwise refinements, that Nature has followed through the Library of Turing to arrive at Einstein’s brain. We just don’t know how to recognise, nor efficiently follow this gradient. In the enthusiasm surrounding artificial intelligence in the 1960s we might have thought that we were on such a gradient. If so, we have abandoned it. The current trend in artificial intelligence, away from symbolic reasoning and towards statistical methods like machine learning, that do not aim to build cognitive models, strikes me as an unlikely pathway.

    Given our current understanding of the limits of computation, there are many relatively innocent-looking computational problems that are computationally hard. After more than a generation of very serious research into these problems, nobody knows anything better than exhaustive search for many of them. In fact, while the amazing features of your smartphone may tell you a different story, the main conclusion of a generation of research in algorithms is bleak: for most computational problems we don’t know what to do.

Here Husfeldt's intuition and mine part from each other. I agree with him that most likely P≠NP, but I don't see why that would be such a blow to our prospects of finding a superintelligent program. We can do a lot of wonderful things without solving NP-hard problems in polynomial time, and I think the success that Nature has had in coming up with the brain of Homo sapiens and many other impressive things is very suggestive. The Library of Turing is probably at least as structured as the Library of Babel, offering at least as navigable a landscape to search in if we should choose to imitate Nature's method of evolution by natural selection. And we have the bonus option of using our own intelligence to make clever macro-moves in the landscape. I am not in a position to tell whether symbolic reasoning is a more promising approach than machine learning, but there seem to be many potential ways forward. I think there's a good chance (or risk!) we'll create superintelligence before the end of the persent century, and perhaps a lot sooner than that.

This divergence of our intuitions notwithstanding, I find Husfeldt's essay interesting and stimulating, and it ends on a constructive note, suggesting a number of research directions that may turn out worthwhile even in case he is right about the inaccessability of "the monster in the Library of Turing". There's just one passage I find a little bit misleading:
    Computation is a resource, exponential growth is immense, and the universe is finite.

    Bostrom pays little attention to this problem. For instance, in his discussion of current efforts on brain simulation, he writes:

      Success at emulating a tiny brain, such as that of C. elegans, would give us a better view of what it would take to emulate larger brains. At some point in the technology development process, once techniques are available for automatically emulating small quantities of brain tissue, the problem reduces to one of scaling.
    Well, to me, scaling is the problem.
Here (and in combination with the essay's overall focus on exponential complexity), Husfeldt seems to suggest that the problem of emulating a brain of size n would suffer from a combinatorial explosion and the accompanying exponential increase in computational complexity. Does he really mean this? I don't want to claim that whole brain emulation is an easy project (it most certainly isn't!), but to me the scaling problem here sounds like it should land in a linear complexity in n, or perhaps a low-degree polynomial, but not exponential, as I don't see where the combinatorial explosion would be coming from.

22 kommentarer:

  1. This seems a bit unfounded: "The current trend in artificial intelligence, away from symbolic reasoning and towards statistical methods like machine learning, that do not aim to build cognitive models, strikes me as an unlikely pathway." The very reason we are making this shift is that it works better.

    1. Well, it works better but to me it is far from AI and that kind of is the "hard question"? Will just doing enough algorithms make it... ofc we do not know... but it is a valid point imho.

  2. Once you have the ”whole brain” in the computer, I suppose the time for running it for m moments (where a ”moment” is a time span short enough for the brain to regarded as incapable of changing state in any way noticeable to any part of itself) would be nm, where n is its number of neurophysically atomic parts. This should be enough to sequentialize the parallel operation of the brain.

    But the first problem is getting to that state. How to find the layout of the brain emulation graph and its connection to its I/O devices (the body). I hope I'm not misinterpreting Thore's argument, but this is what I take as the analog of finding the right book in the library – the problem that plausibly does not scale.

    1. Why would the problem of finding "the layout of the brain emulation graph and its connection to its I/O devices" succumb to a combinatorial explosion?

    2. The question is: why wouldn't it? For all we know, finding just the right configuration (or one of a few right configurations) usually does succumb to a combinatorial explosion. Experience from first few decades of computer science tells us that if it's not immediately obvious how to find something without almost-exhaustive-search, it's usually at least NP-hard.

    3. Nah... the task of setting up the initial state of the emulation looks to me a lot more like copying (taking essentially linear time) than like combinatorial optimization.

    4. Think that the way the human brain project in EU is a bit telling... which I had said so but I did only think it... making smart code for simulations of complex things is harder then it might seam. My personal gut feeling that is based on basically nothing :) sorry for taking space... is that it is really hard to say which way things are going. Could haven an AI this century or we might not... how do you set a good Bayesian test for that? Perhaps it is in your book? :)

    5. To do the copying you have to have something to copy, so then you need to already have the superintelligent monster. Nothing indicates that copying a human brain (if it can be done) would bring anything spectacular. It would be an interesting and perhaps useful experiment, however, if it can be done. I think it would tell us a lot about ourselves.

    6. Aha ctail, you've missunderstood what the Bostrom passage quoted by Husfeldt is about. It's not about building a superintelligence, it's about emulating a human brain.

    7. Ok, but a) if it's in the book Superintelligence I suppose it's supposed to have some bearing on that subject? and b) I took Thore's reaction “scaling is the problem” as a general comment on the “it's just scaling” attitude. But I sort of regret getting myself into defending my interpretation of Thore, since it might not be spot-on what he meant.

    8. Human brain emulation does have potential bearing on the prospects of creating a superintelligence. I wholeheartedly recommend reading the book.

    9. Yes, the passage I quoted from Bostrom was to establish that he doesn’t much care about issues of scaling. I cherry-picked the most blatant of those passages, it just happens to be in the section on brain simulation.

      At the danger of continuing along a fruitless tangent: I have no idea how computationally difficult brain simulation is. N-body problems of very simple things in physics is much harder than just NP. But one can approximate it, after a fashion. So maybe neurons (which are more complicated that balls, and which we don’t currently know how to describe) happen to be in some sweet spot where (because of physical distances) most interactions can be ignored, so maybe we can get a reasonable simulation in quadratic time.

      So I’d be willing to entertain the idea that we can kinda-sorta simulate a brain to some degree of accuracy in quadratic time. There are 10^9 or 10^10 neurons in the human brain, so we’re looking at 10^18 operations. You need a billion modern computers to simulate “one iteration” in a second. How fast is the human brain clocked? No idea if that even makes sense to discretise. 100 Hz?

      So, with these back-of-the envelope calculations, I can simulate Einstein’s brain using 100 billion computers, assuming neuroanatomy ever gives me a model and a description. I only kinda-sorta simulate it, so it’s unclear if my iEinstein can do anything else than drool, intellectually speaking.

      Now, how much faster would we need to run Einstein before we have something that is even close to a superintelligence? Einstein on 10 times the speed? Is that better than 10 Einsteins? (Say, Princeton’s faculty in the 1940s.) So I need 100 billion computers to simulate Einstein drooling really fast. Compare this to the perfectly pleasant way of producing new brains by having intercourse.

      Make no mistake: I like to play these games. But whenever I start, my brain just says “why should that be easy?” As I tried to make explicit: the main conclusion of half a decade of research is that most problems are really, really hard. And quadratic time is quite enough.

      (Disclaimer: we have no idea how to describe the functions of a neuron, much less in one-“operation”-per-clock cycle speed. The entire above calculation assumes many very charitable interpretations of neuroanatomy probably rendering the entire analysis void. Not even mentioning the required technological breakthrough in “how to slice up brains.”)

    10. Why quadratic time? Each neuron only has about 1000 connections, so reading inputs and updating outputs should only take about 1000 operations per neuron per clock tick, for the most generous assumption of neuron simplicity.

    11. No reason for quadratic time, I just wanted to entertain the idea that brain simulation doesn’t need to be NP-hard (or PSPACE-hard, like n-body simulation) to be considered a problem that can be computationally infeasible already – disregarding the neuroanatomical part. (That‘s the danger of making these calculations – it makes concrete what doesn’t deserve concreteness.)

  3. Om vi kan läsa vikterna på kopplingarna med ett verktyg så är problemet antagligen polynomiskt att simulera. Om vi bara kan se att det finns en koppling mellan neuroner, men inte om den är positiv/negativ eller dess storlek så måste man på något sätt välja dessa vikter. Att träna ett neuralt nätverk (d.v.s. att välja vettiga vikter) är ett NP-problem.

    Förr eller senare lär dock ändå tekniken bli tillräckligt bra för att mäta hur stark en koppling mellan två neuroner är. Simulationen lär knappast vara värre än om varenda neuron är kopplad till alla andra neuroner, vilket är O(n²) i tidskomplexitet.

    Vill för övrigt tacka dig Olle för att ha skrivit "Slumpens skördar". Jag tror att den boken har inspirerat mig, för mitt kandidatarbete (här på Chalmers) handlar om perkolation! Mycket intressant blogg, jag är besviken att jag inte upptäckt den tidigare.

  4. I may have time for a longer response tomorrow, but let me just state that I don’t claim brain simulation takes exponential time. (Neither do I claim that it’s feasible.)

  5. In response to, " ... as I don't see where the combinatorial explosion would be coming from. "

    I think if you compare the neuronal structure of C. Elegans to Homo Sapiens, the number of dendrites plays a key role here.

    It seems to me that you get a combinatorial explosion from increasing the number of neurons, which increases the need for each neuron to have more dendrites.

    So, rather than just thinking about how many neurons you need to simulate a human brain, you also need to consider the number of synapses.

    1. The human brain has about 10^11 neurons and about 10^14 synapses. That is not the kind of discrepancy that warrants talk of combinatorial explosion.

  6. Torsten Ehrenmark skriver i sina memoarer (antagligen inspirerad av Borges) om hur han grips av högmod inför sin första skrivmaskin (en Halda om jag minns rätt) där han potentiellt kan skriva de största litterära verken - det gäller bara att trycka på tangenterna i rätt ordning.

  7. One thing to always keep in mind when thinking through Superintelligence is the immensely, counterintuitively, hardgraspably large value at stake. It is not *merely* the continued existence of human life on earth. It is the prospect of valuable life in *all* reachable parts of the universe *forever* (i.e. for the duration the universe has features that allow for the existence of beings that instantiate such value) versus bad scenarios such as the entire universe filled with suffering forever or universe with no more sentient life at all anywhere forever.

  8. I’d like to point to an argument that I think has been lacking, at least in the parts of the AGI-discussion that I have seen.

    Part of the problem of AGI is that there is a big difference between computation and reasoning. This is obvious today but was perhaps not quite anticipated in the early days of AI-optimism.

    But there is a kind of “universality” of reasoning too, similar to Turing universality, that we mathematicians take for granted because we have become used to it. The idea, going back to ancient Greece, is that everyone can understand every mathematical result that has ever been established. There are research papers that only experts can read, but there aren’t any results that only smart people can understand.

    To the outside observer, this is manifested in our apparent obsession with proofs. But the reason we insist on proving things is that we often can, which is rather amazing if we pause and think about it.

    This universality of reasoning does not seem to have become “established” like the Church-Turing thesis of computation. Some results of logic even seem to point in the opposite direction. By the Gödel incompleteness theorem we cannot have a complete theory of integer arithmetic. This seems to indicate that there isn’t a single “right” axiomatisation of standard mathematics, a view that becomes strengthened by artificial lists of axioms like Zermelo-Fraenkel and the weird status of the axiom of choice.

    Still, we don’t believe that a superintelligent being could know (using some “super-axioms”) that the Goldbach conjecture is true without being able in principle to explain it in terms of earthly mathematics.

    In my opinion this universality strongly suggests that there is a rather short program that provides a core from which we can start “surfing” the Library of Turing (by adding layers of stuff like Siri, Watson, deep learning or whatever) and eventually reach a monster.

    I’m talking only about mathematical reasoning here, and there is probably some disagreement about how general this sort of reasoning is. But this shouldn’t be relevant here since the standard argument for an intelligence explosion is about a computer program doing math-stuff like writing and analysing computer programs, or perhaps designing hardware.

  9. Douglas Adams speaks of a superintelligent shade of the color blue, which I suppose is supposed to be nonsensical, but it just occurs to me how this might actually be possible. Digitally, a color is represented as a number, for example a combination of the intensities of red, green, and blue. Our standard today is to use eight bits for each, which makes a 24 bit number, not enough to encode a superintelligence, but to encode a precise shade you need arbitrarily many bits. So, if a superintelligence can be encoded digitally, it can be encoded as a shade of blue.