tisdag 15 oktober 2013

Reading the Hanson-Yudkowsky debate

Robin Hanson and Eliezer Yudkowsky are, in my opinion, two the the brightest, most original and most important thinkers of our time. Sadly, they are almost entirely ignored in mainstream discourse. They both consider it fairly likely that the ongoing computer revolution will continue and, at some point in the next 100 years or so, cause profound changes to society and what it means to be human. They differ drastically, however, in their estimates of how likely various more detailed scenarios are. Hanson expects scenarios where we learn how to emulate human brains as computer software, thus enabling us to cheaply copy our minds onto computers, quickly causing vast changes to the labor market and to the economy as a whole. Yudkowsky gives more credence to even more radically transformative scenarios of the kind that are typically labeled intelligence explosion or the Singularity, where we humans eventually manage to construct an AI (an artificial intelligence) that is superior to us in terms of general intelligence, including the construction of AIs, so it quickly manages to create an even smarter AI, and so on in a rapidly escalating spiral, resulting within perhaps weeks or even hours in an AI so smart that it can take over the world.

The Hanson-Yudkowsky AI-Foom Debate is a recently published 730-page ebook, provided free of charge by the Machine Intelligence Research Institute (founded in 2000 and the brainchild of Yudkowsky). The bulk part of the book consists of a debate between Hanson and Yudkowsky that took place on the blog Overcoming Bias during (mostly) November-December 2008. The core topic of the debate concerns the likelihood of the intelligence explosion scenario outlined by Yudkowsky. Hanson considers this scenario quite unlikely, while Yudkowsky considers it reasonably likely provided that civilisation does not collapse first (by a nuclear holocaust or by any of a host of other dangers), but they are both epistemically humble and open to the possibility of being simply wrong. They mostly alternate as authors of the blog posts presented in the book, with just a couple of critical posts by other authors (Carl Shulman and James Miller) interspersed in the first half of the debate. The blog posts are presented mostly chronologically, except in a few cases where chronology is sacrificed in favor of logical flow.1 After this bulk part follows some supplementary material, including (a) a transcript of an oral debate between Hanson and Yudkowsky conducted in June 2011, (b) a 56-page summary by Kaj Sotala of the debate (both the blog exchange in 2008 and the oral debate in 2011), and (c) Yudkowsky's 2013 manuscript Intelligence Explosion Microeconomics which I've recommended in an earlier blog post. Sotala's summary is competent and fair, but adds very little to the actual debate, so to be honest I don't quite see the point of including it in the book. Yudkowsky's manuscript, on the other hand, is very much worth reading after going through the debate; while it does repeat many of the points raised there, the arguments are put a bit more rigorously and systematically compared to the more relaxed style of the debate, and Yudkowsky offers quite a few additional insights that he has made since 2008 and 2011.

My overall assessment of The Hanson-Yudkowsky AI-Foom Debate is strongly favorable. What we get is an engaging dialogue between two exceptionally sharp minds. And the importance of their topic of discussion can hardly be overestimated. While the possibility of an intelligence explosion is far from the only lethal danger to humanity in the 21st century that we need to be aware of, it is certainly one of those that we very much need to take seriously and figure out how to tackle. I warmly - no, insistently - recommend The Hanson-Yudkowsky AI-Foom Debate to anyone who has above-average academic capabilities and who cares deeply about the future of humanity.

So, given the persistent disagreement between Hanson and Yudkowsky on the likelihood of an intelligence explosion, who is right and who is wrong? I don't know. It is not obvious that this question has a clearcut answer, and of course it may well be that they are both badly wrong about the likely future. This said, I still find myself throughout most parts of the book to be somewhat more convinced by Yudkowsky's arguments than by Hanson's. Following are some of my more specific reactions to the book.
  • Hanson and Yudkowsky are both driven by an eager desire to understand each other's arguments and to pinpoint the source of their disagreement. This drives them to bring a good deal of meta into their discussion, which I think (in this case) is mostly a good thing. The discussion gets an extra nerve from the facts that they both aspire to be (as best as they can) rational Bayesian agents and that they are both well-acquainted with Robert Aumann's Agreeing to Disagree theorem, which states that whenever two rational Bayesian agents have common knowledge of each other's estimates of the probability of some event, their estimates must in fact agree.2 Hence, as long as Hanson's and Yudkowsky's disagreement persists, this is a sign that at least one of them is irrational. Especially Yudkowsky tends to obsess over this.

  • All scientific reasoning involves both induction and deduction, although the proportions can vary quite a bit. Concerning the difference between Hanson's and Yudkowsky's respective styles of thinking, what strikes me most is that Hanson relies much more on induction,3 compared to Yudkowsky who is much more willing to engage in deductive reasoning. When Hanson reasons about the future, he prefers to find some empirical trend in the past and to extrapolate it into the future. Yudkowsky engages much more in mechanistic explanations of various phenomena, and combining then in order to deduce other (and hitherto unseen) phenomena.4 (This difference in thinking styles is close or perhaps even identical to what Yudkowsky calls the "outside" versus "inside" views.)

  • Yudkowsky gave, in his very old (1996 - he was a teenager then) paper Staring into the Singularity, a beautiful illustration based on Moore's law of the idea that a self-improving AI might take off towards superintelligence very fast:
      If computing speeds double every two years, what happens when computer-based AIs are doing the research?

      Computing speed doubles every two years.
      Computing speed doubles every two years of work.
      Computing speed doubles every two subjective years of work.

      Two years after Artificial Intelligences reach human equivalence, their speed doubles. One year later, their speed doubles again.

      Six months - three months - 1.5 months ... Singularity.

    The simplicity of this argument makes it tempting to put forth when explaining the idea of an intelligence explosion to someone who is new to it. I have often done so, but have come to think it may be a pedagogical mistake, because it is very easy for an intelligent layman to come up with compelling arguments against the possibility of the suggested scenario, and then walk away from the issue thinking (erroneously) that he/she has shot down the whole intelligent explosion idea. In Chapter 5 of the book (taken from the 2008 blog post The Weak Inside View), Yudkowsky takes care to distance himself from his 1996 argument. It is naive to just extrapolate Moore's law indefinitely into the future, and, perhaps more importantly, Yudkowsky holds software improvements to be a more likely main driver of an intelligence explosion than hardware improvements. Related to this is the notion of hardware overhang discussed by Yudkowsky in Chapter 33 (Recursive Self-Improvement) and elsewhere: At the time where the self-improving AI emerges, there will likely exist huge amounts of poorly protected harware available via the Internet, so why not simply go out and pick that up rather than taking the cumbersome trouble of inventing better hardware?

  • In Chapter 18 (Surprised by Brains), Yudkowsky begins...
      Imagine two agents who've never seen an intelligence - including, somehow, themselves - but who've seen the rest of the universe up until now, arguing about what these newfangled "humans" with their "language" might be able to do
    ...and then goes on to give a very entertaining dialogue between his own alter ego and a charicature of Hanson, concerning what this funny new thing down on Earth might lead to. The point of the dialogue is to answer the criticism that it is extremely shaky to predict something (an intelligence explosion) that has never happened before - a point made by means of showing that new and surprising things have happened before. I really like this chapter, but cannot quite shake off a nagging feeling that the argument is similar to the so-called Galileo gambit, a favorite ploy among pseudoscientists:
      They made fun of Galileo, and he was right.
      They make fun of me, therefore I am right.

  • The central concept that leads Yudkowsky to predict an intelligence explosion is the new positive feedback introduced by recursive self-improvement. But it isn't really new, says Hanson, recalling (in Chapter 2: Engelbart as UberTool?) the case of Douglas Engelbart, his 1962 paper Augmenting Human Intellect: A Conceptual Framework, and his project to create computer tools (many of which are commonplace today) that will improve the power and efficiency of human cognition. Take word processing as an example. Writing is a non-negligible part of R&D, so if we get an efficient word processor, we will get (at least a bit) better at R&D, so we can then device an even better word processor, and so on. The challenge here to Yudkowsky is this: Why hasn't the invention of the word processor triggered an intelligence explosion, and why is the word processor case different from the self-improving AI feedback loop?

    An answer to the last question might be that the writing part of the R&D process is not really all that crucial to the R&D process, taking up maybe just 2% of the time involved, as opposed to the stuff going on in the AI's brain which makes up maybe 90% of the R&D work. In the word processor case, no more than 2% improvement is possible, and after each iteration the percentage decreases, quickly fizzling out to undetectable levels. But is there really a big qualitative difference between 0.02 and 0.9 here? Won't the 90% part of the R&D taking place inside the AI's brain similarly fizzle out after a number of iterations of the feedback loop, with other factors (external logistics) taking on the role as dominant bottlenecks? Perhaps not, if the improved AI brain figures out ways to improve the external logistics as well. But then again, why doesn't that same argument apply to word processing? I think this is an interesting criticism from Hanson, and I'm not sure how conclusively Yudkowsky has answered it.


1) The editing is gentle but nontrivial, so it is slightly strange that there is no mention of who did it (or even a signature (or two) indicating who wrote the foreword). It might be that Hanson and Yudkowsky did it together, but I doubt it; a prime suspect (if I may speculate) is Luke Muehlhauser.

2) At least, this is what the theorem says in what seems to be Hanson's and Yudkowsky's idea of what a rational Bayesian agent is. I'm not sure I buy into this, because one of the assumptions going into Aumann's theorem is that both agents are born with the same prior. On one hand, I can have some sympathy with the idea that two agents that are exposed to the exact same evidence should have the same beliefs, implying identical priors. On the other hand, as long as no one is able to pinpoint what excatly the objectively correct prior to be born with is, I see no compelling reason for giving the verdict "at least one of them is irrational" as soon as we encounter two agents with different priors. The only suggetion I am aware of for a prior to elevate to this universal objective correctness status is the so-called Solomonoff prior, but the theoretical arguments in its support are unconvincing and the practical difficulties in implementing it are (most likely) unsurmountable.

3) A cornerstone in Hanson's study of the alternative future mentioned in the first paragraph above is an analysis of the growth of the global economy (understood in a suitably wide sense) in the past few million years or so. This growth, he finds, is mostly exponential at a constant rate, except at a very small number of disruptive events when the growth rate makes a more-or-less sudden jump; these events are the emergence of the modern human brain, the invention of farming, and the beginning of the industrial era. From the relative magnitudes and timing of these events, he tries to predict a fourth one - the switch to an economy based on uploaded minds. In my view, as a statistician, this is not terribly convincing - the data set is too small, and there is too little in terms of mechanistic explanations that suggest a continued regularity. Nevertheless, his ideas are very interesting and very much worth discussing. He does so in a draft book that has not been made publically available but only circulated privately to a number of critical readers (including yours truly). Prospective readers may try and contact Hanson directly. See also the enlightening in-depth YouTube discussion on this topic between Hanson and Nikola Danaylov.

4) Althogh I realize that the analogy is grossly unfair to Hanson, I am reminded of the incessant debate between climate scientists and climate denialists. A favorite method of the denialists is to focus on a single source of data (most often a time series of global average temperature), ignore everything else, and proclaim that the statistical evidence in favor of continued warming is statistically inconclusive. Climate scientists, on the other hand, are much more willing to look at the mechanisms underlying wheather and climate, and to deductively combine these into computer models that provide scenarios for future climate - models that the denialists tend to dismiss as mere speculation. (Come to think of it, this comparison is highly unfair not only to Hanson, but even more so to climate science, whose computer models are incomparably better validated, and stand on incomparably firmer ground, than Yudkowsky's models of an intelligence explosion.)

4 kommentarer:

  1. I did the editing (and the conversion to LaTeX) but did not write the foreword. I'm glad you like it!

  2. Olle,

    You know, a positive outlook on life improves your quality of life substantially. All this doomsday gloom that you seem to be so fond of with regard to technological advancements do little more than make your life a misery.

    1. Thank you, Anonymous 21:09, for your concern. Two points of disagreement, however:

      1. I care deeply about the future, I am convinced that it is possible for us at present to influence the future, and I want us to do so in such a way as to improve the odds of a favorable outcome. I am furthermore convinced that in order to do so, we should strive to attain beliefs about the world that are as realistic as possible given the available evidence - in sharp contrast to your suggestion of striving for beliefs that make us feel good.

      2. My life is not a misery.