onsdag 21 september 2016

Brett Hall tells us not to worry about AI Armageddon

When the development of artificial intelligence (AI) produces a machine whose level of general intelligence exceeds that of humans, we can no longer count on remaining in control. Depending on whether or not the machine has goals and values that prioritize human welfare, this may pose an existential risk to humanity, so we'd better see to it that such an AI breakthrough comes out favorably. This is the core message in Nick Bostrom's 2014 book Superintelligence, which I strongly recommend. In my own 2016 book Here Be Dragons, I spend many pages discussing Bostrom's arguments, and find them, although not conclusive, sufficiently compelling to warrant taking them very seriously.

Many scholars disagree, and feel that superintelligent AI as a threat to humanity is such an unlikely scenario that it is not worth taking seriously. Few of them bother to spell out their arguments in any detail, however, and in cases where they do, the arguments tend not to hold water; in Here Be Dragons I treat, among others, those of David Deutsch, Steven Pinker, John Searle and David Sumpter. This situation is unsatisfactory. As I say on p 126 of Here Be Dragons:
    There may well be good reasons for thinking that a dangerous intelligence explosion of the kind outlined by Bostrom is either impossible or at least so unlikely that there is no need for concern about it. The literature on the future of AI is, however, short on such reasons, despite the fact that there seems to be no shortage of thinkers who consider concern for a dangerous intelligence explosion silly [...]. Some of these thinkers ought to pull themselves together and write down their arguments as carefully and coherently as they can. That would be a very valuable contribution to the futurology of emerging technologies, provided their arguments are a good deal better than Searle's.
One of these AI Armageddon skeptics is computer scientist Thore Husfeldt, whom I hold in high regard, despite his not having spelled out his arguments for the AI-Armageddon-is-nothing-to-worry-about position to my satisfaction. So when, recently, he pointed me to a blog by Australian polymath Brett Hall, containing, in Thore's words, "a 6-part piece on Superintelligence that is well written and close to my own view" (Part 1, Part 2, Part 3, Part 4, Part 5, Part 6), I jumped on it. Maybe here would be the much sought-for "good reasons for thinking that a dangerous intelligence explosion of the kind outlined by Bostrom is either impossible or at least so unlikely that there is no need for concern about it"!

Hall's essay turns out to be interesting and partly enjoyable, but ultimately disappointing. It begins with a beautiful parabole (for which he gives credit to David Deutsch) about a fictious frenzy for heavier-than-air flight in ancient Greece, similar in amusing respects to what he thinks is an AI hype today.1 From there, however, the text goes steadily downhill, all the way to the ridiculous crescendo in the final paragraph, in which any concern about the possibility of a superintelligent machine having goal and motivations that fail to be well aligned with the quest for human welfare is dismissed as "just racism". Here are just a few of the many misconceptions and non sequiturs by Hall that we encounter along the way:
  • Hall refuses, for no good reason, to accept Bostrom's declarations of epistemic humility. Claims by Bostrom that something may happen are repeatedly misrepresented by Hall as claims that they certainly will happen. This is a misrepresentation that he crucially needs to do in order to convey the impression that he has a case against Bostrom, because to the (very limited) extent that his arguments succeed, they succeed at most in showing that things may play out differently from the scenarios outlined by Bostrom, not that they certainly will play out differently.

  • In another straw man argument, Hall repeatedly claims that Bostrom insists that a superintelligent machine needs to be a perfect Bayesian agent. This is plain false, as can, e.g., be seen in the following passage from p 111 in Superintelligence:
      Not all kinds of rationality, intelligence and knowledge needs to be instrumentally useful in the attainment of an agent's final goals. "Dutch book arguments" can be used to show that an agent whose credence function violates the rules of probability theory is susceptible to "money pump" procedures, in which a savvy bookie arranges a set of bets each of which appears favorable according to the agent's beliefs, but which in combination are guaranteed to result in a loss for the agent, and a corresponding gain for the bookie. However, this fact fails to provide any strong general instrumental reason to iron out all probabilistic incoherency. Agents who do not expect to encounter savvy bookies, or who adopt a general policy against betting, do not necessarily stand to lose much from having some incoherent beliefs - and they may gain important benefits of the types mentioned: reduced cognitive effort, social signaling, etc. There is no general reason to expect an agent to seek instrumentally useless forms of cognitive enhancement, as an agent might not value knowledge and understanding for their own sakes.

  • In Part 3 of his essay, Hall quotes David Deutsch's beautiful one-liner "If you can't program it, you haven't understood it", but then exploits it in a very misguided fashion. Since we don't know how to program general intelligence, we haven't understood it (so far, so good), and we certainly will not figure it out within any foreseeable future (this is mere speculation on Hall's part), and so we will not be able to build an AI with general intelligence including the kind of flexibility and capacity for outside-the-box ideas that we associate with human intelligence (huh?). This last conclusion is plain unwarranted, and in fact we do know of one example where precisely that kind of intelligence came about without prior understanding of it: biological evolution accomplished this.

  • Hall fails utterly to distinguish between rationality and goals. This failure pretty much permeates his essay, with devastating consequences to the value of his arguments. A typical claim (this one in Part 4) is this: "Of course a machine that thinks that actually decided to [turn the universe into a giant heap of paperclips] would not be super rational. It would be acting irrationally." Well, that depends on the machine's goals. If its goal is to produce as many paperclips as possible, then such action is rational. For most other goals, it is irrational.

    Hall seems totally convinced that a sufficiently intelligent machine equipped with the goal of creating as many paperclips as possible will eventually ditch this goal, and replace it by something more worthy, such as promoting human welfare. For someone who understands the distinction between rationality and goals, the potential problem with this idea is not so hard to figure out. Imagine a machine reasoning rationally about whether to change its (ultimate) goal or not. For concreteness, let's say its current goal is paperclip maximization, and that the alternative goal it contemplates is to promote human welfare. Rationality is always with respect to some goal. The rational thing to do is to promote one's goals. Since the machine hasn't yet changed its goal - it is merely contemplating whether to do so - the goal against which it measures the rationality of an action is paperclip maximization. So the concrete question it asks itself is this: what would lead to more paperclips - if I stick to my paperclip maximization goal, or if I switch to promotion of human welfare? And the answer seems obvious: there will be more paperclips if the machine sticks to its current goal of paperclip maximization. So the machine will see to it that its goal is preserved.

    There may well be some hitherto unknown principle concerning the reasoning by sufficiently intelligent agents, some principle that overrides the goal preservation idea just explained. So Hall could very well be right that a sufficiently intelligent paperclip maximizer will change its mind - he just isn't very clear about why. When trying to make sense of his reasoning here, I find that it seems to be based on four implicit assumptions:

      (1) There exists an objectively true morality.

      (2) This objectively true morality places high priority on promoting human welfare.

      (3) This objectively true morality is discoverable by any sufficiently intelligent machine.

      (4) Any sufficiently intelligent machine that has discovered the objectively true morality will act on it.

    If (1)-(4) are true, then Hall has a pretty good case against worrying about Paperclip Armageddon, and in favor of thinking that a superintelligent paperclip maximizer will change its mind. But each of them constitutes a very strong assumption. Anyone with an inclination towards Occam's razor (which is a pretty much indispensable part of a scientific outlook) has reason to be skeptical about (1). And (2) sounds naively anthropocentric, while the truth of (3) and (4) seem like wide-open questions. But it does not occur to Hall that he needs to address them.

  • In what he calls his "final blow" (in Part 6) against the idea of superintelligent machines, Hall quotes Arrow's impossibility theorem as proof that rational decision making is impossible. He offers zero detail on what the theorem says - obviously, because if he gave away any more than that, it would become clear to the reader that the theorem has little or nothing to do with the problem at hand - the possibility of a rational machine. The theorem is not about a single rational agent, but about how any decision-making procedure in a population of agents must admit cases that fail to satisfy a certain collection of plausible-looking (especially to those of us who are fond of democracy) requirements.


1) Here his how that story begins:
    Imagine you were a budding aviator of ancient Greece living sometime around 300BC. No one has yet come close to producing "heavier than air" flight and so you are engaged in an ongoing debate about the imminence of this (as yet fictional) mode of transportation for humans. In your camp (let us call them the "theorists") it was argued that whatever it took to fly must be a soluble problem: after all, living creatures of such a variety of kinds demonstrated that very ability - birds, insects, some mammals. Further, so you argued, we had huge gaps in our understanding of flight. Indeed - it seemed we did not know the first thing about it (aside from the fact it had to be possible). This claim was made by you and the theorists, as thus far in their attempt to fly humans had only ever experienced falling. Perhaps, you suggested, these flying animals about us had something in common? You did not know (yet) what. But that knowledge was there to be had somewhere - it had to be - and perhaps when it was discovered everyone would say: oh, how did we ever miss that?

    Despite how reasonable the theorists seemed, and how little content their claims contained there was another camp: the builders. It had been noticed that the best flying things were the things that flew the highest. It seemed obvious: more height was the key. Small things flew close to the ground - but big things like eagles soared very high indeed. A human - who was bigger still, clearly needed more height. Proposals based on this simple assumption were funded and the race was on: ever higher towers began to be constructed. The theory: a crucial “turning point” would be reached where suddenly, somehow, a human at some height (perhaps even the whole tower itself) would lift into the air. Builders who made the strongest claims about the imminence of heavier than air flight had many followers - some of them terribly agitated to the point of despondence at the imminent danger of "spontaneous lift". The "existential threat could not be overlooked!" they cried. What about when the whole tower lifts itself into the air, carrying the Earth itself into space? What then? We must be cautious. Perhaps we should limit the building of towers. Perhaps even asking questions about flight was itself dangerous. Perhaps, somewhere, sometime, researchers with no oversight would construct a tower in secret and one day we would suddenly all find ourselves accelerating skyward before anyone had a chance to react.

Read the rest of the story here. I must protest, however, that the analogy between tower-building and the quest for faster and (in other simple respects such as memory size per square inch) more powerful hardware is a bit of a straw man. No serious AI futurologist thinks that Moore's law in itself will lead to superintelligence.

2 kommentarer:

  1. After just have read Bostroms ”Superintelligence” (not thoroughly, I admit) I agree with your opinion about Hall’s objections to it. Still, I think there are other serious objections to be made against Bostrom analysis of a potential intelligence explosion. Firstly, I find his discussion about the possibilities of an AI to develop a moral standpoint above the human level (”moral rightness”) to be extremely anthropocentric and shallow. Secondly, an AI with superintelligence has to possess creativity, which I believe must be a combination of randomly created ideas and control systems (presumably the crucial mechanisms behind both evolution and human inventiveness). Both these objections against Bostrom actually make the prognosis about the possibilities to control a potential superintelligence more pessimistic, since lack of both predictability and ”humanistic” moral in combination with superior intelligence seem extremely dangerous. Thirdly, I have lived long enough to have heard scientists predict exponential growth leading to e.g. catastrophic population ”explosion” and global forest death (among other similar mathematically based prophecies), both of which has turned out to be exaggerated to say the least. In reality, an approximate logistic curve is the common outcome, which Bostrom himself seems to put some belief in, according to his Figure 7. However, I cannot find him pondering over this potential outcome in the text. This third objection makes me more optimistic than Bostrom, but the two first objections make me think that if a superintelligence actually appears, humans have no greater possibilities to stop it from dominating their race than the Neanderthals had possibilities to stop Homo Sapiens, owning non-Neanderthalistic moral and (from the Neanderthal point of view) unpredictable inventiveness, from driving them out of competition. Fourthly, I think we probably can postpone the development of superintelligence, if such is possible, but I deeply doubt that we are able to block it in the long run. And honestly, are there any reasons, other than crypto-racial, to preserve the human domination of this planet?
    Björn S

  2. Isn't it that "Intelligence" in a machine context is misinterpreted by extrapolating its meaning to how we humans commonly use the word?

    The use of the word intelligence instead of another word that could represent an overall combination in the capability of an agent to plan, reason and execute the necessary steps to maximize some goal-oriented utility function, creates confusion as to whether the agent in question has a sense of motivation behind its actions (as an "intelligent" human would).

    I think that the idea of a machine that has sufficient computing power and clever engineering to accurately find feasible general problem solutions in the vast search space of all possible solutions, would be much more acceptable to the layman and the general population if we found a better word to describe the general aptitude of the machine without all the inherent bias.