Parallels between AI1 futurology and religion are usually meant to insult and to avoid serious rational discussion of possible AI futures; typical examples can be found here and here. Such contributions to the discussion are hardly worth reading. It might therefore be tempting to dismiss philosopher John Danaher's paper Why AI doomsayers are like sceptical theists and why it matters just from its title. That may well be a mistake, however, because the point that Danaher makes in the paper seems to be one that merits taking seriously.
Danaher is concerned with a specific and particularly nightmarish AI scenario that Nick Bostrom, in his influential recent book Superintelligence, calls the treacherous turn. To understand that scenario it is useful to recall the theory of an AI's instrumental versus final goals, developed by Bostrom and by computer scientist Steve Omohundro. In brief, this theory states that virtually any ultimate goal is compatible with arbitrarily high levels of intelligence (the orthogonality thesis), and that there are a number of instrumental goals that a sufficiently intelligent AI is likely to set up as a means towards reaching its ultimate goal pretty much regardless of what this ultimate goal is (the instrumental convergence thesis). Such instrumental goals include self-protection, acquisition of hardware and other resources, and preservation of the ultimate goal. (As indicated in Footnote 2 of my earlier blog post Superintelligence odds and ends II, I am not 100% convinced that this theory is right, but I do consider it sufficiently plausible to warrant taking seriously, and it is the best and pretty much only game in town for predicting behavior of superintelligens AIs.)
Another instrumental goal that is likely to emerge in a wide range of circumstances is that as long as humans are still in power, and as long as the AI's ultimate goals are poorly aligned with human values, the AI will try to hide its true intentions. For concreteness, we may consider one of Bostrom's favorite running examples: the paperclip maximizer. Maximizing the production of paperclips may sound like a pretty harmless goal of a machine, but only up to the point where it attains superintelligence and the capacity to take over the world and to consequently turn the entire solar system (or perhaps even the entire observable universe) into a giant heap of paperclips. The goal of the paperclip maximizer is clearly not very well aligned with human values, and as soon as we humans realize what the machine is up to, we will do what we can to pull the plug on it or to prevent its plans by other means. But if the machine is smart, it will forsee this reaction from us, and therefore hide its intention to turn everything (including us) into paperclips, until the moment when it has become so powerful that we can no longer put up any resistance. This is what Bostrom calls the treacherous turn. On p 119 of Superintelligence he gives the following definition.
- The treacherous turn: While weak, an AI behaves cooperatively (increasingly so, as it gets smarter). When the AI gets sufficiently strong - without warning or provocation - it strikes, forms a singleton,2 and begins directly to optimize the world according to the criteria implied by its final values.
- An unfriendly AI of sufficient intelligence realizes that its unfriendly goals will be best realized if it behaves in a friendly manner initially, so that it will be let out of the box. It will only start behaving in a way that reveals its unfriendly nature when it no longer matters whether we find out; that is, when the AI is strong enough that human opposition is ineffectual.
And here is where Danaher, in his paper, sees a parallel to skeptical theism. Skeptical theism is the particular answer to the problem of evil - i.e., the problem of how the existence of an omnipotent and omnibenevolent God can be reconciled with the existence of so much terrible evil in the world3 - that points to limits of our mere human understanding, and the fact that something that seems evil to us may have consequences that are very very good but inconceivable to us: God moves in mysterious ways. Danaher explains that according to skeptical theists
- you cannot, from the seemingly gratuitous nature of evil, [conclude] its actually gratuitous nature. This is because there may be beyond-our-ken reasons for allowing such evils to take place. We are cognitively limited human beings. We know only a little of the deep structure of reality, and the true nature of morality. It is epistemically possible that the seemingly gratuitous evils we perceive are necessary for some greater good.
So there is the parallel. Does it matter? Danaher says that it does, because he points to huge practical end epistemic costs to the stance of skeptical theism, and suggests analogous costs to belief in the treacherous turn. As an example of such downsides to skeptical theism, he gives the following example of moral paralysis:
- Anyone who believes in God and who uses sceptical theism to resist the problem of evil confronts a dilemma whenever they are confronted with an instance of great suffering. Suppose you are walking through the woods one day and you come across a small child, tied to tree, bleeding profusely, clearly in agony, and soon to die. Should you intervene, release the child and call an ambulance? Or should you do nothing? The answer would seem obvious to most: you should intervene. But is it obvious to the sceptical theist? They assume that we don't know what the ultimate moral implications of such suffering really is. For all they know, the suffering of the child could be logically necessary for some greater good, (or not, as the case may be). This leads to a practical dilemma: either they intervene, and possibly (for all they know), frustrate some greater good; or they do nothing and possibly (for all they know) permit some
Having come this far in Danaher's paper, I told myself that his parallel is unfair, and that a crucial disanalogy is that while skeptical theists consider humanity to be in this epistemologically hopeless situation right now, Bostrom's treacherous turn is a hypothetical future scenario - it applies only when a superintelligent AI is in existence, which is obviously not the case today.4 Surely hypothesizing about the epistemologically desperate situation of people in the future is not the same thing as putting oneself in such a situation - while the latter is self-defeating, the former is not. But then Danaher delivers the following blow:
- It could be that an AI will, in the very near future, develop the level of intelligence necessary to undertake such a deceptive project and thereby pose a huge existential risk to our futures. In fact, it is even worse than that. If the AI could deliberately conceal its true intelligence from us, in the manner envisaged by Bostrom, it could be that there already is an AI in existence that poses such a risk. Perhaps Google's self-driving car, or IBM's Watson have already achieved this level of intelligence and are merely biding their time until we release them onto our highways or into our hospitals and quiz show recording studios? After all, it is not as if we have been on the lookout for the moment of the conception of deception (whatever that may look like). If AI risk is something we should take seriously, and if Bostrom’s notion of the treacherous turn is something we should also take seriously, then this would seem to be one of its implications.
- The first is a reductio, suggesting that by introducing the treacherous turn, Bostrom reveals the underlying absurdity of his position. The second is an a fortiori, suggesting that the way in which Bostrom thinks about the treacherous turn may be the right way to think about superintelligence, and may consequently provide further reason to be extremely cautious about the development
of artificial intelligence.
1) AI is short for artificial intelligence.
2) Earlier in the book (p 78), Bostrom defines a singleton as "a world order in which there is at the global level a single decision-making agency". See his 2005 paper What is a singleton.
3) The problem of evil is one that I've repeatedly discussed (and even proposed a solution to) on this blog.
4) This is somewhat analogous to the distinction between, on one hand, chemtrails conspiracy theories (which are purportedly about ongoing events), and, on the other hand, serious writing on geoengineering (which is about possible future technologies). Imagine my irritation when my writings in the latter category are read by chemtrails nutcases as support for ideas in the former category.
5) Footnote added on June 9, 2015: See also Danaher's blog post on the same topic.