Häggström hävdar: A disturbing parallel between AI futurology and religion

måndag 8 juni 2015

A disturbing parallel between AI futurology and religion

Parallels between AI¹ futurology and religion are usually meant to insult and to avoid serious rational discussion of possible AI futures; typical examples can be found here and here. Such contributions to the discussion are hardly worth reading. It might therefore be tempting to dismiss philosopher John Danaher's paper Why AI doomsayers are like sceptical theists and why it matters just from its title. That may well be a mistake, however, because the point that Danaher makes in the paper seems to be one that merits taking seriously.

Danaher is concerned with a specific and particularly nightmarish AI scenario that Nick Bostrom, in his influential recent book Superintelligence, calls the treacherous turn. To understand that scenario it is useful to recall the theory of an AI's instrumental versus final goals, developed by Bostrom and by computer scientist Steve Omohundro. In brief, this theory states that virtually any ultimate goal is compatible with arbitrarily high levels of intelligence (the orthogonality thesis), and that there are a number of instrumental goals that a sufficiently intelligent AI is likely to set up as a means towards reaching its ultimate goal pretty much regardless of what this ultimate goal is (the instrumental convergence thesis). Such instrumental goals include self-protection, acquisition of hardware and other resources, and preservation of the ultimate goal. (As indicated in Footnote 2 of my earlier blog post Superintelligence odds and ends II, I am not 100% convinced that this theory is right, but I do consider it sufficiently plausible to warrant taking seriously, and it is the best and pretty much only game in town for predicting behavior of superintelligens AIs.)

Another instrumental goal that is likely to emerge in a wide range of circumstances is that as long as humans are still in power, and as long as the AI's ultimate goals are poorly aligned with human values, the AI will try to hide its true intentions. For concreteness, we may consider one of Bostrom's favorite running examples: the paperclip maximizer. Maximizing the production of paperclips may sound like a pretty harmless goal of a machine, but only up to the point where it attains superintelligence and the capacity to take over the world and to consequently turn the entire solar system (or perhaps even the entire observable universe) into a giant heap of paperclips. The goal of the paperclip maximizer is clearly not very well aligned with human values, and as soon as we humans realize what the machine is up to, we will do what we can to pull the plug on it or to prevent its plans by other means. But if the machine is smart, it will forsee this reaction from us, and therefore hide its intention to turn everything (including us) into paperclips, until the moment when it has become so powerful that we can no longer put up any resistance. This is what Bostrom calls the treacherous turn. On p 119 of Superintelligence he gives the following definition.

The treacherous turn:

Or as he puts it two pages earlier:

An unfriendly AI of sufficient intelligence realizes that its unfriendly goals will be best realized if it behaves in a friendly manner initially, so that it will be let out of the box. It will only start behaving in a way that reveals its unfriendly nature when it no longer matters whether we find out; that is, when the AI is strong enough that human opposition is ineffectual. So the essence of the treacherous turn is that an increasingly powerful AI may seem perfectly benevolent, while in actual fact it has plans that run counter to human values, perhaps including our extinction. Things may be very different from what they seem.

And here is where Danaher, in his paper, sees a parallel to skeptical theism. Skeptical theism is the particular answer to the problem of evil - i.e., the problem of how the existence of an omnipotent and omnibenevolent God can be reconciled with the existence of so much terrible evil in the world³ - that points to limits of our mere human understanding, and the fact that something that seems evil to us may have consequences that are very very good but inconceivable to us: God moves in mysterious ways. Danaher explains that according to skeptical theists

beyond-our-ken reasons

So here, too, things may be very different from what they seem. And in both cases - the AI treacherous turn, as well as in the worldview of the skeptical theists - the radical discrepancy between how the world seems to us and how it actually is stems from the same phenomenon: the existence of a more intelligent being who understands the world incomparably better than we do.

So there is the parallel. Does it matter? Danaher says that it does, because he points to huge practical end epistemic costs to the stance of skeptical theism, and suggests analogous costs to belief in the treacherous turn. As an example of such downsides to skeptical theism, he gives the following example of moral paralysis:

Anyone who believes in God and who uses sceptical theism to resist the problem of evil confronts a dilemma whenever they are confronted with an instance of great suffering. Suppose you are walking through the woods one day and you come across a small child, tied to tree, bleeding profusely, clearly in agony, and soon to die. Should you intervene, release the child and call an ambulance? Or should you do nothing? The answer would seem obvious to most: you should intervene. But is it obvious to the sceptical theist? They assume that we don't know what the ultimate moral implications of such suffering really is. For all they know, the suffering of the child could be logically necessary for some greater good, (or not, as the case may be). This leads to a practical dilemma: either they intervene, and possibly (for all they know), frustrate some greater good; or they do nothing and possibly (for all they know) permit some great evil. Anyone who believes we may be the soon-to-be-slaughtered victims of an AI's treacherous turn suffers from similar epistemological and practical dilemmas.

Having come this far in Danaher's paper, I told myself that his parallel is unfair, and that a crucial disanalogy is that while skeptical theists consider humanity to be in this epistemologically hopeless situation right now, Bostrom's treacherous turn is a hypothetical future scenario - it applies only when a superintelligent AI is in existence, which is obviously not the case today.⁴ Surely hypothesizing about the epistemologically desperate situation of people in the future is not the same thing as putting oneself in such a situation - while the latter is self-defeating, the former is not. But then Danaher delivers the following blow:

It could be that an AI will, in the very near future, develop the level of intelligence necessary to undertake such a deceptive project and thereby pose a huge existential risk to our futures. In fact, it is even worse than that. If the AI could deliberately conceal its true intelligence from us, in the manner envisaged by Bostrom, it could be that there already is an AI in existence that poses such a risk. Perhaps Google's self-driving car, or IBM's Watson have already achieved this level of intelligence and are merely biding their time until we release them onto our highways or into our hospitals and quiz show recording studios? After all, it is not as if we have been on the lookout for the moment of the conception of deception (whatever that may look like). If AI risk is something we should take seriously, and if Bostrom’s notion of the treacherous turn is something we should also take seriously, then this would seem to be one of its implications. Regular readers of this blog will know that I am highly influenced by the sort of AI futurology that Bostrom respresents in his book Superintelligence. I therefore find Danaher's parallel highly worrying, and I do not know what to make of it. Danaher doesn't quite know either, but suggests two possible conclusions, "tending in opposite directions":

reductio

a fortiori

Danaher goes on to develop these opposing ideas in his paper, which I recommend reading in full.⁵ And I would very much like to hear what others think about his highly disturbing parallel and what to make of it.

Footnotes

1) AI is short for artificial intelligence.

2) Earlier in the book (p 78), Bostrom defines a singleton as "a world order in which there is at the global level a single decision-making agency". See his 2005 paper What is a singleton.

3) The problem of evil is one that I've repeatedly discussed (and even proposed a solution to) on this blog.

4) This is somewhat analogous to the distinction between, on one hand, chemtrails conspiracy theories (which are purportedly about ongoing events), and, on the other hand, serious writing on geoengineering (which is about possible future technologies). Imagine my irritation when my writings in the latter category are read by chemtrails nutcases as support for ideas in the former category.

5) Footnote added on June 9, 2015: See also Danaher's blog post on the same topic.

7 kommentarer:

John Danaher8 juni 2015 kl. 21:32
It is worth not in that I have slightly weakened my position on this now. I think it is unlikely that any present AI has crossed the threshold (it is still a bare epistemic possibility). Nevertheless, I think if we take Bostrom seriously, we should be very sceptical about our ability to detect when the relevant threshold has been crossed.
SvaraRadera
Svar
Anonym10 juni 2015 kl. 22:18
Göran Lambertz har gett sig på sannolikhetsberäkningar igen...

http://goranlambertz.se/book-extra/bevisvardet-av-hundsokningarna/
SvaraRadera
Svar

Lägg till kommentar