- There may well be good reasons for thinking that a dangerous intelligence explosion of the kind outlined by Bostrom is either impossible or at least so unlikely that there is no need for concern about it. The literature on the future of AI is, however, short on such reasons, despite the fact that there seems to be no shortage of thinkers who consider concern for a dangerous intelligence explosion silly [...]. Some of these thinkers ought to pull themselves together and write down their arguments as carefully and coherently as they can. That would be a very valuable contribution to the futurology of emerging technologies, provided their arguments are a good deal better than Searle's.
- Hall refuses, for no good reason, to accept Bostrom's declarations of epistemic humility. Claims by Bostrom that something may happen are repeatedly misrepresented by Hall as claims that they certainly will happen. This is a misrepresentation that he crucially needs to do in order to convey the impression that he has a case against Bostrom, because to the (very limited) extent that his arguments succeed, they succeed at most in showing that things may play out differently from the scenarios outlined by Bostrom, not that they certainly will play out differently.
- In another straw man argument, Hall repeatedly claims that Bostrom insists that a superintelligent machine needs to be a perfect Bayesian agent. This is plain false, as can, e.g., be seen in the following passage from p 111 in Superintelligence:
- Not all kinds of rationality, intelligence and knowledge needs to be instrumentally useful in the attainment of an agent's final goals. "Dutch book arguments" can be used to show that an agent whose credence function violates the rules of probability theory is susceptible to "money pump" procedures, in which a savvy bookie arranges a set of bets each of which appears favorable according to the agent's beliefs, but which in combination are guaranteed to result in a loss for the agent, and a corresponding gain for the bookie. However, this fact fails to provide any strong general instrumental reason to iron out all probabilistic incoherency. Agents who do not expect to encounter savvy bookies, or who adopt a general policy against betting, do not necessarily stand to lose much from having some incoherent beliefs - and they may gain important benefits of the types mentioned: reduced cognitive effort, social signaling, etc. There is no general reason to expect an agent to seek instrumentally useless forms of cognitive enhancement, as an agent might not value knowledge and understanding for their own sakes.
- In Part 3 of his essay, Hall quotes David Deutsch's beautiful one-liner "If you can't program it, you haven't understood it", but then exploits it in a very misguided fashion. Since we don't know how to program general intelligence, we haven't understood it (so far, so good), and we certainly will not figure it out within any foreseeable future (this is mere speculation on Hall's part), and so we will not be able to build an AI with general intelligence including the kind of flexibility and capacity for outside-the-box ideas that we associate with human intelligence (huh?). This last conclusion is plain unwarranted, and in fact we do know of one example where precisely that kind of intelligence came about without prior understanding of it: biological evolution accomplished this.
- Hall fails utterly to distinguish between rationality and goals. This failure pretty much permeates his essay, with devastating consequences to the value of his arguments. A typical claim (this one in Part 4) is this: "Of course a machine that thinks that actually decided to [turn the universe into a giant heap of paperclips] would not be super rational. It would be acting irrationally." Well, that depends on the machine's goals. If its goal is to produce as many paperclips as possible, then such action is rational. For most other goals, it is irrational.
Hall seems totally convinced that a sufficiently intelligent machine equipped with the goal of creating as many paperclips as possible will eventually ditch this goal, and replace it by something more worthy, such as promoting human welfare. For someone who understands the distinction between rationality and goals, the potential problem with this idea is not so hard to figure out. Imagine a machine reasoning rationally about whether to change its (ultimate) goal or not. For concreteness, let's say its current goal is paperclip maximization, and that the alternative goal it contemplates is to promote human welfare. Rationality is always with respect to some goal. The rational thing to do is to promote one's goals. Since the machine hasn't yet changed its goal - it is merely contemplating whether to do so - the goal against which it measures the rationality of an action is paperclip maximization. So the concrete question it asks itself is this: what would lead to more paperclips - if I stick to my paperclip maximization goal, or if I switch to promotion of human welfare? And the answer seems obvious: there will be more paperclips if the machine sticks to its current goal of paperclip maximization. So the machine will see to it that its goal is preserved.
There may well be some hitherto unknown principle concerning the reasoning by sufficiently intelligent agents, some principle that overrides the goal preservation idea just explained. So Hall could very well be right that a sufficiently intelligent paperclip maximizer will change its mind - he just isn't very clear about why. When trying to make sense of his reasoning here, I find that it seems to be based on four implicit assumptions:
- (1) There exists an objectively true morality.
(2) This objectively true morality places high priority on promoting human welfare.
(3) This objectively true morality is discoverable by any sufficiently intelligent machine.
(4) Any sufficiently intelligent machine that has discovered the objectively true morality will act on it.
- In what he calls his "final blow" (in Part 6) against the idea of superintelligent machines, Hall quotes Arrow's impossibility theorem as proof that rational decision making is impossible. He offers zero detail on what the theorem says - obviously, because if he gave away any more than that, it would become clear to the reader that the theorem has little or nothing to do with the problem at hand - the possibility of a rational machine. The theorem is not about a single rational agent, but about how any decision-making procedure in a population of agents must admit cases that fail to satisfy a certain collection of plausible-looking (especially to those of us who are fond of democracy) requirements.
- Imagine you were a budding aviator of ancient Greece living sometime around 300BC. No one has yet come close to producing "heavier than air" flight and so you are engaged in an ongoing debate about the imminence of this (as yet fictional) mode of transportation for humans. In your camp (let us call them the "theorists") it was argued that whatever it took to fly must be a soluble problem: after all, living creatures of such a variety of kinds demonstrated that very ability - birds, insects, some mammals. Further, so you argued, we had huge gaps in our understanding of flight. Indeed - it seemed we did not know the first thing about it (aside from the fact it had to be possible). This claim was made by you and the theorists, as thus far in their attempt to fly humans had only ever experienced falling. Perhaps, you suggested, these flying animals about us had something in common? You did not know (yet) what. But that knowledge was there to be had somewhere - it had to be - and perhaps when it was discovered everyone would say: oh, how did we ever miss that?
Despite how reasonable the theorists seemed, and how little content their claims contained there was another camp: the builders. It had been noticed that the best flying things were the things that flew the highest. It seemed obvious: more height was the key. Small things flew close to the ground - but big things like eagles soared very high indeed. A human - who was bigger still, clearly needed more height. Proposals based on this simple assumption were funded and the race was on: ever higher towers began to be constructed. The theory: a crucial “turning point” would be reached where suddenly, somehow, a human at some height (perhaps even the whole tower itself) would lift into the air. Builders who made the strongest claims about the imminence of heavier than air flight had many followers - some of them terribly agitated to the point of despondence at the imminent danger of "spontaneous lift". The "existential threat could not be overlooked!" they cried. What about when the whole tower lifts itself into the air, carrying the Earth itself into space? What then? We must be cautious. Perhaps we should limit the building of towers. Perhaps even asking questions about flight was itself dangerous. Perhaps, somewhere, sometime, researchers with no oversight would construct a tower in secret and one day we would suddenly all find ourselves accelerating skyward before anyone had a chance to react.