With the publication in Axess 6/2014 of my review (which I advertised in two blog posts in July) of Nick Bostrom's Superintelligence: Paths, Dangers, Strategies, I am now ready to present it to the readers of this blog.1,2 The book is, in my humble opinion, terribly important, and it deserves to be as widely discussed as possible. Hoping to stimulate such discussion, I decided to translate the review into English; my translation is given below. The issues I wanted to discuss vastly outnumber what I could reasonably fit into the review, and for this reason I will, during the next week or so, follow up this review with a handful of further blog posts devoted to what I will call "Superintelligence odds and ends".
In Axess 3/2014, readers were treated to a theme section on future robotization and its consequences to society. Machines replacing human labor is of course not a new phenomenon, and we have always found new tasks for human workers at about the same rate as machines have taken over the old ones (albeit with some variation in the booms and recessions of the economy). Today, however, things are happening faster than ever, and it is not clear what to expect when advances in artificial intelligence (AI) causes the automatization not only of manual labor, but also of an increasing number of increasingly advanced intellectual tasks. On a basic level, our liberation from the hardship of labor is a good thing, letting us focus instead on art, culture, sports, love or whatever we wish to fill our lives with, but can the transition to such a utopia be accomplished without negative social consequences of monstrous proportions? These are some of the issues discussed by the eight writers of the Axess theme section, with Erik Brynjolfsson's and Andrew McAfee's important and influential recent book The Second Machine Age as the starting point.
There is, however, a longer and more radical perspective on AI, where far greater values than merely a turbulent labor market are at stake. Earlier this year, physicist Stephen Hawking and three coauthors wrote in The Independent that "whereas the short-term impact of AI depends on who controls it, the long-term impact depends on whether it can be controlled at all", and they warned that while "it's tempting to dismiss the notion of highly intelligent machines as mere science fiction, [...] this would be a mistake, and potentially our worst mistake in history".
Already Alan Turing, the father of modern computer science, anticipated in his 1951 essay Ingelligent machinery, a heretical theory how machines would eventually reach superhuman intelligence levels and then quickly take control of the world. Since then, the field has been haunted by a series of overly optimistic and in retrospect a bit embarrassing predictions about how soon an AI breakthrough could be expected. No AI with a general intelligence on the level of a human or higher is anywhere to be seen. On the other hand, AI research has made impressive advances in more specialized areas. Nowadays, computers beat even the strongest human opposition in chess as well as in the quiz game Jeopardy. Google, with their driverless cars' flawless performance both on highways and in city traffic, has achieved results that only ten years ago would have been considered utopian. The list of examples goes on and on, but what makes it easy to underestimate the achievements of AI research is the phenomenon that AI pioneer John McCarthy summarized by saying that "as soon as it works, no one calls it AI anymore".
What, then, can we expect from the promised breakthrough? There are theoretical arguments supporting the conclusion that an AI exceeding human general intelligence can eventually be built, and that from that point onwards, the development will escalate very quickly, towards levels of superintelligence leaving all human cognitive abilities far behind. What makes the latter development likely are variations of the following feedback mechanism: once an AI has a general intelligence exceeding ours, it is also better than us at building AI, so it will be in a position to build an even better AI, and so on in a spiral climbing towards higher and higher intelligence levels, in what is sometimes called the Singularity.
Few, if anyone, have thought deeper and more systematically about these issues than the Swedish-born Oxford philosopher Nick Bostrom. In his groundbreaking new book Superintelligence: Paths, Dangers, Strategies, he emphasizes what a crucial turning point in human history an AI breakthrough may become. On one hand, the creation of a superintelligence has the potential to solve all our problems and to give us all that we wish for. On the other hand, a breakthrough can turn out to be extremely dangerous, and in the worst case lead to the extinction of humanity. A central message in the book is the importance of thinking things through very carefully and taking suitable action before the breakthrough hits us, because when it does it may well be too late.
Talk of the danger of AI may trigger us to think about drones and other military technology, but Bostrom emphasizes that even an AI with seemingly harmless tasks may bring about disaster. He exemplifies with a machine designed to produce paperclips. If such a machine becomes the seed of an intelligence explosion, then, unless we have planned it with extreme caution, it may well result in the entire solar system (including ourselves) being turned into a grotesque heap of paperclips.
No reliable timelines are available for when the big AI breakthrough can be expected - it is even hard to say which of the years 2030 or 2200 is the more realistic prediction. More generally, making concrete predictions about what a breakthrough will entail is a very shaky undertaking. Bostrom maintains an exemplary attitude of epistemic humility when he subjects even the most obvious-sounding arguments - his own and others' - to his utmost scrutiny, and there is hardly a prediction or a recommendation in the book that isn't followed by a qualification along the lines of "on the other hand, it might be that...". In the preface he stresses that many of his points are likely to be outright wrong (although he is unable to tell which ones). And then he adds the following.
- This is no false modesty: for while I believe that my book is likely to be seriously wrong and misleading, I think that the alternative views that have been presented in the literature are substantially worse - including the default view, or "null hypothesis", according to which we can for the time being safely or reasonably ignore the prospect of superintelligence.
A cornerstone of the book is the chapter in which Bostrom argues for two theses which he calls, respectively, the orthogonality thesis and the instrumental convergence thesis. To understand these, we need to distinguish between the AI's means and its ends. Its ends consist in its ultimate drive, while its means are whatever instrumental drives that serve as tools to support its ultimate drive. The orthogonality thesis states that superintelligence is compatible with pretty much any ultimate drive, be it paperclip production or the maximization of the total hedonistic level (pleasure minus suffering) of all sentient beings of the entire universe. The instrumental convergence thesis states that there are a number of instrumental drives that any sufficiently intelligent AI will tend to attain, almost regardless of its ultimate drive. An example is the desire not to be turned off, because an AI that is turned off will not be able to do anything to promote its ultimate aim. For similar reasons, an AI can be expected to wish to improve its intelligence, to copy its code to other machines, and to take control of as much hardware and other resources as possible. Plus, it will want to preserve its ultimate drive.
The problem we need to solve, according to Bostrom, is what he calls the control problem: how do we turn the intelligence explosion into a controlled detonation, with consequences that are beneficial to humanity? To do so, we have to instill the AI with the right drives (and to do so before it reaches superintelligence levels, because by then it will not allow such tampering). The more one looks at this challenge, the less innocuous and the more difficult does it seem, and even the slightest mistake may trigger complete disaster. An AI equipped with the ultimate drive of doing things that make us happy may decide to rewire our brains in such a way that we are constantly happy regardless of circumstances.
The difficulties that are lined up in chapter after chapter may eventually be too much for some readers, and cause them to throw up their hands and declare the situation hopeless. But Bostrom refuses to give in, convinced as his is that he is working on one of the most important problems ever encountered, and that it needs to be solved. But he cannot do it on his own. Unfortunately, among the many thousands of researchers working on AI today, only a tiny fraction show a serious interest in the control problem. Bostrom is eager to change this, and speaks of "a research challenge worthy of some of the next generation's best mathematical talent".
There are a number of recent books, other than Bostrom's, that treat the intelligence explosion and related radical AI scenarios. These include Singularity Rising by economist James Miller, and Our Final Invention by documentary film maker James Barrat. Especially the latter is written in a more popular and less technical language than Supreintelligence. Bostrom makes up for this with his extraordinary sagacity and clarity, enabling him to combine his wide-ranging knowledge over an impressively broad spectrum of disciplines - engineering, natural sciences, medicine, social sciences and philosophy - into a comprehensible whole. For anyone content with reading just one of these books, I do not hesitate to recommend Superintelligence. If this book gets the reception that it deserves, it may turn out the most important alarm bell since Rachel Carson's Silent Spring from 1962, or ever.
1) The editors of Axess gave my review the headline En ren kontrollfråga, which translates into English as Purely a matter of control. The word ren (pure) here is a bit misplaced, possibly in an attempt to create a (rather pointless) pun that I do not care to explain here.
2) Among other reviews and comments on Bostrom's book, two of the most interesting ones I've come across so far are Max Tegmark's recent manuscript Friendly Artificial Intelligence: the Physics Challenge and Robin Hansons's blog post I Still Don’t Get Foom. Tegmark gives reasons for why Bostrom's control problem may be even more difficult than suggested in the book, while Hanson expresses skepticism about the very rapid ascension from human-level artificial intelligence that Bostrom calls "fast takeoff" and that many others call "the Singularity". Hanson's argument, which is clearly worthy of serious attention, echoes his reasoning in The Hanson-Yudkowsky AI Foom Debate (which I reviewed here).