- The present
paper is concerned with scenarios where AI development
has succeeded in creating a machine that is superintelligent, in the sense of vastly outperforming humans across the full range of cognitive skills that we associate with intelligence, including
prediction, planning and the elusive quality we speak of as creativity. [...]
While important, [...] matters of timing and suddenness of AI development will mostly be abstracted away in the present paper, in order to focus on issues about what happens next, once a superintelligent machine has been created. A widely accepted thesis in contemporary AI futurology is that we humans can then no longer expect to be in control of our own destiny, which will instead be up to the machine. [...]
This leads to the crucial issue of what the superintelligent machine will want – what will it be motivated to do? The question is extraordinarily difficult and any answer at present is bound to come with a high degree of uncertainty – to a large extent due to our limited understanding of how a superintelligent machine might function, but also because the answer may depend on what we choose to do during the development stage, up to the point when we lose control. [...]
So our inexperience with superintelligence puts us in a situation where any reasoning about such a machine’s goals and motivations need to be speculative to a large degree. Yet, we are not totally in the dark, and the discussion need not be totally speculative and ungrounded. The main (and pretty much only) theoretical framework available today for grounding our reasoning on this topic is what in an earlier publication I decided to call the Omohundro—Bostrom theory on instrumental vs final AI goals [...]. The two cornerstones of the theory are what Bostrom (2012) dubbed the Orthogonality Thesis (OT) and the Instrumental Convergence Thesis (ICT), which together give nontrivial and (at least seemingly) useful predictions – not about what will happen, but about what might plausibly happen under various circumstances. The OT and the ICT are, however, not precise and definite on the level that mathematical theorems can be: they are not written in stone, and they have elements of vagueness and tentativeness. The purpose of the present paper is to discuss some reasons to doubt the two theses – not with the intent of demonstrating that they are wrong or useless, but mostly to underline that further work is needed to evaluate their validity and range of applicability.