Häggström hävdar: John Danaher

Visar inlägg med etikett John Danaher. Visa alla inlägg

lördag 25 mars 2023

Three recent podcast episodes where I talk about GPT and other AI

Over the course of just over a month, I've appeared on three different podcasts for in-depth interviews about the current very rapid development in artificial intelligence (AI) and what it means for our future.

February 18, 2023: The episode Om den artificiella intelligensens framtid in Christer Sturmark's podcast Fri Tanke-podden.
March 10, 2023: The episode Om ChatGPT och generativ AI in Fredrik Eriksson's and Martin Jönsson's podcast Om filosofers liv och framtid.
March 24, 2023: The episode GPT: How worried should we be? in John Danaher's podcast Philosophical Disquisitions.

As to the subject matter discussed, there is plenty of overlap between the three episodes, but also significant non-overlap. Perhaps importantly to some of my readers, the first two are in Swedish and the third one is in English.

Let me also note that I am a returning guest on all three of these podcasts. Prior to the episodes listad above, I have previously appeared twice in Fri Tanke-podden (in December 2017 and May 2021), once in Om filosofers liv och tankar (in June 2021), and twice in Philosophical Disquisitions (in July 2019 and September 2022).

tisdag 20 september 2022

Katastrofrisker från nya teknologier: två färska framträdanden

Om jag nämner katastrofrisker från nya teknologier så kommer trogna läsare att känna igen det som det på senare år kanske mest frekventa temat för mina debattinlägg och andra skriverier, och jag har nyligen gjort två framträdanden som gemensamt kan sorteras under denna rubrik och som går att ta del av på nätet:

Vi är inte rustade att möta de extrema risker vi står inför heter den debattartikel med Max Tegmark och Anders Sandberg som vi fick in i Göteborgs-Posten i torsdags, den 15 september. I förhållande till den engelskspråkiga litteraturen om globala katastrofrisker har vi inte så mycket nytt att komma med utöver en försiktig anpassning till svenska förhållanden av de rekommendationer som Toby Ord, Angus Mercer och Sophie Dannreuther gör för Storbritannien i deras rapport Future Proof, men ämnet är ohyggligt viktigt och det vi säger om behovet av förebyggande arbete tål att upprepas.
Idag släppte den irländske filosofen John Danaher del 12 av sin podcast The Ethics of Academia med rubriken Olle Häggström on Romantics vs Vulgarists in Scientific Research - finns där poddar finns. Vi tar i vårt samtal avstamp i min text Vetenskap på gott och ont (finns även i engelsk översättning med rubriken Science for good and science for bad) och den lite karikatyrartade uppdelning av gängse forskningsetiska synsätt jag gör när jag kontrasterar de akademisk-romantiska mot de ekonomistisk-vulgära. Istället för att välja mellan dessa två förespråkar jag ett tredje synsätt som till skillnad mot de två andra tar vederbörlig hänsyn till de risker eventuella forskningsframsteg kan föra med sig.

onsdag 3 juli 2019

Talking to John Danaher about AI motivations and AI risk denialism

Yesterday I spoke at the 2nd Oxford Workshop on Global Priorities Research about my paper Challenges to the Omohundro-Bostrom framework for AI motivations, and today my discussion with philosopher John Danaher about the same topic was released on his podcast. Towards the end of the 75-minute episode, which can be heard here and here, we also touched upon the unfortunate phenomenon of AI risk denialism.

fredag 27 oktober 2017

On Tegmark's Life 3.0

I've just read Max Tegmark´s new book Life 3.0: Being Human in the Age of Artificial Intelligence.¹ My expectations were very high after his previous book Our Mathematical Universe, and yes, his new book is also good. In Tegmark's terminology, Life 1.0 is most of biological life, whose hardware and software are both evolved; Life 2.0 is us, whose hardware is mostly evolved but who mostly create our own software through culture and learning; and Life 3.0 is future machines who design their own software and hardware. The book is about what a breakthrough in artificial intelligence (AI) might entail for humanity - and for the rest of the universe. To a large extent it covers the same ground as Nick Bostrom's 2014 book Superintelligence, but with more emphasis on cosmological perspectives and on the problem of consciousness. Other than that, I found less novelty in Tegmark's book compared to Bostrom's than I had expected, but one difference is that while Bostrom's book is scholarly quite demanding, Tegmark's is more clearly directed to a broader audience, and in fact a very pleasant and easy read.

There is of course much I could comment upon in the book, but to keep this blog post short, let me zoom in on just one detail. The book's Figure 1.2 is a very nice diagram of ways to view the possibility of a future superhuman AI, with expected goodness of the consequences (utopia vs dystopia) on the x-axis, and expected time until its arrival on the y-axis. In his discussion of the various positions, Tegmark emphasizes that "virtually nobody" expects superhuman AI to arrive within the next few years. This is what pretty much everyone in the field - including myself - says. But I've been quietly wondering for some time what the actual evidence is for the claim that the required AI breakthrough will not happen in the next few years.² Almost simultaneously with reading Life 3.0, I read Eliezer Yudkowsky's very recent essay There’s No Fire Alarm for Artificial General Intelligence, which draws attention to the fact that all of the empirical evidence that is usually held forth in favor of the breakthrough not being imminent describes a general situation that can be expected to still hold at a time very close to the breakthrough.³ Hence the purported evidence is not very convincing.

Now, unless I misremember, Tegmark doesn't actually say in his book that he endorses the view that a breakthrough is unlikely to be imminent - he just says that this is the consensus view among AI futurologists. Perhaps this is not an accident? Perhaps he has noticed the lack of evidence for the position, but chooses not to advertise this? I can see good reasons to keep a low profile on this issue. First, when one discusses topics that may come across as weird (AI futurology clearly is such a topic), one may want to somehow signal sobriety - and saying that an AI breakthrough in the next few years is unlikely may serve as such a signal. Second, there is probably no use in creating panic, as solving the so-called Friendly AI problem seems unlikely to be doable in just a few years. Perhaps one can even make the case that these reasons ought to have compelled me not to write this blog post.

Footnotes

1) I read the Swedish translation, which I typically do not do with books written in English, but this time I happened to receive it as a birthday gift from the translators Helena Sjöstrand Svenn and Gösta Svenn. The translation struck me as highly competent.

2) An even more extreme version of this question is to ask what the evidence is that, in a version of Bostrom's (2014) notion of the treacherous turn, the superintelligent AI already exists and is merely biding its time. It was philosopher John Danaher, in his provocative 2015 paper Why AI doomsayers are like sceptical theists and why it matters, who brought attention to this matter; see my earlier blog post A disturbing parallel between AI futurology and religion.

3) Here is Yudkowsky's own summary of this evidence:

(A) The author does not know how to build AGI using present technology. The author does not know where to start.

(B) The author thinks it is really very hard to do the impressive things that modern AI technology does, they have to slave long hours over a hot GPU farm tweaking hyperparameters to get it done. They think that the public does not appreciate how hard it is to get anything done right now, and is panicking prematurely because the public thinks anyone can just fire up Tensorflow and build a robotic car.

(C) The author spends a lot of time interacting with AI systems and therefore is able to personally appreciate all the ways in which they are still stupid and lack common sense.

måndag 8 juni 2015

A disturbing parallel between AI futurology and religion

Parallels between AI¹ futurology and religion are usually meant to insult and to avoid serious rational discussion of possible AI futures; typical examples can be found here and here. Such contributions to the discussion are hardly worth reading. It might therefore be tempting to dismiss philosopher John Danaher's paper Why AI doomsayers are like sceptical theists and why it matters just from its title. That may well be a mistake, however, because the point that Danaher makes in the paper seems to be one that merits taking seriously.

Danaher is concerned with a specific and particularly nightmarish AI scenario that Nick Bostrom, in his influential recent book Superintelligence, calls the treacherous turn. To understand that scenario it is useful to recall the theory of an AI's instrumental versus final goals, developed by Bostrom and by computer scientist Steve Omohundro. In brief, this theory states that virtually any ultimate goal is compatible with arbitrarily high levels of intelligence (the orthogonality thesis), and that there are a number of instrumental goals that a sufficiently intelligent AI is likely to set up as a means towards reaching its ultimate goal pretty much regardless of what this ultimate goal is (the instrumental convergence thesis). Such instrumental goals include self-protection, acquisition of hardware and other resources, and preservation of the ultimate goal. (As indicated in Footnote 2 of my earlier blog post Superintelligence odds and ends II, I am not 100% convinced that this theory is right, but I do consider it sufficiently plausible to warrant taking seriously, and it is the best and pretty much only game in town for predicting behavior of superintelligens AIs.)

Another instrumental goal that is likely to emerge in a wide range of circumstances is that as long as humans are still in power, and as long as the AI's ultimate goals are poorly aligned with human values, the AI will try to hide its true intentions. For concreteness, we may consider one of Bostrom's favorite running examples: the paperclip maximizer. Maximizing the production of paperclips may sound like a pretty harmless goal of a machine, but only up to the point where it attains superintelligence and the capacity to take over the world and to consequently turn the entire solar system (or perhaps even the entire observable universe) into a giant heap of paperclips. The goal of the paperclip maximizer is clearly not very well aligned with human values, and as soon as we humans realize what the machine is up to, we will do what we can to pull the plug on it or to prevent its plans by other means. But if the machine is smart, it will forsee this reaction from us, and therefore hide its intention to turn everything (including us) into paperclips, until the moment when it has become so powerful that we can no longer put up any resistance. This is what Bostrom calls the treacherous turn. On p 119 of Superintelligence he gives the following definition.

The treacherous turn:

Or as he puts it two pages earlier:

An unfriendly AI of sufficient intelligence realizes that its unfriendly goals will be best realized if it behaves in a friendly manner initially, so that it will be let out of the box. It will only start behaving in a way that reveals its unfriendly nature when it no longer matters whether we find out; that is, when the AI is strong enough that human opposition is ineffectual. So the essence of the treacherous turn is that an increasingly powerful AI may seem perfectly benevolent, while in actual fact it has plans that run counter to human values, perhaps including our extinction. Things may be very different from what they seem.

And here is where Danaher, in his paper, sees a parallel to skeptical theism. Skeptical theism is the particular answer to the problem of evil - i.e., the problem of how the existence of an omnipotent and omnibenevolent God can be reconciled with the existence of so much terrible evil in the world³ - that points to limits of our mere human understanding, and the fact that something that seems evil to us may have consequences that are very very good but inconceivable to us: God moves in mysterious ways. Danaher explains that according to skeptical theists

beyond-our-ken reasons

So here, too, things may be very different from what they seem. And in both cases - the AI treacherous turn, as well as in the worldview of the skeptical theists - the radical discrepancy between how the world seems to us and how it actually is stems from the same phenomenon: the existence of a more intelligent being who understands the world incomparably better than we do.

So there is the parallel. Does it matter? Danaher says that it does, because he points to huge practical end epistemic costs to the stance of skeptical theism, and suggests analogous costs to belief in the treacherous turn. As an example of such downsides to skeptical theism, he gives the following example of moral paralysis:

Anyone who believes in God and who uses sceptical theism to resist the problem of evil confronts a dilemma whenever they are confronted with an instance of great suffering. Suppose you are walking through the woods one day and you come across a small child, tied to tree, bleeding profusely, clearly in agony, and soon to die. Should you intervene, release the child and call an ambulance? Or should you do nothing? The answer would seem obvious to most: you should intervene. But is it obvious to the sceptical theist? They assume that we don't know what the ultimate moral implications of such suffering really is. For all they know, the suffering of the child could be logically necessary for some greater good, (or not, as the case may be). This leads to a practical dilemma: either they intervene, and possibly (for all they know), frustrate some greater good; or they do nothing and possibly (for all they know) permit some great evil. Anyone who believes we may be the soon-to-be-slaughtered victims of an AI's treacherous turn suffers from similar epistemological and practical dilemmas.

Having come this far in Danaher's paper, I told myself that his parallel is unfair, and that a crucial disanalogy is that while skeptical theists consider humanity to be in this epistemologically hopeless situation right now, Bostrom's treacherous turn is a hypothetical future scenario - it applies only when a superintelligent AI is in existence, which is obviously not the case today.⁴ Surely hypothesizing about the epistemologically desperate situation of people in the future is not the same thing as putting oneself in such a situation - while the latter is self-defeating, the former is not. But then Danaher delivers the following blow:

It could be that an AI will, in the very near future, develop the level of intelligence necessary to undertake such a deceptive project and thereby pose a huge existential risk to our futures. In fact, it is even worse than that. If the AI could deliberately conceal its true intelligence from us, in the manner envisaged by Bostrom, it could be that there already is an AI in existence that poses such a risk. Perhaps Google's self-driving car, or IBM's Watson have already achieved this level of intelligence and are merely biding their time until we release them onto our highways or into our hospitals and quiz show recording studios? After all, it is not as if we have been on the lookout for the moment of the conception of deception (whatever that may look like). If AI risk is something we should take seriously, and if Bostrom’s notion of the treacherous turn is something we should also take seriously, then this would seem to be one of its implications. Regular readers of this blog will know that I am highly influenced by the sort of AI futurology that Bostrom respresents in his book Superintelligence. I therefore find Danaher's parallel highly worrying, and I do not know what to make of it. Danaher doesn't quite know either, but suggests two possible conclusions, "tending in opposite directions":

reductio

a fortiori

Danaher goes on to develop these opposing ideas in his paper, which I recommend reading in full.⁵ And I would very much like to hear what others think about his highly disturbing parallel and what to make of it.

Footnotes

1) AI is short for artificial intelligence.

2) Earlier in the book (p 78), Bostrom defines a singleton as "a world order in which there is at the global level a single decision-making agency". See his 2005 paper What is a singleton.

3) The problem of evil is one that I've repeatedly discussed (and even proposed a solution to) on this blog.

4) This is somewhat analogous to the distinction between, on one hand, chemtrails conspiracy theories (which are purportedly about ongoing events), and, on the other hand, serious writing on geoengineering (which is about possible future technologies). Imagine my irritation when my writings in the latter category are read by chemtrails nutcases as support for ideas in the former category.

5) Footnote added on June 9, 2015: See also Danaher's blog post on the same topic.