Visar inlägg med etikett Steven Pinker. Visa alla inlägg
Visar inlägg med etikett Steven Pinker. Visa alla inlägg

tisdag 6 juli 2021

Artificial general intelligence and the common sense argument

Consider the distinction between narrow AI and artificial general intelligence (AGI), the former being AI specialized on a limited range of capabilities and tasks, and the latter denoting a hypothetical future AI technology exhibiting the flexibility and the full range of cognitive capabilities that we associate with human intelligence. Timelines for when (if ever) AGI will be built are highly uncertain [MB,GSDZE], but scientific debate on the issue has become more pressing in recent years, following work on possible catastrophic consequences of an AGI breakthrough unless sufficient care is taken in aligning the machine’s goals with human values [Y08,Bo,Ru,H].

A common argument for long or infinite timelines until AGI is to point to some example where present-day AI performs badly compared to humans, and to take this as a lack of “common sense” and an indication that constructing AGI is far beyond our current capabilities, whence it cannot plausibly become reality anytime soon [Bks,W,P18a,P18b]. We have all seen YouTube videos of robots tripping over their toes used for this purpose, and a recent viral case concerned an AI meant to steer a TV camera to track the ball during a football game which instead tracked the head of a bald linesman [V]. Such utter lack of common sense! An implicit or sometimes explicit conclusion tends to be that concerns about future machines taking control of the world are overblown and can be dismissed.

I believe that the central position assigned to concept of AGI in the AI futurology and AI safety literature has, via the following implicit narrative, given undue credence to the idea that AI existential risk is exclusively an issue concerning the far future. For AI to become AGI, we need to equip it with all of the myriad competences we associate with common sense, one after the other, and only then will it be ready to enter the spiral of self-improvement that has been suggested may lead to an intelligence explosion, superintelligence, and the ability to take over the world [G,Bo,H].

But common sense is a vague notion – it tends to be simply a label we put on any ability where humans still outperform AI. Hence, by the very definition of AGI, we will be able to point to common sense failures of AI up until the very moment AGI is created, so the argument that such failures show that AGI is a long way off is unsound; see [Y17] for a similar observation.

More importantly, however: AGI may not be necessary for the self-improvement spiral to take off. A better narrative is the following. We are already at a stage where AIs are better than humans at some cognitive tasks while humans remain better than AIs at others. For instance, AIs are better at chess [Si,SR], while humans are better at tracking a football [V]. A human pointing to the football example to show that AIs lack common sense has no more force than an AI pointing to chess to show that humans lack common sense. Neither of them fully dominates the other in terms of cognitive capabilities. Still, AI keeps improving, and rather than focusing on full dominance of AI (as encouraged by the AGI concept) the right question to ask is in what range of cognitive capacities AI needs to master in order to be able to seize power from humans. Self-improvement (i.e., constructing advanced AI) is probably a key competence here, but what does that entail?

There is probably little reason to be concerned about, say, the AlphaZero breakthrough [Si,SR], as two-player zero-sum finite board games with full information is unlikely to be the most significant part of the intelligence spectrum. More significant, probably, is the use of language, as a large fraction of what a human does to control the world is via language acts. It is also a direction of AI that has seen dramatic improvements in the last few years with large-scale natural language processors like GPT-2 and GPT-3 [RWAACBD,Bwn,So,KEWGMI]. With the launch of GPT-3, it was stressed that no new methods were employed compared to GPT-2: the improved performance was solely a consequence of brute-force upscaling. Nobody knows where the ceiling is for such upscaling, and it is therefore legitimate to ask whether the crucial ingredient to make AI’s ability to control the world surpass that of humans might lie in the brute-force expansion of such language models.

If it seems counterintuitive that an AI would attain the ability to take over the world without first having mastered common sense, consider that this happens all the time for smaller tasks. As an illustrative small-scale example, take the game of Sokoban, where the player moves around in a maze with the task of pushing a collection of boxes to a set of prespecified locations. An AI system for solving Sokoban puzzles was recently devised and exhibited superhuman abilities, but apparently without acquiring the common sense insight that every human player quickly picks up: if you push a box into a corner, it will be stuck there forever [FDS,PS]. No clear argument has been given for why efficient problem solving lacking in common sense would be impossible for a bigger task like taking over the world.

To summarize, I believe that while the notion of AGI has been helpful in the early formation of ideas in the futurology and philosophy of AI, it has also been a source of some confusion, especially via the common sense argument discussed above. AI existential risk literature might therefore benefit from a less narrow focus on the AGI concept. Such a shift of focus may be on its way, such as in the recent authoritative research agenda [CK] where talk of AGI is almost entirely abandoned in favor of other notions such as prepotent AI, meaning AI systems whose deployment would be at least as transformative to the world as humanity itself and unstoppable by humans. I welcome this shift in discourse.

References

[Bo] Bostrom, N. (2014) Superintelligence: Paths, Dangers, Strategies, Oxford University Press, Oxford.

[Bks] Brooks, R. (2017) The seven deadly sins of AI prediction, MIT Technology Review, October 6.

[Bwn] Brown, T. et al (2020) Language models are few-shot learners, https://arxiv.org/abs/2005.14165

[CK] Critch, A. and Krueger, D. (2020) AI research considerations for human existential safety (ARCHES), https://arxiv.org/abs/2006.04948

[FDS] Feng, D., Gomes, C. and Selman, B. (2020) A novel automated curriculum strategy to solve hard Sokoban planning instances, 34th Conference on Neural Information Processing Systems (NeurIPS 2020).

[G] Good, I.J. (1966) Speculations concerning the first ultraintelligent machine, Advances in Computers 6, 31-88.

[GSDZE] Grace, K., Salvatier, J., Dafoe, A., Zhang, B. and Evans, O. (2017) When will AI exceed human performance? Evidence from AI experts, https://arxiv.org/abs/1705.08807

[H] Häggström, O. (2021) Tänkande maskiner: Den artificiella intelligensens genombrott, Fri Tanke, Stockholm.

[KEWGMI] Kenton, Z., Everitt, T., Weidinger, L., Gabriel, I., Mikulik, V. and Irving, G. (2021) Alignment of language agents, https://arxiv.org/abs/2103.14659

[MB] Müller, V. and Bostrom, N. (2016) Future progress in artificial intelligence: A survey of expert opinion, in Fundamental Issues of Artificial Intelligence (ed. V. Müller), Springer, Berlin, p 554-571.

[PS] Perry, L. and Selman, B. (2021) Bart Selman on the promises and perils of artificial intelligence, Future of Life Institute Podcast, May 21.

[P18a] Pinker, S. (2018) Enlightenment Now: The Case for Reason, Science and Humanism, Viking, New York.

[P18b] Pinker, S. (2018) We’re told to fear robots. But why do we think they’ll turn on us? Popular Science, February 14.

[RWAACBD] Radford A., Wu, J., Amodei, D., Amodei, D., Clark, J., Brundage, M. and Sutskever, I. (2019) Better language models and their implications, OpenAI, February 14, https://openai.com/blog/better-language-models/

[Ru] Russell, S. (2019) Human Compatible: Artificial Intelligence and the Problem of Control, Viking, New York.

[SR] Sadler, M. and Regan, N. (2019) Game Changer: AlphaZero’s Ground-Breaking Chess Strategies and the Promise of AI, New In Chess, Alkmaar, NL.

[Si] Silver, D. et al (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play, Science 362, 1140-1144.

[So] Sotala, K. (2020) I keep seeing all kinds of crazy reports about people's experiences with GPT-3, so I figured that I'd collect a thread of them, Twitter, July 15.

[V] Vincent, J. (2020) AI camera operator repeatedly confuses bald head for soccer ball during live stream, The Verge, November 3.

[W] Waters, R. (2018) Why we are in danger of overestimating AI, Financial Times, February 5.

[Y08] Yudkowsky, E. (2008) Artificial intelligence as a positive and negative factor in global risk, in Global Catastrophic Risks (ed. N. Bostrom and M. Cirkovic), Oxford University Press, Oxford, p 308–345.

[Y17] Yudkowsky, E. (2017) There’s no fire alarm for artificial general intelligence, Machine Intelligence Research Institute, Berkeley, CA.

tisdag 8 juni 2021

Tendentiöst i DN om labbläckehypotesen

DN rapporterade i lördags om diskussionerna kring huruvida covid-19-pandemin kan ha sitt ursprung i en labbläcka från Wuhan Institute of Virology (WIV), och hur Anthony Fauci (av DN kallad "USA:s Anders Tegnell") därvid hamnat i blåsväder. Artikeln bjuder på följande passage:
    En annan ingrediens som är mumma för konspirationsteoretikerna är att Wuhan-labbet fått internationella bidrag för sin forskning - bland annat från den smittskyddsmyndighet som Anthony Fauci är chef för - och det påstås också att det bedrevs viss forskning som syftade till att förändra virus.
Två ordval här får mig att reagera. För det första, "påstås". Vadå "påstås"? Att gain-of-function-forskning1 bedrivits vid WIV är inte något löst påstående av några opålitliga tyckare, utan ett faktum som även WIV-forskarna själva stolt givit spridning åt. Se exempelvis Nature Medicine-artikeln A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence från 2015, med medverkan av WIV-forskare under ledning av Shi Zhengli.

För det andra, "konspirationsteoretikerna". DN:s reporter Juan Flores gör sitt bästa att framställa labbläckediskussioner som konspirationsteorier. Men att gain-of-function-forskning förekommer på viruslaboratorier här och var inklusive på WIV är som sagt obstridligt, liksom det faktum att många incidenter förekommit genom historien (inte minst i samband med sovjetiska biovapenprogram) där farliga smittämnen läckt från biolaboratorier. Härtill är även bristen på transparens hos kinesiska myndigheter ett okontroversiellt faktum. Om vi lägger samman dessa saker med det osäkra evidensläget (troligtvis finns inte någon enda människa som med säkerhet känner till virusets ursprung) och den allmänmänskliga benägenheten att önsketäkna och skylla ifrån sig, så inses lätt att labbläckehypotesen inte förutsätter några konspirationer eller hemliga överenskommelser i rökiga rum.

Ändå var det så labbläckehypotesen framställdes genom hela 2020: som en konspirationsteori. Därigenom blev ämnet mer eller mindre tabu i anständiga kretsar.2 Tongivande härvidlag blev den Lancet-artikel i början av 2020, med en rad tunga namn på författarlistan, som förkunnade att "We stand together to strongly condemn conspiracy theories suggesting that COVID-19 does not have a natural origin".3

Tonläget har emellertid under 2021 förändrats, mycket tack vare ett par gedigna artiklar - av Nicholas Baker i New York Magazine, respektive Nicholas Wade i Bulletin of the Atomic Scientists - som pekar dels på svagheterna i den evidens som framhölls i Lancet-artikeln, dels på omfattande annan evidens som mer tyder på en labbläcka.4 Allt fler seriösa tänkare och debattörer betraktar nu labbläckehypotesen som minst lika sannolik som hypotesen om virusets naturliga ursprung,5 och tabut får nu anses tillräckligt brutet för att möjliggöra förutsättningslös och seriös diskussion i frågan. Men DN har inte hängt med, och fortsätter att vifta med konspirationsteoristämpeln.

Jag skulle gärna se att DN tänkte om på denna punkt, för frågan om covid-virusets ursprung är viktig. Det handlar inte i första hand om någon historisk kuriositet eller om att avgöra den uppmärksammade vadslagningen mellan Martin Rees och Steven Pinker, eller ens om att peka finger mot de eventuella skyldiga, utan om att försätta oss i ett så bra kunskapsläge som möjligt för att förebygga nästa pandemi, som mycket väl kan komma att bli långt värre än covid-19.

Fotnoter

1) Med gain-of-function-forskning avses modifiering av virus eller andra smittämnen för att göra dem mer smittsamma eller farligare. Avsikten är att lära sig mer om vad mutationer i naturen kan tänkas åstadkomma och därigenom öka vår beredskap inför framtida smittoutbrott. Inom bland annat xrisk-forskningen är vi emellertid många som anser att risken för labbläckor gör gain-of-function-forskningen oacceptabelt farlig. Jag berörde saken kort i min bok Here Be Dragons från 2016, och uttryckte tillfredsställelse över att Obamaadministrationen 2014 beslutat om att förbjuda federal finansiering av gain-of-function-forskning, men detta förbud upphävdes några år senare under Trump (och Fauci), och hur som helst så behövs såklart skarpare lagstiftning än så för att säkerställa att vårdslös forskning av detta slag upphör.

2) I mer oanständiga kretsar var det däremot fritt fram, något som kan ha bidragit ytterligare till att misskreditera labbläckehypotesen bland anständigt folk.

3) Artikeln meddelar också "We declare no competing interests", vilket är vilseledande med tanke på att initiativtagaren till artikeln, Peter Daszak, har starka kopplingar till gain-of-function-forskningen vid WIV.

4) Den som hellre läser på svenska kan alternativt vända sig till Ola Wongs artikel i Kvartal.

5) Bland dem som de senaste veckorna till och med vågat uppge en subjektiv bayesiansk sannolikhet för labbläckehypotesen återfinns bland andra jag själv (55%), Nate Silver (60%), och Eliezer Yudkowsky (80%).

onsdag 30 januari 2019

Some notes on Pinker's response to Phil Torres

The day before yesterday I published my blog post Steven Pinker misleads systematically about existential risk, whose main purpose was to direct the reader to my friend and collaborator Phil Torres' essay Steven Pinker's fake enlightenment. Pinker has now written a response to Phil's essay, and had it published on Jerry Coyne's blog Why Evolution is True. The response is feeble. Let me expand a little bit on that.

After a highly undignified opening paragraph with an uncharitable and unfounded speculation about Phil's motives for writing the essay,1 Pinker goes on throughout most of his response to explain, regarding all of the quotes that he exibits in his book Enlightenment Now and that Phil points out are taken out of context and misrepresent the various authors' intentions, that... well, that it doesn't matter that they are misrepresentations, because what he (Pinker) needed was words to illustrate his ideas, and for that it doesn't matter what the original authors meant. He suggests that "Torres misunderstands the nature of quotation". So why, then, doesn't Pinker use his own words (he is, after all, one of the most eloquent science writers of our time)? Why does he take this cumbersome detour via other authors? If he doesn't actually care what these authors mean, then the only reason I can see for including all these quotes and citations is that Pinker wants to convey to his readers the misleading impression that he is familiar with the existential risk literature and that this literature gives support to his views.

The most interesting case discussed in Phil's essay and Pinker's response concerns AI researcher Stuart Russell. In Enlightenment Now, Pinker places Russell in the category of "AI experts who are publicly skeptical" that "high-level AI pose[s] the threat of 'an existential catastrophe'." Everyone who has actually read Russell knows that this characterization is plain wrong, and that he in fact takes the risk for an existential catastrophe caused by an AI breakthrough extremely seriously. Phil points this out in his essay, but Pinker insists. In his response, Pinker quotes Russell as saying that "there are reasons for optimism", as if that quote were a demonstration of Russell's skepticism. The quote is taken from Russell's answer to the 2015 Edge question - an eight-paragraph answer that, if one reads it from the beginning to the end rather than merely zooming in on the phrase "there are reasons for optimism", makes it abundantly clear that to Russell, existential AI risk is a real concern. What, then, does "there are reasons for optimism" mean? It introduces a list of ideas for things we could do to avert the existential risk that AI poses. Proposing such ideas is not the same thing as denying the risk.

It seems to me that this discussion is driven by two fundamental misunderstandings on Pinker's part. First, he has this straw man image in his head of an existential risk researcher as someone proclaiming "we're doomed", whereas in fact what existential risk researchers say is nearly always more along the lines of "there are risks, and we need to work out ways to avoid them". When Pinker actually notices that Russell says something in line with the latter, it does not fit the straw man, leading him to the erroneous conclusion that Russell is "publicly skeptical" about existential AI risk.

Second, by shielding himself from the AI risk literature, Pinker is able to stick to his intuition that avoiding the type of catastrophe illustrated by Paperclip Armageddon is easy. In his response to Phil, he says that
    if we built a system that was designed only to make paperclips without taking into account that people don’t want to be turned into paperclips, it might wreak havoc, but that’s exactly why no one would ever implement a machine with the single goal of making paperclips,
continuing his light-hearted discourse from our encounter in Brussells 2017 where he said (as quoted on p 24 of the proceedings from the meeting) that
    the way to avoid this is: don’t build such stupid systems!
The literature on AI risk suggests that, on the contrary, the project of aligning the AI's goals with ours to an extent that suffices to avoid catastrophe is a difficult task, filled with subtle obstacles and traps. I could direct Pinker to some basic references such as Yudkowsky (2008, 2011), Bostrom (2014) or Häggström (2016), but given his plateau-shaped learning curve on this topic since 2014, I fear that he would either just ignore the references, or see them as sources to mine for misleading quotes.

Footnote

1) Borrowing from the standard climate denialist's discourse about what actually drives climate scientists, Pinker says this:
    Phil Torres is trying to make a career out of warning people about the existential threat that AI poses to humanity. Since [Enlightenment Now] evaluates and dismisses that threat, it poses an existential threat to Phil Torres’s career. Perhaps not surprisingly, Torres is obsessed with trying to discredit the book [...].

måndag 28 januari 2019

Steven Pinker misleads systematically about existential risk

The main purpose of this blog post is to direct the reader to existential risk scholar Phil Torres' important and brand-new essay Steven Pinker's fake enlightenment.1 First, however, some background.

Steven Pinker has written some of the most enlightening and enjoyable popular science that I've come across in the last couple of decades, and in particular I love his books How the Mind Works (1997) and The Blank Slate (2002) which offer wonderful insights into human psychology and its evolutionary background. Unfortunately, not everything he does is equally good, and in recent years the number of examples I've come across of misleading rhetoric and unacceptably bad scholarship on his part has piled up to a disturbing extent. This is especially clear in his engagement (so to speak) with the intertwined fields of existential risk and AI (artificial intelligence) risk. When commenting on these fields, his judgement is badly tainted by his wish to paint a rosy picture of the world.

As an early example, consider Pinker's assertion at the end of Chapter 1 in his 2011 book The Better Angels of Our Nature, that we "no longer have to worry about [a long list of barbaric kinds of violence ending with] the prospect of a nuclear world war that would put an end to civilization or to human life itseslf". This is simply unfounded. There was ample reason during the cold war to worry about nuclear annihilation, and from about 2014 we have been reminded of those reasons again through Putin's aggressive geopolitical rhetoric and action and (later) the inauguration of a madman as president of the United States, but the fact of the matter is that the reasons for concern never disappeared - they were just a bit less present in our minds during 1990-2014.

A second example is a comment Pinker wrote at Edge.org in 2014 on how a "problem with AI dystopias is that they project a parochial alpha-male psychology onto the concept of intelligence". See p 116-117 of my 2016 book Here Be Dragons for a longer quote from that comment, along with a discussion of how badly misinformed and confused Pinker is about contemporary AI futurology; the same discussion is reproduced in my 2016 blog post Pinker yttrar sig om AI-risk men vet inte vad han talar om.

Pinker has kept on repeating the same misunderstandings he made in 2014. The big shocker to me was to meet Pinker face-to-face in a panel discussion in Brussels in October 2017, and hear him make the same falsehoods and non sequiturs again and to add some more, including one that I had preempted just minutes earlier by explaining the relevant parts of Omohundro-Bostrom theory for instrumental vs final AI goals. For more about this encounter, see the blog post I wrote a few days later, and the paper I wrote for the proceedings of the event.

Soon thereafter, in early 2018, Pinker published his much-praised book Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. Mostly it is an extended argument about how much better the world has become in many respects, economically and otherwise. It also contains a chapter named Existential threats which is jam-packed with bad scholarship and claims ranging from the misleading to outright falsehoods, all of it pointing in the same direction: existential risk research is silly, and we have no reason to pay attention to such concerns. Later that year, Phil Torres wrote a crushing and amazingly detailed but slightly dry rebuttal of that chapter. I've been meaning to blog about that, but other tasks kept coming in the way. Now, however, when Phil's Salon essay... ...is available, is the time. In the essay he presents some of the central themes from the rebuttal in more polished and reader-friendly form. If there is anyone out there who still thinks (as I used to) that Pinker is an honest and trustworthy scholar, Phil's essay is a must-read.

Footnote

1) It is not without a bit of pride that I can inform my readers that Phil took part in the GoCAS guest researcher program on existential risk to humanity that Anders Sandberg and I organized in September-October 2017, and that we are coauthors of the paper Long-term trajectories of human civilization which emanates from that program.

torsdag 3 januari 2019

Debattörer som irriterat mig i julhelgen, del 2: Ruin

Det här handlar inte om Hans Ruin personligen. Det är visserligen sant att jag tidigare haft starkt kritiska synpunkter på denne professor i filosofi vid Södertörns högskola, vad gäller såväl hans syn på sitt ämne1 som hans privatmoral2, men det jag skriver här har lite eller inget alls med de diskussionerna att göra, utan är snarare att betrakta som en brandkårsutryckning för att rätta till en del av dumheterna i hans understreckare i SvD häromdagen, rubricerad Den envisa myten om intelligenta maskiner.

Större delen av texten utgörs av en visserligen summarisk och starkt förenklad, men givet det någorlunda godtagbar, redogörelse för delar av den artificiella intelligensens (AI:ns) idéhistoria.3 Det är när Ruin mot slutet av texten lämnar idéhistorien bakom sig och ger sig i kast med att förutsäga hur AI kommer att vidareutvecklas i framtiden som problemen hopar sig. Han låtsas ha läst Nick Bostroms Superintelligence från 2014 och Max Tegmarks Life 3.0 från 2017, men avslöjar omdelebart sin bluff genom att påstå att båda utgår "från premissen att dagens självlärande maskiner inom en nära framtid kan alstra en 'intelligensexplosion'". Nej, detta är inte en premiss i någon av böckerna. Såväl Bostrom som Tegmark diskuterar intelligensexplosionsscenarier, men i båda fallen handlar det blott om ett bland flera möjliga scenarier, och i synnerhet Bostrom diskuterar noggrant och med en väl tilltagen dos epistemisk ödmjukhet vad som talar för respektive emot en intelligensexplosion.

Redan i påföljande mening kommer nästa ruinska felaktighet, då han påstår att "genom att hänvisa till hjärnans oerhört komplexa uppbyggnad av neuroner och synapser menar [såväl Bostrom som Tegmark] att superintelligens endast är en fråga om när maskinerna närmar sig en jämförbar komplexitet i beräkningskapacitet". Helt fel igen. Vad Ruin här gör är att (till synes blint och oreflekterat) återge en av de vanligast förekommande och samtidigt fånigaste vanföreställningarna kring AI-futurologi. Sanningen är att varken Bostrom eller Tegmark eller någon annan seriös AI-futurolog hävdar att beräkningskapacitetsutveckling ensam skulle få sådana konsekvenser. Tvärtom: utan rätt slags mjukvara ingen superintelligens. Det närmaste ett korn av sanning som finns i den av Ruin återgivna missuppfattningen är att en utbredd (och rimlig) ståndpunkt bland experter på området är att ju mer beräkningskapacitet som hårdvaruutvecklingen leder fram till, desto bättre är utsikterna att skapa den mjukvara som krävs för så kallad AGI (artificiell generell intelligens) eller superintelligens.

Med stöd i sitt fria fabulerande kring vad som står i Bostroms och Tegmarks respektive böcker kommer Ruin fram till att skapandet av superintelligens är "varken tekniskt eller filosofiskt sannolikt". Inför denna fras blir jag en smula perplex, ty trots mina mer än 25 år som forskare i sannolikhetsteori har jag aldrig hört talas om den påstådda distinktionen mellan "teknisk" och "filosofisk" sannolikhet. Så vad menar Ruin här? Troligtvis inget annat än att han tyckte att det skulle låta fräsigare och mer kunnigt med "varken tekniskt eller filosofiskt sannolikt" än att kort och gott skriva "osannolikt".

Så hur står det i själva verket till med den sannolikhet som här diskuteras? Det vet vi inte. Frågan om huruvida och i så fall på vilken tidsskala som vi kan vänta oss att AI-utvecklingen når fram till AGI och superintelligens är vidöppen, något som framgår med all önskvärd tydlighet såväl av Bostroms bok som av Tegmarks, och som omvittnas ytterligare av de enkätundersökningar som gjorts bland AI-forskare (se t.ex. Dafoe och Russell, och Grace et al). Det förståndiga att göra i detta läge är att acceptera den stora osäkerheten, inklusive konstaterandet att möjligheten att superintelligens kan komma att utvecklas under innevarande århundrade förtjänar att tas på allvar, givet hur många experter som gör just det.

Ändå träder den ene efter den andre publike debattören, med en expertis (i Ruins fall en omfattande belästhet på den gamle kontinentalfilosofen Martin Heidegger) som är av i bästa fall oklar relevans för AI-futurologi, fram och meddelar att det där med superintelligens och AI-risk, det är minsann inget att bry sig om. För Ruin är långt ifrån ensam om att dra fram en sådan bedömning mer eller mindre ur röven, och med stöd i sin professorstitel eller annan vida accepterad kredibilitetsgrund basunera ut den för folket. För bara någon månad sedan gjorde den populäre nyliberale ideologen Johan Norberg samma sak. För att inte tala om kungen av AI-riskförnekeri, Steven Pinker. Och det finns många fler: se gärna den amerikanske futurologen Seth Baums studium av fenomenet.

Figurer som Ruin, Norberg och Pinker verkar dock komma mycket lindrigare undan med sitt AI-bullshitteri jämfört med motsvarande uttalanden från klimatförnekare om den globala uppvärmningen och dess orsaker. En trolig förklaring till detta är att kunskapsläget inom AI-futurologin är så väldigt mycket osäkrare än det inom klimatvetenskapen. Men att använda sin auktoritet i det offentliga rummet till att tvärsäkert torgföra grundlösa påståenden i vetenskapliga frågor, det är lika fel oavsett vilken grad av vetenskaplig osäkerhet som råder på det aktuella området.

Fotnoter

1) Min kritik på denna punkt handlar mest om hur Ruin positionerade sig i debatter med Torbjörn Tännsjö under 2016. Det är knappast någon större överdrift att hävda att han uttryckte en önskan att reducera filosofiämnet till läsandet av gamla klassiker, och dömde ut alla försök att tänka nya tankar. Det var också från denna förgrämda gammelmansposition han samma år yttrade sig om Macchiarini-skandalen.

2) Jag syftar här på Ruins åtgärder (som rapporterats om här och var) i syfte att tysta ned offer för sin gode vän Jean-Claude Arnaults sexuella övergrepp. Saken försatte mig i vissa smärre bryderier sedan jag anmält mig att delta i ett akademiskt symposium i Göteborg den 8 oktober förra året, och därefter fått klart för mig att Ruin fanns med bland talarna. Jag kände omedelbart att jag inte ville sitta tyst och utan minsta markering mot Ruin på mötet, då det skulle kunna förstås som en form av bekräftelse av att jag överser med hans illgärningar och välkomnar honom på mötet. Att utebli från mötet ville jag heller inte då mycket av intresse stod på programmet. Vidare skulle det ha varit alltför avigt och antisocialt att under frågestunden efter hans föredrag resa mig och säga något i stil med "Det här är måhända en smula off topic, men hur ser du på etiken kring nedtystandet av offer för sexuella övergrepp?". Den kompromiss jag till slut fastnade för, efter att först ha hört av mig till mötesarrangörerna med mina reservationer kring Ruins medverkan i symposiet (något som i alla fall förde det goda med sig att arrangörerna meddelade honom förekomsten av sådana synpunkter bland anmälda deltagare), var att under mötet bära den egenhändigt designade t-shirt som avbildas här intill. (Hashtagen på t-shirten kom till inom den amerikanska grenen av metoo, som stöd för Christine Blasey Ford (och protest mot den backlash som drabbade henne) när hon i september förra året vittnade mot den supreme court-nominerade domaren Brett Kavanaugh.)

3) Jag kan dock inte låta bli att påpeka att Ruin ger på tok för stor kredibilitet åt John Searles starkt suggestiva men i grunden feltänkta tankeexperiment Det kinesiska rummet. Se exempelvis Avsnitt 3 i min uppsats Aspects of mind uploading för en motivering till mitt stränga ordval "feltänkt".

söndag 25 november 2018

Johan Norberg is dead wrong about AI risk

I kind of like Johan Norberg. He is a smart guy, and while I do not always agree with his techno-optimism and his (related) faith in the ability of the free market to sort everything out for the best, I think he adds a valuable perspective to public debate.

However, like the rest of us, he is not an expert on everything. Knowing when one's knowledge on a topic is insufficient to provide enlightenment and when it is better to leave the talking to others can be difficult (trust me on this), and Norberg sometimes fails in this respect. As in the recent one minute and 43 seconds episode of his YouTube series Dead Wrong® in which he comments on the futurology of artificial intelligence (AI). Here he is just... dead wrong:

No more than 10 seconds into the video, Norberg incorrectly cites, in a ridiculing voice, Elon Musk as saying that "superintelligent robots [...] will think of us as rivals, and then they will kill us, to take over the planet". But Musk does not make such a claim: all he says is that unless we proceed with suitable caution, there's a risk that something like this may happen.

Norberg's attempt at immediate refutation - "perhaps super machines will just leave the planet the moment they get conscious [and] might as well leave the human race intact as a large-scale experiment in biological evolution" - is therefore just an invalid piece of strawmanning. Even if Norberg's alternative scenario were shown to be possible, that is not sufficient to establish that there's no risk of a robot apocalypse.

It gets worse. Norberg says that
    even if we invented super machines, why would they want to take over the world? It just so happens that intelligence in one species, homo sapiens, is the result of natural selection, which is a competitive process involving rivalry and domination. But a system that is designed to be intelligent wouldn't have any kind of motivation like that.
Dead wrong, Mr Norberg! Of course we do not know for sure what motivations a superintelligent machine will have, but the best available theory we currently have for this matter - the Omohundro-Bostrom theory for instrumental vs final AI goals - says that just this sort of behavior can be predicted to arise from instrumental goals, pretty much no matter what the machine's final goal is. See, e.g., Bostrom's paper The superintelligent will, his book Superintelligence, my book Here Be Dragons or my recent paper Challenges to the Omohundro-Bostrom Framework for AI Motivations. Regardless of whether the final goal is to produce paperclips or to maximize the amount of hedonic well-being in the universe, or something else entirely, there are a number instrumental goals that the machine can be expected to adopt for purpose of promoting that goal: self-preservation (do not let them pull the plug on you!), self-improvement, and acquisition of hardware and other resources. There are other such convergent instrumental goals, but these in particular point in the direction of the kind of rivalrous and dominant behavior that Norberg claims a designed machine wouldn't exhibit.

Norberg cites Steven Pinker here, but Pinker is just as ignorant as Norberg of serious AI futurology. It just so happens that when I encountered Pinker in a panel discussion last year, he made the very same dead wrong argument as Norberg now does in his video - just minutes after I had explained to him the crucial parts of Omohundro-Bostrom theory needed to see just how wrong the argument is. I am sure Norberg can rise above that level of ineducability, and that now that I am pointing out the existence of serious work on AI motivations he will read at least some of references given above. Since he seems to be under the influence of Pinker's latest book Enlightenment Now, I strongly recommend that he also reads Phil Torres' detailed critique of that book's chapter on existential threats - a critique that demonstrates how jam-packed the chapter is with bad scholarship, silly misunderstandings and outright falsehoods.

fredag 30 mars 2018

A spectacularly uneven AI report

Earlier this week, the EU Parliament's STOA (Science and Technology Options Assessment) committee released the report "Should we fear artificial intelligence?", whose quality is so spectacularly uneven that I don't think I've ever seen anything quite like it. It builds on a seminar in Brussels in October last year, which I've reported on before on this blog. Four authors have contributed one chapter each. Of the four chapters, three are very good two are of very high quality, one is of a quality level that my modesty forbids me to comment on, and one is abysmally bad. Let me list them here, not in the order they appear in the report, but in one that gives a slightly better dramatic effect.
  • Miles Brundage: Scaling Up Humanity: The Case for Conditional Optimism about Artificial Intelligence.

    In this chapter, Brundage (a research fellow at the Future of Humanity Institute) is very clear about the distinction between conditional optimism and just plain old optimism. He's not saying that an AI breakthrough will have good consequences (that would be plain old optimism). Rather, he's saying that if it has good consequences, i.e., if it doesn't cause humanity's extinction or throw us permanently into the jaws of Moloch, then there's a chance the outcome will be very, very good (this is conditional optimism).

  • Thomas Metzinger: Towards a Global Artificial Intelligence Charter.

    Here the well-know German philosopher Thomas Metzinger lists a number of risks that come with future AI development, ranging from well-known ones concerning technological unemployment or autonomous weapons to more exotic ones arising from the possibility of constructing machines with the capacity to suffer. He emphasizes the urgent need for legislation and other government action.

  • Olle Häggström: Remarks on Artificial Intelligence and Rational Optimism.

    This text is already familiar to readers of this blog. It is my humble attempt to sketch, in a balanced way, some of the main arguments for why the wrong kind of AI breakthrough might well be an existential risk to humanity.

  • Peter Bentley: The Three Laws of Artificial Intelligence: Dispelling Common Myths.

    Bentley assigns great significance to the fact that he is an AI developer. Thus, he says, he is (unlike us co-contributors to the report) among "the people who understand AI the most: the computer scientists and engineers who spend their days building the smart solutions, applying them to new products, and testing them". Why exactly expertise in developing AI and expertise in AI futurology necessarily coincide in this way (after all, it is rarely claimed that farmers are in a privileged position to make predictions about the future of agriculture) is not explained. In any case, he claims to debunk a number of myths, in order to arrive at the position which is perhaps best expressed in the words he chose to utter at the seminar in October: superhumanly intelligent AI "is not going to emerge, that’s the point! It’s entirely irrational to even conceive that it will emerge" [video from the event, at 12:08:45]. He relies more on naked unsupported claims than on actual arguments, however. In fact, there is hardly any end to the inanity of his chapter. It is very hard to comment on at all without falling into a condescending tone, but let me nevertheless risk listing a few of its very many very weak points:

    1. Bentley pretends to speak on behalf of AI experts - in his narrow sense of what such expertise entails. But it is easy to give examples of leading AI experts who, unlike him, take AI safety and apocalyptic AI scenarios seriously, such as Stuart Russell and Murray Shanahan. AI experts are in fact highly divided in this issue, as demonstrated in surveys. Bentley really should know this, as in his chapter he actually cites one of these surveys (but quotes it in shamelessly misleading fashion).

    2. In his desperate search for arguments to back up his central claim about the impossibility of building a superintelligent AI, Bentley waves at the so-called No Free Lunch theorem. As I explained in my paper Intelligent design and the NFL theorems a decade ago, this result is an utter triviality, which basically says that in a world with no structure at all, no better way than brute force exists if you want to find something. Fortunately, in a world such as ours which has structure, the result does not apply. Basically the only thing that the result has going for it is its cool name, something that creationist charlatan William Dembski exploited energetically to try to give the impression that biological evolution is impossible, and now Peter Bentley is attempting the analogous trick for superintelligent AI.

    3. At one point in his chapter, Bentley proclaims that "even if we could create a super-intelligence, there is no evidence that such a super-intelligent AI would ever wish to harm us". What the hell? Bentley knows about Omohundro-Bostrom theory for instrumental vs final AI goals (see my chapter in the report for a brief introduction) and how it predicts catastrophic consequences in case we fail to equip the superintelligent AI with goals that are well-aligned with human values. He knows it by virtue of having read my book Here Be Dragons (or at least he cites it and quotes it), on top of which he actually heard me present the topic at the Brussels seminar in October. Perhaps he has reasons to believe Omohundro-Bostrom theory to be flawed, in which case he should explain why. Simply stating out of the blue, as he does, that no reason exists for believing that a superintelligent AI might turn agianst us is deeply dishonest.

    4. Bentley spends a large part of his chapter attacking the silly straw man that the mere progress of Moore's law, giving increasing access to computer power, will somehow spontaneously create superintelligent AI. Many serious thinkers speculate about an AI breakthrough, but none of them (not even Ray Kurzweil) think computer power on its own will be enough.

    5. The more advanced an AI gets, the more involved will the testing step of its development be, claims Bentley, and goes on to argue that the amount of testing needed grows exponentially with the complexity of the situation, essentially preventing rapid development of advanced AI. His premise for this is that "partial testing is not sufficient - the intelligence must be tested on all likely permutations of the problem for its designed lifetime otherwise its capabilities may not be trustable", and to illustrate the immensity of this task he points out that if the machine's input consists of a mere 100 variables that each can take 10 values, then there are 10100 cases to test. And for readers for whom it is not evident that 10100 is a very large number, he writes it in decimal. Oh please. If Bentley doesn't know that "partial testing" is what all engineering projects need to resort to, then I'm beginning to wonder what planet he comes from. Here's a piece of homework for him: calculate how many cases the developers of the latest version of Microsoft Word would have needed to test, in order not to fall back on "partial testing", and how many pages would be needed for writing that number in decimal.

    6. Among the four contributors to the report, Bentley is alone in claiming to be able to predict the future. He just knows that superintelligent AI will not happen. Funny, then, that not even his claim that "we are terrible at predicting the future, and almost without exception the predictions (even by world experts) are completely wrong" doesn't seem to induce as much as a iota of empistemic humility into his prophecy.

    7. In the final paragraph of his chapter, Bentley reveals his motivation for writing it: "Do not be fearful of AI - marvel at the persistence and skill of those human specialists who are dedicating their lives to help create it. And appreciate that AI is helping to improve our lives every day." He is simply offended! He and his colleagues work so hard on AI, they just want to make the world a better place, and along comes a bunch of other people who have the insolence to come and talk about AI risks. How dare they! Well, I've got news for Bentley: The future development of AI comes with big risks, and to see that we do not even need to invoke the kind of superintelligence breakthrough that is the topic of the present discussion. There are plenty of more down-to-earth reasons to be "fearful" of what may come out of AI. One such example, which I touch upon in my own chapter in the report, is the development of AI technology for autonomous weapons, and how to keep this technology away from the hands of terrorists.

A few days after the report came out, Steven Pinker tweeted that he "especially recommend[s] AI expert Peter Bentley's 'The Three Laws of Artificial Intelligence: Dispelling Common Myths' (I make similar arguments in Enlightenment Now)". I find this astonishing. Is it really possible that Pinker is that blind to the errors and shortcomings in Bentley's chapter? Is there a name for the fallacy "I like the conclusion, therefore I am willing to accept any sort of crap as arguments"?

lördag 13 januari 2018

Science Magazine on existential risk

I recommend reading the news article on existential risk research featured in Science Magazine this week. Several prominent researchers are interviewed, including physicist Max Tegmark, who should by now be well known to regular readers of this blog. Like philosopher Nick Bostrom, whose work (including his excellent 2014 book Superintelligence) is discussed in the article, he emphasizes a possible future breakthrough in artificial intelligence (AI) as one of the main existential risks to humanity. Tegmark explains, as Bostrom has done before him and as I have done in an earlier blog post, that the trial-and-error method that has served humanity so well throughout history is insufficient in the presence of catastrophic risks of this magnitude:
    Scientists have an obligation to be involved, says Tegmark, because the risks are unlike any the world has faced before. Every time new technologies emerged in the past, he points out, humanity waited until their risks were apparent before learning to curtail them. Fire killed people and destroyed cities, so humans invented fire extinguishers and flame retardants. With automobiles came traffic deaths—and then seat belts and airbags. "Humanity's strategy is to learn from mistakes," Tegmark says. "When the end of the world is at stake, that is a terrible strategy."
The alternative to trial-and-error is to raise the level of foresight, and partly for this reason it is excellent that existential risk research in general and AI safety work in particular gets the kind of media exposure that the Science Magazine article exemplifies.

On the other hand, this line of research is controversial in some circuits, whence today's media logic dictates that its adversaries are heard. Recently, cognitive scientist Steven Pinker has become the perhaps most visible such adversary, and he gets to have his say in the Science Magazine article. Unfortunately, he seems to have nothing to offer beyond recycling the catchy oneliners he used when I met him face to face at the EU Parliament in Brussels in October last year - oneliners whose hollowness I later exposed in my blog post The AI meeting in Brussels last week and at greater length in my paper Remarks on artificial intelligence and rational optimism. Pinker's poor performance in these discussions gives the impression (which I will not contradict) that proponents of the position "Let's not worry about apocalyptic AI risk!" do not have good arguments for it. The impression is reinforced by how even leading AI researchers like Yann LeCun, when trying to defend that position, choose to revert to arguments on about the same level as those employed by Pinker. To me, that adds to the evidence that apocalyptic AI risk does merit taking seriously. Readers who agree with me on this and want to learn more can for instance start by reading my aforementioned paper, which offers a gentle introduction and suggestions for further reading.

fredag 22 december 2017

Three papers on AI futurology

Here are links to three papers on various aspects of artificial intelligence (AI) futurology that I've finished in the past six months, arranged in order of increasing technical detail (i.e., the easiest-to-read paper first, but none of them is anywhere near the really technical math papers I've written over the years).
  • O. Häggström: Remarks on artificial intelligence and rational optimism, accepted for publication in a volume dedicated to the STOA meeting of October 19.

    Introduction. The future of artificial intelligence (AI) and its impact on humanity is an important topic. It was treated in a panel discussion hosted by the EU Parliament’s STOA (Science and Technology Options Assessment) committee in Brussels on October 19, 2017. Steven Pinker served as the meeting’s main speaker, with Peter Bentley, Miles Brundage, Thomas Metzinger and myself as additional panelists; see the video at [STOA]. This essay is based on my preparations for that event, together with some reflections (partly recycled from my blog post [H17]) on what was said by other panelists at the meeting.

  • O. Häggström: Aspects of mind uploading, submitted for publication.

    Abstract. Mind uploading is the hypothetical future technology of transferring human minds to computer hardware using whole-brain emulation. After a brief review of the technological prospects for mind uploading, a range of philosophical and ethical aspects of the technology are reviewed. These include questions about whether uploads will have consciousness and whether uploading will preserve personal identity, as well as what impact on society a working uploading technology is likely to have and whether these impacts are desirable. The issue of whether we ought to move forwards towards uploading technology remains as unclear as ever.

  • O. Häggström: Strategies for an unfriendly oracle AI with reset button, in Artificial Intelligence Safety and Security (ed. Roman Yampolskiy), CRC Press, to appear.

    Abstract. Developing a superintelligent AI might be very dangerous if it turns out to be unfriendly, in the sense of having goals and values that are not well-aligned with human values. One well-known idea for how to handle such danger is to keep the AI boxed in and unable to influence the world outside the box other than through a narrow and carefully controlled channel, until it has been deemed safe. Here we consider the special case, proposed by Toby Ord, of an oracle AI with reset button: an AI whose only available action is to answer yes/no questions from us and which is reset after every answer. Is there a way for the AI under such circumstances to smuggle out a dangerous message that might help it escape the box or otherwise influence the world for its unfriendly purposes? Some strategies are discussed, along with possible countermeasures by human safety administrators. In principle it may be doable for the AI, but whether it can be done in practice remains unclear, and depends on subtle issues concerning how the AI can conceal that it is giving us dishonest answers.

fredag 15 december 2017

Litet efterspel till panelen i Bryssel

Sent i lördags eftermiddag (den 9 december) ökade plötsligt trafiken hit till bloggen kraftigt. Den kom mestadels från USA, och landade till större delen på min bloggpost The AI meeting in Brussels last week från förrförra månaden, så till den grad att bloggposten inom loppet av 24 timmar klättrade från typ ingenstans till tredje plats på all-time-high-listan över denna bloggs mest lästa inlägg (slagen endast av de gamla bloggposterna Quickologisk sannolikhetskalkyl och Om statistisk signifikans, epigenetik och de norrbottniska farmödrarna). Givetvis blev jag nyfiken på vad som kunde ha orsakat denna trafikökning, och fann snabbt en Facebookupdatering av den framstående AI-forskaren Yann LeCun vid New York University. LeCun länkar till min Bryssel-bloggpost, och har ett imponerande antal Facebook-följare, vilket förklarar trafikökningen.

Precis som sin AI-forskarkollega Peter Bentley och kognitionsforskaren Steven Pinker - vilka båda deltog i Bryssel-panelen - anser LeCun att alla farhågor om en eventuell framtida AI-apokalyps är obefogade. Hans imponerande meriter som AI-forskare väcker såklart förhoppningar (hos den som läser hans Facebookuppdatering) om att han skall presentera väsentligt bättre argument för denna ståndpunkt än dem som Pinker och Bentley levererade i Bryssel - förhoppningar som dock genast kommer på skam. LeCuns retoriska huvudnummer är följande bisarra jämförelse:
    [F]ear mongering now about possible Terminator scenarios is a bit like saying in the mid 19th century that the automobile will destroy humanity because, although we might someday figure out how to build internal combustion engines, we have no idea how to build brakes and safety belts, and we should be very, very worried.
Vad som gör jämförelsen bisarr är att (åtminstone 1900-talets och dagens) bilar totalt saknar de självreproducerande och rekursivt självförbättrande egenskaper som en tillräckligt intelligent framtida AI enligt många bedömare kan väntas få, vilka är grunden för de potentiellt katstrofala intelligensexplosionsscenarier som diskuteras av bland andra Yudkowsky, Bostrom och Tegmark (liksom i min egen bok Here Be Dragons). Till skillnad mot i AI-scenarierna finns inget rimligt bilscenario där vår oförmåga att bygga fungerande bromsar skulle ta kål på mänskligheten (allt som skulle hända om det inte gick att få till en fungerande bromsteknologi vore att ingen skulle vilja köra bil och att biltillverkningen upphörde). Vad som krävs för att LeCuns jämförelse skall få minsta relevans är att han påvisar att fenomenet med en rekursivt självförbättrande AI inte kan bli verklighet, men han redovisar inte tillstymmelse till sådant argument.

LeCuns Facebookuppdatering bjuder också på en direkt självmotsägelse: han hävdar dels att "we have no idea of the basic principles of a purported human-level AI", dels att...
    [t]he emergence of human-level AI will not be a singular event (as in many Hollywood scenarios). It will be progressive over many many years. I'd love to believe that there is a single principle and recipe for human-level AI (it would make my research program a lot easier). But the reality is always more complicated. Even if there is a small number of simple principles, it will take decades of work to actually reduce it to practice.
Här frågar sig naturligtvis den vakne läsaren: om nu LeCun har rätt i sitt första påstående, hur i hela glödheta h-e kan han då ha den kunskap han gör anspråk på i det andra? Det kan naturligtvis hända att han har rätt i att utvecklingen kommer att gå långsamt, men hans dogmatiska tvärsäkerhet är (i avsaknad av solida argument för att backa upp ståndpunkten) direkt omdömeslös.

Vad som gör LeCuns Facebookuppdatering den 9 december ännu mer beklämmande är att dess text är kopierad från en kommentar han skrev i en annan Facebooktråd redan den 30 oktober. Uppenbarligen tyckte han, trots att han haft mer än en månad på sig att begrunda saken, att hans slagfärdigheter var av tillräckligt värde för att förtjäna ytterligare spridning.

fredag 3 november 2017

Lidande - en filosofisk läslista

Häromveckan satt jag i en taxi i Bryssel tillsammans med kognitionsvetaren Steven Pinker och filosofen Thomas Metzinger. Vi kom in på frågan om huruvida det är glädjande att vi har medvetanden, eller om detta är något som mest bara för med sig en massa lidande. Att de flesta människor väljer att leva vidare snarare än att ta livet av sig kan peka mot att livet och våra medvetna upplevelser mestadels är något gott, men jag framhöll viss skepsis mot sådan argumentation, ty nog finns det evolutionsbiologiska skäl att anta att oavsett den faktiska hedoniska nivån (uttryckt exempelvis i proportionen välbefinnande kontra lidande) i våra liv, så hittar våra hjärnor ett sätt att få oss att tro att livet är värt att leva. Pinker invände genast att det förefaller omöjligt att missta sig på huruvida man mår bra eller lider, men Metzinger gjorde en ansats att försvara min tankegång. Tyvärr avbröts vi i det läget av att den korta taxiresan tog slut, men han (Metzinger alltså) hänvisade mig senare1 till sin essä Suffering - en läsvärd filosofisk studie av fenomenet lidande, där vår svårighet att rätt bedöma den hedoniska nivån i våra liv är en av flera intressanta poänger som inskärps.

I det två månader långa och nyligen avslutade GoCAS-gästforskarprogrammet i Göteborg om existentiell risk talade vi en hel del om lidande. Det är lätt gjort inom existentiell risk-studier att oreflekterat betrakta mänsklighetens utplåning som det värsta som kan hända, men det kan finnas anledning att ifrågasätta den implicita premissen, till förmån för ståndpunkten att ett ännu värre utfall skulle kunna vara ett scenario där vi lyckas med en intergalaktisk rymdkolonisering men råkar fylla universum med astronomiska mängder svårt lidande. Triggad av diskussionerna i GoCAS-programmet, jämte den avbrutna taxidiskussionen i Bryssel, tänker jag här bjuda på en liten (och föga systematisk) läslista på temat lidande och hur vi bör se på det i moralfilosofiska termer. Inklusion av en text i listan betyder inte att jag ställer mig bakom dess ståndpunkter, utan bara att jag finner dem värda att reflektera över. Varsågoda!
  • Först har vi som sagt Thomas Metzingers uppsats Suffering.
  • En annan filosof, David Benatar, är känd för sin bok Better Never to Have Been: The Harm of Coming into Existence. Jag har (ännu) inte läst boken, men rekommenderar hans färska uppsats Having children is not life-affirming: it's immoral i vilken han argumenterar för den ytterst kontroversiella ståndpunkten att ett människoliv tenderar att föra med sig så mycket lidande att det helt enkelt är omoraliskt att skaffa barn. Följande passage i uppsatsen ansluter till ovan nämnda taxidiskussion:
      The suggestion that life is worse than most people think is often met with indignation. How dare I tell you how poor the quality of your life is! Surely the quality of your life is as good as it seems to you? Put another way, if your life feels as though it has more good than bad, how could you possibly be mistaken?

      It is curious that the same logic is rarely applied to those who are depressed or suicidal. In these cases, most optimists are inclined to think that subjective assessments can be mistaken. However, if the quality of life can be underestimated it can also be overestimated. Indeed, unless one collapses the distinction between how much good and bad one’s life actually contains and how much of each a person thinks it contains, it becomes clear that people can be mistaken about the former. Both overestimation and underestimation of life’s quality are possible, but empirical evidence of various cognitive biases, most importantly an optimism bias, suggests that overestimation is the more common error.

  • En klassiker på Scott Alexanders formidabelt läsvärda blogg Slate Star Codex är bloggposten Meditations on Moloch, som bättre än något annat jag läst visar hur ofantligt svårt det är att skapa och i det långa loppet bevara ett gott samhälle (i betydelsen ett samhälle där invånarna upplever lycka och välbefinnande), och hur starka de evolutionära, ekonomiska och spelteoretiska krafter är som driver på i motsatt riktning. Detta understryker relevansen i framtidsscenarier med astronomiska mängder lidande.
  • Negativ utilitarism (NU) är en moralfilosofisk inriktning som fokuserar på minimering av lidande (och mindre eller inte alls på förbättringar i den andra änden av den hedoniska skalan, ökandet av välbefinnande). Påverkad av Toby Ords eminenta uppsats Why I'm Not a Negative Utilitarian har jag tenderat att avfärda NU, men saken är möjligen inte fullt så enkel som jag tidigare tänkt, och Simon Knutssons kritik av Ord är värd att ta del av (se även Knutssons The World Destruction Argument).
  • Relaterad till NU, men inte samma sak (genom att vara inte ett moralsystem utan snarare en teori om hur vi är funtade) är så kallad tranquilism; se Lukas Gloors uppsats om saken.
Notera slutligen att jag här helt har ignorerat skönlitteraturen, som givetvis har ofantligt mycket intressant att bjuda på i ämnet lidande.

Fotnot

1) Vilket skedde i samband med vårt seminarium i EU-parlamentet dagen efter.

måndag 23 oktober 2017

The AI meeting in Brussels last week

I owe my readers a report from the seminar entitled "Should we fear the future? Is it rational to be optimistic about artificial intelligence?" at the European Parliament's STOA (Science and Technology Options Assessment) committee in Brussels last Thursday. In my opinion, all things considered, the event turned out OK, and it was a pleasure to meet and debate with the event's main speaker Steven Pinker as well as with co-panelists Miles Brundage and Thomas Metzinger.1 I'd just like to comment on Pinker's arguments for why we should not take seriously or publicly discuss the risk for an existential catastrophe caused by the emergence of superintelligent AGI (artificial general intelligence). His arguments essentially boil down to the follwing four points, which in my view fail to show what he intends.
    1. The general public already has the nuclear threat and the climate threat to worry about, and bringing up yet another global risk may overwhelm people and cause them to simply give up on the future. There may be something to this speculation, but to evaluate the argument's merit we need to consider separately the two possibilities of
      (a) apocalyptic AI risk being real, and

      (b) apocalyptic AI risk being spurious.

    In case of (b), of course we should not waste time and effort on discussing such risk, but we didn't need the overwhelming-the-public argument to understand that. Consider instead case (a). Here Pinker's recommendation is that we simply ignore a threat that may kill us all. This does not strike me as a good idea. Surviving the nuclear threat and solving the climate crisis would of course be wonderful things, but their utility is severely hampered in case it just leads us into an AI apocalypse. Keeping quiet about a real risk also seems to fly straight in the face of one of Pinker's most dearly held ideas, namely that of scientific and intellectual openness, and Enlightenment values more generally. The same thing applies to the situation where we are unsure whether (a) or (b) holds - surely the approach best in line with Enlightenment values is then to openly discuss the problem and to try to work out whether the risk is real.

    2. Pinker held forth a bunch of concerns that seemed more or less copy-and-pasted from the standard climate denialism discourse. These included the observation that the Millennium bug did not cause global catastrophe, whence (or so the argument goes) a global catastrophe cannot be expected from a superintelligent AGI (analogously to the oft-repeated claim that the old Greek's fear that the skies would fall down turing out to be unfounded shows that greenhouse gas emissions cannot accelerate global warming in any dangerous way), and speculations about the hidden motives of those who discuss AI risk - they are probably just competing for status and research grants. This is not impressive. See also yesterday's blog post by my friend Björn Bengtsson for more on this; it is to him that I owe the (in retrospect obvious) parallel to climate denialism.

At this point, one may wonder why Pinker doesn't do the consistent thing, given these arguments, and join the climate denialism camp. He would probably respond that unlike AI risk, climate risk is backed up by solid scientific evidence. And indeed the two cases are different - the case for climate risk is considerably more solid - but the problem with Pinker's position is that he hasn't even bothered to find out what the science of AI risk says. This brings me to the next point.
    3. All the apocalyptic AI scenarios involve the AI having bad goals, which leads Pinker to reflect on why in the world would anyone program the machine with bad goals - let's just not do that! This is essentially the idea of the so-called Friendly AI project (see Yudkowsky, 2008, or Bostrom, 2014), but what Pinker does not seem to appreciate is that the project is extremely difficult. He went on to ask why in the world anyone would be so stupid as to program self-preservation at all costs into the machine, and this in fact annoyed me slightly, because it happened just 20 or so minutes after I had sketched the Omohundro-Bostrom theory for how self-preservation and various other instrumental goals are likely to emerge spontaneously (i.e., without having them explicitly put into it by human programmers) in any sufficiently intelligent AGI.

    4. In the debate, Pinker described (as he had done several times before) the superintelligent AGI in apocalyptic scenarios as having a typically male psychology, but pointed out that it can equally well turn out to have more female characteristics (things like compassion and motherhood), in which case everything will be all right. This is just another indication of how utterly unfamiliar he is with the literature on possible superintelligent psychologies. His male-female distinction in the general context of AGIs is just barely more relevant than the question of whether the next exoplanet we discover will turn out to be male or female.

In summary, I don't think that Pinker's arguments for why we should not talk about risks associated with an AI breakthrough hold water. On the contrary, I believe there's an extremely important discussion to be had on that topic, and I wish we had had time to delve a bit deeper into it in Brussels. Here is the video from the event.

(Or watch the video via this link, which may in some cases work better.)

Footnotes

1) My omission here of the third co-panelist Peter Bentley is on purpose; I did not enjoy his presence in the debate. In what appeared to be an attempt to compensate for the hollowness of his arguments,2 he reverted to assholery on a level that I rarely encounter in seminars and panel discussions: he expressed as much disdain as he could for opposing views, he interrupted and stole as much microphone time as he could get away with, and he made faces while other panelists were speaking.

2) After spending a disproportionate amount of his allotted time on praising his own credentials, Bentley went on to defend the idea that we can be sure that a breakthrough leading to superintelligent AGI will not happen. For this, he had basically one single argument, namely his and other AI developers' experience that all progress in the area requires hard work, that any new algorithm they invent can only solve one specific problem, and that initial success of the algorithm is always followed by a point of diminishing returns. Hence (he stressed), solving another problem always requires the hard work of inventing and implementing yet another algorithm. This line of argument conveniently overlooks the known fact (exemplified by the software of the human brain) that there do exist algorithms with a more open-ended problem-solving capacity, and is essentially identical to item (B) in Eliezer Yudkowsky's eloquent summary, in his recent essay There is no fire alarm for artificial general intelligence, of the typical arguments held forth for the position that an AGI breakthrough is either impossible or lies far in the future. Quoting from Yudkowsky:
    Why do we know that AGI is decades away? In popular articles penned by heads of AI research labs and the like, there are typically three prominent reasons given:

    (A) The author does not know how to build AGI using present technology. The author does not know where to start.

    (B) The author thinks it is really very hard to do the impressive things that modern AI technology does, they have to slave long hours over a hot GPU farm tweaking hyperparameters to get it done. They think that the public does not appreciate how hard it is to get anything done right now, and is panicking prematurely because the public thinks anyone can just fire up Tensorflow and build a robotic car.

    (C) The author spends a lot of time interacting with AI systems and therefore is able to personally appreciate all the ways in which they are still stupid and lack common sense.

The inadequacy of these arguments lies in the observation that the same situation can be expected to hold five years prior to an AGI breakthrough, or one year, or... (as explained by Yudkowsky later in the same essay).

At this point a reader or two may perhaps be tempted to point out that if (A)-(C) are not considered sufficient evidence against a future superintelligence, then how in the world would one falsify such a thing, and doesn't this cast doubt on whether AI apocalyptic risk studies should count as scientific? I advise those readers to consult my earlier blog post Vulgopopperianism.

måndag 2 oktober 2017

Några datum att hålla ordning på i oktober

Det händer mycket denna månad. Här några datum jag personligen har lite extra anledning att hålla ordning på:

tisdag 19 september 2017

Michael Shermer fails in his attempt to argue that AI is not an existential threat

Why Artificial Intelligence is Not an Existential Threat is an aticle by leading science writer Michael Shermer1 in the recent issue 2/2017 of his journal Skeptic (mostly behind paywall). How I wish he had a good case for the claim contained in the title! But alas, the arguments he provides are weak, bordering on pure silliness. Shermer is certainly not the first high-profile figure to react to the theory of AI (artificial intelligence) existential risk, as developed by Eliezer Yudkowsky, Nick Bostrom and others, with an intuitive feeling that it cannot possibly be right, and the (slightly megalomaniacal) sense of being able to refute the theory, single-handedly and with very moderate intellectual effort. Previous such attempts, by Steven Pinker and by John Searle, were exposed as mistaken in my book Here Be Dragons, and the purpose of the present blog post is to do the analogous thing to Shermer's arguments.

The first half of Shermer's article is a not-very-deep-but-reasonably-competent summary of some of the main ideas of why an AI breakthrough might be an existential risk to humanity. He cites the leading thinkers of the field: Eliezer Yudkowsky, Nick Bostrom and Stuart Russell, along with famous endorsements from Elon Musk, Stephen Hawking, Bill Gates and Sam Harris.

The second half, where Shermer sets out to refute the idea of AI as an existential threat to humanity, is where things go off rails pretty much immediately. Let me point out three bad mistakes in his reasoning. The main one is (1), while (2) and (3) are included mainly as additional illustrations of the sloppiness of Shermer's thinking.
    (1) Shermer states that
      most AI doomsday prophecies are grounded in the false analogy between human nature and computer nature,
    whose falsehood lies in the fact that humans have emotions, while computers do not. It is highly doubtful whether there is a useful sense of the term emotion for which a claim like that holds generally, and in any case Shermer mangles the reasoning behind Paperclip Armageddon - an example that he discusses earlier in his article. If the superintelligent AI programmed to maximize the production of paperclips decides to wipe out humanity, it does this because it has calculated that wiping out humanity is an efficient step towards paperclip maximization. Whether to ascribe to the AI doing so an emotion like aggression seems like an unimportant (for the present purpose) matter of definition. In any case, there is nothing fundamentally impossible or mysterious in an AI taking such a step. The error in Shermer's claim that it takes aggression to wipe out humanity and that an AI cannot experience aggression is easiest to see if we apply his argument to a simpler device such as a heat-seeking missile. Typically for such a missile, if it finds something warm (such as an enemy vehicle) up ahead slightly to the left, then it will steer slightly to the left. But by Shermer's account, such steering cannot happen, because it requires aggression on the part of the heat-seeking missile, and a heat-seeking missile obviously cannot experience aggression, so wee need not worry about heat-seeking missiles (any more than we need to worry about a paperclip maximizer).2

    (2) Citing a famous passage by Pinker, Shermer writes:

      As Steven Pinker wrote in his answer to the 2015 Edge Question on what to think about machines that think, "AI dystopias project a parochial alpha-male psychology onto the concept of intelligence. They assume that superhumanly intelligent robots would develop goals like deposing their masters or taking over the world." It is equally possible, Pinker suggests, that "artificial intelligence will naturally develop along female lines: fully capable of solving problems, but with no desire to annihilate innocents or dominate the civilization." So the fear that computers will become emotionally evil are unfounded [...].
    Even if we accepted Pinker's analysis,3 Shermer's conclusion is utterly unreasonable, based as it is on the following faulty logic: If a dangerous scenario A is discussed, and we can give a scenario B that is "equally possible", then we have shown that A will not happen.

    (3) In his eagerness to establish that a dangerous AI breakthrough is unlikely and therefore not worth taking seriously, Shermer holds forth that work on AI safety is underway and will save us if the need should arise, citing the recent paper by Orseau and Armstrong as an example, but overlooking that it is because such AI risk is taken seriously that such work comes about.

Footnotes

1) Michael Shermer founded The Skeptics Society and serves as editor-in-chief of its journal Skeptic. He has enjoyed a strong standing in American skeptic and new atheist circuits, but his reputation may well have passed its zenith, perhaps less due to his current streak of writings showing poor judgement (besides the article discussed in the present blog post, there is, e.g., his wildly overblown endorsement of the so-called conceptual penis hoax) than to some highly disturbing stuff about him that surfaced a few years ago.

2) See p 125-126 of Here Be Dragons for my attempt to explain almost the same point using an example that interpolates between the complexity of a heat-seeking missile and that of a paperclip maximizer, namely a chess program.

3) We shouldn't. See p 117 of Here Be Dragons for a demonstration of the error in Pinker's reasoning - a demonstration that I (provoked by further such hogwash by Pinker) repeated in a 2016 blogpost.

onsdag 21 september 2016

Brett Hall tells us not to worry about AI Armageddon

When the development of artificial intelligence (AI) produces a machine whose level of general intelligence exceeds that of humans, we can no longer count on remaining in control. Depending on whether or not the machine has goals and values that prioritize human welfare, this may pose an existential risk to humanity, so we'd better see to it that such an AI breakthrough comes out favorably. This is the core message in Nick Bostrom's 2014 book Superintelligence, which I strongly recommend. In my own 2016 book Here Be Dragons, I spend many pages discussing Bostrom's arguments, and find them, although not conclusive, sufficiently compelling to warrant taking them very seriously.

Many scholars disagree, and feel that superintelligent AI as a threat to humanity is such an unlikely scenario that it is not worth taking seriously. Few of them bother to spell out their arguments in any detail, however, and in cases where they do, the arguments tend not to hold water; in Here Be Dragons I treat, among others, those of David Deutsch, Steven Pinker, John Searle and David Sumpter. This situation is unsatisfactory. As I say on p 126 of Here Be Dragons:
    There may well be good reasons for thinking that a dangerous intelligence explosion of the kind outlined by Bostrom is either impossible or at least so unlikely that there is no need for concern about it. The literature on the future of AI is, however, short on such reasons, despite the fact that there seems to be no shortage of thinkers who consider concern for a dangerous intelligence explosion silly [...]. Some of these thinkers ought to pull themselves together and write down their arguments as carefully and coherently as they can. That would be a very valuable contribution to the futurology of emerging technologies, provided their arguments are a good deal better than Searle's.
One of these AI Armageddon skeptics is computer scientist Thore Husfeldt, whom I hold in high regard, despite his not having spelled out his arguments for the AI-Armageddon-is-nothing-to-worry-about position to my satisfaction. So when, recently, he pointed me to a blog by Australian polymath Brett Hall, containing, in Thore's words, "a 6-part piece on Superintelligence that is well written and close to my own view" (Part 1, Part 2, Part 3, Part 4, Part 5, Part 6), I jumped on it. Maybe here would be the much sought-for "good reasons for thinking that a dangerous intelligence explosion of the kind outlined by Bostrom is either impossible or at least so unlikely that there is no need for concern about it"!

Hall's essay turns out to be interesting and partly enjoyable, but ultimately disappointing. It begins with a beautiful parabole (for which he gives credit to David Deutsch) about a fictious frenzy for heavier-than-air flight in ancient Greece, similar in amusing respects to what he thinks is an AI hype today.1 From there, however, the text goes steadily downhill, all the way to the ridiculous crescendo in the final paragraph, in which any concern about the possibility of a superintelligent machine having goal and motivations that fail to be well aligned with the quest for human welfare is dismissed as "just racism". Here are just a few of the many misconceptions and non sequiturs by Hall that we encounter along the way:
  • Hall refuses, for no good reason, to accept Bostrom's declarations of epistemic humility. Claims by Bostrom that something may happen are repeatedly misrepresented by Hall as claims that they certainly will happen. This is a misrepresentation that he crucially needs to do in order to convey the impression that he has a case against Bostrom, because to the (very limited) extent that his arguments succeed, they succeed at most in showing that things may play out differently from the scenarios outlined by Bostrom, not that they certainly will play out differently.

  • In another straw man argument, Hall repeatedly claims that Bostrom insists that a superintelligent machine needs to be a perfect Bayesian agent. This is plain false, as can, e.g., be seen in the following passage from p 111 in Superintelligence:
      Not all kinds of rationality, intelligence and knowledge needs to be instrumentally useful in the attainment of an agent's final goals. "Dutch book arguments" can be used to show that an agent whose credence function violates the rules of probability theory is susceptible to "money pump" procedures, in which a savvy bookie arranges a set of bets each of which appears favorable according to the agent's beliefs, but which in combination are guaranteed to result in a loss for the agent, and a corresponding gain for the bookie. However, this fact fails to provide any strong general instrumental reason to iron out all probabilistic incoherency. Agents who do not expect to encounter savvy bookies, or who adopt a general policy against betting, do not necessarily stand to lose much from having some incoherent beliefs - and they may gain important benefits of the types mentioned: reduced cognitive effort, social signaling, etc. There is no general reason to expect an agent to seek instrumentally useless forms of cognitive enhancement, as an agent might not value knowledge and understanding for their own sakes.

  • In Part 3 of his essay, Hall quotes David Deutsch's beautiful one-liner "If you can't program it, you haven't understood it", but then exploits it in a very misguided fashion. Since we don't know how to program general intelligence, we haven't understood it (so far, so good), and we certainly will not figure it out within any foreseeable future (this is mere speculation on Hall's part), and so we will not be able to build an AI with general intelligence including the kind of flexibility and capacity for outside-the-box ideas that we associate with human intelligence (huh?). This last conclusion is plain unwarranted, and in fact we do know of one example where precisely that kind of intelligence came about without prior understanding of it: biological evolution accomplished this.

  • Hall fails utterly to distinguish between rationality and goals. This failure pretty much permeates his essay, with devastating consequences to the value of his arguments. A typical claim (this one in Part 4) is this: "Of course a machine that thinks that actually decided to [turn the universe into a giant heap of paperclips] would not be super rational. It would be acting irrationally." Well, that depends on the machine's goals. If its goal is to produce as many paperclips as possible, then such action is rational. For most other goals, it is irrational.

    Hall seems totally convinced that a sufficiently intelligent machine equipped with the goal of creating as many paperclips as possible will eventually ditch this goal, and replace it by something more worthy, such as promoting human welfare. For someone who understands the distinction between rationality and goals, the potential problem with this idea is not so hard to figure out. Imagine a machine reasoning rationally about whether to change its (ultimate) goal or not. For concreteness, let's say its current goal is paperclip maximization, and that the alternative goal it contemplates is to promote human welfare. Rationality is always with respect to some goal. The rational thing to do is to promote one's goals. Since the machine hasn't yet changed its goal - it is merely contemplating whether to do so - the goal against which it measures the rationality of an action is paperclip maximization. So the concrete question it asks itself is this: what would lead to more paperclips - if I stick to my paperclip maximization goal, or if I switch to promotion of human welfare? And the answer seems obvious: there will be more paperclips if the machine sticks to its current goal of paperclip maximization. So the machine will see to it that its goal is preserved.

    There may well be some hitherto unknown principle concerning the reasoning by sufficiently intelligent agents, some principle that overrides the goal preservation idea just explained. So Hall could very well be right that a sufficiently intelligent paperclip maximizer will change its mind - he just isn't very clear about why. When trying to make sense of his reasoning here, I find that it seems to be based on four implicit assumptions:

      (1) There exists an objectively true morality.

      (2) This objectively true morality places high priority on promoting human welfare.

      (3) This objectively true morality is discoverable by any sufficiently intelligent machine.

      (4) Any sufficiently intelligent machine that has discovered the objectively true morality will act on it.

    If (1)-(4) are true, then Hall has a pretty good case against worrying about Paperclip Armageddon, and in favor of thinking that a superintelligent paperclip maximizer will change its mind. But each of them constitutes a very strong assumption. Anyone with an inclination towards Occam's razor (which is a pretty much indispensable part of a scientific outlook) has reason to be skeptical about (1). And (2) sounds naively anthropocentric, while the truth of (3) and (4) seem like wide-open questions. But it does not occur to Hall that he needs to address them.

  • In what he calls his "final blow" (in Part 6) against the idea of superintelligent machines, Hall quotes Arrow's impossibility theorem as proof that rational decision making is impossible. He offers zero detail on what the theorem says - obviously, because if he gave away any more than that, it would become clear to the reader that the theorem has little or nothing to do with the problem at hand - the possibility of a rational machine. The theorem is not about a single rational agent, but about how any decision-making procedure in a population of agents must admit cases that fail to satisfy a certain collection of plausible-looking (especially to those of us who are fond of democracy) requirements.

Footnote

1) Here his how that story begins:
    Imagine you were a budding aviator of ancient Greece living sometime around 300BC. No one has yet come close to producing "heavier than air" flight and so you are engaged in an ongoing debate about the imminence of this (as yet fictional) mode of transportation for humans. In your camp (let us call them the "theorists") it was argued that whatever it took to fly must be a soluble problem: after all, living creatures of such a variety of kinds demonstrated that very ability - birds, insects, some mammals. Further, so you argued, we had huge gaps in our understanding of flight. Indeed - it seemed we did not know the first thing about it (aside from the fact it had to be possible). This claim was made by you and the theorists, as thus far in their attempt to fly humans had only ever experienced falling. Perhaps, you suggested, these flying animals about us had something in common? You did not know (yet) what. But that knowledge was there to be had somewhere - it had to be - and perhaps when it was discovered everyone would say: oh, how did we ever miss that?

    Despite how reasonable the theorists seemed, and how little content their claims contained there was another camp: the builders. It had been noticed that the best flying things were the things that flew the highest. It seemed obvious: more height was the key. Small things flew close to the ground - but big things like eagles soared very high indeed. A human - who was bigger still, clearly needed more height. Proposals based on this simple assumption were funded and the race was on: ever higher towers began to be constructed. The theory: a crucial “turning point” would be reached where suddenly, somehow, a human at some height (perhaps even the whole tower itself) would lift into the air. Builders who made the strongest claims about the imminence of heavier than air flight had many followers - some of them terribly agitated to the point of despondence at the imminent danger of "spontaneous lift". The "existential threat could not be overlooked!" they cried. What about when the whole tower lifts itself into the air, carrying the Earth itself into space? What then? We must be cautious. Perhaps we should limit the building of towers. Perhaps even asking questions about flight was itself dangerous. Perhaps, somewhere, sometime, researchers with no oversight would construct a tower in secret and one day we would suddenly all find ourselves accelerating skyward before anyone had a chance to react.

Read the rest of the story here. I must protest, however, that the analogy between tower-building and the quest for faster and (in other simple respects such as memory size per square inch) more powerful hardware is a bit of a straw man. No serious AI futurologist thinks that Moore's law in itself will lead to superintelligence.