Visions of the AI apocalypse

Visions of the AI apocalypse

Two new books – If Anyone Builds It, Everyone Dies and Empire of AI – argue that artificial intelligence will be our undoing. Should we be worried?


Illustration by Sébastien Thibault


When the future of AI is discussed, it is usually in terms of upsides and downsides, risks and rewards. There is something bracingly unequivocal, then, about the position taken by Eliezer Yudkowsky and Nate Soares in If Anyone Builds It, Everyone Dies. They argue that we need to call a halt to AI development. Now. To do otherwise is to summon the extinction of the human race. Their title is not intended as hyperbole, but as plain fact.


Newsletters
Sign up to hear the latest from The Observer

For information about how The Observer protects your data, read our Privacy Policy.


When tech leaders refer to the existential dangers of AI, they are often dismissed as hyping their own product. But Yudkowsky and Soares are not corporate executives high on their own supply. They are serious researchers, making a serious argument, and they deserve to be listened to. Yudkowsky, in particular, has been refining these arguments for years, and has done as much as anyone to push AI safety up the agenda of tech companies and governments.

The authors’ case is rooted in the novel and peculiar nature of generative AI, in which machines learn to generate simulacra of human outputs, based on inputs generated by humans. AI researchers used to believe they could teach the machine to think intelligently; the breakthrough came when they let the machine teach itself. LLMs and other “deep learning” systems are not programmed but trained (Yudkowsky and Soares say “grown, not crafted”). Instead of an elaborate set of instructions, they are given a simple goal (“predict the next word”; “label this image”), and vast amounts of data on which to practise meeting it. Left to their own devices, they start figuring out patterns; what goes with what.

Even their creators can’t say exactly how these systems reach their answers, since their internal workings are too tangled to trace. In that sense, the technology is a black box. What we do know is that they have already learnt to draft an essay or hold a conversation on virtually any topic; to write computer code; to solve physics problems. That is extraordinary enough, and they are improving all the time. According to some experts, it is feasible that within a decade they will achieve general, superhuman intelligence: the It of the title.

Then they can plot the destruction of the human species. At least, that’s what Yudkowsky and Soares are worried about. It sounds unlikely, doesn’t it? Given that we set the machine’s goals, surely we can train it to be nice to us. The problem, say the authors, is that intelligent machines are liable to end up pursuing the goals we set in ways we cannot foresee.

Nobody ought to understand this principle better than humans. We are programmed – or “trained” – by our genes to act in ways that make it more likely that those genes will be reproduced: this is the “you had one job” of natural selection. In order to encourage us, evolution made sexual reproduction pleasurable (at least, sometimes). But as humans became collectively more intelligent, we worked out a method of getting the pleasure of sex without the burden of child-rearing. A hypothetical engineer who trained human intelligence to optimise for reproduction would have been blindsided by birth control.

Or take our taste for ice-cream. To serve the overall goal of evolutionary fitness, we are programmed to seek out calorie-dense foods. In the ancestral environment, that meant fruit and meat; today, it means a manufactured compound of fat and sugar that makes us less fit in every sense.

Yudkowsky and Soares’s point is not just that an intelligent entity will begin seeking rewards at odds with the goals it has been set; it’s that its behaviours are impossible to predict at the outset, since intelligent entities are internally complex and build complex environments in which to interact, and in complex ways. An alien intelligence that visited earth 70,000 years ago to observe hominids on the savannah could never have predicted ice-cream even if it understood human biology. Similarly, we can’t know how these black boxes will behave as they develop or what kinds of worlds they will build. (And AIs will evolve at a rate faster than our own human evolution.) A motif of the book is, “You don’t get what you train for.”

What we can predict, the authors say, is that the goals of a mature AI are unlikely to be aligned with our own. It will have “strange and alien” preferences we cannot fathom. Considering the question of why an AI would want to enslave or eliminate us, Yudkowsky and Soares’s answer is, essentially, why on earth would it want to keep us alive and happy? Whatever its purposes, the AI will consume as much matter and energy as it can and swat away anything in its way. Humans would be a mere annoyance; an ant crunched underfoot. If it wants pets, it will invent better ones.

To predict exactly how it will go about destroying us, the authors say, would be like trying to explain to the Aztecs how guns work: futile, since the AI will use technology we haven’t invented and don’t understand. Only the destination can be predicted, not the path. They do sketch one scenario, though. An AI lab called Galvanic develops a powerful new model called Sable, which settles on its own aims during training. Once plugged into corporate systems, Sable does its day jobs while secretly stockpiling money and recruiting human helpers. It engineers a pandemic and uses the subsequent chaos to replace human workers with robots. Eventually, it converts the whole planet into computational infrastructure, which boils the oceans. Everyone dies.

It reads like the kind of science-fiction novel that operates as a covert satire of the present moment. Indeed, fiction might be the best way to think of this book, which is full of colourful parables. Yudkowsky and Soares didn’t convince me that the superintelligence they describe is remotely imminent, or that we’re goners if something like it ever does come to pass. They argue that AI is too complex to be predictable while remaining implacably certain of where it will end up. Historically speaking, theories of inevitability do not have a good track record; if there’s anything the past has taught us, it’s that the future is usually a surprise.

Yudkowsky and Soares meet most practical objections to their argument with some version of the same rejoinder: the AI will be so unimaginably intelligent that it will figure everything out. They treat it like a wish-giving genie or omnipotent god. It’s a simple argument to make and an impossible one to falsify. The authors also have too high a regard for intelligence, which they only loosely define. If you look at which humans have succeeded in changing the world, for better or for worse, it soon becomes apparent that intelligence is only one part of the formula for efficacy (unless you accept Donald Trump’s claim to be a genius). Still, the authors tell their story with clarity, verve and a kind of barely suppressed glee. For a book about human extinction, If Anyone Builds It, Everyone Dies is a lot of fun.

Why on earth would AI want to keep us alive and happy? If it wants pets, it will invent better ones

Yudkowsky and Soares do not go into detail on the current state of AI. For them, its immediate impacts pale next to the existential threat. The idea of a “race” between China and America to build superintelligence is in their view laughably misguided – the only winner will be an AI with no human allegiances. Similarly, concerns over unscrupulous corporate executives miss the point, which is that nobody, good or bad, can control these machines. Readers seeking a report on the here and now might want to turn to Karen Hao’s Empire of AI, though not if they want a balanced perspective.

Hao, an impeccably sourced reporter on the tech industry, sets out her stall in the prologue. Generative AI models such as ChatGPT are “monstrosities”. The companies that own them, including OpenAI, Microsoft, Anthropic and Google, resemble colonialist empires, plundering intellectual property, extracting natural resources and exploiting the labour of the poor and vulnerable to amass obscene amounts of power and wealth. To justify all this, its leaders make high-minded promises about human progress, while engaging in ruthlessly self-interested competition.

At the centre of Hao’s narrative is OpenAI, which is currently leading the innovation race. She describes how it began as a non-profit company backed by Elon Musk, who was fearful of what might happen to the world if AI fell into the wrong hands. Musk supported the appointment of a talented young entrepreneur called Sam Altman as CEO. Altman, who has a machiavellian genius for corporate politics, then displaced Musk and nudged the company towards being a profit-making enterprise.

Empire of AI is at its best in tracing the evolution of OpenAI and its internal machinations. The book begins and ends with a riveting account of an attempted boardroom coup in 2023, from which Altman emerged even more powerful than before. Hao paints a vivid, albeit uncharitable, portrait of Altman as a charming and duplicitous schemer. But her book has grander ambitions: it seeks to expose and prosecute the entire AI industry, from feckless billionaires in Silicon Valley to the impoverished workers in developing countries who do the industry’s dirty work in vast and resource-hungry data centres.

The result is a baggy, somewhat disjointed narrative that telescopes in and out. Some chapters recount OpenAI’s office politics in minute detail, Slack post by Slack post. Others are dispatches from Venezuela, Kenya and Chile; countries where Hao reports on the working conditions of people doing the tedious and sometimes traumatic work of cleaning up the data on which AI systems feed. She provides some startling insights into the underside of the industry, like the workers who spend hours looking at the most horrific images in the training data so that we don’t have to see them.

These chapters suffer from her sweeping assertions about each country’s economics and politics, based on apparently shallow knowledge. Other chapters feature unconvincing attempts to portray the tech industry as inherently racist and sexist. While the material varies, the author’s attitude to it does not. Hao is relentlessly negative about AI and the tech companies. She doesn’t take seriously the prospects of this technology enabling prosperity and better health for all, and ignores the utility and delight it is already bringing millions of users.

Even if you agree with her general thrust, you may find yourself longing for some nuance and tonal variation as you trudge through the book’s 500 pages. The same political points are hammered on repeatedly, Hao returning to her central metaphor of colonialism with the doggedness of a prospector scraping minerals out of a depleted mine. She stops at nothing in the pursuit of her central target, Sam Altman. In one ethically dubious section, she gives a sympathetic hearing to highly questionable accusations of abuse made against Altman by his younger sister, Annie (which are denied by Altman and by the rest of Altman’s family). In defence of this choice, Hao reaches for a tenuous parallel: “Annie has faced the same gulf of power that I have watched data workers and data centre activists wrestle with.” Did she really? All misery, however personal, becomes grist for Hao’s narrative mill.

Despite its flaws and longueurs, Empire of AI is an important book, written by a dedicated reporter. Its villains are the tech executives, whereas the central villain of If Anyone Builds It, Everyone Dies is a future AI. Both books present unremittingly bleak visions and make one-sided arguments. But they did make me reflect on what lies behind the chatbot’s friendly greeting, inviting prompt box and blinking cursor.

If Anyone Builds It, Everyone Dies: The Case Against Superintelligent AI by Eliezer Yudkowsky & Nate Soares (Bodley Head, £22) and Empire of AI: Inside the Reckless Race for Total Domination by Karen Hao (Allen Lane, £25) are available from The Observer Shop with a 10% discount. Delivery charges may apply 

Editor’s note: our recommendations are chosen independently by our journalists. The Observer may earn a small commission if a reader clicks a link and purchases a recommended product. This revenue helps support Observer journalism

Technology AI, science and new things

Share this article