Why misaligned AGI won’t lead to mass killings (and what actually matters instead)

The real threats come from human misuse, not AI itself—so should we focus on alignment, or on building AI that benefits humanity?

Feb 05, 2025

This was initially published as a follow-up comment to this comment on Less Wrong.

I still don't understand the concern about misaligned AGI regarding mass killings.

Even if AGI would, for whatever reason, want to kill people: As soon as that happens, the physical force of governments will come into play. For example the US military will NEVER accept that any force would become stronger than it.

So essentially there are three ways of how such misaligned, autonomous AI with the intention to kill can act, i.e. what its strategy would be:

"making humans kill each other": Through something like a cult (i.e. like contemporary extremist religions which invent their stupid justifications for killing humans; we have enough blueprints for that), then all humans following these "commands to kill" given by the AI will just be part of an organization deemed as terrorists by the world’s government, and the government will use all its powers to exterminate all these followers.
"making humans kill themselves": Here the AI would add intense large-scale psychological torture to every aspect of life, to bring the majority of humanity into a state of mind (either very depressed or very euphoric) to trick the majority of the population into believing that they actually want to commit suicide. So like a suicide cult. Protecting against this means building psychological resilience, but that’s more of an education thing (i.e. highly political), related to personal development and not technical at all.
"killing humans through machines": One example would be that the AI would build its own underground concentration camps or other mass killing facilities. Or that it would build robots that would do the mass killing. But even if it would be able to build an underground robot army or underground killing chambers, first the logistics would raise suspicions (i.e. even if the AI-based concentration camp can be built at all, the population would still need to be deported to these facilities, and at least as far as I know, most people don’t appreciate their loved ones being industrially murdered in gas chambers). The AI simply won't physically be able to assemble the resources to gain more physical power than the US military or, as a matter of fact, most other militaries in the world.

I don't see any other ways. Humans have been pretty damn creative with how to commit genocides and if any computer would start giving commands to kill, the AI won't ever have more tanks, guns, poisons or capabilities to hack and destroy infrastructure than Russia, China or the US itself.

The only genuine concern I see is that AI should never make political decisions autonomously, i.e. a hypothetical AGI “shouldn’t” aim to take complete control of an existing country’s military. But even if it would, that would just be another totalitarian government, which is unfortunate, but also not too unheard of in world history. From the practical side, i.e. in terms of the lived human experience, it doesn’t really matter whether it’s a misaligned AGI or Kim Jong-Un torturing its population.

In the end, psychologically it's a mindset thing: Either we take the approach of "let's build AI that doesn't kill us". Or, from the start, we take the approach of "let's built AI that actually benefits us" (like all the "AI for Humanity" initiatives). It's not like we first need to solve the killing problem and only once we've fixed that once and for all, we can make AI be good for humanity as an afterthought. That would be the same fallacy which the entire domain of psychology has fallen into, where it has been pathological (i.e. just intending to fix issues) instead of empowering (i.e. building a mindset so that the issues don't happen in the first place) for many decades, and only positive psychology is finally changing something. So it very much is about optimism instead of pessimism.

I do think that it's not completely pointless to talk about these "alignment" questions. But not to change anything about AI, but for the software engineers behind it to finally adopt some sort of morality themselves (i.e. who they want to work for). Before there's any AGI that wants to kill large-scale, your evil government of choice will do that by itself, and such governments will give all us software engineers pretty good job offers. The banality of evil is a thing.

Every misaligned AI will initially need to be built/programmed by a human, just to kick off the mass killing. And that evil human won't give a single damn about all your thoughts and ideas and strategies and rules which all the AI alignment folks are establishing. So if AI alignment work is obviously nothing that will actually have any effect on anything whatsoever, why bother with it and not work on ways how AI can add value for humanity instead?

Or am I completely wrong and I’m fundamentally missing a way of how AGI will bring true innovation and yet total monopolization into the global industry of genocides? Tell me, from a very pragmatic, practical point of view, how would it be able to outsmart humans when it comes to genocides?

For more on my opinions, especially why I believe that “safe AI” is a fallacy anyway, see the final sections of my post Debunking the myth of safe AI.

Wondrous Machines

Discussion about this post