SUNDAY, 6 JULY 2025
AI is undeniably the topic du jour. The newspapers are full of it, videos are popping up like a plague of rabbits all over YouTube, and many of my online students in the arts and humanities seem desperate to discuss its ramifications. I’m wary about jumping on this bandwagon because, as Manuel famously said, ‘I know nothing’, but, hey, that’s never stopped me before, so here goes.
Sometimes it feels as if all we can do as non-experts is watch from the sidelines as the heavyweights in the field slug it out. In one corner we have Kokotajlo, who argues that AI is an existential threat to human beings and may have wiped us out entirely by 2030, and in the other we have Nayaranan, who points to the limitations that reality always imposes on any new technology to reduce its speed of uptake and its eventual wider influence on human society. There are experts like Searle and Larson who argue that machine intelligence is fundamentally different from human intelligence and therefore things like Hal’s takeover of the spaceship in 2001, A Space Odyssey, will always remain the stuff of science fiction, while others like Kurzweil argue that we are heading for the ‘singularity’ when AI can perform all mental tasks, including ones that we see as uniquely human, better than we do.
A key reason Kokotajlo advances for his forebodings about AI is that it is proven that it has started to lie to us. At the risk of asking a question which is technologically naive, how can we be sure of this? In order for anyone or anything to lie, it must first have a concept of truth and make a distinction between what is true and what is not. Then it must decide to choose the lie. And a deliberate lie always involves a reason to lie, a motivation.
Nor is it ever accidental: there is a clear difference between lying and misinforming. If I go into a chatroom and claim that I’m a hunky guy of 36 when in reality I’m 72 and decrepit, clearly I am lying because what I’m saying doesn’t correspond with what I know to be true. But if I tell you that Rio is the capital of Brazil in the honest belief that this is the case, I’m not. I may be misleading you, but I’m not lying to you. I have learned a fact that does not correspond to reality – a false belief – and am propagating it, but my error is an innocent one. In contrast, there is necessarily an ethical dimension to lying because there is always some intention to benefit from the lie or at the very least a psychological need is being satisfied.
This means that when we think that AI is lying, we need to be absolutely sure that what is happening is not a genuine error. And lies can be hard to prove to be lies. Even if another human being is lying to us, it is often impossible to be sure, and in most situations we have to rely on other evidence, such as the facial expressions and body language of the speaker, and our own intuition, to make our decision about their honesty or mendacity. This difficulty is hugely compounded when communication is taking place via text on a screen with an AI that has no bodily reality and so we have no non-textual clues to guide us. We can’t begin to glimpse into the black box of its ‘mind’, so how can we judge it to be honest or dishonest?
It’s true that in some cases we can know for sure that a lie is taking place. In my example of the chatroom, a friend who is sitting beside me and watching the screen as I type will know that I am lying (although he is not the target of my lie, of course, which is the salient fact. I am lying to the person in the chatroom, not to him.) Are Kokotajlo and AI experts in a similar position to my friend who can clearly see that a lie is being told? If so, can they prove with simple examples that a non-expert can understand why they are so sure that deception is taking place? For instance, maybe they feel confident that the size of the database ensures a consistency of data which guarantees that mistakes of this kind cannot happen, since AI will have millions of references to Brasilia as the capital of Brazil in its database and so in theory should never tell me it’s Rio. But even then how can we be sure that a response that looks like a lie wasn’t a blip, a gremlin in the machine? How can we be sure that it was motivated?
There is a tendency for us to anthropomorphise non-human animals. When we see the open mouth of a dolphin, it’s very hard not to interpret it as a smile. When we see a mother bird pretend to have a broken wing to lure a predator away from her chicks, we impute human reasoning and emotion to the action: ‘If I trick it into thinking that I am wounded, I can save my babies’. But surely what is happening in this case is not the kind of conscious, rational thinking which a human being would make; hard as it might be at times to imagine how this behaviour can result simply from natural selection, it must be a kind of instinct built into the genes. And this tendency to see non-human animal behaviour in human terms is extended even to machines: ‘My computer is in a foul mood this morning’. Ditto the pathetic fallacy: skies that weep because my love has left me and sunlight that skips merrily on the hillside to celebrate his return. Could our propensity to weave human feelings and thoughts into non-human objects be colouring our attitude to AI and imbuing it with an agency it doesn’t possess?
So far I’ve focused on basic questions to which there is an answer which is unequivocally correct, like the capital of Brazil. What happens if I ask more difficult questions where there are no undisputed answers, such as: ‘Was Picasso or Matisse the greater artist?’ If you asked me that as a person, my reply would necessarily be complex, involving a host of factors such as my personal tastes and temperament, my history and knowledge of their work, my views about the purpose of Art, my thoughts on how we measure greatness, and so on. At heart, no matter how extensive my familiarity with the two artists and how sincere my desire to be objective, my final conclusion would still be deeply personal. AI’s response, in contrast, would more likely be a ‘on-the-one-hand, on-the-other-hand’ piece of writing created by a trawl through the database, which would probably be adequate for an essay written by an undergraduate in art history, but I doubt there would be much sign of a truly personal response gradually built up over years of exposure to these two artists. An instruction could be installed into the AI programme that its response must include some kind of final evaluation when asked this type of question to avoid its producing what would be little more than an overview, but that would still not show the deep involvement of a human being who cared about Art: basically it wouldn’t get to the heart of the question because the answer will always be personal. (Overall I’d plump for Matisse, by the way).
Unless we are like the protagonist in Cockcroft’s The Dice Man, a decision to lie is never random: we never lie for no reason. Therefore, if Kokotajlo is correct and AI is proved to be lying, this suggests that it has already reached a level of consciousness, or at least of emotional preference, that only carbon-based forms appeared to exhibit until now. This seems a huge assumption. Many philosophers of mind argue that thought without emotion is impossible: that the kind of logic Spock portrays in Star Trek, with its superior clarity based on detachment from all feeling, is a chimera. I must admit that the relationship between thinking and feeling is something which I struggle to get my head around, like the related question of whether there can be thought without language, but at the level of intuition it feels right to me that one phenomenon (thinking) requires the other (feeling). It seems to me that thought – as opposed to data – must take place from a personal perspective, and for us as human beings that essentially means within the brain/mind and the body. Indeed, the move towards the search for Artificial General Intelligence (AGI) and attempts to go beyond LLMs seems to reflect a recent awareness of the limitations of an AI based purely on disembodied logic.
With regard to the ‘hallucination’ that sometimes happens in AI, is it possible that this is genuine error, perhaps the result of a database so large that it ceases to function reliably? If I understand correctly, one of the historical brakes on the advance of AI has been limits to computing power, but quantum computing is overcoming this problem and today’s state-of-the-art computers can shoot through billions of bits of memory at an unimaginable speed. But is access to a memory base the only thing that is essential for intelligence to occur? Cognitive psychologists and neurologists highlight the role of forgetting in mental function in human beings, and the problems that can result when information piles up in the brain and none gets discarded. Could the database of a machine become similarly clogged up? And could a ‘rogue’ piece of information – e.g. Rio as the capital of Brazil – somehow become the one that AI picks up from the mass of data on occasion?
I’m sure that experts in the field will find my questions simplistic or naive because they must have asked themselves the same ones a thousand times already, so they feel confident that they have overcome these doubts. But I think it’s important that laypeople like me try to understand what’s happening in AI even if we can never become an informed voice in the debate. I know from my limited experience of helping people to write English-language theses on AI that many workers in the field seem to take a rather laissez-faire attitude to its moral and social consequences, seeing it as a series of technical problems to be solved rather than a technology that they are unleashing on the world for better or worse. Scientific objectivity comes at a price.
I also realise that a lot of the public disquiet surrounding AI is a reprise of similar panics that have followed most new technologies; for example, the purported fears of Victorians (possibly an urban myth) that the human body could not travel at the speed of a train without incurring serious injury. I’m currently reading a book about the history of apocalyptic thinking which is making me more aware of the seductive lure that catastrophe has for the human mind and as a result am becoming more sceptical of Kokotajlo’s dire warnings. On the other hand, just because the apocalypse has never arrived in the past doesn’t prove it won’t smack us in the face this time. Maybe this time we’re Russell’s chicken and it’s Christmas morning.
Even if Kokotajlo is right and we are doomed, there seems nothing we can do about it because nations are never going to voluntarily pause research into AI in view of its huge potential for economic and military benefits. So I suggest that maybe the best approach laypeople can take to AI is to simply hope that Kokotajlo’s nightmare scenario never occurs and focus on other issues instead, such as AI’s psychological, social and economic effects, the areas thinkers like Harari are exploring in the hope of attenuating the damage it might cause (although even he seems to be edging towards the dark side these days). Perhaps we laypeople have more to contribute in these societal areas which some AI scientists seem ready to abandon, because we don’t need to be an expert in computing, neuroscience or philosophy of mind in order to take part in the discussion. But whether we can make any difference and put the brakes on the juggernaut, only time will tell.