Surprise in the age of AI

As intelligence becomes modeled, surprise becomes endangered

Surprise in the age of AI

By Isaac Hodes


Because of AI, my children will grow up in a world materially different from the one I grew up in. Beyond the self-driving cars, AI slop, and perfectly tailored entertainment, something more fundamental to the nature of our understanding of being is changing. When so much of our language can be convincingly modeled by several unimaginably enormous matrices multiplied and added together, what does it mean to be? 

A large language model is a statistical model that takes in a sequence of words (context) and then outputs the next word you’d most likely expect to come next. For example, you might provide the model with the context “The capital of France is”, and it might respond with “Paris”. In practice, we apply the model recursively to an ever-growing sequence of words until the model outputs a magic word that lets us know that it’s done for now. In a chat interface, that’s when we get to respond before the loop continues. 

How do these models learn to do this? A model might start as a set of one trillion random numbers. It will lap up mountains of data, learning first how to respond to context with the most likely next word, using a corpus of trillions of words from the internet as its measure of likelihood. Models that stop training at this point are feral: they don’t follow instructions, and they can parrot the more vile corners of the internet, and are generally unhinged. Once their initial training (pretraining) completes, they begin to be domesticated. They are fed a diet of much smaller sets of carefully crafted, curated, and tweaked examples of good and bad sentences. Through this method, they are taught to follow instructions (instruction tuned) and to be polite and honest (aligned). Finally, with the power of reinforcement learning, many of their skills are sharpened through a gauntlet of graded exercises in generally more objective areas such as math and coding. Now we have a scholar. After all of this learning—which requires 10^26 arithmetic operations and costs $50,000,000—we’re left with an adjusted set of one trillion numbers that encode much of what humans are capable of communicating.

If you and this model are given the same context, and you both consistently respond in a similar way, you’ve been well-modeled. If you consistently prefer the model’s output to yours, you’ve also been well-modeled. Being relatively normal means you will be modeled, that your possibilities and preferences can be found in the expected output of a large model. If you’re well-modeled you’re predictable and can be thus commoditized in the datacenters of our largest technology corporations. If this is not appealing, you may instead be well-served by being surprising.

The concept of minimizing surprise is central to artificial intelligence. With some creative liberties taken, Kullback–Leibler (KL) divergence is a measure of just how surprised you’d be if you compared to reality the output of a statistical model you thought was correct. Training large language models is entirely about minimizing this divergence from reality as represented by trillions of words of text from billions of people over hundreds of years. If you are surprising when compared to an LLM trained in this way, there’s a good chance you’re interesting. Or crazy. 

Being in the tails of the distribution of humanity is not at all an unvarnished good. Much legal, religious, and social pressure has been expended on reducing the width of those tails: don’t steal (imprisonment), don’t worship false idols (stoning), and don’t be a dweeb (swirlies). There are desirable constraints on surprise alone being our goal, and minimizing divergence can serve a valuable purpose in society at large. At the same time, if I were to be well-modeled by an LLM, I would be distraught. At going frontier model inference rates, it would mean that $10 could purchase 1,000 pages of writing that’s not meaningfully different from what I would’ve written with the same context, faster. Further, as we are shaped by the content we consume, and as we increasingly consume the generated content that represents the least-surprising middle ground of humankind, we will ourselves become more gathered into the body of the distribution. This is not a novel phenomenon: from the colocation enabled by agriculture, to religion, the printing press, and mass media, we’ve continued to converge. The tails will continue to recede, and we will progressively cease to surprise. 

I want my children to be able to offer more to the world than what the tensor processing units inside of Google’s datacenters are spitting out in regular silent staccato at a rate of trillions of words per second. I want my children to offer surprises to the world. 

In the end, is being surprising good? I think reasonable people might disagree, and plenty of good people may not be terribly divergent. It’s clear that a huge amount of economically valuable work is unsurprising. It’s clear that many wonderful people who have intrinsic value are not terribly surprising. At the same time, those who are noteworthy in their kindness, intelligence, humor, or talent, are noteworthy because they’re surprising in their own way. Because they diverge from the expected. Delight stems from surprise, and if nothing else, I want my children to continue to experience delight well beyond their childhood.

In a world where the expected is a moment behind a blinking cursor connected by a thin glass thread to one trillion numbers selected by closely studying everything that’s been said before, focusing on the least surprising things in particular, and then filed down to upset the least, we will have statistically modeled comforting and excellent normalcy and distributed it to everyone. In that world, surprise will be endangered. Ferlinghetti instructs us to “create works capable of answering the challenge of apocalyptic times”. As surprise recedes from the world, his admonishment will become increasingly important to heed.

In many ways, the process of training an LLM and the process of raising a child is similar. There’s a similar progression from feral toddler to scholar and from a broad base of knowledge and socialization to specialization and precision. This child-rearing and education pipeline produces adults who are interesting and who are not. And as with model training, we often try to shield our children from examples from the tails. In model training, this is another place KL divergence comes into play: we use it to reduce the impact a truly surprising example can have on the model’s learning. Trainers of large language models do this because they’ve seen that when they don’t shelter the models from this kind of surprise, they diverge too much and can even lose coherence in something known as “model collapse”.  We often do the same with children, sheltering them from malinfluence. I’m of the opinion that this can be an important thing to do.

It’s not obvious what will cultivate healthy divergence, but it’s instructive to examine what doesn’t lead to it. The contenders for unhealthy divergence include abuse, substance abuse, and similarly extreme environments. The contenders for convergence include a steady consumption of unchallenging material and exposure to social pressures. Surprising and healthy people can come from both groups, but this is not common. 

If I consider some examples of healthfully divergent people, I begin to guess at what ties them together and what I might do to guide my children to experience and generate surprise. There are many more anecdotes that deserve a careful analysis, but many of the people you might point to as divergent went strengthened through some hardship (Ruth Bader Ginsburg’s mother died when she was in high school), were instilled with confidence (Nelson Mandela was royalty), had their curiosity nourished (Bertrand Russell’s tutor was a practicing biologist; his godfather a renowned philosopher), and were given time and space to practice. 

At this point it’s worth noting an interesting point of difference between the training of language models and of humans. Large language models consume tens to hundreds of trillions of words in the course of their training and, as the past 6 years—now known as the Scaling Era—have shown, adding more data and computational power to the training of models continues to improve them on all the structured grading scales (benchmarks) we care about. And yet, a person might consume one hundred-thousandth of that over the course of their life. People can learn so much, and be so talented in their unusual and unique ways, even with such a relative paucity of data. This strength isn’t necessarily sustainable—I won’t pretend to know the direction of future AI research—but in it I find some comfort, and a warning. What we consume, who we spend time with, and what we practice shapes us. We can cultivate surprise, even while divergence is compressed as normalcy is distributed.