AI FAQ: Why Does AI Hallucinate?
We all know that AI can "hallucinate." But why does it make up false facts and present them so confidently?
This is the first of what I plan on being a series of articles on frequently asked questions about AI. They come out of conversations I have with clients, colleagues, and others who are just interested in AI. I hope they're helpful. And for the topics of future posts, I'm taking requests!
By now, you've probably heard of the lawyer who did some of his pre-trial prep on ChatGPT and presented the judge with completely fabricated findings from previous court cases. And if you've asked ChatGPT if it knows who you are, and to provide a few details about your life, you've probably been surprised by what it "knows" about you! (If you haven't done that yet, give it a try!)
We all know that AI can "hallucinate." But why does it make up false facts and present them so confidently? And does that mean that AI is just a waste of time?
What is AI "hallucination"?
First of all, what exactly is this "hallucination"?
When an AI application confidently answers a question with false or unsubstantiated/unverifiable information, it is said to “hallucinate.” Hallucinations can include simple, straightforward answers to questions that are factually false. They can also include fabricated citations. And because generative A.I. is designed around the capability of generating content, it can also generate factually false, yet highly believable “evidence” in support of its hallucinations.
So why does it happen?
Why does AI "hallucinate"?
Generative A.I. applications like ChatGPT are designed for tasks such as text completion: that is, to generate text that appropriately follows what came before it. What it considers "appropriate" is based on the statistical patterns inherent in the data on which it was “trained” — in the case of ChatGPT, that's a large swath of the internet. When the preceding text is a question, the highest-probability text to follow will usually be in the form of an answer. And when the correct answer is generally known, and therefore well represented in the training data, the highest-probability answer will also be the correct answer.
However, when the correct answer is unknown, in dispute, or overwhelmed by incorrect answers in the training data, the highest-probability answer in the training data may actually be an incorrect answer. In other words, the answer is factually inaccurate, but it is statistically correct in that it represents the plurality of occurrences in the training data. And because the model generates responses based on statistics, rather than some embedded understanding of truth, it is highly sensitive to such factual inaccuracies in the training data.
For example, if you ask the model “What is the capital of Illinois?”, the highest-probability response is likely also the correct answer: “Springfield is the capital of Illinois.” However, in that data, there are likely many other lower-probability responses, such as “Chicago is the capital of Illinois.” Some of these incorrect but believable (to those who don't know any better) answers may appear from time to time in the model's output, but only occasionally. If the training data, however, was overwhelmingly inclusive of content created by individuals with less education and/or no close connection to the Midwest region of the United States, “Chicago is the capital of Illinois” might actually be the highest-probability response in the training data, and therefore the answer the model is trained to “think” is “correct.” And if that's the case, it will be the answer the model returns most of the time.
However, there's another factor to be aware of when it comes to tools like ChatGPT. OpenAI offers a parameter for their GPT models called temperature. Temperature is basically how creative the model is in its output — that is, how likely it is to return content that is not the highest-probability occurrence. Most AI apps being built that leverage OpenAI’s API employ a temperature setting of zero (or near to it). This restricts the model to producing content that is very high probability, and therefore more likely to be factual. However, the main ChatGPT interface at chat.openai.com has a relatively high temperature setting.
One reason OpenAI may have done this is to generate some wrong answers on purpose. This would allow them to gather feedback from users. Human feedback is incredibly helpful in the development and refining of a model. For example, knowledgeable Midwesterners can click the thumbs down on “Chicago is the capital of Illinois” to help improve the model’s accuracy in areas where the training data is unreliable. This is especially true for unsupervised models — machine-learning models that were trained on large bodies of text, without any tags denoting “right” and “wrong” information.
This is not the only reason why OpenAI likely chose to have ChatGPT be more "creative" — for example, when the temperature is set to zero (more factual), the prose is often less interesting, because the model is more constrained in what it can produce. A higher temperature setting produces more readable and more interesting prose, which is more important or desirable for some tasks, such as creative writing or brainstorming.
We're still learning all of the implications of Generative AI, as people put these new tools to a wide variety of uses, and every decision made in the development of these tools will come with trade-offs and unintended consequences.
So when it comes to AI chatbots like ChatGPT and the large language models that support them, it's important to keep in mind that 1) AI chatbots will regularly “hallucinate”, and that 2) most apps built using GPT or another large language model will hallucinate less often than ChatGPT, though 3) they will still hallucinate sometimes, particularly when dealing with new or complex topics.
I hope this short post helps clear things up, at least a bit. Look for more AI FAQ posts to come. And in the mean time, if there's a question you'd like me to address in a future post, drop me a line!