DeepSeek Cracks the Context Code
DeepThink Mode is a breakthrough in communication, not just reasoning
The release of DeepSeek’s newest model, r1, is sending earthquake-sized tremors through the AI world. Free to use (for now), its performance matches that of OpenAI’s o1 model, among others, all of which are paid. Upending assumptions about how much it costs to train a model and to run one, DeepSeek is a threat to many AI companies but a gift to the user.
Sleek and easy to use, one of its major innovations is that, in DeepThink mode, the model exposes its chain of thought. Thoughtfully structured to make the reasoning clear, the model explains how it interpreted a request and what concepts, ideas, or information it explored to generate an answer.
The step-by-step reasoning is fascinating and incredibly useful. Unlike with current models, when an answer goes awry, there is no longer a need for guesswork about where the breakdown in communication between request-interpretation-output may have occurred. DeepThink signposts the interpretation, allowing you to diagnose what may have gone wrong.
This is obviously being hailed as a valuable innovation to understand the reasoning of an LLM, but I think that description misses the boat. What DeepSeek is really solving is a serious issue with communicating with AI-powered chatbots: the lack of pragmatic context.
Pragmatic context is the water that words swim in. We, speakers and interpreters of speech, use shared cultural norms, background knowledge, speaker intentions, and the very environment itself to go beyond the literal meaning of what is said to understand what the speaker actually intended to communicate. Imagine how stilted and limited language would be without this skill.
Examples of how speakers use context to decode a message are endless. Take this exchange from Barbara Kingsolver’s (excellent) book Demon Copperhead:
“I don’t have any money,” I said…
“Sorry,” she said, and for once I didn’t mind that word. It looked good on Angus. I’d been waiting for it.
“Forget it,” I said. “Can we just get out of here?”
“No. I’m saying, sorry for not getting straight with you. My bad.” She whipped out a slice of silver out of her pocket and tilted it up and down in the light, like a mirror flashing code. “Meet the Master”...
Without the vivid description of the credit card, “Meet the Master” would be impossible to interpret. Note that it is not merely the description that allows you to understand—on its own, it would still not be enough—but rather your knowledge that MasterCard is a brand of credit card, making it clear this must be a reference to that company.
Up until now, communicating with AI has been a bit maddening. You launch a conversation, it veers a bit, you prompt and prompt, trying to coax a response that better fits your needs, until out it pops—or you give up, never really sure why the exchange failed. You lack the context needed to move the conversation back on track. No longer. The ability to read through how the LLM has interpreted an input provides crucial information about what background knowledge is being used, as well as something that is not far off from “speaker intention.”
Let’s begin with background knowledge. As you read through the chain-of-thought reasoning, you will see what information is referenced in constructing the response. This is clearly useful for understanding the reasoning of the machine, but better yet, it allows you to identify what facts are being used to interpret your prompt. Perhaps they are irrelevant or incomplete; either way, you now have the necessary context to interpret the AI’s response.
Speaker intention is more interesting. Words mean something, but speakers use those words to communicate whatever they want. Speakers have a goal, purpose, or some underlying intention when they communicate— which it is up to you, as the listener, to discern. AI does not. It does not have a goal, other than to output the next high-probability token in a string. Generative AI does not use language to do anything; it lacks speaker intentions.
This lack of intention is confusing because AI outputs often mimic how people use language to convey more than just the literal meaning of their words. AI tells jokes, lies, insults, insinuates, covertly suggests, flirts… that is what makes it such a convincing communicator. Nevertheless, none of that is evidence of intent. It is evidence that the ways we use language are deeply embedded in patterns that can be detected by the LLM’s algorithms.
But… by providing insight into how it interpreted your prompt, DeepSeek offers a glimpse into what particular speaker intention it may be mimicking. This is important! Take, for example, the simple problem of humor. Many relationships have been ruined by the failure to recognize a comment as dry wit and not an insult. By sharing the chain of thought that led to the reply, you are able to see if the chatbot missed your attempt to be funny. You now understand its intent: to generate the next probable token based on a literal interpretation of your words, not their intended humor. It’s not speaker’s intent, but it’s not bad.
This is a step forward in communication! You have background knowledge and a proxy for speaker intent. This is not all there is to pragmatic context — no shared situation on which to depend, no actual speaker’s intent, and with so little known about how LLMs work, arguably an incomplete understanding of cultural norms—but it is a start. With access to the chain-of-thought reasoning, DeepSeek is showing the way of how to solve the challenge of context.
