Reasoning tokens expose the internal thought process of advanced models like OpenAI’s GPT-5 and Anthropic’s Claude with extended thinking. These models produce structured content blocks that separate reasoning from the final answer, letting you build UIs that show how the model arrived at its response.Documentation Index
Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-cbuipl-1779916257-33d1bcf.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
What are reasoning tokens?
When models with reasoning capabilities process a prompt, they generate two distinct types of content:- Reasoning blocks: the model’s internal chain-of-thought, problem decomposition, and step-by-step analysis
- Text blocks: the final, polished response presented to the user
AIMessage, accessible via the contentBlocks property:
Not all models produce reasoning tokens. This pattern applies specifically to models that support extended thinking or chain-of-thought output. Standard chat models return only text blocks.
Use cases
- Transparency: show users the model’s reasoning process to build trust in its answers
- Debugging: inspect the model’s thought process to identify where it goes wrong
- Educational tools: teach students problem-solving by revealing how an AI approaches questions
- Decision support: let domain experts validate the reasoning behind recommendations
- Quality assurance: audit reasoning chains for compliance in regulated industries
Extracting reasoning and text blocks
ThecontentBlocks array on an AIMessage contains all blocks in the order they were generated. Filter them by type to separate reasoning from text:
Accessing messages from useStream
Connect useStream to your reasoning-capable agent and iterate
stream.messages in your chat UI. Branch on HumanMessage.isInstance and
AIMessage.isInstance, then pass each assistant message to a component that
reads contentBlocks and separates reasoning from text. Set isStreaming on
the last message while stream.isLoading is true so thinking blocks update as
tokens arrive.
The code examples use
useStream<typeof myAgent> for type-safe stream state. See Type inference for Python or JavaScript backends.Building a ThinkingBubble component
TheThinkingBubble presents reasoning tokens in a visually distinct, collapsible container. Users can expand it to see the full thought process or collapse it to focus on the final answer.
Rendering the complete AI response
Combine theThinkingBubble and a standard text bubble into a single AIResponse component:
Handling edge cases
Messages without reasoning
Not every AI message will contain reasoning blocks. WhencontentBlocks has only text blocks, render a standard message bubble without the ThinkingBubble.
Empty reasoning blocks
Some models produce empty reasoning blocks as placeholders. Filter these out:Multiple reasoning-text cycles
A single message can alternate between reasoning and text blocks. If you need to preserve this interleaving, iteratecontentBlocks in order rather than grouping by type:
Best practices
- Default to collapsed: show reasoning on demand, not by default
- Show character count: gives users a quick sense of how much thinking went into the response
- Differentiate visually: use distinct colors, borders, or backgrounds so reasoning is never confused with the actual answer
- Animate transitions: smooth expand/collapse animations improve perceived quality
- Consider accessibility: use proper ARIA attributes (
aria-expanded,aria-controls) on the toggle button - Truncate in previews: show a short preview of the reasoning when collapsed so users can decide whether to expand
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

