3.5 KiB
3.5 KiB
Thinking Translation Model Alignment Summary
Last Updated: 2025-11-25 Status: Verification Complete ✅
Overview
We have implemented a comprehensive Thinking Translation Model that aligns Claude Code's native thinking.budget_tokens parameter with the diverse reasoning configurations of 6 major AI providers. This ensures that when a user requests a specific thinking budget (e.g., "Think for 16k tokens"), it is correctly translated into the native control mechanism for the target model.
Provider Alignment Matrix
| Provider | Model | Claude Parameter | Translated Parameter | Logic |
|---|---|---|---|---|
| OpenAI | o1, o3 | budget_tokens |
reasoning_effort |
< 4k: minimal4k-16k: low16k-32k: medium> 32k: high |
| Gemini 3.0 | budget_tokens |
thinking_level |
< 16k: low>= 16k: high |
|
| Gemini 2.5/2.0 | budget_tokens |
thinking_config.thinking_budget |
Passes exact budget (capped at 24,576) | |
| xAI | Grok 3 Mini | budget_tokens |
reasoning_effort |
< 20k: low>= 20k: high |
| Qwen | Qwen 2.5/3 | budget_tokens |
enable_thinking, thinking_budget |
enable_thinking: truethinking_budget: exact value |
| MiniMax | M2 | thinking |
reasoning_split |
reasoning_split: true |
| DeepSeek | R1 | thinking |
(Stripped) | Parameter removed to prevent API error (400) |
Implementation Details
1. OpenAI Adapter (OpenAIAdapter)
- File:
src/adapters/openai-adapter.ts - Behavior: Maps continuous token budget into discrete effort levels.
- New Feature: Added support for
minimaleffort (typically < 4000 tokens) for faster, lighter reasoning tasks.
2. Gemini Adapter (GeminiAdapter)
- File:
src/adapters/gemini-adapter.ts - Behavior:
- Gemini 3 detection: Checks
modelIdfor "gemini-3". Usesthinking_level. - Backward Compatibility: Defaults to
thinking_configfor Gemini 2.0/2.5. - Safety: Caps budget at 24k tokens to maintain stability.
- Gemini 3 detection: Checks
3. Grok Adapter (GrokAdapter)
- File:
src/adapters/grok-adapter.ts - Behavior:
- Validation: Explicitly checks for "mini" models (Grok 3 Mini).
- Stripping: Removes thinking parameters for standard Grok 3 models which do not support API-controlled reasoning (prevents errors).
4. Qwen Adapter (QwenAdapter)
- File:
src/adapters/qwen-adapter.ts - Behavior:
- Enables the specific
enable_thinkingflag required by Alibaba Cloud / OpenRouter. - Passes the budget through directly.
- Enables the specific
5. MiniMax Adapter (MiniMaxAdapter)
- File:
src/adapters/minimax-adapter.ts - Behavior:
- Sets
reasoning_split: true. - Does not support budget control, but correctly enables the interleaved reasoning feature.
- Sets
6. DeepSeek Adapter (DeepSeekAdapter)
- File:
src/adapters/deepseek-adapter.ts - Behavior:
- Defensive: Detects DeepSeek models and removes the
thinkingobject. - Reasoning: Reasoning happens automatically (R1) or not at all; sending the parameter causes API rejection.
- Defensive: Detects DeepSeek models and removes the
Protocol Integration
The translation happens during the prepareRequest phase of the BaseModelAdapter.
- Intercept: The adapter intercepts the
ClaudeRequest. - Translate: It reads
thinking.budget_tokens. - Mutate: It modifies the
OpenRouterPayloadto add provider-specific fields. - Clean: It deletes the original
thinkingobject to prevent OpenRouter from receiving conflicting or unrecognized parameters.