Qwen3-Thinking API

Open-source reasoning at scale — MoE efficiency, 256K context, explicit thinking mode

Qwen3-235B-A22B-Thinking-2507 is a purpose-built reasoning model. With a Mixture-of-Experts (235B total / ~22B active) and native **256K tokens**, it excels at multi-step logic, math proofs, code reasoning and long-document synthesis — while remaining Apache-2.0 open source and production-ready.

API Access

To use the API for inference, please register an account first. You can view and manage your API token in the API Token dashboard.

All requests to the inference API require authentication via an API token. The token uniquely identifies your account and grants secure access to .

When calling the API, set the Authorization header to your API token, configure the request parameters as shown below, and send the request.

Why Qwen3-Thinking API?

235B total / ~22B active MoE — deep reasoning with efficient expert routing.
Native 256K context — load long contracts, papers or multi-file codebases in one call.
Explicit thinking mode — <think>…</think> more transparent multi-step reasoning.
Open-source Apache-2.0 — audit, self-host, fine-tune; no vendor lock-in.
Developer-first — OpenAI-compatible endpoints, streaming, retries, observability.
Designed for hard tasks — math & logic, code comprehension, research synthesis.

Model	Architecture	Native Context	Open Source	Best For
Qwen3-235B-A22B-Thinking-2507	MoE (235B / ~22B active), explicit thinking	262 K	Yes (Apache-2.0)	Multi-step reasoning, long documents, math & code
OpenAI o4-mini	Proprietary dense + thinking	200 K →	No	General reasoning
Claude 3.5 Sonnet	Proprietary	200 K	No	Long-context assistance
Gemini 2.5 Pro	Proprietary	128 K	No	Multimodal tasks

Popular Use Cases of Qwen3-Thinking API

🧮 Mathematical & Logical Proofs

Solve competition-level problems with step-by-step, verifiable chains of thought.

📚 Long-Document Analysis

Ingest 100–200+ page specs, contracts or literature and synthesize precise, cited summaries.

👨‍💻 Code Reasoning & Debugging

Explain unfamiliar code paths, propose fixes, and generate tests across multi-file repos.

🔍 Research Assistant

Plan, decompose and reason through complex topics (GPQA-style) with transparent intermediate steps.

Benchmark Highlights (July 2025)

AIME'25: strong competition-level math performance.
LiveCodeBench v6: robust real-world code reasoning.
GPQA / SuperGPQA: high-level graduate-style QA.
Arena-Hard v2: competitive head-to-head win rate.

Read methodology & results →

Frequently Asked Questions of Qwen3-Thinking API

Q What does “A22B” mean in the Qwen3-235B-A22B-Thinking-2507 API (MoE reasoning LLM)?

Q How is the thinking mode exposed in the Qwen3 reasoning model API?

Q Can I run long documents with the Qwen3-235B-A22B-Thinking-2507 API (256K context)?

Q Is Qwen3-Thinking really open source (Apache-2.0)?

Q How does it compare to closed models for multi-step reasoning?

Q What about pricing for the Qwen3-Thinking API?