GPT‐5 - chunhualiao/public-docs GitHub Wiki

Feature GPT-5 (Standard) GPT-5 Thinking GPT-5 Pro
Purpose General-purpose AI with automatic mode switching Deep reasoning mode for complex tasks Maximum performance for advanced reasoning
Availability All users (Free, Plus, Pro, Team) Auto-activated or manual (Plus, Pro, Team) Pro/Team subscribers only
Speed Very fast (optimized for quick responses) Slower (prioritizes depth over speed) Slowest (uses extended compute for rigor)
Reasoning Depth Low to medium (auto-routes to Thinking as needed) High (multi-step, chain-of-thought reasoning) Maximal (parallel test time compute)
Accuracy ~45% fewer factual errors vs. GPT-4o ~80% fewer factual errors vs. o3 >90% fewer major errors vs. o3
Coding Performance 52.8% (SWE-bench, without Thinking) 74.9% (SWE-bench, with Thinking) Higher accuracy, fewer logic errors
Math Performance 93.3% (HMMT, no tools) 96.7% (HMMT, with Python) 100% (HMMT, with Python)
Science Performance 77.8% (GPQA Diamond, no tools) 85.7% (GPQA Diamond, with Thinking) 89.4% (GPQA Diamond, with Python)
Use Cases Everyday Q&A, writing, summaries Complex coding, technical analysis, strategies Research-grade tasks, large datasets, critical work
Context Window Up to 400K tokens Up to 400K tokens Up to 400K tokens
Output Window Up to 128K tokens Up to 128K tokens Up to 128K tokens
API Pricing $1.25/$10 per 1M tokens (input/output) Same as GPT-5 Same as GPT-5 (Pro plan required)
Hallucination Rate Moderate reduction vs. GPT-4o Significant reduction (~4.8% error rate) Lowest (~1.6% on HealthBench)
Tool Integration Basic (auto-routed as needed) Strong (sequential/parallel tool calls) Advanced (Deep Research, custom connectors)

Notes:

  • GPT-5 (Standard): Default model in ChatGPT, uses a real-time router to switch between fast responses and deeper reasoning based on query complexity.
  • GPT-5 Thinking: Automatically or manually engaged for tasks requiring multi-step reasoning, such as coding or strategic planning.
  • GPT-5 Pro: Offers the highest accuracy and extended reasoning, ideal for mission-critical tasks like scientific research or large-scale codebases.
  • Data sourced from various benchmarks and OpenAI's system card.
⚠️ **GitHub.com Fallback** ⚠️