Skip to content

Model Selection

Simply select ‘auto’ from the OpenAI model selection to enable CompanyGPT’s dynamic routing. The system analyses your prompt and automatically selects the most efficient OpenAI model: fast, smaller models for standard queries and high-end models for complex analyses. This saves you time and token costs without any manual effort.

  • Fast & inexpensive → Mini / Flash / Nano / Haiku
  • Standard & reliable → Large all-round models
  • Complex & critical → Most powerful models
  • EU / internal / data protection → Stackit models

  • For: logic, maths, coding and complex problem-solving
  • When: when the AI needs to ‘think’ (reason) internally before responding
  • Why: combines extremely fast output speed with in-depth logical precision
  • For: pure reasoning
  • When: when deep thinking takes precedence over pure speed
  • Why: specialized reasoning focus for complex logic (slightly slower)
  • For: short questions, simple texts, quick answers
  • When: everyday use, chat, summarizing, brainstorming
  • Why: very fast and inexpensive, sufficient for ~80% of cases
  • For: standard all-round tasks
  • When: when quality is more important than pure speed
  • Why: strong in text comprehension, structure, and logic
  • For: complex analyses, clear arguments
  • When: strategy, concepts, in-depth explanations
  • Why: more precise and stable than gpt-4o
  • For: better quality while maintaining high speed
  • When: when gpt-4o mini is too superficial
  • Why: good balance between quality and performance
  • For: demanding tasks with efficiency
  • When: coding, structured outputs, longer texts
  • Why: more modern and robust than the GPT-4 series
  • For: extremely simple, fast tasks
  • When: auto-complete, short answers, mass tasks
  • Why: extremely fast and inexpensive, but limited depth
  • For: interactive dialogues and agent workflows
  • When: support, fluid conversations, strict adherence to instructions
  • Why: optimised for natural language and maintaining context in chat
  • For: more demanding dialogues with high factual accuracy
  • When: complex advice, brainstorming, detailed explanations in chat
  • Why: improved logic and significantly reduced hallucinations in direct conversation
  • For: the most demanding analytical all-round tasks
  • When: large volumes of documents, complex data analysis, strategic planning
  • Why: the flagship of the 5.2 generation; pure quality takes precedence over speed (will be replaced by GPT-5.4)
  • For: tasks requiring higher precision
  • When: when the basic version of 5.2 was not precise enough
  • Why: Deprecated, now replaced by GPT-5.4 Pro
  • For: pure software development and system architecture
  • When: writing code, refactoring, debugging, planning architecture
  • Why: a specialised model that outperforms all-round models in programming tasks
  • For: highly dynamic real-time interactions
  • When: complex multi-turn dialogues, subtle nuances in language
  • Why: extremely low latency and perfect understanding of the conversation flow
  • For: autonomous software development
  • When: agentic coding and code optimization
  • Why: self-optimizing special model, 25% faster than gpt-5.2
  • For: the absolute maximum in performance (Recommended)
  • When: when cost is a secondary consideration and the best results are required for extremely complex problems
  • Why: New flagship (1M context), 33% fewer hallucinations, native computer use
  • For: maximum precision for the most complex tasks
  • When: when deepest reasoning is required and higher costs/latency are accepted
  • Why: offers the deepest reasoning of all OpenAI models
  • For: state-of-the-art quality for high-volume tasks
  • When: processing large datasets, routine tasks at a very high level
  • Why: 2x faster than its predecessor, ideal for quick code edits and classification
  • For: repetitive tasks and sub-agents
  • When: when the lowest latency is absolutely necessary
  • Why: extremely fast, but with a limited feature set
  • For: Image generation
  • When: when images need to be generated
  • Why: OpenAI’s image generation model

  • For: complex reasoning and massive contexts (Recommended)
  • When: processing huge documents (2M tokens) and multimodal tasks
  • Why: Google’s current, most powerful all-round flagship
  • For: highest speed with large contexts
  • When: fast processing of up to 1M tokens
  • Why: fastest model of the 3.x series, extremely strong price-performance ratio
  • For: complex STEM tasks
  • When: mathematical or logical problems that require an internal thought process
  • Why: extended reasoning (takes time to think, hence higher latency)
  • For: long analysis tasks and multi-hop research
  • When: in-depth internet research across multiple sources
  • Why: specialized model for complex information gathering (not universally applicable)
  • For: reasoning-first approaches and multimodal tasks
  • When: testing reasoning capabilities
  • Why: deprecated preview predecessor (replaced by 3.1 Pro GA)
  • For: fast multimodal tasks
  • When: early tests of flash speed
  • Why: deprecated preview predecessor (replaced by 3.1 Flash GA)
  • For: maximum speed
  • When: quick queries, brainstorming, iterations
  • Why: very fast, good at overview and context
  • For: deep thinking and large contexts
  • When: complex documents, comparisons, analyses
  • Why: Google’s formerly strongest model with high structural quality (Deprecated)
  • For: simple, extremely cheap tasks
  • When: when absolute cost efficiency is what counts
  • Why: deprecated, but proven entry-level model (Deprecated)
  • For: image analysis, image generation, image editing
  • When: text-to-image generation, image editing using prompts (image + text) and the composition of multiple images
  • Why: Google’s image models, which are integrated into CompanyGPT

  • For: maximum complexity and in-depth analysis
  • When: strategic planning, extremely long contexts (1M), the most challenging logic tasks
  • Why: Anthropic’s most powerful model for agent teams and parallel workflows (highest costs)
  • For: programming, complex text processing and demanding all-round tasks (Recommended)
  • When: software development, code refactoring, deep text comprehension, structured data extraction
  • Why: The sweet spot of the range. Opus-class performance at the Sonnet price (1M context)
  • For: very fast processing with high logical precision
  • When: filtering large amounts of data, UI-based chatbots, simple to medium tasks in bulk
  • Why: very fast and cost-efficient (less reasoning than Sonnet/Opus)
  • For: good balance between performance and latency
  • When: tasks that require up to 200k token context
  • Why: deprecated model, outperformed by 4.6
  • For: premium quality and high reasoning performance of the previous generation
  • When: complex tasks using an older model stack
  • Why: deprecated flagship, replaced by Opus 4.6

  • For: looking things up, overview, fact checking, research
  • When: questions with facts and sources
  • Why? Answer with sources

  • “I just want a good answer” → gpt-5 mini / Claude 4.6 Sonnet
  • “I need it fast” → gpt-5 nano, gpt-5.4 mini, or Gemini 3.1 Flash
  • “I want to program / write code” → Claude 4.6 Sonnet / GPT-5.3 Codex
  • “It’s complicated or extremely important” → gpt-5.4, gpt-5.4 Pro, or Gemini 3.1 Pro
  • “Search documents” → Gemini 3.1 Flash / Claude 4.6 Sonnet