Model Selection
Dynamic Model Routing
Section titled “Dynamic Model Routing”Simply select ‘auto’ from the OpenAI model selection to enable CompanyGPT’s dynamic routing. The system analyses your prompt and automatically selects the most efficient OpenAI model: fast, smaller models for standard queries and high-end models for complex analyses. This saves you time and token costs without any manual effort.
Rule of thumb:
Section titled “Rule of thumb:”- Fast & inexpensive → Mini / Flash / Nano / Haiku
- Standard & reliable → Large all-round models
- Complex & critical → Most powerful models
- EU / internal / data protection → Stackit models
OpenAI
Section titled “OpenAI”o3-mini
Section titled “o3-mini”- For: logic, maths, coding and complex problem-solving
- When: when the AI needs to ‘think’ (reason) internally before responding
- Why: combines extremely fast output speed with in-depth logical precision
- For: pure reasoning
- When: when deep thinking takes precedence over pure speed
- Why: specialized reasoning focus for complex logic (slightly slower)
gpt-4o mini
Section titled “gpt-4o mini”- For: short questions, simple texts, quick answers
- When: everyday use, chat, summarizing, brainstorming
- Why: very fast and inexpensive, sufficient for ~80% of cases
gpt-4o
Section titled “gpt-4o”- For: standard all-round tasks
- When: when quality is more important than pure speed
- Why: strong in text comprehension, structure, and logic
gpt-4.1
Section titled “gpt-4.1”- For: complex analyses, clear arguments
- When: strategy, concepts, in-depth explanations
- Why: more precise and stable than gpt-4o
gpt-4.1 mini
Section titled “gpt-4.1 mini”- For: better quality while maintaining high speed
- When: when gpt-4o mini is too superficial
- Why: good balance between quality and performance
gpt-5 mini
Section titled “gpt-5 mini”- For: demanding tasks with efficiency
- When: coding, structured outputs, longer texts
- Why: more modern and robust than the GPT-4 series
gpt-5 nano
Section titled “gpt-5 nano”- For: extremely simple, fast tasks
- When: auto-complete, short answers, mass tasks
- Why: extremely fast and inexpensive, but limited depth
gpt-5.1-chat
Section titled “gpt-5.1-chat”- For: interactive dialogues and agent workflows
- When: support, fluid conversations, strict adherence to instructions
- Why: optimised for natural language and maintaining context in chat
gpt-5.2-chat
Section titled “gpt-5.2-chat”- For: more demanding dialogues with high factual accuracy
- When: complex advice, brainstorming, detailed explanations in chat
- Why: improved logic and significantly reduced hallucinations in direct conversation
gpt-5.2
Section titled “gpt-5.2”- For: the most demanding analytical all-round tasks
- When: large volumes of documents, complex data analysis, strategic planning
- Why: the flagship of the 5.2 generation; pure quality takes precedence over speed (will be replaced by GPT-5.4)
gpt-5.2 pro
Section titled “gpt-5.2 pro”- For: tasks requiring higher precision
- When: when the basic version of 5.2 was not precise enough
- Why: Deprecated, now replaced by GPT-5.4 Pro
gpt-5.2-codex
Section titled “gpt-5.2-codex”- For: pure software development and system architecture
- When: writing code, refactoring, debugging, planning architecture
- Why: a specialised model that outperforms all-round models in programming tasks
gpt-5.3-chat
Section titled “gpt-5.3-chat”- For: highly dynamic real-time interactions
- When: complex multi-turn dialogues, subtle nuances in language
- Why: extremely low latency and perfect understanding of the conversation flow
gpt-5.3 Codex
Section titled “gpt-5.3 Codex”- For: autonomous software development
- When: agentic coding and code optimization
- Why: self-optimizing special model, 25% faster than gpt-5.2
gpt-5.4
Section titled “gpt-5.4”- For: the absolute maximum in performance (Recommended)
- When: when cost is a secondary consideration and the best results are required for extremely complex problems
- Why: New flagship (1M context), 33% fewer hallucinations, native computer use
gpt-5.4 Pro
Section titled “gpt-5.4 Pro”- For: maximum precision for the most complex tasks
- When: when deepest reasoning is required and higher costs/latency are accepted
- Why: offers the deepest reasoning of all OpenAI models
gpt-5.4-mini
Section titled “gpt-5.4-mini”- For: state-of-the-art quality for high-volume tasks
- When: processing large datasets, routine tasks at a very high level
- Why: 2x faster than its predecessor, ideal for quick code edits and classification
gpt-5.4 nano
Section titled “gpt-5.4 nano”- For: repetitive tasks and sub-agents
- When: when the lowest latency is absolutely necessary
- Why: extremely fast, but with a limited feature set
gpt-Image 1.5
Section titled “gpt-Image 1.5”- For: Image generation
- When: when images need to be generated
- Why: OpenAI’s image generation model
Gemini 3.1 Pro (GA)
Section titled “Gemini 3.1 Pro (GA)”- For: complex reasoning and massive contexts (Recommended)
- When: processing huge documents (2M tokens) and multimodal tasks
- Why: Google’s current, most powerful all-round flagship
Gemini 3.1 Flash (GA)
Section titled “Gemini 3.1 Flash (GA)”- For: highest speed with large contexts
- When: fast processing of up to 1M tokens
- Why: fastest model of the 3.x series, extremely strong price-performance ratio
Gemini 3.1 Flash Thinking
Section titled “Gemini 3.1 Flash Thinking”- For: complex STEM tasks
- When: mathematical or logical problems that require an internal thought process
- Why: extended reasoning (takes time to think, hence higher latency)
Gemini 3.1 Pro Deep Research
Section titled “Gemini 3.1 Pro Deep Research”- For: long analysis tasks and multi-hop research
- When: in-depth internet research across multiple sources
- Why: specialized model for complex information gathering (not universally applicable)
Gemini 3 Pro (Preview)
Section titled “Gemini 3 Pro (Preview)”- For: reasoning-first approaches and multimodal tasks
- When: testing reasoning capabilities
- Why: deprecated preview predecessor (replaced by 3.1 Pro GA)
Gemini 3 Flash (Preview)
Section titled “Gemini 3 Flash (Preview)”- For: fast multimodal tasks
- When: early tests of flash speed
- Why: deprecated preview predecessor (replaced by 3.1 Flash GA)
Gemini 2.5 Flash
Section titled “Gemini 2.5 Flash”- For: maximum speed
- When: quick queries, brainstorming, iterations
- Why: very fast, good at overview and context
Gemini 2.5 Pro
Section titled “Gemini 2.5 Pro”- For: deep thinking and large contexts
- When: complex documents, comparisons, analyses
- Why: Google’s formerly strongest model with high structural quality (Deprecated)
Gemini 2.0 Flash
Section titled “Gemini 2.0 Flash”- For: simple, extremely cheap tasks
- When: when absolute cost efficiency is what counts
- Why: deprecated, but proven entry-level model (Deprecated)
Gemini Image Tools / Nano Banana
Section titled “Gemini Image Tools / Nano Banana”- For: image analysis, image generation, image editing
- When: text-to-image generation, image editing using prompts (image + text) and the composition of multiple images
- Why: Google’s image models, which are integrated into CompanyGPT
Anthropic
Section titled “Anthropic”Claude 4.6 Opus
Section titled “Claude 4.6 Opus”- For: maximum complexity and in-depth analysis
- When: strategic planning, extremely long contexts (1M), the most challenging logic tasks
- Why: Anthropic’s most powerful model for agent teams and parallel workflows (highest costs)
Claude 4.6 Sonnet
Section titled “Claude 4.6 Sonnet”- For: programming, complex text processing and demanding all-round tasks (Recommended)
- When: software development, code refactoring, deep text comprehension, structured data extraction
- Why: The sweet spot of the range. Opus-class performance at the Sonnet price (1M context)
Claude 4.5 Haiku
Section titled “Claude 4.5 Haiku”- For: very fast processing with high logical precision
- When: filtering large amounts of data, UI-based chatbots, simple to medium tasks in bulk
- Why: very fast and cost-efficient (less reasoning than Sonnet/Opus)
Claude 4.5 Sonnet
Section titled “Claude 4.5 Sonnet”- For: good balance between performance and latency
- When: tasks that require up to 200k token context
- Why: deprecated model, outperformed by 4.6
Claude 4.5 Opus
Section titled “Claude 4.5 Opus”- For: premium quality and high reasoning performance of the previous generation
- When: complex tasks using an older model stack
- Why: deprecated flagship, replaced by Opus 4.6
Perplexity AI
Section titled “Perplexity AI”- For: looking things up, overview, fact checking, research
- When: questions with facts and sources
- Why? Answer with sources
Quick decision-making aid
Section titled “Quick decision-making aid”- “I just want a good answer” → gpt-5 mini / Claude 4.6 Sonnet
- “I need it fast” → gpt-5 nano, gpt-5.4 mini, or Gemini 3.1 Flash
- “I want to program / write code” → Claude 4.6 Sonnet / GPT-5.3 Codex
- “It’s complicated or extremely important” → gpt-5.4, gpt-5.4 Pro, or Gemini 3.1 Pro
- “Search documents” → Gemini 3.1 Flash / Claude 4.6 Sonnet