Skip to content

Model Selection

Simply select “auto” in the OpenAI model selection to activate CompanyGPT’s dynamic routing. The system analyzes your prompt and automatically chooses the most efficient OpenAI model: fast, smaller models for standard questions and high-end models for complex analyses. This saves time and token costs without any manual effort.

  • Fast & inexpensive → Mini / Flash / Nano / Haiku
  • Standard & reliable → large all-round models
  • Complex & critical → most powerful models
  • EU / internal / data protection → STACKIT models

  • For: coding, professional knowledge work, and the most complex tasks
  • When: software development, strategic decisions, critical analyses, high-stakes outputs
  • Why: latest flagship (1M context) with the highest overall performance across all categories
  • For: coding and demanding analyses with very long context
  • When: large document volumes, extensive reports, long-context workflows
  • Why: more affordable than GPT 5.5 while still delivering strong quality (1M context)
  • For: high quality with high throughput, coding, and autonomous agents
  • When: batch processing, scaled content workflows, computer use, subagents
  • Why: strongest mini model (400K context) with support for computer use and subagents
  • For: dynamic real-time interaction
  • When: assistant dialogues, iterative coordination, quick follow-up questions in chat
  • Why: chat-optimized for smooth conversations and strong context retention
  • For: software development and technical implementation
  • When: writing code, refactoring, debugging, creating tests
  • Why: specialized for development tasks with high code quality
  • For: challenging analysis and structuring tasks
  • When: complex evaluations, difficult trade-offs, precise reasoning
  • Why: strong reasoning model for demanding general workloads
  • For: demanding dialogues with high reliability
  • When: consulting-like conversations, long interactions, precise follow-ups
  • Why: optimized for chat quality and more robust against hallucinations
  • For: interactive dialogues and agent workflows
  • When: support, fluent conversations, strict adherence to instructions
  • Why: chat-optimized for natural language and stable context management
  • For: versatile general-purpose tasks
  • When: structured texts, analyses, automations without a primary chat focus
  • Why: balanced quality, speed, and reliability for everyday use
  • For: demanding tasks with efficiency
  • When: coding, structured outputs, longer texts
  • Why: more modern and robust than the GPT-4 series
  • For: extremely simple, fast tasks
  • When: auto-completion, short answers, mass tasks
  • Why: extremely fast and inexpensive, but limited depth
  • For: logic, mathematics, coding, and step-by-step reasoning
  • When: when clean derivation is more important than style or creativity
  • Why: strongly optimized for precise problem-solving and rational derivations
  • For: complex analyses, clean argumentation
  • When: strategy, concepts, in-depth explanations
  • Why: more precise and stable than GPT 4o
  • For: better quality while still maintaining high speed
  • When: when GPT 4o mini is too superficial
  • Why: a good middle ground between quality and performance
  • For: image generation
  • When: when images need to be generated
  • Why: OpenAI’s image generation model

  • For: maximum speed
  • When: quick follow-up questions, brainstorming, iterations
  • Why: very fast, good for overview and context
  • For: deep thinking and large contexts
  • When: complex documents, comparisons, analyses
  • Why: Google’s strong model for high quality on complex tasks
  • For: image analysis, image generation, image editing
  • When: text-to-image generation, image editing with prompts (image + text), and composition of multiple images
  • Why: Google’s image models integrated into CompanyGPT

  • For: highest complexity and in-depth analyses
  • When: strategic planning, extremely long contexts (1M), most difficult logic tasks
  • Why: strongest Anthropic model available in CompanyGPT for agent teams and parallel workflows (highest costs)
  • For: programming, complex text processing, and demanding all-round tasks (Recommended)
  • When: software development, code refactoring, deep text comprehension
  • Why: the sweet spot of the series. Opus-class performance at Sonnet pricing (1M context)
  • For: very fast processing with high logical precision
  • When: filtering large data volumes, UI-based chatbots, simple to medium tasks at scale
  • Why: very fast and cost-efficient (less reasoning than Sonnet/Opus)

These open-source models run in the STACKIT Cloud (EU/Germany) and are particularly suited for workloads with high requirements for data protection, data sovereignty, and internal compliance.

  • For: fast, general instruct tasks
  • When: internal assistants, standard QA, summaries with EU hosting
  • Why: efficient all-rounder with a good balance of quality, latency, and cost
  • For: multimodal embeddings (text/image) for search and retrieval
  • When: semantic search, RAG indexing, similarity search across mixed data
  • Why: specialized in vector representations rather than classic chat responses
  • For: multimodal high-end analysis with image and text understanding
  • When: visual document analysis, complex image-text tasks, demanding inference
  • Why: very powerful VL model for deeper understanding of multimodal content
  • For: lightweight open-source text tasks
  • When: cost-sensitive internal workflows with controllable infrastructure
  • Why: compact OSS approach for solid quality with lower resource requirements
  • For: fast inference for standard instruct use cases
  • When: chatbots, classification, extraction, and simple automations
  • Why: small model size with high efficiency and solid production maturity
  • For: more demanding generation and reasoning in the EU stack
  • When: more complex enterprise questions, longer responses, better level of detail
  • Why: 70B class provides significantly more quality than small models while retaining OSS flexibility
  • For: highest OSS quality for complex text tasks
  • When: when internally hosted top performance is required instead of maximum speed
  • Why: large open-source model for strong analytical and linguistic results
  • For: high-quality text embeddings for retrieval and ranking
  • When: vector search, document retrieval, relevance ranking in RAG pipelines
  • Why: proven embedding model for precise semantic search applications
  • For: versatile instruct tasks with good efficiency
  • When: internal assistants, knowledge work, structured text production
  • Why: strong mid-range between small fast and large expensive OSS models

  • “I just want a very good answer” → GPT 5.1 / GPT 5 Mini / Claude Sonnet 4.6
  • “It should be as fast and inexpensive as possible” → GPT 5 Nano / GPT 5.4 Mini / Gemini 2.5 Flash / Claude Haiku 4.5
  • “I want to program / write code” → GPT 5.2 Codex / Claude Sonnet 4.6
  • “It’s complicated or extremely important” → GPT 5.5 / GPT 5.4 / Claude Opus 4.6 / Gemini 2.5 Pro
  • “Data protection (EU/Germany) is mandatory” → STACKIT models (e.g., Llama 3.3 70B Instruct)
  • “I work in chat (dialogue)” → GPT 5.3 Chat / GPT 5.2 Chat / GPT 5.1 Chat
  • “I need images” → GPT Image 1.5 / Gemini Image Tools / Nano Banana