CompanyRAG
With the CompanyRAG add-on, you can make large quantities of documents available to your agents via the MCP interface. The documents can be indexed from different sources and integrated using the RAG - Retrieval Augmented Generation methodology. There are no limitations on the length of individual documents or the total number of documents.
Indexing can be implemented once or on a recurring basis, depending on your use case.
CompanyRAG User Interface
Section titled “CompanyRAG User Interface”The user interface allows you to add individual and multiple files or entire data sources for indexing. The interface is divided into:
An overview of all files that have been added to the service. The overview includes:
- Name: Document designation of the document (partially abbreviated - hover function for full display)
- Collection: The collection that the file has been assigned to
- Size: File size
- Status: Status of the associated job
- Completed: Document has been successfully indexed
- Pending: Indexing job is still pending
- Failed: Indexing was not successful
- Last indexed: Date and time of the last completed indexing job
- Actions:
- Re-index: Creates a new indexing job
- Delete: Deletes the file from the service including associated jobs and indexed form
Collections
Section titled “Collections”Collections are storage locations and allow you to organize documents and permissions.
Create Collections
Section titled “Create Collections”
In addition to name and description, visibility can also be set:
- Private: Only you can access this collection and associated documents. However, you can add further shares later.
- Public: Everyone can see the collection and display files from it.
All collections you own appear under the My tab. Collections specifically shared with you (Admin or Viewer role) are shown under Shared with me. Under Public, all publicly visible collections are displayed.
Collection Actions
Section titled “Collection Actions”
Share Edit Delete
- Share:
- Type: Share with individual users, an Entra group, or the entire organization.
- Role: Viewer (collection and associated documents can be viewed) or Admin (collection and associated documents can be edited)
After confirmation via
Add Share, the share is granted and added to theCurrent Shareslist.
- Edit: Change the name and description of the collection.
- Delete: Delete the collection.
Sources
Section titled “Sources”Different sources for automatic synchronization of data into companyRAG.
Webcrawl
Section titled “Webcrawl”Individual web pages, lists (in CSV files), and entire website sitemaps can be crawled and indexed. You can set how frequently the crawling should be repeated. For individual pages, the entire page is always re-indexed. For sitemap crawls, only the difference based on the last crawl and the modification date is taken into account.

For sitemap crawls, the Path Prefix can be used to determine which pages are indexed. For example, /news/ would only index all pages that have /news in the path.

Connect SharePoint
Section titled “Connect SharePoint”Click the ”+ Connect SharePoint” button to start the selection.

- Select website or team: Select the SharePoint website or team
- Select library: Libraries of the selected website/team
- Browse folder: Select the folder and file types for synchronization. Set the collection where the files should be synchronized to.
After connection, the folder appears under “All Sources” as Active. Synchronization must be initiated once via the “Synchronize now” button. Subsequently, the connected documents are added as synchronization jobs and future contents of the folder are automatically synchronized.
Source Actions
Section titled “Source Actions”- Synchronize now: Start initial/manual synchronization
- Pause/Resume: Deactivate or reactivate selected sources
- Delete: Remove the data source - already synchronized files remain in the collection
Display indexing jobs and status
Status:
- Pending: Document will be indexed soon
- Running: Document is currently being indexed
- Completed: Document has been indexed
- Failed: Document could not be indexed. Further information can be found in the “Error” column.
Actions:
- Delete: Deletes the job from the queue or history. Status Completed → Indexed file remains. Status Pending → File will not be indexed. Running processes cannot be deleted.
- Retry
Upload
Section titled “Upload”Upload individual and multiple files manually for indexing.
Supported formats: PDF, DOCX, DOC, TXT, MD, RTF, HTML, HTM, XML, CSV, JSON, EML, XLSX, XLS, PPTX, PPT
CompanyRAG in CompanyGPT
Section titled “CompanyRAG in CompanyGPT”Via the MCP Server “ai-search”, the RAG service can be connected with CompanyGPT to search indexed documents across all (available to the user) collections (see Similarity Search)
The following specialized search tools for the RAG Collection – from semantic search to document retrieval to metadata filtering – are available:
-
search_content: Semantic similarity search for general queries. Default choice for most user questions. Required parameters: query (search text), source (technical name of the collection) Optional: topK (number of results: default 5, max 20)
-
find_content_by_source: Retrieve all content from a specific document. Use for queries about individual documents (e.g., “What’s in documentation.md?”). Required parameters: source (document name), collection (technical name of the collection)
-
find_content_by_metadata: Filter content by metadata attributes. Use for filtered results (e.g., “All urgent tasks from 2026”). Required parameters: filter (JSON object with operators $and, $or, $not), collection (technical name of the collection)
The MCP server can be added to an agent for easier use.
A step-by-step guide for setting up a search agent with a suitable instruction can be found in the tutorial Using CompanyRAG in CompanyGPT.