Internal AI Knowledge Platform

Problem

Information was spread across SharePoint, emails, documents, and internal systems. Staff wasted large amounts of time searching for answers that existed somewhere in the organisation but were effectively inaccessible.

The technical support and operations teams were hit hardest. They needed fast, accurate answers from a knowledge base that spanned years of accumulated documentation.

Approach

Built a retrieval-augmented generation platform. The system ingests documents, creates embeddings, stores them in a vector database, retrieves relevant context at query time, and generates answers with source attribution.

Pipeline

Ingestion. Python workers pull documents from SharePoint, file shares, and internal systems on a schedule
Processing. Documents are chunked, cleaned, and embedded using sentence transformers
Storage. Embeddings stored in PostgreSQL with pgvector for efficient similarity search
Retrieval. User queries are embedded and matched against the knowledge base using hybrid search (semantic + keyword)
Generation. LLM synthesises an answer from retrieved context, with citations back to source documents

Architecture

Ingestion layer: Python workers with connectors for SharePoint, file systems, and internal APIs

Vector storage: PostgreSQL + pgvector. Chosen for data residency requirements and existing infrastructure

LLM reasoning layer: generates answers from retrieved context with source attribution

Deployment: secure internal deployment behind existing authentication

Results

Significantly reduced information retrieval time across support and operations
Improved operational efficiency. Staff get answers in seconds rather than searching for minutes
Natural language querying. Non-technical users can search internal data without knowing where to look
Source attribution. Every answer links back to the original document, building trust in the system

Lessons

The biggest win wasn’t the AI. It was making information findable. Most of the knowledge already existed. It was just buried in places nobody could search effectively.