9series
OmniHub
Enterprise Knowledge Infrastructure

Turn institutional knowledge
into operational leverage

OmniHub ingests your documents, policies, and SOPs — then serves citation-backed answers to your teams and customers through any channel. No curation required. First answers within hours of deployment.

Built for organizations where knowledge accuracy, access control, and response time directly impact operations

— Brand X Knowledge Base

What are the brand guidelines for social media campaigns?

The knowledge problem is an operations problem.

20–30%
of support time spent locating — not solving — answers
5+
disconnected systems where policy and process docs live
47%
of institutional knowledge lost within 12 months of attrition
< 5%
adoption rate for tools that require manual tagging

From document ingestion to cited, auditable answers

A 6-stage pipeline handles parsing, chunking, embedding, retrieval, and generation — so your team doesn't manage any of it. Every answer traces back to a source document.

01
Ingest
Drive · Crawl · URL
02
Parse
Extract text
03
Chunk
Structure
04
Embed
Vectorize
05
Retrieve
RRF + citations
06
Answer
Cited response
API Wrapper
  • Thin layer over a single LLM
  • Prompt in → text out
  • No retrieval logic
  • Model lock-in
  • Prompt-level 'safety'
OmniHub Harness
  • Full orchestration pipeline
  • Intent → Retrieve → Rank → Generate → Cite
  • 6-stage retrieval engine
  • Swap models per slot
  • Structural security at every layer

Platform capabilities. Not feature theater.

Nine capabilities built for production: hierarchy, ingestion, citations, retrieval security, sync, observability, embeddable UI, APIs, and model choice — not slide filler.

01

Org-Aware Department Hierarchy

Model your org chart with unlimited nesting depth. Permissions cascade downward. When departments restructure, access rights follow — no manual cleanup.

02

Format-Agnostic Ingestion

PDF, DOCX, XLSX, HTML, Markdown, images with OCR, URLs, Google Drive folders. Drop files in — parsing, chunking, and indexing happen automatically. Zero manual tagging.

03

Citation-Backed Conversational AI

Every answer includes traceable source references. Users see exactly which document, which section, which paragraph. Hallucination is structurally constrained, not prompt-managed.

04

Structural Access Control

Permissions enforced at the retrieval layer, not the UI layer. Two users asking the same question get different answers based on their department access — guaranteed by architecture.

05

Live Source Sync

Native Google Drive integration with change detection. When source docs update, the knowledge layer re-indexes automatically. SharePoint, Notion, Confluence on the roadmap.

06

Retrieval Quality Observability

Synthetic benchmarks run nightly. Live telemetry tracks citation engagement, no-match rates, and user feedback. You see degradation before your users report it.

07

Embeddable Widget — 1 Script Tag

Drop a script tag into any internal portal, client dashboard, or help center. Floating or inline mode with full brand customization. Under 150KB total payload.

08

Versioned REST API

Every platform capability exposed programmatically. Webhooks for pipeline events. Compatible with Zapier, n8n, and custom workflow automation.

09

Swap Models Without Rearchitecting

Generation, classification, embedding, and reranking are independently configurable per org. Move from GPT to Claude to open-source without touching your integration.

What happens between the question and the answer. Every time.

Most RAG systems do a single embedding lookup and hope for the best. OmniHub runs a 6-stage pipeline on every query — each stage exists because skipping it produces a measurably worse answer.

Tracing query: "What's our return policy for international orders?"
< 1ms

Intent Classification

Classifies query type and selects the retrieval strategy before any search runs.

OmniHub Pipeline Output
intent: "policy_lookup"
filters: [
"geographic_scope: international",
"document_type: policy"
]
strategy: "high_precision, multi_document_synthesis"
Naive RAG (Single-Pass)

Skipped — query sent directly to embedding model as-is.

Why this stage matters:

Without intent classification, the retrieval engine doesn't know whether to search broadly or precisely. A policy lookup needs exact matches; a troubleshooting query needs fuzzy search. Treating them identically degrades both.

~80ms

Query Expansion

Generates multiple search perspectives from the original question — synonyms, domain terms, and hypothetical answer patterns.

OmniHub Pipeline Output

4 search variants generated:

ORIG return policy international orders
VAR1 refund process overseas shipping
VAR2 cross-border exchange conditions
VAR3 international order money back guarantee timeline
Naive RAG (Single-Pass)

Single embedding of original question. Misses 'refund', 'exchange', 'overseas', 'cross-border' — all present in source docs but not in the user's words.

Why this stage matters:

Users rarely phrase questions the way documentation is written. A single embedding misses synonyms and domain terminology. Expansion bridges that vocabulary gap.

~120ms

Dual-Path Retrieval

Semantic similarity and keyword precision run in parallel. Results fused using Reciprocal Rank Fusion.

OmniHub Pipeline Output
Semantic path (3 results):
0.91 International returns must be initiated within 45 days… Returns Policy v3.2
0.87 Customers outside the domestic market are eligible for… Global Shipping Guide
0.84 Refund processing for cross-border orders takes 10-15… Finance SOP
Keyword path (2 results):
0.94 Return policy: international orders are subject to… Returns Policy v3.2
0.82 Customs duties on returned international shipments… Customs & Duties FAQ
→ Fused via Reciprocal Rank Fusion into unified ranked list
Naive RAG (Single-Pass)

Single vector search returns top-5. Three of the five are from the same document section — the customs FAQ — because it happens to have the densest keyword overlap. Misses the actual return policy document entirely.

Why this stage matters:

Semantic search understands meaning but misses exact terms. Keyword search catches exact matches but misses paraphrases. Running both and fusing results catches what either alone would miss.

~60ms

Diversity Reranking

Removes near-duplicate passages and ensures the final set covers distinct information facets.

OmniHub Pipeline Output
Before reranking (clustered):
DUPLICATE Returns Policy v3.2 §4.1
DUPLICATE Returns Policy v3.2 §4.2
DUPLICATE Returns Policy v3.2 §4.3
Global Shipping Guide §7
Customs FAQ #12
After reranking (diverse):
UNIQUE Returns Policy v3.2 §4.1 — return window
UNIQUE Global Shipping Guide §7 — shipping costs
UNIQUE Customs FAQ #12 — duty reclaim
UNIQUE Finance SOP §3 — refund timeline
Naive RAG (Single-Pass)

No deduplication. Three passages from §4.1-4.3 all say essentially the same thing. The LLM gets redundant context and misses the duty reclaim and refund timeline information entirely.

Why this stage matters:

Without diversity reranking, the top-k results cluster around whichever document section has the highest raw relevance — even if those passages are near-identical. The answer becomes deep on one subtopic and blind to others.

~5ms

Context Assembly

Orders passages based on how LLMs actually allocate attention — critical evidence placed in high-attention positions.

OmniHub Pipeline Output
Position 1: Return window & eligibility — Returns Policy v3.2 High attention
Position 2: Refund timeline — Finance SOP §3
Position 3: Shipping cost responsibility — Global Shipping Guide
Position 4: Customs duty reclaim process — Customs FAQ High attention
Position 3 is the "lost in the middle" zone — LLMs pay least attention here. Non-critical info goes here intentionally.
Naive RAG (Single-Pass)

Passages dumped in retrieval-score order. Critical refund timeline information ends up in position 3 — the 'lost in the middle' zone where LLMs demonstrably pay less attention. The final answer omits refund timelines.

Why this stage matters:

Research shows LLMs pay most attention to the first and last items in context. If your most important passage lands in the middle, the model is likely to ignore it — even though it was retrieved correctly.

~10ms

Citation Mapping

Every factual claim in the response is traced back to a specific source passage, section, and document.

OmniHub Pipeline Output
"International orders can be returned within 45 days 1 of delivery. The customer bears return shipping costs 2, but customs duties are reclaimable through the carrier 3. Refunds are processed within 10–15 business days 4 of warehouse receipt."
Source trace:
  • 1Returns Policy v3.2 — §4.1, line 12
  • 2Global Shipping Guide — §7, paragraph 3
  • 3Customs & Duties FAQ — Q12
  • 4Finance SOP — §3.2, table row 4
Naive RAG (Single-Pass)

No citations. The LLM outputs '30-day return window' — which is the domestic policy, not international. No way for the user to verify. No way for the team to catch it.

Why this stage matters:

Without citation mapping, you can't distinguish a correct answer from a confident hallucination. Citations make every response auditable — and make errors immediately identifiable.

Total pipeline latency: ~280ms
6 stages, 4 source documents, 4 traceable citations — before the LLM generates a single token.
6
Processing stages
4
Query variants
5
Retrieval paths
4
Unique sources

Operational visibility for the people responsible.

Pipeline status, ingestion health, document coverage, model spend — your ops team sees what is working and what needs attention without asking engineering.

Departments
Users
Content Pipeline
Integrations
Analytics
AI Config
Spending
Settings
Notifications
Content Pipeline+ Upload
brand-guidelines.pdf
Indexed
onboarding-handbook.docx
Chunking…
api-reference.md
Parsing…
sales-playbook.pdf
Queued

No model lock-in. No rearchitecting.

Four independent AI slots — generation, fast processing, embedding, reranking — each configurable per org.

Generation
Claude · GPT-4.1 · Opus
Fast Processing
Haiku · GPT-4o Mini
Embedding
OpenAI · Cohere Multilingual
Reranking
Cohere Rerank · Optional

Five concentric isolation layers.

Security enforced at the architecture level — not policy, not prompting, not access control lists alone.

Authentication SAML 2.0 / OIDC SSO, MFA enforcement, short-lived JWT tokens, automatic credential rotation ENFORCED
Tenant Isolation Separate vector stores, object storage, and encryption keys per organization — not namespace separation, physical isolation ENFORCED
Permission-Scoped Retrieval Access control enforced at the retrieval layer. Two users, same query, different results — guaranteed by the pipeline, not the frontend ENFORCED
Credential Vault Third-party API keys AES-256 encrypted with per-tenant derived keys. Never returned after storage. Rotatable without downtime ENFORCED
Prompt Injection Defense Source content structurally separated from instructions. Retrieved passages cannot alter system behavior regardless of content ENFORCED

You see degradation before your users report it.

Tracked
Retrieval Precision
Verified
Citation Accuracy
Measured
User Thumbs-Up Rate
Monitored
No-Match Rate
Automated
Nightly Benchmarks

Deploy conversational AI without a frontend rewrite.

Single script tag. Works inside any existing portal, dashboard, or help center. Inherits your branding. Under 150KB total payload. No iframe, no performance penalty.

<!-- Add OmniHub to any page --> 
<script src="https://cdn.omnihub.io/widget.js" data-org-key="wk_abc123" async defer ></script>
ShopBrand Products · Support · Account

Help Center

Find answers about orders, returns, and account management.

ShopBrand Support Powered by OmniHub
How do I return an item?
You can initiate a return within 30 days of delivery. Go to My Orders → "Start Return." 1

Where this runs in production

Utilities, telecom, internal ops, and CX teams run OmniHub where cited answers, access control, and channel coverage matter.

Utilities & Government

Citizen-Facing Portal Automation

Deflect billing, outage, and policy inquiries from call centers. Agents handle escalations with full AI-gathered context — not cold transfers. Designed for 24/7 uptime across web and messaging.

Telecom & Financial Services

Tier-1 Support Deflection

Resolve plan, billing, and account questions before they reach a human. Sentiment-aware routing escalates frustrated customers immediately. Operational across web, WhatsApp, and email from day one.

HR & Internal Operations

Policy and Process Self-Service

Surface answers from HR manuals, IT runbooks, and compliance docs without filing a ticket. Multilingual. Department-scoped — engineering sees engineering docs, finance sees finance.

Customer Experience Teams

Omnichannel Knowledge Layer

Single knowledge backend powering every customer touchpoint. When agents do engage, they see the same sources the AI used — ensuring consistency between automated and human responses.

Connects to your support stack

WhatsAppWeb ChatZendeskFreshdeskREST API

What changes operationally

Same teams and channels — different outcomes when answers are indexed, cited, and permission-aware end to end.

Dimension Without OmniHub With OmniHub
Response source Agent memory, tribal knowledge Indexed docs with traceable citations
Time to answer Minutes to hours (agent dependent) Sub-second first token, < 4s full answer
Channels Single channel, one language Web, WhatsApp, email — 50+ languages
Context on handover Lost — customer repeats everything Full transcript + retrieved sources forwarded
Scales with volume Linear headcount increase Marginal cost per query, not per agent
Knowledge freshness Whenever someone updates the wiki Auto-reindex on source doc change
Department Hierarchy
Acme Corp148 users
Engineering42 users
Frontend12 users
Backend18 users
Marketing28 users
Finance14 users

Same question. Different clearance. Different answer. By design.

Access cascades through your org hierarchy at the retrieval layer. Finance data never surfaces for engineering queries.

Your infrastructure. Your rules.

Multi-tenant SaaS, dedicated cloud, or on-prem — same pipeline and APIs; you choose where data lives and who operates the boundary.

Single-Tenant Cloud

Dedicated instance in your VPC or ours. Full network control.

  • Data residency compliance
  • Custom network and firewall policies
  • Managed upgrades by 9series

On-Premise / Air-Gapped

Runs entirely inside your data center. Nothing leaves your perimeter.

  • Air-gapped deployment supported
  • Regulated industry ready (BFSI, Govt)
  • Custom SLA and support terms

Frequently Asked Questions

Learn more about OmniHub enterprise knowledge infrastructure.

Most deployments reach first working answers within a day. Connect document sources, configure department hierarchy and permissions, and the pipeline handles ingestion automatically. No data science team required on your side.

You own your data — full stop. Documents are processed and stored in isolated tenant environments. We don't use customer data for model training. On-premise deployment is available if data cannot leave your infrastructure.

Structurally, not through prompting. Answers are constrained to retrieved source passages. Every claim includes a citation reference. When confidence is low, the system says so explicitly and offers human handover — it doesn't guess.

Yes. Generation, embedding, classification, and reranking are independently configurable. Move between Claude, GPT, or open-source models per slot without changing your integration. No vendor lock-in at the model layer.

SOC 2 Type II in progress. Infrastructure supports GDPR data residency requirements. Tenant isolation is enforced at the storage, retrieval, and API layer independently. We support SSO (SAML/OIDC), MFA, and role-based access controls.

Pre-built connectors for Zendesk and Freshdesk. Webhook-based integration for any ticketing system. Conversations and context transfer via API. The widget embeds into any web interface with a single script tag.

Minimal. Source docs re-index automatically when they change. Quality benchmarks run nightly and surface issues proactively. Your team manages what to ingest and who has access — the pipeline handles everything else.

< 1s
First token latency
50+
Languages supported
12+
Ingestible file formats
0
Cross-tenant data leakage

The questions your team will ask.
Answered before the call.

Exit strategy, engineering lift, scale, model outages, security review, and TCO — laid out for platform and security stakeholders.

What's the exit strategy?
Your documents are always exportable. Embeddings and index metadata can be bulk-exported via API. If you leave, your data leaves with you — we retain nothing post-termination.
What engineering time does this require from us?
Integration is a single API key or script tag. No ML engineers, no data pipeline work, no model hosting. Ongoing maintenance is document uploads and permission changes — both handled in the admin console.
How does this handle scale?
Retrieval is horizontally scalable — concurrent queries do not degrade latency. Document ingestion is async and queued. We've designed for organizations processing thousands of documents across dozens of departments.
What breaks when the AI model has an outage?
Model providers are abstracted behind a routing layer. If your primary generation model goes down, failover to a secondary is configurable. Retrieval and citation continue to function regardless of model availability.
Can our security team audit this?
Architecture documentation available under NDA. We support penetration testing by your team against your tenant. SOC 2 Type II in progress. On-premise deployment gives your team full infrastructure visibility.
What's the total cost of ownership beyond licensing?
No ML team, no model hosting, no vector database management, no curation staff. Your cost is the platform subscription plus underlying model API usage — which you can see and control per-slot in the admin console.

Start with your hardest knowledge problem

Bring your messiest document set. We will run a working pilot against your actual content — so you evaluate real retrieval quality, not a demo dataset.

Pilot runs on your data, not oursWorking answers within 48 hoursFull API access from day oneNo long-term commitment required
Request a Technical Walkthrough