Skip to content

LLMosaic Platform Overview

LLMosaic is a cloud-native AI platform providing comprehensive API-driven services specifically optimized for developing Retrieval-Augmented Generation (RAG) applications. The platform integrates Large Language Models (LLMs), Embeddings Models, and a scalable PostgreSQL-based database solution, including Vector and Relational Database services, into a unified, pay-as-you-go service.

Platform Architecture Diagram

%% LLMosaic Platform Architecture Diagram (Vertical Layout, Larger Fonts)
%%{init: {"themeVariables": {"fontSize": "32px"}}}%%
graph TD

%% Client Layer
subgraph Client_Layer ["Client Layer"]
  CLIENT["Client Applications"]
end

%% API Layer
subgraph API_Layer ["LLMosaic API Layer"]
  LLM["LLM API (OpenAI Compatible)"]
  EMBED["Embeddings API"]
  DBAPI["Database API (PostgREST-inspired)"]
  ETL["Integrated ETL Pipelines"]
end

%% Database Layer
subgraph Database_Layer ["PostgreSQL Database Cluster"]
  VECDB["Vector Schemas (pgvector/vectorchord)"]
  RELDB["Relational & JSON Schemas"]
  CITUS["Citus Horizontal Scaling"]
  PATRONI["Patroni High Availability"]
end

%% Infrastructure Layer
subgraph Infra_Layer ["Infrastructure Layer"]
  GPU["NVIDIA GPUs (H100, A100)"]
  CPU["CPU Resources"]
  STORAGE["Storage Resources"]
end

%% External Sources
EXT_RDB["Enterprise Relational DB"]
EXT_JSON["Enterprise NoSQL JSON"]

%% Connections (Vertical alignment)
CLIENT --> LLM
CLIENT --> EMBED
CLIENT --> DBAPI
CLIENT --> ETL

EXT_RDB --> ETL
EXT_JSON --> ETL

ETL --> DBAPI
DBAPI --> VECDB
DBAPI --> RELDB

RELDB --> CITUS
VECDB --> CITUS
CITUS --> PATRONI

%% Infrastructure connections
LLM --> GPU
EMBED --> GPU
DBAPI --> CPU
ETL --> CPU
CITUS --> CPU
PATRONI --> CPU
VECDB --> STORAGE
RELDB --> STORAGE

%% Optimized Internal Data Pathways
RELDB -->|Query Results| EMBED
EMBED -->|Embeddings Storage| VECDB
VECDB -->|Vector Retrieval| LLM

%% Data Sovereignty Compliance
CITUS -->|"Geographic Data Restrictions"| STORAGE

Core Features

OpenAI-Compatible LLM and Embeddings APIs

LLMosaic offers OpenAI-compatible REST API endpoints, enabling seamless integration of popular LLMs and embedding models into RAG workflows.

Vector Database Integration (pgvector & vectorchord)

The LLMosaic vector database utilizes PostgreSQL extended with the pgvector and vectorchord extensions, providing efficient semantic search, storage, indexing, and querying of vector embeddings. Direct integration with embedding model outputs significantly reduces latency and enhances retrieval performance.

Relational and JSON Database Services (PostgreSQL-based)

Powered by PostgreSQL, the relational and JSON database APIs support structured data and document storage. These services accelerate the development of ETL pipelines, empowering rapid integration of enterprise data into advanced RAG scenarios.

Secure Multi-Tenant Database API (Inspired by PostgREST)

The database REST API closely aligns with the design philosophy of PostgREST, providing a secure and scalable multi-tenant environment. Tenant isolation is enforced via PostgreSQL schemas, with schema selection controlled through HTTP request headers and JWT-based authentication. Each tenant can define multiple schemas, enabling encapsulation of separate applications within a single tenant namespace.

High Availability & Horizontal Scalability (Citus & Patroni)

LLMosaic’s database infrastructure leverages Citus for horizontal scaling and Patroni for high availability (HA). Citus workers distribute database workloads across servers, data centers, and geographic regions, enabling seamless scalability. Patroni ensures robust, fault-tolerant operations and automatic failover capabilities.

Data Sovereignty Compliance

Advanced features allow the platform to restrict sensitive data geographically, assisting enterprises in meeting regional data sovereignty and compliance regulations.

Optimized Infrastructure with Colocated Resources (OneSource Cloud)

In partnership with OneSource Cloud, LLMosaic colocates state-of-the-art GPU (including NVIDIA H100 and A100), CPU, and storage resources within unified data center environments, significantly improving RAG application performance. By eliminating unnecessary network hops, response times and security are dramatically enhanced.

Enhanced RAG Workflow Efficiency

Integrated ETL for Enterprise Data

Existing enterprise data from relational databases and JSON data from NoSQL sources can be quickly loaded into the LLMosaic PostgreSQL cluster via integrated ETL pipelines. This capability allows enterprises to easily leverage existing data assets in their RAG workflows.

Optimized Embedding Pipelines

Database query results can be sent directly from the PostgreSQL database cluster to the embeddings API via optimized internal pathways within the data center. Generated embeddings are then stored back into the PostgreSQL cluster, indexed utilizing pgvector and vectorchord, significantly improving RAG application performance by minimizing latency.

Configurable Optimized Data Pathways

LLMosaic's optimized data pathways can be flexibly configured via its REST API. Additionally, the LLM, vector database, and relational database APIs function independently, enabling developers to rapidly adapt existing RAG applications to leverage LLMosaic's infrastructure. Upcoming integrations with popular frameworks like LLAMAIndex and LangChain will soon provide even easier onboarding.

Direct Embedding Vector Storage

Embedding model outputs are stored directly within the vector database, eliminating external data round trips and providing reduced latency and enhanced data security.

Efficient Embeddings Queries

Embedding queries executed against the vector database automatically forward results to the LLM API, streamlining retrieval-generation cycles and minimizing overhead.

Comprehensive ETL Capabilities

Comprehensive database APIs facilitate rapid extraction, transformation, and loading (ETL) of enterprise data into embedding and relational stores, accelerating RAG application deployment.

Flexible and Cost-Effective Deployment

LLMosaic’s pay-as-you-go model provides cost-effective and massively scalable database and API services. Optional private database cluster provisioning is available via subscription, offering dedicated resources tailored to specific enterprise requirements.