LLM Intelligence & Governance Platform

Stop Watching Your
AI Costs Climb.

Oolyx is a 100% on-premises reverse proxy that actively reduces your LLM spend — while giving you complete visibility, quota enforcement, and compliance across every provider and IDE.

See all your developer, pipeline, and chat usage in one place. Every LLM call — from every source — centralized in a single dashboard.
Change nothing — we optimize for you. Smart routing, caching, prompt engineering, and token reduction under the hood.
Powerful governance built in. Free-form tagging, customized quotas, PII scrubbing, and guardrails.
100% on-prem. Nothing leaves without your say. Your infrastructure, your data, your rules.

No SaaS dependency · No telemetry · Runs in Docker · Deploys in minutes

Built for regulated enterprises
Banking Healthcare Financial Services
The Problem

Your AI spend is invisible — and your regulators are watching

No centralized monitoring across chat UIs, IDEs, and production workflows. You get visibility into maybe one or two channels — never all of them, and certainly not IDE spending.

83%
organizations lack automated AI security controls

Shadow AI flows through GitHub Copilot, Cursor, and ChatGPT with zero oversight. Every unmonitored prompt is a potential compliance violation.

Kiteworks, "2025 AI Data Security and Compliance Risk Study"
27.4%
of corporate data sent to AI tools is sensitive

SSNs, account numbers, PHI — flowing to external APIs with no DLP controls. Only 17% of organizations have automated protections in place.

Cyberhaven via DNSFilter, 2025
+56%
AI privacy incident growth in 2024

EU AI Act high-risk rules enforceable August 2026. SR 11-7 examiners reviewing LLM usage now. Colorado AI Act enforcement begins June 2026.

OECD AI Incident Tracker via Thunderbit, 2024
Near-zero
IDE cost visibility across the industry

Copilot and Cursor make thousands of LLM calls daily across your developer fleet — none of it visible in your AI cost dashboards.

ROI Calculator

See exactly what you'd save.

Adjust the sliders below to model your actual savings potential.

Your primary model:
Your monthly spend
$1k – $500k
$50,000
/month
TOKEN REDUCTION
10%
-$5,000
/month
UNNECESSARY API CALL PREVENTION
10%
-$4,500
/month
SMART MODEL SWITCHING
Route a portion of calls to
20%
-$3,240
/month
New monthly spend
$37,260
You save
$12,740/mo
New annual spend
$447,120
You save
$152,880/yr
25.5%
total saved

Oolyx does this. Automatically. On your infrastructure.

See How It Works
How It Works

Everything flows through one place.

Your apps, your IDEs, your agents — all governed, all optimized, all in one dashboard.

YOUR APPS
API calls
Backend services
DEVELOPER IDES
Copilot
Cursor
Claude Code
Codex
AI AGENTS
Tool calls
MCP
Sessions
CHAT UI USERS
Internal chat interface
Oolyx
Oolyx Proxy
Quotas PII Scrub Token Trim Caching Analytics Playbooks
On your infrastructure · Nothing leaves
PROVIDERS
OpenAI
Anthropic
Gemini
AWS Bedrock
Converse API
IDE PROVIDERS
GitHub Copilot
Cursor
Claude Code
Codex
01

Run Docker

$ docker-compose up -d
✓ Proxy on :8787
✓ Connected to PostgreSQL
✓ Dashboard ready
02

Swap one URL

base_url="http://Oolyx:8787/v1"
  
03

Savings start

Month 1: Baseline (free)
Month 2: Optimize + enforce

→ Savings tracked automatically
→ Every dollar attributed
Capabilities

Everything your organization needs to govern AI.
Built in from day one.

LIVE

Workspaces & Access Control

LIVE

Quota Enforcement

LIVE

PII / PHI Protection

LIVE

Optimization Strategies

LIVE

Gauntlet + Savings Report

LIVE

Analytics Dashboard

COMING SOON

Playbooks

Custom Tagging

Fully customizable tagging system with no predefined schema. Tag requests by manager, organization, team, contract, environment, module — whatever hierarchy your business needs.

Workspaces

Organize LLM connectivity by tagset. Each workspace gets its own view of cost, usage, and governance — isolated but centrally managed. Multi-tenant with local admin control, roles (admin, member, observer), full audit trail.

Compliance & Security

Built for regulated industries.

Your data never leaves.

On-premises deployment satisfies the strictest data residency requirements. Full audit trails ready for your examiners — no cloud dependency, no telemetry.

On-Prem Infrastructure

Pure Python + PostgreSQL
Docker / K8s / bare metal
Connects to your RDS
No SaaS, no telemetry

Audit Trail

Every quota change logged
Every blocked request recorded
Full request history with tags
User identity on every action

Security

Server-side sessions (not JWT)
bcrypt password hashing
PII/PHI scrubbing
Role-based access control
Architecture

One proxy. Every provider. Zero code changes.

Oolyx sits between your applications and LLM providers as an on-premises reverse proxy. All traffic flows through one place — governed, optimized, and audited.

Client sources Apps & APIs Backend services Developer IDEs Copilot · Cursor · Claude Code AI Agents MCP · Pipelines · Sessions Chat Users Internal interface Oolyx Proxy On-premises · port 8787 Quota enforcement PII / PHI scrubbing Smart model routing Semantic caching Audit log → PostgreSQL LLM providers OpenAI Anthropic Gemini AWS Bedrock Azure OpenAI Cohere Local / Private (Ollama · vLLM) Self-hosted models All traffic flows through Oolyx · nothing bypasses governance · 100% on-premises
DEPLOYMENT
Docker / Podman / Kubernetes
docker-compose up -d
4 containers: proxy, dashboard, postgres, redis
Runs on AWS, GCP, Azure, or bare metal
INTEGRATION
Swap one URL — base_url → :8787/v1
OpenAI-compatible API endpoint
IDE traffic interception (transparent)
No code changes to your applications
DATA & SECURITY
No telemetry, no SaaS dependency
Your PostgreSQL — full audit trail
Air-gappable control plane
HTTPS-only outbound to LLM providers
View detailed architecture diagrams →
Why Oolyx

The only platform that actively reduces costs, runs on-prem, and proves every dollar saved.

Provider-agnostic Tag anything Zero code changes Lightweight Self-hosted Zero lock-in
Only Oolyx

IDE traffic, governed and optimized.

Oolyx intercepts Copilot, Cursor, and Claude Code traffic — enforcing quotas, scrubbing PII, and actively reducing hidden token bloat under the hood, without touching your code or workflows.

Only Oolyx

Stop expensive calls before they start.

Hard enforcement mode guarantees limits are respected. Intelligent quota estimation techniques predict bloated requests before they occur and blocks them.

Architecture

100% On-Prem, Radically Simple

Python + PostgreSQL. No SaaS dependency. No telemetry. No vendor audit surface. Zero lock-in — swap one URL to start, swap it back to stop.

Governance

Layered Quota Governance

Comprehensive, intuitive quota system that works with your custom tagging system. Full audit trail of every request, block, and admin action in your PostgreSQL.

Getting Started

From zero to governed in a day

Deploy once, see everything immediately. Governance follows at your pace.

Day 1

Deploy & Connect

Deploy Oolyx, swap URLs, see your first requests flowing through the dashboard immediately. Works on AWS, GCP, or Azure — private subnets, no public exposure.

Week 1–2

Organize & Govern

Tag your applications, set up workspaces, configure quotas and PII rules, enable IDE interception. Compliance layer goes live for your first regulated workflows.

Month 1

Optimize

Run model benchmarks, review prompt intelligence, identify waste patterns. Quantify LLM spend reduction with hard numbers from your own traffic data.

Start with a free 2-month pilot.

MONTH 1 — OBSERVE
Observes Your Spending
Deploy on your infrastructure
Log all LLM + IDE traffic
Full dashboard access
Build your cost baseline
MONTH 2 — OPTIMIZE
Reduces Your Spending
Run Savings Report on your data
Turn on quotas, caps, trimming
Every saved dollar tracked + attributed
PDF report for your leadership

After the pilot, we can negotiate a rate that works for you.

FAQ

Common questions

How is this different from competitors?+
How do I know the savings are real?+
Does it add latency?+
Is there really no SaaS component?+
What providers do you support?+

Ready to optimize your AI spend?

Optimize. Illuminate. Save.