Quota checks run in parallel and complete as quickly as possible with minimal impact. Streaming pass-through adds no visible latency. The entire proxy is async end-to-end.

On-prem AI governance · Built for regulated enterprises

Every AI call, governed.
Every dollar, proven.

Oolyx is the neutral, on-premises control plane for AI. It governs every model and IDE your teams already use, scrubs PII before it ever leaves your network, and cuts LLM spend 25–40% — with every saved dollar traced to an auditable action.

→

Neutral by design. No model to sell, no cloud to fill — the one layer with no reason to inflate your bill.

→

Govern every model and IDE. One view of every LLM call — quotas, PII scrubbing, and tagging across providers, pipelines, chat, and Copilot/Cursor.

→

Prove every dollar. The Gauntlet backtests savings on your real traffic; the ledger traces each cut dollar to the action that made it.

→

Nothing leaves your network. 100% on-prem, no telemetry, air-gappable. Your infrastructure, your data, your rules.

Book a demo See how we prove savings

No SaaS · No telemetry · Runs in your VPC · SR 11-7 · HIPAA · EU AI Act

Oolyx — unified governance & spend, all sources

Total Spend

$42,380

this month

Governed

100%

of AI traffic

Requests

1.2M

across 4 providers

PII Blocked

847

SSN, CC, PHI

Spend by Provider — Last 30 Days

OpenAI

Anthropic

Gemini

Bedrock

IDE

The Problem

Your AI spend is invisible — and your regulators are watching

No centralized monitoring across chat UIs, IDEs, and production workflows. You get visibility into maybe one or two channels — never all of them, and certainly not IDE spending.

83%

organizations lack automated AI security controls

Shadow AI flows through GitHub Copilot, Cursor, and ChatGPT with zero oversight. Every unmonitored prompt is a potential compliance violation.

Kiteworks, "2025 AI Data Security and Compliance Risk Study"

27.4%

of corporate data sent to AI tools is sensitive

SSNs, account numbers, PHI — flowing to external APIs with no DLP controls. Only 17% of organizations have automated protections in place.

Cyberhaven via DNSFilter, 2025

+56%

AI privacy incident growth in 2024

SR 11-7 examiners are reviewing LLM usage now. The EU AI Act's high-risk regime phases in through 2027, and US state AI laws are following. Every unmonitored prompt is a growing audit exposure.

OECD AI Incident Tracker via Thunderbit, 2024

Near-zero

IDE cost visibility across the industry

Copilot and Cursor make thousands of LLM calls daily across your developer fleet — none of it visible in your AI cost dashboards.

Why now

The labs now own your AI tools.
You need a layer they don't control.

Every major AI coding tool now sits inside a company that also sells you the model behind it. Each one has a reason to lock you to its own model, keep token usage opaque, and route your traffic through its cloud. Oolyx has no model to sell — which is exactly why it can sit neutral above all of them.

Cursor

→ SpaceX / xAI

Windsurf

→ Google / Cognition

Copilot

→ Microsoft

Antigravity

→ Google

Oolyx is the one layer with no incentive to inflate your bill. Provider-agnostic, IDE-agnostic, on your own infrastructure — the neutral referee for every model and every tool your teams already use.

The proof

A savings report you can audit, line by line.

The calculator above is an estimate. This is what it looks like once it's real — Oolyx earns the 25–40% on your own traffic, then proves it. The Gauntlet backtests every optimization against your historical calls before a single one goes live — model switches, output trims, quota caps — so you see the savings before you trust them.

Then every dollar we cut is posted to a ledger and traced to the exact action that produced it. It's not an estimate. It's a record your finance team and your examiners can both reconcile.

Backtested on real traffic Every dollar attributed Examiner-ready export

SAVINGS LEDGER

May 2026 · verified

Output trim · summarization route

412k calls · -38% tokens

+$4,910

Model switch · Opus → Sonnet

quality held at 99.1%

+$3,240

Cache hit · repeated retrieval

28,904 redundant calls

+$2,118

Quota block · runaway IDE agent

capped before spend

+$1,472

PII redaction · pre-send scrub

847 entities blocked

risk averted

TOTAL SAVED

$11,740

How It Works

Everything flows through one place.

Your apps, your IDEs, your agents — all governed, all optimized, all in one dashboard.

YOUR APPS

●API calls

●Backend services

DEVELOPER IDES

●Copilot

●Cursor

●Claude Code

●Codex

AI AGENTS

●Tool calls

●MCP

●Sessions

CHAT UI USERS

●Internal chat interface

→

Oolyx Proxy

Quotas PII Scrub Token Trim Caching Analytics Playbooks

On your infrastructure · Nothing leaves

→

PROVIDERS

OpenAI

Anthropic

Gemini

AWS Bedrock

Converse API

IDE PROVIDERS

GitHub Copilot

Cursor

Claude Code

Codex

Run Docker

$ docker-compose up -d
✓ Proxy on :8787
✓ Connected to PostgreSQL
✓ Dashboard ready

Swap one URL

base_url="http://Oolyx:8787/v1"

Savings start

Month 1: Baseline (free)
Month 2: Optimize + enforce

→ Savings tracked automatically
→ Every dollar attributed

Capabilities

Everything your organization needs to govern AI.
Built in from day one.

LIVE

Workspaces & Access Control

LIVE

Quota Enforcement

LIVE

PII / PHI Protection

LIVE

Optimization Strategies

LIVE

Gauntlet + Savings Report

LIVE

Analytics Dashboard

COMING SOON

Playbooks

Custom Tagging

Fully customizable tagging system with no predefined schema. Tag requests by manager, organization, team, contract, environment, module — whatever hierarchy your business needs.

Workspaces

Organize LLM connectivity by tagset. Each workspace gets its own view of cost, usage, and governance — isolated but centrally managed. Multi-tenant with local admin control, roles (admin, member, observer), full audit trail.

Architecture

One proxy. Every provider. Zero code changes.

Oolyx sits between your applications and LLM providers as an on-premises reverse proxy. All traffic flows through one place — governed, optimized, and audited.

DEPLOYMENT

✓Docker / Podman / Kubernetes

✓docker-compose up -d

✓4 containers: proxy, dashboard, postgres, redis

✓Runs on AWS, GCP, Azure, or bare metal

INTEGRATION

✓Swap one URL — base_url → :8787/v1

✓OpenAI-compatible API endpoint

✓IDE traffic interception (transparent)

✓No code changes to your applications

DATA & SECURITY

✓No telemetry, no SaaS dependency

✓Your PostgreSQL — full audit trail

✓Air-gappable control plane

✓HTTPS-only outbound to LLM providers

View detailed architecture diagrams →

Why Oolyx

The neutral layer that governs AI, proves the savings, and never leaves your network.

Provider-agnostic No model agenda Zero code changes Air-gappable No telemetry Zero lock-in

Only Oolyx

IDE traffic, governed at last.

Copilot, Cursor, and Claude Code make thousands of calls a day that never hit your dashboards. Oolyx governs that shadow spend — quotas, PII scrubbing, attribution — without touching a developer's workflow.

Only Oolyx

Stop expensive calls before they start.

Hard enforcement guarantees limits are respected. Quota estimation predicts bloated requests before they're sent and blocks them — so a runaway agent can't run up a bill overnight.

Proof

Every dollar reconciled.

The Gauntlet backtests savings on your real traffic; the ledger attributes each one to an action. Finance and compliance read from the same record — no estimates, no hand-waving.

Stickiness

Playbooks — governed config-as-code.

Encode your coding conventions, model policies, and agent guardrails once, version them, and inject them into every IDE, agent, and API call. The layer your whole org standardizes on — and the one that's painful to remove.

Getting Started

From zero to governed in a day

Deploy once, see everything immediately. Governance follows at your pace.

Day 1

Deploy & Connect

Deploy Oolyx, swap URLs, see your first requests flowing through the dashboard immediately. Works on AWS, GCP, or Azure — private subnets, no public exposure.

Week 1–2

Organize & Govern

Tag your applications, set up workspaces, configure quotas and PII rules, enable IDE interception. Compliance layer goes live for your first regulated workflows.

Month 1

Optimize

Run model benchmarks, review prompt intelligence, identify waste patterns. Quantify LLM spend reduction with hard numbers from your own traffic data.

Pricing

Priced against what we save you — not what we meter.

Most LLM tools bill you per request or per log — so the more you use AI, the more you pay just to watch it. Oolyx is a flat on-prem license, sized to stay smaller than the savings it proves. There's no per-token fee. We don't meter your traffic — we can't even see it.

Annual license · Unlimited seats & requests · No telemetry · Cancel by swapping one URL back

Pilot

2 months, no commitment, no card

Start free pilot

Prove it on your own data:

✓Deployed on your infrastructure

✓Full analytics dashboard

✓Month 1 observe, Month 2 optimize

✓Gauntlet savings report

✓Keep the analytics if you walk away

Common questions

How is this different from competitors?+

How do I know the savings are real?+

Does it add latency?+

Is there really no SaaS component?+

What providers do you support?+

What does the free pilot include?+

How does pricing work?+

How long does deployment take?+

Most organizations are up and running the same day. It's a Docker container — docker-compose up -d gets the proxy, dashboard, PostgreSQL, and Redis running. Point your applications to the proxy URL and traffic starts flowing immediately.

Does it affect developer productivity?+

What if I want to stop using Oolyx?+

Have a question not listed here?

Book a demo

Ready to stop overpaying for AI?

Deploy on your infrastructure. See savings in the first month.

Start with a free 2-month pilot — no commitment, no credit card.

Book a demo Contact Us

hello@oolyx.ai

Book a demo — free 2-month pilot

Every AI call, governed.Every dollar, proven.

Your AI spend is invisible — and your regulators are watching

The labs now own your AI tools.You need a layer they don't control.

See exactly what you'd save.

A savings report you can audit, line by line.

Measurable results from day one

Everything flows through one place.

Run Docker

Swap one URL

Savings start

Everything your organization needs to govern AI.Built in from day one.

Workspaces & Access Control

Quota Enforcement

PII / PHI Protection

Optimization Strategies

Gauntlet + Savings Report

Analytics Dashboard

Playbooks

Custom Tagging

Workspaces

Quota Enforcement

PII / PHI Protection

Optimization Strategies

Gauntlet + Automated Savings Report

Analytics Dashboard

Playbooks

Built for regulated industries.

Your data never leaves.

On-Prem Infrastructure

Audit Trail

Security

One proxy. Every provider. Zero code changes.

The neutral layer that governs AI, proves the savings, and never leaves your network.

IDE traffic, governed at last.

Stop expensive calls before they start.

Every dollar reconciled.

Playbooks — governed config-as-code.

From zero to governed in a day

Deploy & Connect

Organize & Govern

Optimize

Priced against what we save you — not what we meter.

Common questions

Ready to stop overpaying for AI?

Every AI call, governed.
Every dollar, proven.

The labs now own your AI tools.
You need a layer they don't control.

Everything your organization needs to govern AI.
Built in from day one.