> ## Documentation Index
> Fetch the complete documentation index at: https://opencompress.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> The compression layer between your app and any LLM. Save 30-40% on every API call.

## What is OpenCompress?

OpenCompress is a **drop-in middleware** that sits between your application and any LLM provider. It compresses your prompts before they reach the model, reducing token usage by 30-40% while preserving output quality.

<CardGroup cols={2}>
  <Card title="Quick Start" icon="bolt" href="/quickstart">
    Get running in under 2 minutes. Change two lines of code.
  </Card>

  <Card title="How It Works" icon="gear" href="/how-it-works">
    Understand the five-layer compression pipeline.
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/overview">
    Full OpenAI-compatible endpoint documentation.
  </Card>

  <Card title="Pricing" icon="coins" href="/billing/pricing">
    Pay-for-savings model. No savings = no charge.
  </Card>
</CardGroup>

## Why OpenCompress?

Every LLM call you make contains **token waste** — filler words, redundant context, verbose formatting that models don't need. OpenCompress removes this waste before the request hits your provider.

<Steps>
  <Step title="Same API format">
    Fully compatible with OpenAI's Chat Completions API. Works with any SDK.
  </Step>

  <Step title="Any model, any provider">
    GPT-4o, Claude, Gemini, Llama, DeepSeek — we compress for all of them.
  </Step>

  <Step title="You keep 80%">
    We charge 20% of what we save you. If we don't save you money, you pay nothing extra.
  </Step>

  <Step title="Two lines to integrate">
    Change `base_url` and `api_key`. Everything else stays the same.
  </Step>
</Steps>

## How much can you save?

| Use Case                             | Typical Compression    | Monthly Savings (at \$10K spend) |
| ------------------------------------ | ---------------------- | -------------------------------- |
| RAG / retrieval-augmented generation | 40-55% input reduction | $2,400 - $3,300                  |
| Agent tool calls                     | 30-45% input reduction | $1,800 - $2,700                  |
| Chat with long context               | 35-50% input reduction | $2,100 - $3,000                  |
| Code generation                      | 25-35% input reduction | $1,500 - $2,100                  |

<Tip>
  Savings vary by prompt structure. Prompts with more natural language and repeated patterns compress best. Try it in the [Playground](https://www.opencompress.ai/playground) with your actual prompts.
</Tip>
