> ## Documentation Index > Fetch the complete documentation index at: https://opencompress.mintlify.site/llms.txt > Use this file to discover all available pages before exploring further. # Introduction > The compression layer between your app and any LLM. Save 30-40% on every API call. ## What is OpenCompress? OpenCompress is a **drop-in middleware** that sits between your application and any LLM provider. It compresses your prompts before they reach the model, reducing token usage by 30-40% while preserving output quality. Get running in under 2 minutes. Change two lines of code. Understand the five-layer compression pipeline. Full OpenAI-compatible endpoint documentation. Pay-for-savings model. No savings = no charge. ## Why OpenCompress? Every LLM call you make contains **token waste** — filler words, redundant context, verbose formatting that models don't need. OpenCompress removes this waste before the request hits your provider. Fully compatible with OpenAI's Chat Completions API. Works with any SDK. GPT-4o, Claude, Gemini, Llama, DeepSeek — we compress for all of them. We charge 20% of what we save you. If we don't save you money, you pay nothing extra. Change `base_url` and `api_key`. Everything else stays the same. ## How much can you save? | Use Case | Typical Compression | Monthly Savings (at \$10K spend) | | ------------------------------------ | ---------------------- | -------------------------------- | | RAG / retrieval-augmented generation | 40-55% input reduction | $2,400 - $3,300 | | Agent tool calls | 30-45% input reduction | $1,800 - $2,700 | | Chat with long context | 35-50% input reduction | $2,100 - $3,000 | | Code generation | 25-35% input reduction | $1,500 - $2,100 | Savings vary by prompt structure. Prompts with more natural language and repeated patterns compress best. Try it in the [Playground](https://www.opencompress.ai/playground) with your actual prompts.