How streaming works
OpenCompress supports streaming via server-sent events (SSE), identical to the OpenAI API. Set"stream": true in your request body.
Important: Compression happens before the stream starts. The compression step adds a small fixed latency (~100-300ms), but once streaming begins, tokens arrive at the same speed as a direct API call.
Request
Response format
Each SSE event contains a JSON chunk:usage data with token counts. Billing is calculated after the stream completes.