Skip to main content

Local Service

The local service keeps one warm ContextCrumb model process running behind a localhost HTTP API. Use it when an application, agent, editor, MCP server, or batch job makes repeated compression calls.

Use the service when process startup or model loading would otherwise happen too often. For a single script that already stays alive, prefer ContextCompressor directly.

The service accepts text directly, so it works for prompts, conversation history chunks, subagent reports, and tool-output fields as well as files.

Install the service dependencies:

pip install "contextcrumb[serve]"

Start

contextcrumb service start

By default it starts on:

http://127.0.0.1:8765

Start lazily, so the HTTP server starts before the model is loaded:

contextcrumb service start --lazy-load

Allow local file reads under a specific root:

contextcrumb service start --allow-root docs

Disable file reads entirely:

contextcrumb service start --disable-file-reads

File reads are limited to allowed roots. This matters when an agent can call /compress_file; only expose directories that the agent should read.

Use From The CLI

contextcrumb load notes.md --use-service
contextcrumb batch docs --glob "*.md" --out compressed-docs --use-service

Set a default service URL:

CONTEXTCRUMB_SERVICE_URL=http://127.0.0.1:8765 contextcrumb load notes.md --use-service

Status And Stop

contextcrumb service status
contextcrumb service stop

HTTP Endpoints

GET /health

Returns service state:

{
"ok": true,
"model_loaded": true,
"backend": "onnx",
"file_reads_enabled": true,
"allowed_file_roots": ["C:/project/docs"]
}

POST /compress

curl -X POST http://127.0.0.1:8765/compress \
-H "Content-Type: application/json" \
-d '{"text":"Long prose-heavy text.","target_keep_ratio":0.5}'

POST /compress_file

curl -X POST http://127.0.0.1:8765/compress_file \
-H "Content-Type: application/json" \
-d '{"path":"docs/notes.md","target_keep_ratio":0.5}'

The file path must be under an allowed root unless file reads are disabled. The endpoint uses configured content_mode by default. In auto mode, supported source files preserve executable code exactly and compress only comments/docstrings:

curl -X POST http://127.0.0.1:8765/compress_file \
-H "Content-Type: application/json" \
-d '{"path":"src/app.py","content_mode":"code-comments"}'

Unsupported syntax-sensitive file types are refused by default. Use "force": true only for exploratory compression, and read the raw source before exact edits, quotes, commands, or schema details.

POST /shutdown

curl -X POST http://127.0.0.1:8765/shutdown

Direct Serve Mode

Use serve when you want the foreground process instead of the service manager:

contextcrumb serve --host 127.0.0.1 --port 8765 --allow-root docs