Local Service
The local service keeps one warm ContextCrumb model process running behind a localhost HTTP API. Use it when an application, agent, editor, MCP server, or batch job makes repeated compression calls.
Use the service when process startup or model loading would otherwise happen too often. For a single script that already stays alive, prefer ContextCompressor directly.
The service accepts text directly, so it works for prompts, conversation history chunks, subagent reports, and tool-output fields as well as files.
Install the service dependencies:
pip install "contextcrumb[serve]"
Start
contextcrumb service start
By default it starts on:
http://127.0.0.1:8765
Start lazily, so the HTTP server starts before the model is loaded:
contextcrumb service start --lazy-load
Allow local file reads under a specific root:
contextcrumb service start --allow-root docs
Disable file reads entirely:
contextcrumb service start --disable-file-reads
File reads are limited to allowed roots. This matters when an agent can call /compress_file; only expose directories that the agent should read.
Use From The CLI
contextcrumb load notes.md --use-service
contextcrumb batch docs --glob "*.md" --out compressed-docs --use-service
Set a default service URL:
CONTEXTCRUMB_SERVICE_URL=http://127.0.0.1:8765 contextcrumb load notes.md --use-service
Status And Stop
contextcrumb service status
contextcrumb service stop
HTTP Endpoints
GET /health
Returns service state:
{
"ok": true,
"model_loaded": true,
"backend": "onnx",
"file_reads_enabled": true,
"allowed_file_roots": ["C:/project/docs"]
}
POST /compress
curl -X POST http://127.0.0.1:8765/compress \
-H "Content-Type: application/json" \
-d '{"text":"Long prose-heavy text.","target_keep_ratio":0.5}'
POST /compress_file
curl -X POST http://127.0.0.1:8765/compress_file \
-H "Content-Type: application/json" \
-d '{"path":"docs/notes.md","target_keep_ratio":0.5}'
The file path must be under an allowed root unless file reads are disabled. The
endpoint uses configured content_mode by default. In auto mode, supported
source files preserve executable code exactly and compress only comments/docstrings:
curl -X POST http://127.0.0.1:8765/compress_file \
-H "Content-Type: application/json" \
-d '{"path":"src/app.py","content_mode":"code-comments"}'
Unsupported syntax-sensitive file types are refused by default. Use
"force": true only for exploratory compression, and read the raw source before
exact edits, quotes, commands, or schema details.
POST /shutdown
curl -X POST http://127.0.0.1:8765/shutdown
Direct Serve Mode
Use serve when you want the foreground process instead of the service manager:
contextcrumb serve --host 127.0.0.1 --port 8765 --allow-root docs