Getting Started
Install ContextCrumb with Python 3.10 or newer:
pip install contextcrumb
The default backend is ONNX. You do not need PyTorch or Transformers for the default install.
Pick An Integration
Start with the smallest path that matches what you are doing:
| Goal | Command or API |
|---|---|
| Compress one file from a terminal | contextcrumb load notes.md |
| Compress supported code comments/docstrings | contextcrumb load script.py |
| Add compression before an LLM API call | ContextCompressor().compress(text) |
| Compress a prompt, history, or subagent report | compressor.compress(text) |
| Compress prose fields in tool output | Walk the structure and compress only text values |
| Compress many files | contextcrumb batch docs --glob "*.md" --out compressed-docs |
| Share one warm model across tools | contextcrumb service start |
| Let an agent call compression tools | contextcrumb-mcp |
First Compression
Create a text file:
Agents spend context on notes, logs, tickets, docs, and tool descriptions. Those files contain useful facts, but they also carry filler phrases and repeated wording.
Compress it:
contextcrumb load notes.txt
load prints compressed text only, so agents and shell scripts can capture it directly.
For application code, use the Python API:
from contextcrumb import ContextCompressor
compressor = ContextCompressor()
result = compressor.compress("Long prose-heavy context about project decisions and constraints.")
print(result.text)
Inspect Before Trusting
Use inspect to see what happened without dumping the whole compressed file:
contextcrumb inspect notes.txt
Use diff to see deleted tokens inline:
contextcrumb diff notes.txt
Deleted tokens are marked like this:
kept words [-deleted words-] kept words
Tune Compression
By default, ContextCrumb keeps tokens whose aggregated KEEP probability is at or above 0.5, the binary classifier decision boundary.
Use a fixed budget when you need predictable output size:
contextcrumb load notes.txt --target-keep-ratio 0.5
contextcrumb load notes.txt --target-keep-ratio 0.75
Use a custom threshold when you want stricter or looser direct probability control:
contextcrumb load notes.txt --threshold 0.6
JSON Output
Use JSON when another tool needs stats or token counts:
contextcrumb load notes.txt --json
The compressed text is in text. Useful stats live under stats, including token counts, keep ratios, mode, backend, and model window count.
{
"text": "compressed output",
"stats": {
"input_tokens": 100,
"kept_tokens": 58,
"deleted_tokens": 42,
"token_keep_ratio": 0.58,
"mode": "threshold"
}
}
Use --receipt when a shell script or agent should keep compressed text clean on
stdout but still show what was saved:
contextcrumb load notes.txt --receipt
Plain output writes the receipt to stderr. JSON output includes it as a top-level
receipt field.
File Safety
ContextCrumb uses compression.content_mode = "auto" by default. Prose files are
compressed normally. Supported code files use code-aware compression: executable
source is preserved exactly, and only comments/docstrings are compressed.
Initial code-aware languages are Python, JavaScript, TypeScript, JSX, TSX, Go,
and Rust. Other syntax-sensitive file types are refused by default, including
diffs, structured configs, lockfiles, SQL, and .env files. Those files should
usually be read raw because exact tokens, structure, or commands may matter.
Set or inspect persistent defaults:
contextcrumb config show
contextcrumb config set compression.content_mode auto
contextcrumb config set code.comment_target_keep_ratio 0.55
Use --force only for exploratory compression:
contextcrumb load notes.py --force
If you force compression, read the raw source before editing, quoting, copying commands, or relying on exact formatting.
Optional Extras
Install extras only when you need them:
pip install "contextcrumb[mcp]"
pip install "contextcrumb[serve]"
pip install "contextcrumb[torch]"
Use [mcp] for the MCP server, [serve] for the local HTTP service, and [torch] if you want the Torch backend.