Skip to main content

Getting Started

Install ContextCrumb with Python 3.10 or newer:

pip install contextcrumb

The default backend is ONNX. You do not need PyTorch or Transformers for the default install.

Pick An Integration

Start with the smallest path that matches what you are doing:

GoalCommand or API
Compress one file from a terminalcontextcrumb load notes.md
Compress supported code comments/docstringscontextcrumb load script.py
Add compression before an LLM API callContextCompressor().compress(text)
Compress a prompt, history, or subagent reportcompressor.compress(text)
Compress prose fields in tool outputWalk the structure and compress only text values
Compress many filescontextcrumb batch docs --glob "*.md" --out compressed-docs
Share one warm model across toolscontextcrumb service start
Let an agent call compression toolscontextcrumb-mcp

First Compression

Create a text file:

notes.txt
Agents spend context on notes, logs, tickets, docs, and tool descriptions. Those files contain useful facts, but they also carry filler phrases and repeated wording.

Compress it:

contextcrumb load notes.txt

load prints compressed text only, so agents and shell scripts can capture it directly.

For application code, use the Python API:

from contextcrumb import ContextCompressor

compressor = ContextCompressor()
result = compressor.compress("Long prose-heavy context about project decisions and constraints.")

print(result.text)

Inspect Before Trusting

Use inspect to see what happened without dumping the whole compressed file:

contextcrumb inspect notes.txt

Use diff to see deleted tokens inline:

contextcrumb diff notes.txt

Deleted tokens are marked like this:

kept words [-deleted words-] kept words

Tune Compression

By default, ContextCrumb keeps tokens whose aggregated KEEP probability is at or above 0.5, the binary classifier decision boundary.

Use a fixed budget when you need predictable output size:

contextcrumb load notes.txt --target-keep-ratio 0.5
contextcrumb load notes.txt --target-keep-ratio 0.75

Use a custom threshold when you want stricter or looser direct probability control:

contextcrumb load notes.txt --threshold 0.6

JSON Output

Use JSON when another tool needs stats or token counts:

contextcrumb load notes.txt --json

The compressed text is in text. Useful stats live under stats, including token counts, keep ratios, mode, backend, and model window count.

{
"text": "compressed output",
"stats": {
"input_tokens": 100,
"kept_tokens": 58,
"deleted_tokens": 42,
"token_keep_ratio": 0.58,
"mode": "threshold"
}
}

Use --receipt when a shell script or agent should keep compressed text clean on stdout but still show what was saved:

contextcrumb load notes.txt --receipt

Plain output writes the receipt to stderr. JSON output includes it as a top-level receipt field.

File Safety

ContextCrumb uses compression.content_mode = "auto" by default. Prose files are compressed normally. Supported code files use code-aware compression: executable source is preserved exactly, and only comments/docstrings are compressed.

Initial code-aware languages are Python, JavaScript, TypeScript, JSX, TSX, Go, and Rust. Other syntax-sensitive file types are refused by default, including diffs, structured configs, lockfiles, SQL, and .env files. Those files should usually be read raw because exact tokens, structure, or commands may matter.

Set or inspect persistent defaults:

contextcrumb config show
contextcrumb config set compression.content_mode auto
contextcrumb config set code.comment_target_keep_ratio 0.55

Use --force only for exploratory compression:

contextcrumb load notes.py --force

If you force compression, read the raw source before editing, quoting, copying commands, or relying on exact formatting.

Optional Extras

Install extras only when you need them:

pip install "contextcrumb[mcp]"
pip install "contextcrumb[serve]"
pip install "contextcrumb[torch]"

Use [mcp] for the MCP server, [serve] for the local HTTP service, and [torch] if you want the Torch backend.