Overview
ContextCrumb compresses prose-heavy context before it reaches an LLM or agent. It scores the input token by token, deletes low-value tokens, and keeps the surviving text in the original order. For supported source files, it can preserve executable code exactly while compressing only comments/docstrings.
It is useful when natural-language context is too verbose for a model context window: docs, notes, issue threads, transcripts, logs, research dumps, prompts, conversation history, subagent output, or natural-language tool results.
Choose Your Path
ContextCrumb serves two common workflows:
| If you are... | Use ContextCrumb to... | Start with |
|---|---|---|
| Building an app, pipeline, evaluation, or internal tool | Add local context compression as middleware before an LLM call, agent handoff, or prompt assembly step | Python API and Local Service |
| Working with coding agents or assistant tools | Compress prose-heavy context before it enters the agent context | Getting Started and Agent Skills / MCP |
Both paths use the same model and output shape. The difference is integration style: application developers usually call Python or HTTP APIs; assistant workflows usually call the CLI, a skill, or MCP.
What Makes It Different
ContextCrumb is not a summarizer. It does not rewrite the source into a new paragraph. It keeps the source sequence and removes padding.
That makes it a good fit for agent workflows where the model still needs names, steps, constraints, and ordering, but does not need every filler phrase.
Best Fit
- Long Markdown files and documentation
- Meeting notes and research dumps
- Issue threads and long discussions
- Logs with narrative text
- User prompts and prompt templates
- Conversation history before replaying it into another model call
- Subagent reports, planner output, and research-agent notes
- Natural-language tool results
- Tool, resource, and prompt descriptions in MCP catalogs
- Agent skills that load large local files
- Supported source files when comment/docstring compression is useful
Context Surfaces
ContextCrumb can sit anywhere a system turns natural language into model input:
| Context surface | Example |
|---|---|
| Retrieved documents | Compress long docs before adding them to a RAG prompt |
| User prompt | Compress an oversized free-form request before routing or planning |
| Conversation history | Compress older turns before replaying them into another model call |
| Subagent output | Compress research or planning reports before handing them to a main agent |
| Tool output | Compress prose fields in search results, issue threads, or logs |
| Tool catalogs | Compress verbose MCP descriptions before an agent sees the catalog |
Avoid Or Use Carefully
Do not use compressed output as the only source of truth when exact text matters:
- Unsupported source code, or source code when exact comments/docstrings matter
- Config files
- Diffs and patches
- JSON, YAML, TOML, XML, or schemas
- Shell commands people may copy exactly
- Legal, compliance, policy, or contract text
For supported Python, JavaScript, TypeScript, JSX, TSX, Go, and Rust files, auto mode preserves executable source exactly and compresses only comments/docstrings. For structured data and unsupported code, load the raw file or use raw/refuse file modes.
For structured tool output, preserve the structure and compress only natural-language values. Do not compress raw JSON as one string if keys, ids, numbers, or schema shape matter.
Main Workflows
| Workflow | Start with |
|---|---|
| Try it from the terminal | Getting Started |
| Use it as middleware in an app | Python API |
| Compress many files | Batch API |
| Compress prompts, history, or tool results | App Integration Examples |
| Add it to agent-assisted workflows | Agent Skills / MCP |
| Expose MCP tools | MCP Server |
| Shrink verbose MCP catalogs | MCP Shrink Proxy |
| Keep one warm model process | Local Service |