Skip to main content

Overview

ContextCrumb compresses prose-heavy context before it reaches an LLM or agent. It scores the input token by token, deletes low-value tokens, and keeps the surviving text in the original order. For supported source files, it can preserve executable code exactly while compressing only comments/docstrings.

It is useful when natural-language context is too verbose for a model context window: docs, notes, issue threads, transcripts, logs, research dumps, prompts, conversation history, subagent output, or natural-language tool results.

Choose Your Path

ContextCrumb serves two common workflows:

If you are...Use ContextCrumb to...Start with
Building an app, pipeline, evaluation, or internal toolAdd local context compression as middleware before an LLM call, agent handoff, or prompt assembly stepPython API and Local Service
Working with coding agents or assistant toolsCompress prose-heavy context before it enters the agent contextGetting Started and Agent Skills / MCP

Both paths use the same model and output shape. The difference is integration style: application developers usually call Python or HTTP APIs; assistant workflows usually call the CLI, a skill, or MCP.

What Makes It Different

ContextCrumb is not a summarizer. It does not rewrite the source into a new paragraph. It keeps the source sequence and removes padding.

That makes it a good fit for agent workflows where the model still needs names, steps, constraints, and ordering, but does not need every filler phrase.

Best Fit

  • Long Markdown files and documentation
  • Meeting notes and research dumps
  • Issue threads and long discussions
  • Logs with narrative text
  • User prompts and prompt templates
  • Conversation history before replaying it into another model call
  • Subagent reports, planner output, and research-agent notes
  • Natural-language tool results
  • Tool, resource, and prompt descriptions in MCP catalogs
  • Agent skills that load large local files
  • Supported source files when comment/docstring compression is useful

Context Surfaces

ContextCrumb can sit anywhere a system turns natural language into model input:

Context surfaceExample
Retrieved documentsCompress long docs before adding them to a RAG prompt
User promptCompress an oversized free-form request before routing or planning
Conversation historyCompress older turns before replaying them into another model call
Subagent outputCompress research or planning reports before handing them to a main agent
Tool outputCompress prose fields in search results, issue threads, or logs
Tool catalogsCompress verbose MCP descriptions before an agent sees the catalog

Avoid Or Use Carefully

Do not use compressed output as the only source of truth when exact text matters:

  • Unsupported source code, or source code when exact comments/docstrings matter
  • Config files
  • Diffs and patches
  • JSON, YAML, TOML, XML, or schemas
  • Shell commands people may copy exactly
  • Legal, compliance, policy, or contract text

For supported Python, JavaScript, TypeScript, JSX, TSX, Go, and Rust files, auto mode preserves executable source exactly and compresses only comments/docstrings. For structured data and unsupported code, load the raw file or use raw/refuse file modes.

For structured tool output, preserve the structure and compress only natural-language values. Do not compress raw JSON as one string if keys, ids, numbers, or schema shape matter.

Main Workflows

WorkflowStart with
Try it from the terminalGetting Started
Use it as middleware in an appPython API
Compress many filesBatch API
Compress prompts, history, or tool resultsApp Integration Examples
Add it to agent-assisted workflowsAgent Skills / MCP
Expose MCP toolsMCP Server
Shrink verbose MCP catalogsMCP Shrink Proxy
Keep one warm model processLocal Service