Thoughtbox

A reasoning ledger for AI agents. Thoughtbox is an MCP server that provides structured reasoning tools, enabling agents to think step-by-step, branch into alternative explorations, revise earlier conclusions, and maintain a persistent record of their cognitive process.

Unlike ephemeral chain-of-thought prompting, Thoughtbox creates a durable reasoning chain — a ledger of thoughts that can be visualized, exported, and analyzed. Each thought is a node in a graph structure supporting forward thinking, backward planning, branching explorations, mid-course revisions, and autonomous critique via MCP sampling.

Local-First: Thoughtbox runs entirely on your machine. All reasoning data is stored locally at ~/.thoughtbox/ by default — nothing is sent to external servers. Your thought processes remain private and under your control.

Client Compatibility

Note: Thoughtbox is currently optimized for use with Claude Code. We are actively working on supporting additional MCP clients, but due to the wide variation in capability support across the Model Context Protocol ecosystem — server features (prompts, resources, tools), client features (roots, sampling, elicitation), and behaviors like listChanged notifications — we're having to implement custom adaptations for many clients. See the gateway tool for one such adaptation.

If you're using a client other than Claude Code and encounter issues, please open an issue describing your client and the problem — this helps us prioritize compatibility work.

Progressive Disclosure

Thoughtbox uses a staged tool disclosure system to guide agents through proper initialization:

Stage	Tools Available	Trigger
Stage 0	`init`, `thoughtbox_gateway`	Connection start
Stage 1	+ `thoughtbox_cipher`, `session`	`init(start_new)` or `init(load_context)`
Stage 2	+ `thoughtbox`, `notebook`	`thoughtbox_cipher` call
Stage 3	+ `mental_models`, `export_reasoning_chain`	Domain activation

This ensures agents establish proper session context before accessing advanced reasoning tools.

Gateway Tool (Always Available)

The thoughtbox_gateway tool is always enabled and provides a routing mechanism for clients that don't refresh tool lists mid-turn (e.g., Claude Code over streaming HTTP). It routes to all handlers internally while enforcing stage requirements:

// Call operations through gateway without waiting for tool list refresh
{ operation: "get_state" }                    // → init handler
{ operation: "start_new", args: {...} }       // → init handler, advances to Stage 1
{ operation: "cipher" }                       // → cipher content, advances to Stage 2
{ operation: "thought", args: {...} }         // → thoughtbox handler (requires Stage 2)

The gateway returns clear error messages when operations are called at the wrong stage.

Observatory UI showing a reasoning session with 14 thoughts and a branch exploration (purple nodes 13-14) forking from thought 5.

Core Concepts

The Reasoning Ledger

Thoughtbox treats reasoning as data, not just process. Every thought is:

Numbered: Logical position in the reasoning chain (supports non-sequential addition)
Timestamped: When the thought was recorded
Linked: Connected to previous thoughts, branch origins, or revised thoughts
Persistent: Stored in sessions that survive across conversations
Exportable: Full reasoning chains can be exported as JSON or Markdown

This creates an auditable trail of how conclusions were reached — invaluable for debugging agent behavior, understanding decision-making, and improving reasoning strategies.

Thinking Patterns

Thoughtbox supports multiple reasoning strategies out of the box:

Pattern	Description	Use Case
Forward Thinking	Sequential 1→2→3→N progression	Exploration, discovery, open-ended analysis
Backward Thinking	Start at goal (N), work back to start (1)	Planning, system design, working from known goals
Branching	Fork into parallel explorations (A, B, C...)	Comparing alternatives, A/B scenarios
Revision	Update earlier thoughts with new information	Error correction, refined understanding
Interleaved	Alternate reasoning with tool actions	Tool-oriented tasks, adaptive execution

See the Patterns Cookbook for comprehensive examples.

Features

1. Init Tool — Session Management (Stage 0)

The entry point for establishing session context.

Operations:

get_state — Check current session state
list_sessions — Browse available sessions (filter by project, task, aspect)
start_new — Begin a new work session with project/task/aspect scoping
load_context — Resume an existing session
navigate — Move between project/task/aspect levels
list_roots — Query MCP roots from client
bind_root — Bind a root directory as project scope

2. Thoughtbox Cipher — Deep Thinking Primer (Stage 1)

A priming tool that prepares agents for structured reasoning. Calling this tool unlocks Stage 2 tools.

3. Session Tool — Persistence & Export (Stage 1)

Manage reasoning sessions with full persistence.

Operations:

list — List all sessions
get — Retrieve session details
export — Export session as JSON or Markdown
analyze — Get session statistics and insights

4. Thoughtbox Tool — Structured Reasoning (Stage 2)

The core tool for step-by-step thinking with full graph capabilities.

// Simple forward thinking
{ thought: "First, let's identify the problem...", thoughtNumber: 1, totalThoughts: 5, nextThoughtNeeded: true }

// Branching to explore alternatives
{ thought: "Option A: Use PostgreSQL...", thoughtNumber: 3, branchFromThought: 2, branchId: "sql-path", ... }

// Revising an earlier conclusion
{ thought: "Actually, the root cause is...", thoughtNumber: 7, isRevision: true, revisesThought: 3, ... }

// Request autonomous critique via MCP sampling
{ thought: "This approach seems optimal...", thoughtNumber: 5, critique: true, ... }

Parameters:

thought (string): The current thinking step
thoughtNumber (integer): Logical position in the chain
totalThoughts (integer): Estimated total (adjustable)
nextThoughtNeeded (boolean): Continue thinking?
branchFromThought (integer): Fork point for branching
branchId (string): Branch identifier
isRevision (boolean): Marks revision of earlier thought
revisesThought (integer): Which thought is being revised
critique (boolean): Request autonomous LLM critique via MCP sampling API

Autonomous Critique: When critique: true, the server uses the MCP sampling/createMessage API to request an external LLM analysis of the current thought. The critique is returned in the response and persisted with the thought.

5. Observatory — Real-Time Visualization

A built-in web UI for watching reasoning unfold in real-time.

Features:

Live Graph: Watch thoughts appear as they're added
Snake Layout: Compact left-to-right flow with row wrapping
Hierarchical Branches: Branches collapse into clickable stub nodes (A, B, C...)
Navigation: Click stubs to explore branches, back button to return
Detail Panel: Click any node to view full thought content
Multi-Session: Switch between active reasoning sessions

Access: The Observatory is available at http://localhost:1729 when the server is running.

6. Mental Models — Reasoning Frameworks (Stage 2)

15 structured mental models that provide process scaffolds for different problem types.

Available Models:

rubber-duck — Explain to clarify thinking
five-whys — Root cause analysis
pre-mortem — Anticipate failures
steelmanning — Strengthen opposing arguments
fermi-estimation — Order-of-magnitude reasoning
trade-off-matrix — Multi-criteria decisions
decomposition — Break down complexity
inversion — Solve by avoiding failure
And 7 more...

Operations:

get_model — Retrieve a specific model's prompt
list_models — List all models (filter by tag)
list_tags — Available categories (debugging, planning, decision-making, etc.)

7. Notebook — Literate Programming (Stage 2)

Interactive notebooks combining documentation with executable JavaScript/TypeScript.

Features:

Isolated execution environments per notebook
Full package.json support with dependency installation
Sequential Feynman template for deep learning workflows
Export to .src.md format

Operations: create, add_cell, run_cell, export, list, load, update_cell

Installation

Quick Start

npx -y @kastalien-research/thoughtbox

MCP Client Configuration

Claude Code

Add to your ~/.claude/settings.json or project .claude/settings.json:

{
  "mcpServers": {
    "thoughtbox": {
      "command": "npx",
      "args": ["-y", "@kastalien-research/thoughtbox"]
    }
  }
}

Cline / VS Code

Add to your MCP settings or .vscode/mcp.json:

{
  "servers": {
    "thoughtbox": {
      "command": "npx",
      "args": ["-y", "@kastalien-research/thoughtbox"]
    }
  }
}

Usage Examples

Forward Thinking — Problem Analysis

Thought 1: "Users report slow checkout. Let's analyze..."
Thought 2: "Data shows 45s average, target is 10s..."
Thought 3: "Root causes: 3 API calls, no caching..."
Thought 4: "Options: Redis cache, query optimization, parallel calls..."
Thought 5: "Recommendation: Implement Redis cache for product data"

Backward Thinking — System Design

Thought 8: [GOAL] "System handles 10k req/s with <100ms latency"
Thought 7: "Before that: monitoring and alerting operational"
Thought 6: "Before that: resilience patterns implemented"
Thought 5: "Before that: caching layer with invalidation"
...
Thought 1: [START] "Current state: 1k req/s, 500ms latency"

Branching — Comparing Alternatives

Thought 4: "Need to choose database architecture..."

Branch A (thought 5): branchId="sql-path"
  "PostgreSQL: ACID compliance, mature tooling, relational integrity"

Branch B (thought 5): branchId="nosql-path"
  "MongoDB: Flexible schema, horizontal scaling, document model"

Thought 6: [SYNTHESIS] "Use PostgreSQL for transactions, MongoDB for analytics"

Environment Variables

Variable	Description	Default
`DISABLE_THOUGHT_LOGGING`	Suppress thought logging to stderr	`false`
`THOUGHTBOX_DATA_DIR`	Base directory for persistent storage	`~/.thoughtbox`
`THOUGHTBOX_PROJECT`	Project scope for session isolation	`_default`
`THOUGHTBOX_TRANSPORT`	Transport type (`stdio` or `http`)	`http`

Development

# Install dependencies
npm install

# Build
npm run build

# Development with hot reload
npm run dev

# Run the server
npm start

Docker Compose

A docker-compose.yml is included to run the HTTP MCP server and the Observatory UI together.

docker compose up --build

HTTP MCP + health: http://localhost:1731/health
Observatory UI/WebSocket: http://localhost:1729/
OpenTelemetry Collector: ports 4317 (gRPC) and 4318 (HTTP) for Claude Code telemetry
Persistent data: stored in named volume thoughtbox-data at /data/thoughtbox (override with env vars like THOUGHTBOX_PROJECT or THOUGHTBOX_DATA_DIR).

Architecture

src/
├── index.ts              # Entry point (stdio/HTTP transport selection)
├── server-factory.ts     # MCP server factory with tool registration
├── tool-registry.ts      # Progressive disclosure (stage-based tool enabling)
├── tool-descriptions.ts  # Stage-specific tool descriptions
├── thought-handler.ts    # Thoughtbox tool logic with critique support
├── gateway/              # Always-on routing tool
│   ├── gateway-handler.ts  # Routes to handlers with stage enforcement
│   └── index.ts          # Module exports
├── init/                 # Init workflow and state management
│   ├── tool-handler.ts   # Init tool operations
│   └── state-manager.ts  # Session state persistence
├── sessions/             # Session tool handler
├── sampling/             # Autonomous critique via MCP sampling
│   └── handler.ts        # SamplingHandler for LLM critique requests
├── persistence/          # Storage layer
│   ├── storage.ts        # InMemoryStorage with LinkedThoughtStore
│   └── filesystem-storage.ts  # FileSystemStorage with atomic writes
├── observatory/          # Real-time visualization
│   ├── ui/               # Self-contained HTML/CSS/JS
│   ├── ws-server.ts      # WebSocket server for live updates
│   └── emitter.ts        # Event emission for thought changes
├── mental-models/        # 15 reasoning frameworks
├── notebook/             # Literate programming engine
└── resources/            # Documentation and patterns cookbook

Storage

Thoughtbox supports two storage backends:

InMemoryStorage: Default for development, uses LinkedThoughtStore for O(1) thought lookups
FileSystemStorage: Persistent storage with atomic writes and project isolation

Data is stored at ~/.thoughtbox/ by default:

~/.thoughtbox/
├── config.json           # Global configuration
└── projects/
    └── {project}/
        └── sessions/
            └── {date}/
                └── {session-id}/
                    ├── manifest.json
                    └── {thought-number}.json

Contributing

We welcome contributions! See CONTRIBUTING.md for:

Development setup
Commit conventions (optimized for thick_read code comprehension)
Testing with agentic scripts
Pull request process

License

MIT License — free to use, modify, and distribute.

Name		Name	Last commit message	Last commit date
Latest commit History 314 Commits
.github		.github
.husky		.husky
benchmarks		benchmarks
dgm-specs		dgm-specs
docs		docs
observability		observability
otel-collector		otel-collector
public		public
scripts		scripts
self-improvement		self-improvement
src		src
templates		templates
tests		tests
.cursorignore		.cursorignore
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.npmignore		.npmignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
llms-install.md		llms-install.md
observatory-test.html		observatory-test.html
package-lock.json		package-lock.json
package.json		package.json
server.json		server.json
tsconfig.json		tsconfig.json

License

Kastalien-Research/thoughtbox

Folders and files

Latest commit

History

Repository files navigation

Thoughtbox

Client Compatibility

Progressive Disclosure

Gateway Tool (Always Available)

Core Concepts

The Reasoning Ledger

Thinking Patterns

Features

1. Init Tool — Session Management (Stage 0)

2. Thoughtbox Cipher — Deep Thinking Primer (Stage 1)

3. Session Tool — Persistence & Export (Stage 1)

4. Thoughtbox Tool — Structured Reasoning (Stage 2)

5. Observatory — Real-Time Visualization

6. Mental Models — Reasoning Frameworks (Stage 2)

7. Notebook — Literate Programming (Stage 2)

Installation

Quick Start

MCP Client Configuration

Claude Code

Cline / VS Code

Usage Examples

Forward Thinking — Problem Analysis

Backward Thinking — System Design

Branching — Comparing Alternatives

Environment Variables

Development

Docker Compose

Architecture

Storage

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Uh oh!

Languages

Packages