Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 101 additions & 138 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,167 +1,130 @@
# BaseAgent - SDK 3.0

High-performance autonomous agent for [Term Challenge](https://term.challenge). **Does NOT use term_sdk** - fully autonomous with litellm.
High-performance autonomous agent for [Term Challenge](https://term.challenge). Supports multiple LLM providers with **Chutes API** (Kimi K2.5-TEE) as the default.

## Installation
## Quick Start

```bash
# Via pyproject.toml
pip install .

# Via requirements.txt
# 1. Install dependencies
pip install -r requirements.txt
```

## Usage
# 2. Configure Chutes API (default provider)
export CHUTES_API_TOKEN="your-token-from-chutes.ai"

```bash
python agent.py --instruction "Your task here..."
# 3. Run the agent
python3 agent.py --instruction "Your task description here..."
```

The agent receives the instruction via `--instruction` and executes the task autonomously.

## Mandatory Architecture

> **IMPORTANT**: Agents MUST follow these rules to work correctly.
### Alternative: OpenRouter

### 1. Project Structure (MANDATORY)

Agents **MUST** be structured projects, NOT single files:

```
my-agent/
├── agent.py # Entry point with --instruction
├── src/ # Modules
│ ├── core/
│ │ ├── loop.py # Main loop
│ │ └── compaction.py # Context management (MANDATORY)
│ ├── llm/
│ │ └── client.py # LLM client (litellm)
│ └── tools/
│ └── ... # Available tools
├── requirements.txt # Dependencies
└── pyproject.toml # Project config
```bash
export LLM_PROVIDER="openrouter"
export OPENROUTER_API_KEY="your-openrouter-key"
python3 agent.py --instruction "Your task description here..."
```

### 2. Session Management (MANDATORY)

Agents **MUST** maintain complete conversation history:

```python
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": instruction},
]
## Documentation

# Add each exchange
messages.append({"role": "assistant", "content": response})
messages.append({"role": "tool", "tool_call_id": id, "content": result})
📚 **Full documentation available in [docs/](docs/)**

### Getting Started
- [Overview](docs/overview.md) - What is BaseAgent
- [Installation](docs/installation.md) - Setup instructions
- [Quick Start](docs/quickstart.md) - First task in 5 minutes

### Core Concepts
- [Architecture](docs/architecture.md) - Technical deep-dive with diagrams
- [Configuration](docs/configuration.md) - All settings explained
- [Usage Guide](docs/usage.md) - CLI commands and examples

### Reference
- [Tools Reference](docs/tools.md) - Available tools
- [Context Management](docs/context-management.md) - Token optimization
- [Best Practices](docs/best-practices.md) - Performance tips

### LLM Providers
- [Chutes Integration](docs/chutes-integration.md) - **Default provider setup**

## Architecture Overview

```mermaid
graph TB
subgraph User
CLI["python3 agent.py --instruction"]
end

subgraph Core
Loop["Agent Loop"]
Context["Context Manager"]
end

subgraph LLM
Chutes["Chutes API (Kimi K2.5)"]
OpenRouter["OpenRouter (fallback)"]
end

subgraph Tools
Shell["shell_command"]
Files["read/write_file"]
Search["grep_files"]
end

CLI --> Loop
Loop --> Context
Loop -->|default| Chutes
Loop -->|fallback| OpenRouter
Loop --> Tools
```

### 3. Context Compaction (MANDATORY)

Compaction is **CRITICAL** for:
- Avoiding "context too long" errors
- Preserving critical information
- Enabling complex multi-step tasks
- Improving response coherence
## Key Features

```python
# Recommended threshold: 85% of context window
AUTO_COMPACT_THRESHOLD = 0.85
| Feature | Description |
|---------|-------------|
| **Fully Autonomous** | No user confirmation needed |
| **LLM-Driven** | All decisions made by the language model |
| **Chutes API** | Default: Kimi K2.5-TEE (256K context, thinking mode) |
| **Prompt Caching** | 90%+ cache hit rate |
| **Context Management** | Intelligent pruning and compaction |
| **Self-Verification** | Automatic validation before completion |

# 2-step strategy:
# 1. Pruning: Remove old tool outputs
# 2. AI Compaction: Summarize conversation if pruning insufficient
```

## Features
## Environment Variables

### LLM Client (litellm)
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `CHUTES_API_TOKEN` | Yes* | - | Chutes API token |
| `LLM_PROVIDER` | No | `chutes` | `chutes` or `openrouter` |
| `LLM_MODEL` | No | `moonshotai/Kimi-K2.5-TEE` | Model identifier |
| `LLM_COST_LIMIT` | No | `10.0` | Max cost in USD |
| `OPENROUTER_API_KEY` | For OpenRouter | - | OpenRouter API key |

```python
from src.llm.client import LiteLLMClient
*\*Required for default Chutes provider*

llm = LiteLLMClient(
model="openrouter/anthropic/claude-opus-4.5",
temperature=0.0,
max_tokens=16384,
)
## Project Structure

response = llm.chat(messages, tools=tool_specs)
```

### Prompt Caching

Caches system and recent messages to reduce costs:
- Cache hit rate: **90%+** on long conversations
- Significant API cost reduction

### Self-Verification

Before completing, the agent automatically:
1. Re-reads the original instruction
2. Verifies each requirement
3. Only confirms completion if everything is validated

### Context Management

- **Token-based overflow detection** (not message count)
- **Tool output pruning** (removes old outputs)
- **AI compaction** (summarizes if needed)
- **Middle-out truncation** for large outputs

## Available Tools

| Tool | Description |
|------|-------------|
| `shell_command` | Execute shell commands |
| `read_file` | Read files with pagination |
| `write_file` | Create/overwrite files |
| `apply_patch` | Apply patches |
| `grep_files` | Search with ripgrep |
| `list_dir` | List directories |
| `view_image` | Analyze images |

## Configuration

See `src/config/defaults.py`:

```python
CONFIG = {
"model": "openrouter/anthropic/claude-opus-4.5",
"max_tokens": 16384,
"max_iterations": 200,
"auto_compact_threshold": 0.85,
"prune_protect": 40_000,
"cache_enabled": True,
}
baseagent/
├── agent.py # Entry point
├── src/
│ ├── core/
│ │ ├── loop.py # Main agent loop
│ │ └── compaction.py # Context management
│ ├── llm/
│ │ └── client.py # LLM client
│ ├── config/
│ │ └── defaults.py # Configuration
│ ├── tools/ # Tool implementations
│ └── prompts/ # System prompt
├── docs/ # 📚 Full documentation
├── rules/ # Development guidelines
└── astuces/ # Implementation techniques
```

## Environment Variables

| Variable | Description |
|----------|-------------|
| `OPENROUTER_API_KEY` | OpenRouter API key |

## Documentation

### Rules - Development Guidelines

See [rules/](rules/) for comprehensive guides:

- [Architecture Patterns](rules/02-architecture-patterns.md) - **Mandatory project structure**
- [LLM Usage Guide](rules/06-llm-usage-guide.md) - **Using litellm**
- [Best Practices](rules/05-best-practices.md)
- [Error Handling](rules/08-error-handling.md)

### Tips - Practical Techniques

See [astuces/](astuces/) for techniques:
## Development Guidelines

- [Prompt Caching](astuces/01-prompt-caching.md)
- [Context Management](astuces/03-context-management.md)
- [Local Testing](astuces/09-local-testing.md)
For agent developers, see:
- [rules/](rules/) - Architecture patterns, best practices, anti-patterns
- [astuces/](astuces/) - Practical techniques (caching, verification, etc.)
- [AGENTS.md](AGENTS.md) - Comprehensive building guide

## License

Expand Down
125 changes: 125 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# BaseAgent Documentation

> **Professional documentation for the BaseAgent autonomous coding assistant**

BaseAgent is a high-performance autonomous agent designed for the [Term Challenge](https://term.challenge). It leverages LLM-driven decision making with advanced context management and cost optimization techniques.

---

## Table of Contents

### Getting Started
- [Overview](./overview.md) - What is BaseAgent and core design principles
- [Installation](./installation.md) - Prerequisites and setup instructions
- [Quick Start](./quickstart.md) - Your first task in 5 minutes

### Core Concepts
- [Architecture](./architecture.md) - Technical architecture and system design
- [Configuration](./configuration.md) - All configuration options explained
- [Usage Guide](./usage.md) - Command-line interface and options

### Reference
- [Tools Reference](./tools.md) - Available tools and their parameters
- [Context Management](./context-management.md) - Token management and compaction
- [Best Practices](./best-practices.md) - Optimal usage patterns

### LLM Providers
- [Chutes API Integration](./chutes-integration.md) - Using Chutes as your LLM provider

---

## Quick Navigation

| Document | Description |
|----------|-------------|
| [Overview](./overview.md) | High-level introduction and design principles |
| [Installation](./installation.md) | Step-by-step setup guide |
| [Quick Start](./quickstart.md) | Get running in minutes |
| [Architecture](./architecture.md) | Technical deep-dive with diagrams |
| [Configuration](./configuration.md) | Environment variables and settings |
| [Usage](./usage.md) | CLI commands and examples |
| [Tools](./tools.md) | Complete tools reference |
| [Context Management](./context-management.md) | Memory and token optimization |
| [Best Practices](./best-practices.md) | Tips for optimal performance |
| [Chutes Integration](./chutes-integration.md) | Chutes API setup and usage |

---

## Architecture at a Glance

```mermaid
graph TB
subgraph User["User Interface"]
CLI["CLI (agent.py)"]
end

subgraph Core["Core Engine"]
Loop["Agent Loop"]
Context["Context Manager"]
Cache["Prompt Cache"]
end

subgraph LLM["LLM Layer"]
Client["LiteLLM Client"]
Provider["Provider (Chutes/OpenRouter)"]
end

subgraph Tools["Tool System"]
Registry["Tool Registry"]
Shell["shell_command"]
Files["read_file / write_file"]
Search["grep_files / list_dir"]
end

CLI --> Loop
Loop --> Context
Loop --> Cache
Loop --> Client
Client --> Provider
Loop --> Registry
Registry --> Shell
Registry --> Files
Registry --> Search
```

---

## Key Features

- **Fully Autonomous** - No user confirmation required; makes decisions independently
- **LLM-Driven** - All decisions made by the language model, not hardcoded logic
- **Prompt Caching** - 90%+ cache hit rate for significant cost reduction
- **Context Management** - Intelligent pruning and compaction for long tasks
- **Self-Verification** - Automatic validation before task completion
- **Multi-Provider** - Supports Chutes AI, OpenRouter, and litellm-compatible providers

---

## Project Structure

```
baseagent/
├── agent.py # Entry point
├── src/
│ ├── core/
│ │ ├── loop.py # Main agent loop
│ │ └── compaction.py # Context management
│ ├── llm/
│ │ └── client.py # LLM client (litellm)
│ ├── config/
│ │ └── defaults.py # Configuration
│ ├── tools/ # Tool implementations
│ ├── prompts/
│ │ └── system.py # System prompt
│ └── output/
│ └── jsonl.py # JSONL event emission
├── rules/ # Development guidelines
├── astuces/ # Implementation techniques
└── docs/ # This documentation
```

---

## License

MIT License - See [LICENSE](../LICENSE) for details.
Loading