diff --git a/agentic-rag-authorization/.env.example b/agentic-rag-authorization/.env.example new file mode 100644 index 0000000..c2b3fe4 --- /dev/null +++ b/agentic-rag-authorization/.env.example @@ -0,0 +1,17 @@ +# Weaviate Configuration +WEAVIATE_URL=http://localhost:8080 +WEAVIATE_API_KEY= + +# SpiceDB Configuration +SPICEDB_ENDPOINT=localhost:50051 +SPICEDB_TOKEN=devtoken + +# OpenAI Configuration +# Get your API key from https://platform.openai.com/api-keys +OPENAI_API_KEY=your-api-key-here + +# Agent Behavior +MAX_RETRIEVAL_ATTEMPTS=1 + +# Logging +LOG_LEVEL=INFO diff --git a/agentic-rag-authorization/.gitignore b/agentic-rag-authorization/.gitignore new file mode 100644 index 0000000..ae5732a --- /dev/null +++ b/agentic-rag-authorization/.gitignore @@ -0,0 +1,59 @@ +# Python +__pycache__/ +*.py[cod] +*$py.class +*.so +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +pip-wheel-metadata/ +share/python-wheels/ +*.egg-info/ +.installed.cfg +*.egg +MANIFEST + +# Virtual environments +venv/ +env/ +ENV/ +.venv + +# IDEs +.vscode/ +.idea/ +*.swp +*.swo +*~ + +# Environment +.env +.env.local + +# Testing +.pytest_cache/ +.coverage +htmlcov/ +.tox/ + +# OS +.DS_Store +Thumbs.db + +# Logs +*.log + +# Docker volumes +docker-volumes/ + +docs/ \ No newline at end of file diff --git a/agentic-rag-authorization/ARCHITECTURE.md b/agentic-rag-authorization/ARCHITECTURE.md new file mode 100644 index 0000000..1b1a20a --- /dev/null +++ b/agentic-rag-authorization/ARCHITECTURE.md @@ -0,0 +1,1566 @@ +# Architecture Deep Dive + +Technical details for those implementing similar systems or extending this one. + +## System Design + +### Core Components + +``` +┌─────────────────────────────────────────────────────────┐ +│ Agentic RAG System Architecture │ +│ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ CLI Tool │ │ Web UI │ │ +│ │ (Python) │ │ (Browser) │ │ +│ └──────┬───────┘ └──────┬───────┘ │ +│ │ │ │ +│ │ run_agentic_rag() │ HTTP API │ +│ │ (sync) │ (async) │ +│ │ │ │ +│ └─────────────┬───────────────┘ │ +│ ▼ │ +│ ┌────────────────────────┐ │ +│ │ FastAPI Backend │ │ +│ │ - POST /api/query │ │ +│ │ - GET /api/users │ │ +│ │ - GET /api/health │ │ +│ │ - Pydantic validation │ │ +│ └──────────┬─────────────┘ │ +│ ▼ │ +│ ┌────────────────────────┐ │ +│ │ LangGraph Engine │ │ +│ │ (State Machine) │ │ +│ └──────────┬─────────────┘ │ +│ │ │ +│ ┌──────────┼──────────┐ │ +│ ▼ ▼ ▼ │ +│ Weaviate SpiceDB OpenAI │ +│ (Search) (AuthZ) (LLM) │ +└─────────────────────────────────────────────────────────┘ +``` + +## System Interfaces + +This system provides two interfaces for interacting with the agentic RAG pipeline: + +### 1. Command-Line Interface (CLI) + +**Purpose:** Direct programmatic access, scripting, testing, and development. + +**Entry Point:** `agentic_rag.graph.run_agentic_rag()` + +**Usage:** +```python +from agentic_rag.graph import run_agentic_rag + +result = run_agentic_rag( + query="What are our engineering practices?", + subject_id="alice", + max_attempts=1 +) +``` + +**Characteristics:** +- Synchronous execution +- Returns full state dictionary +- Direct access to all state fields (messages, reasoning, documents, etc.) +- Used in tests and examples +- Ideal for automation and scripting + +**Return Value:** +```python +{ + "query": str, + "subject_id": str, + "answer": str, + "authorized_documents": List[Document], + "retrieved_documents": List[Document], + "denied_count": int, + "messages": List[BaseMessage], + "reasoning": List[str], + "retrieval_attempt": int, + "authorization_passed": bool, +} +``` + +### 2. Web Interface + +**Purpose:** User-friendly demonstration and interactive exploration of authorization behavior. + +**Architecture:** +- **Frontend:** Single-page HTML/CSS/JS application (`ui/index.html`, 544 lines) +- **Backend:** FastAPI REST API (`api/` directory) +- **Launcher:** Python script with pre-flight checks (`run_ui.py`) + +**API Endpoints:** +- `POST /api/query` - Execute RAG query with authorization +- `GET /api/users` - List available demo users +- `GET /api/health` - Check backend service health + +**Entry Point:** `api.main.app` (FastAPI application) + +**Launch Methods:** + +*Automated (recommended):* +```bash +python3 run_ui.py +``` +This launcher performs pre-flight checks: +- Verifies Weaviate connectivity +- Verifies SpiceDB connectivity +- Checks OpenAI API key configuration +- Validates documents are loaded +- Auto-opens browser to http://localhost:8000 + +*Manual:* +```bash +uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload +``` + +**Request/Response Flow:** +``` +1. User selects identity (alice, bob, hr_manager, etc.) in browser +2. User enters query in textarea +3. Frontend JavaScript sends POST to /api/query: + { + "query": "What are our practices?", + "subject_id": "alice", + "max_attempts": 1 + } +4. FastAPI validates request (Pydantic models) +5. Backend calls run_agentic_rag_async() +6. LangGraph executes state machine (retrieve → authorize → generate) +7. API formats response: + - Authorized documents (doc_id, title, content preview) + - Denied documents (doc_id, title, reason) + - Query statistics (counts, execution time) +8. Frontend renders results with visual indicators: + - Green cards for authorized documents + - Red cards for denied documents with explanations + - Statistics panel showing retrieval/authorization metrics +``` + +**Key Difference:** The web interface uses `run_agentic_rag_async()` for non-blocking execution, while the CLI uses the synchronous `run_agentic_rag()`. Both execute the same LangGraph state machine but differ in their execution model to suit their respective environments. + +### Interface Comparison + +| Feature | CLI | Web UI | +|---------|-----|--------| +| Execution | Synchronous | Asynchronous | +| Entry Point | `run_agentic_rag()` | `run_agentic_rag_async()` | +| Use Case | Scripting, testing, automation | Interactive demos, exploration | +| Output Format | Python dict | JSON (HTTP response) | +| User Experience | Code-based | Visual, browser-based | +| Pre-flight Checks | Manual | Automated (via run_ui.py) | +| Observability | Full state access | Formatted summaries + stats | + +## API Layer Architecture + +The web interface is built on a FastAPI backend that wraps the core LangGraph engine with HTTP endpoints, validation, and response formatting. + +### Components + +**1. FastAPI Application (`api/main.py`)** + +Entry point for the web API with: +- CORS middleware for cross-origin requests (localhost only) +- Static file serving for the UI (mounts `ui/` directory) +- Route registration (via `api/routes.py`) +- Root endpoint serving `index.html` + +```python +from fastapi import FastAPI +from fastapi.middleware.cors import CORSMiddleware + +app = FastAPI( + title="Agentic RAG Authorization API", + version="1.0.0", +) + +# CORS for local development +app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:8000", "http://127.0.0.1:8000"], + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], +) + +# API routes under /api prefix +app.include_router(router, prefix="/api") +``` + +**2. Route Handlers (`api/routes.py`)** + +Implements three core endpoints: + +*POST /api/query:* +```python +@router.post("/query") +async def execute_query(request: QueryRequest) -> APIResponse: + # 1. Validate request (Pydantic) + # 2. Start execution timer + # 3. Call run_agentic_rag_async() + # 4. Format authorized documents + # 5. Extract denied documents (retrieved but not authorized) + # 6. Calculate statistics + # 7. Return APIResponse with data or error +``` + +*GET /api/users:* +```python +@router.get("/users") +async def get_users(): + # Returns list of demo users: + # - alice (Engineering) + # - bob (Sales) + # - hr_manager (HR) + # - finance_manager (Finance) +``` + +*GET /api/health:* +```python +@router.get("/health") +async def health_check(): + # Returns service status + # TODO: Actually check Weaviate/SpiceDB connectivity +``` + +**3. Pydantic Models (`api/models.py`)** + +Type-safe request/response contracts: + +```python +class QueryRequest(BaseModel): + query: str = Field(..., min_length=1, max_length=1000) + subject_id: str + max_attempts: int = Field(default=1, ge=1, le=5) + +class DocumentSummary(BaseModel): + doc_id: str + title: str + content_preview: str # First 200 chars + +class DeniedDocumentSummary(BaseModel): + doc_id: str + title: str + reason: str # "User 'bob' does not have permission..." + +class QueryStats(BaseModel): + retrieved_count: int + authorized_count: int + denied_count: int + retrieval_attempts: int + execution_time_ms: int + +class QueryResponseData(BaseModel): + query: str + subject_id: str + answer: str + authorized_documents: List[DocumentSummary] + denied_documents: List[DeniedDocumentSummary] + stats: QueryStats + +class APIResponse(BaseModel): + success: bool + data: Optional[QueryResponseData] = None + error: Optional[dict] = None +``` + +**4. Configuration (`api/config.py`)** + +API-level configuration (separate from core `agentic_rag.config`): + +```python +class APIConfig(BaseModel): + cors_origins: List[str] = [ + "http://localhost:8000", + "http://127.0.0.1:8000", + ] + api_prefix: str = "/api" +``` + +### Request Processing Flow + +Detailed flow through the API layer: + +``` +Browser + │ + ▼ HTTP POST /api/query + │ Content-Type: application/json + │ Body: { + │ "query": "What are our engineering practices?", + │ "subject_id": "alice", + │ "max_attempts": 1 + │ } + │ +FastAPI Router (routes.py) + │ + ▼ Pydantic validation + │ - query: 1-1000 chars ✓ + │ - subject_id: present ✓ + │ - max_attempts: 1-5 ✓ + │ + ▼ Start timer (time.time()) + │ +run_agentic_rag_async() + │ + ▼ Execute LangGraph + │ 1. Retrieval Node → Weaviate BM25 search + │ 2. Authorization Node → SpiceDB permission checks + │ 3. Generation Node → LLM answer generation + │ + ▼ Returns state dict: + │ { + │ "query": "...", + │ "subject_id": "alice", + │ "answer": "Based on...", + │ "retrieved_documents": [doc1, doc2, doc3], + │ "authorized_documents": [doc1, doc2], + │ "denied_count": 1, + │ "retrieval_attempt": 1, + │ ... + │ } + │ +Route Handler Processing + │ + ▼ Calculate execution time + │ execution_time_ms = int((time.time() - start_time) * 1000) + │ + ▼ Format authorized documents + │ For each in authorized_documents: + │ DocumentSummary( + │ doc_id=doc.metadata["doc_id"], + │ title=doc.metadata["title"], + │ content_preview=doc.page_content[:200] + │ ) + │ + ▼ Extract denied documents + │ Compare retrieved_documents vs authorized_documents by doc_id + │ For each denied: + │ DeniedDocumentSummary( + │ doc_id=..., + │ title=..., + │ reason="User 'alice' does not have permission..." + │ ) + │ + ▼ Build statistics + │ QueryStats( + │ retrieved_count=3, + │ authorized_count=2, + │ denied_count=1, + │ retrieval_attempts=1, + │ execution_time_ms=3420 + │ ) + │ + ▼ Wrap in APIResponse + │ APIResponse( + │ success=True, + │ data=QueryResponseData(...) + │ ) + │ + ▼ JSON serialization (FastAPI automatic) + │ +Browser + │ + ▼ Frontend JavaScript receives JSON + │ + ▼ Render answer + │ Display in answer card + │ + ▼ Render authorized documents + │ Green cards with doc_id, title, preview + │ + ▼ Render denied documents + │ Red cards with doc_id, title, reason + │ + ▼ Display statistics + │ "Retrieved: 3 | Authorized: 2 | Denied: 1" + │ "Execution time: 3.42s" +``` + +### Frontend Architecture + +**Single-Page Application (`ui/index.html`)** + +The frontend is a self-contained HTML file (544 lines) with embedded CSS and JavaScript: + +**Structure:** +- Header with title and description +- User selection dropdown (populated via GET /api/users) +- Query textarea input +- Submit button with loading state +- Results section (hidden until query executes) + - Answer card + - Authorized documents section (green styling) + - Denied documents section (red styling) + - Statistics panel + +**Key JavaScript Functions:** +```javascript +// Load available users on page load +async function loadUsers() { + const response = await fetch('/api/users'); + // Populate dropdown +} + +// Execute query +async function executeQuery() { + const response = await fetch('/api/query', { + method: 'POST', + headers: {'Content-Type': 'application/json'}, + body: JSON.stringify({ + query: queryText, + subject_id: selectedUserId, + max_attempts: 1 + }) + }); + const data = await response.json(); + displayResults(data); +} + +// Render results with authorization transparency +function displayResults(data) { + // Show answer + // Render authorized docs (green cards) + // Render denied docs (red cards with reasons) + // Display statistics +} +``` + +**Design Philosophy:** +- Zero build tools (vanilla HTML/CSS/JS) +- Responsive design (works on mobile) +- Clear visual distinction between authorized and denied content +- Transparency: always show what was denied and why + +### Security Considerations + +**Current State (Demo-Focused):** + +The API layer is designed for **demonstration and education**, not production use. Current security characteristics: + +1. **No API Authentication** + - Endpoints are publicly accessible + - No JWT, API keys, or session management + - Anyone can query as any user + +2. **Client-Side User Selection** + - User identity selected in browser dropdown + - No server-side identity verification + - Trivial to impersonate any user + +3. **CORS Limited to Localhost** + ```python + allow_origins=["http://localhost:8000", "http://127.0.0.1:8000"] + ``` + - Restricts browser-based access to local development + - Does not protect against direct HTTP requests + +4. **No Rate Limiting** + - No protection against abuse or DoS + - OpenAI API costs could accumulate + +**Why This Is Acceptable for Demo:** +- System demonstrates authorization concepts (SpiceDB) +- Not intended for production deployment +- Educational value outweighs security limitations +- Clear documentation of what NOT to do in production + +**Production Recommendations:** + +To deploy this system in production, implement: + +1. **API Authentication** + ```python + from fastapi import Depends, HTTPException + from fastapi.security import HTTPBearer + + security = HTTPBearer() + + @router.post("/query") + async def execute_query( + request: QueryRequest, + credentials: HTTPAuthorizationCredentials = Depends(security) + ): + # Validate JWT or API key + user = validate_token(credentials.credentials) + + # Use authenticated user identity + result = await run_agentic_rag_async( + query=request.query, + subject_id=user.id, # From token, not client + max_attempts=request.max_attempts + ) + ``` + +2. **Server-Side Identity Verification** + - Extract user identity from authenticated session + - Never trust client-provided subject_id + - Validate user exists in authorization system + +3. **Rate Limiting** + ```python + from slowapi import Limiter + from slowapi.util import get_remote_address + + limiter = Limiter(key_func=get_remote_address) + + @router.post("/query") + @limiter.limit("10/minute") + async def execute_query(...): + # Process request + ``` + +4. **Request Logging and Audit Trails** + ```python + import structlog + + logger = structlog.get_logger() + + @router.post("/query") + async def execute_query(request: QueryRequest): + logger.info( + "query_request", + subject_id=request.subject_id, + query=request.query, + timestamp=datetime.utcnow(), + ) + # Execute query + ``` + +5. **HTTPS in Production** + - Terminate TLS at load balancer or reverse proxy + - Never send credentials over HTTP + +6. **Input Sanitization** + - Already done: Pydantic validates query length (1-1000 chars) + - Consider additional sanitization for prompt injection + +7. **CORS Restrictions** + ```python + allow_origins=[ + "https://your-production-domain.com", + "https://app.your-domain.com", + ] + ``` + +**Security Boundary Remains Intact:** + +Importantly, API-level security issues do NOT compromise the core authorization model: +- SpiceDB still enforces document-level permissions +- Authorization node still cannot be bypassed +- Even if an attacker queries as any user, they only see documents that user can access +- The demonstration successfully shows how authorization works, even without API auth + +### Async Execution Model + +**Why Async for Web Interface:** + +The web interface uses `run_agentic_rag_async()` instead of the synchronous `run_agentic_rag()`: + +**Synchronous (CLI):** +```python +def run_agentic_rag(query: str, subject_id: str, max_attempts: int) -> dict: + # Blocks until complete + result = graph.invoke(initial_state) + return result +``` + +**Asynchronous (Web API):** +```python +async def run_agentic_rag_async(query: str, subject_id: str, max_attempts: int) -> dict: + # Non-blocking, allows concurrent requests + result = await graph.ainvoke(initial_state) + return result +``` + +**Benefits of Async:** + +1. **Concurrency** + - Server can handle multiple queries simultaneously + - Other requests aren't blocked while one query waits for OpenAI + +2. **Resource Efficiency** + - Async I/O doesn't waste threads on waiting + - Better scalability under load + +3. **FastAPI Integration** + - FastAPI is async-first framework + - Async route handlers are more efficient + +4. **Consistent Performance** + - Queries don't queue behind each other + - Response time remains consistent under load + +**Implementation:** + +Both execution paths use the same LangGraph state machine, just different invocation methods: + +```python +# graph.py +from langgraph.graph import StateGraph + +workflow = StateGraph(AgenticRAGState) +# ... add nodes, edges ... +graph = workflow.compile() + +# Synchronous wrapper (CLI) +def run_agentic_rag(query, subject_id, max_attempts): + return graph.invoke({ + "query": query, + "subject_id": subject_id, + "max_attempts": max_attempts, + # ... other initial state + }) + +# Asynchronous wrapper (Web API) +async def run_agentic_rag_async(query, subject_id, max_attempts): + return await graph.ainvoke({ + "query": query, + "subject_id": subject_id, + "max_attempts": max_attempts, + # ... other initial state + }) +``` + +**Node Compatibility:** + +All nodes work with both sync and async execution: +- LangChain components support async (Weaviate client, OpenAI) +- SpiceDB gRPC client is synchronous but fast (~40-50ms) +- No code duplication required + +### LangGraph State Machine + +**Default Flow (max_attempts=1):** +``` +START + ↓ +Retrieval Node (Weaviate BM25) + ↓ +Authorization Node (SpiceDB) ◄── Security Boundary (deterministic) + ↓ +Generation Node (LLM with context + explanations) + ↓ +END +``` + +### State Schema + +```python +AgenticRAGState = TypedDict("AgenticRAGState", { + # Input + "query": str, + "subject_id": str, + "max_attempts": int, + + # Tracking + "messages": List[BaseMessage], + "reasoning": List[str], + "retrieval_attempt": int, + + # Documents + "retrieved_documents": List[Document], + "authorized_documents": List[Document], + "denied_count": int, + + # Results + "authorization_passed": bool, + "answer": str, +}) +``` + +## Node Responsibilities + +### Retrieval Node (Deterministic) + +**Purpose**: Execute semantic/keyword search in Weaviate. + +**Input**: `query` from state + +**Operation**: +- Weaviate BM25 keyword search (default) +- Returns top-k documents (typically 5) +- No authorization filtering at this stage +- Direct execution without planning overhead + +**Output**: Updates `retrieved_documents`, `retrieval_attempt` + +Note: This node runs immediately on query input. There is no planning phase before retrieval. + +### Authorization Node (Deterministic - Security Boundary) + +**Purpose**: Filter documents by permissions using SpiceDB. + +**Critical property**: This node ALWAYS runs and cannot be bypassed by the agent. + +**Operation**: +```python +authorized = [] +denied_count = 0 + +for doc in retrieved_documents: + response = spicedb_client.CheckPermission( + resource=f"document:{doc.id}", + permission="view", + subject=f"user:{subject_id}" + ) + + if response.permissionship == HAS_PERMISSION: + authorized.append(doc) + else: + denied_count += 1 + +return { + "authorized_documents": authorized, + "denied_count": denied_count, + "authorization_passed": len(authorized) > 0 +} +``` + +**Output**: Updates `authorized_documents`, `denied_count`, `authorization_passed` + +### Generation Node (LLM-based) + +**Purpose**: Generate final answer incorporating authorization context. + +**Input**: `query`, `authorized_documents`, `denied_count`, `reasoning` + +**Behavior:** +- Uses authorized documents as context for answer +- Mentions if documents were denied (transparency) +- Explains access limitations when applicable +- Provides helpful answer within authorization constraints +- Always runs (even if no authorized documents) + +**Output**: Updates `answer` + +Note: This is the only node that always uses the LLM in default mode (max_attempts=1). It handles both successful retrievals and authorization failures with appropriate explanations. + +## Authorization Model (SpiceDB) + +### Schema Definition + +```zed +definition user {} + +definition department { + relation member: user +} + +definition document { + relation viewer: user | department#member + permission view = viewer +} +``` + +### Permission Check Flow + +``` +1. User makes query + subject_id: "alice" + +2. Weaviate retrieves documents + [eng-001, eng-002, hr-001] + +3. For each document, SpiceDB checks: + + eng-001: + └─ viewer = engineering#member + └─ alice is engineering#member? + └─ alice → engineering → member ✅ + Result: ALLOWED + + hr-001: + └─ viewer = hr_manager + └─ alice is hr_manager? + └─ alice ≠ hr_manager ❌ + Result: DENIED +``` + +### Relationship Graph Example + +``` +alice (user) ──member──> engineering (department) + +eng-001 (document) ──viewer──> engineering#member + (allows all engineering members) + +hr-001 (document) ──viewer──> hr_manager (user) + (allows only hr_manager) +``` + +## Security Architecture + +### Trust Boundaries + +**With Web Interface:** + +``` +┌───────────────────────────────────┐ +│ Untrusted Zone │ +│ Browser, User Input │ +│ - User selects identity │ +│ - User enters query │ +└─────────────┬─────────────────────┘ + ▼ +┌───────────────────────────────────┐ +│ API Layer (Demo: No Auth) │ +│ FastAPI Backend │ +│ - Request validation │ +│ - CORS protection │ +│ - Response formatting │ +│ ⚠️ Production needs auth here │ +└─────────────┬─────────────────────┘ + ▼ +┌───────────────────────────────────┐ +│ Semi-Trusted Zone │ +│ LangGraph Agent │ +│ - Can plan strategies │ +│ - Can check permissions │ +│ - Cannot bypass auth │ +└─────────────┬─────────────────────┘ + ▼ +┌───────────────────────────────────┐ +│ SECURITY BOUNDARY │ +│ Authorization Node │ +│ - Deterministic │ +│ - Always runs │ +│ - No LLM involvement │ +│ - SpiceDB permission checks │ +└─────────────┬─────────────────────┘ + ▼ +┌───────────────────────────────────┐ +│ Trusted Zone │ +│ SpiceDB + Weaviate │ +│ Authorized data │ +└───────────────────────────────────┘ +``` + +**CLI Interface (Direct Access):** + +``` +┌───────────────────────────┐ +│ Untrusted Zone │ +│ User Input, Query │ +└──────────┬────────────────┘ + ▼ +┌───────────────────────────┐ +│ Semi-Trusted Zone │ +│ Agent (LLM) + Tools │ +│ - Can plan strategies │ +│ - Can check permissions │ +│ - Cannot bypass auth │ +└──────────┬────────────────┘ + ▼ +┌───────────────────────────┐ +│ SECURITY BOUNDARY │ +│ Authorization Node │ +│ - Deterministic │ +│ - Always runs │ +│ - No LLM involvement │ +└──────────┬────────────────┘ + ▼ +┌───────────────────────────┐ +│ Trusted Zone │ +│ SpiceDB + Weaviate │ +│ Authorized data │ +└───────────────────────────┘ +``` + +### Security Guarantees + +1. **Authorization cannot be bypassed** + ```python + # Hardcoded in graph.py + workflow.add_edge("retrieve", "authorize") + # Agent cannot skip this edge + ``` + +2. **Deterministic permission checks** + ```python + # Not LLM-based, uses SpiceDB directly + response = spicedb_client.CheckPermission(...) + if response.permissionship == HAS_PERMISSION: + allow() + ``` + +3. **Agent observes, doesn't control** + ```python + # Authorization happens first + authorize() → reason() + # Agent sees results, but doesn't make decisions + ``` + +4. **Fail closed by default** + ```python + # Explicit permission required + if not explicitly_allowed: + deny() + ``` + +## Conditional Logic + +### After Authorization: Generate or Reason? + +```python +def should_reason_or_generate(state: AgenticRAGState) -> str: + if state["authorization_passed"]: + return "generate" # Have docs, answer the query + else: + return "reason" # No docs, agent decides what to do +``` + +### After Reasoning: Retry or Generate? + +```python +def should_retry_or_generate(state: AgenticRAGState) -> str: + if (state["retrieval_attempt"] < state["max_attempts"] + and len(state["authorized_documents"]) == 0): + return "plan" # Try again with different strategy + else: + return "generate" # Give best answer we can +``` + +## Design Decisions + +### Why Post-Filter Authorization? + +**Alternative 1: Pre-filter (embed permissions in metadata)** +```python +# Query with permission filter +query.where({"department": user_department}) +``` +Problems: +- Limits search space (worse semantic results) +- Stale permissions (metadata not always current) +- Doesn't work with computed permissions + +**Alternative 2: Post-filter (this approach)** +```python +# 1. Search without constraints (best semantic results) +docs = search(query) + +# 2. Filter by up-to-date permissions +authorized = [d for d in docs if check_permission(d)] +``` +Benefits: +- Best semantic search results +- Always current permissions +- Works with complex authorization logic + +### Why LangGraph State Machine? + +**Alternative: Pure ReAct loop** +```python +while not done: + action = agent.choose_action() + result = execute(action) +``` +Problems: +- Agent controls flow (can skip steps) +- Harder to enforce security boundary +- Less observable + +**LangGraph approach:** +```python +# Explicit state machine +workflow.add_edge("retrieve", "authorize") # Always runs +``` +Benefits: +- Enforces authorization node +- Observable state transitions +- Easier to debug/audit + +### Why Deterministic Authorization Node? + +**Not this:** +```python +def authorize(state): + # Ask LLM to decide + decision = llm("Should user access this doc?") + return decision # ❌ Non-deterministic +``` + +**This:** +```python +def authorize(state): + # Direct SpiceDB check + response = spicedb.CheckPermission(...) + return response.permissionship == HAS_PERMISSION # ✅ Deterministic +``` + +**Reason:** Security decisions must be deterministic, auditable, and policy-based. + +## Modes of Operation + +### Default Mode (max_attempts=1) + +``` +Query + ↓ +Retrieve (BM25 search) + ↓ +Authorize (filter) + ↓ +Generate (with explanations) + ↓ +Answer +``` + +**Characteristics:** +- Simple, predictable (3 nodes) +- Fast (~3-4s total) +- No retry logic +- Transparent explanations of authorization +- Single LLM call (generation only) +- Deterministic retrieval strategy + +### Adaptive Mode (max_attempts > 1) + +``` +Query + ↓ +Retrieve + ↓ +Authorize ← Security boundary + ↓ +[Reason if needed] ← LLM decides retry + ↓ +Generate or Retry + ↓ +Answer + Reasoning Trace +``` + +**Characteristics:** +- Can adapt to failures (4 nodes) +- Slower (~5-8s with retries) +- Retry logic when authorization fails +- Rich reasoning traces +- Multiple LLM calls (reasoning + generation) +- Can try different retrieval approaches + +Note: Default mode is intentionally simple and deterministic, not highly agentic. Enable adaptive mode only when you need retry logic. + +## Extension Points + +### Adding New Nodes + +```python +def custom_node(state: AgenticRAGState) -> dict: + # Custom logic + result = process(state["query"]) + return {"custom_field": result} + +# Add to graph +workflow.add_node("custom", custom_node) +workflow.add_edge("authorize", "custom") +workflow.add_edge("custom", "reason") +``` + +### Adding New Tools + +```python +from langchain.tools import BaseTool + +class CustomTool(BaseTool): + name = "custom_tool" + description = "What this tool does" + + def _run(self, query: str) -> str: + # Implementation + return result + +# Agent will have access in planning node +``` + +### Modifying Authorization Logic + +```python +def authorization_node(state: AgenticRAGState): + # Add custom checks + if special_case(state["subject_id"]): + return special_authorization(state) + + # Default SpiceDB logic + return spicedb_authorization(state) +``` + +### Extending the Web Interface + +**Adding New API Endpoints:** + +```python +# In api/routes.py +from fastapi import APIRouter + +@router.get("/documents") +async def list_documents(): + """List all documents with metadata.""" + # Query Weaviate for all documents + # Return structured list + return {"documents": [...]} + +@router.get("/permissions/{subject_id}") +async def get_user_permissions(subject_id: str): + """Get all documents accessible to a user.""" + # Query SpiceDB for user's accessible documents + # Useful for authorization debugging + return {"accessible_documents": [...]} +``` + +**Adding Frontend Features:** + +The UI is a single HTML file with embedded CSS and JavaScript. To extend: + +1. **Add UI Section** (HTML): +```html +
Demonstrating fine-grained authorization with SpiceDB
+