Enterprise NL2SQL Engine

A Production-Grade Natural Language to SQL Engine built on the principles of Zero Trust and Deterministic Execution.

This platform treats "Text-to-SQL" not as a prompt engineering problem, but as a Distributed Systems problem. It replaces fragile one-shot generation with a robust, compiled pipeline that bridges the gap between Unstructured Intention (User Language) and Structured Execution (SQL Databases).

🏗️ System Topology

The architecture is composed of three distinct planes, ensuring separation of concerns and failure isolation.

1. The Control Plane (The Graph)

Responsibility: Reasoning, Planning, and Orchestration.

Agentic Graph: Implemented as a Directed Cyclic Graph (LangGraph) to enable "Refinement Loops". If a plan fails validation, the system self-corrects.
State Management: Deterministic state transitions ensure auditability and reproducibility of every decision.

2. The Security Plane (The Firewall)

Responsibility: Invariants Enforcement.

Valid-by-Construction: The LLM never executes SQL directly. It generates an Abstract Syntax Tree (AST).
Static Analysis: The Validator Node enforces Row-Level Security (RLS) and type safety on the AST before compilation.
Intent Classification: Upstream detection of adversarial prompts (Jailbreaks/Injections).

3. The Data Plane (The Sandbox)

Responsibility: Semantic Search and Execution.

Blast Radius Isolation: SQL Drivers (ODBC/C-Ext) run in a dedicated Sandboxed Process Pool. A segfault in a driver kills a disposable worker, not the Agent.
Partitioned Retrieval: The Orchestrator uses Partitioned MMR to inject only relevant schema context, preventing context window overflow.

4. The Reliability Plane (The Guard)

Responsibility: Fault Tolerance and Stability.

Layered Defense: A combination of Retries, Circuit Breakers, and Sandboxing ensures the system stays up even when LLMs or Databases go down.
Fail-Fast: We stop processing immediately if a dependency is unresponsive, preserving resources.

5. The Observability Plane (The Watchtower)

Responsibility: Visibility, Forensics, and Compliance.

Full-Stack Telemetry: Native OpenTelemetry integration provides distributed tracing (Jaeger) and metrics (Prometheus) for every node execution.
Forensic Audit Logs: A tamper-evident, persistent Audit Log records every AI decision (Prompt/Response/Reasoning) for compliance and debugging.

📐 Architectural Invariants

Invariant	Rationale	Mechanism
No Unvalidated SQL	Prevent Hullucinations & Data Leaks	All plans pass through `LogicalValidator` (AST) + `PhysicalValidator` (Dry Run) before execution.
Zero Shared State	Crash Safety	Execution happens in isolated processes; no shared memory with the Control Plane.
Fail-Fast	Reliability	Circuit Breakers and Strict Timeouts prevent cascading failures (Retry Storms).
Determinism	Debuggability	Temperature-0 generation + Strict Typing (Pydantic) for all LLM outputs.

🚀 Quick Start

Prerequisites

Python 3.10+
Docker (Optional, for full integration environment)

1. Installation

git clone https://github.com/nadeem4/nl2sql.git
cd nl2sql

# Set up environment
python -m venv venv
source venv/bin/activate

# Install Core Engine & CLI
pip install -e packages/core
pip install -e packages/cli
pip install -e packages/adapter-sdk

2. Run Demo (Lite Mode)

Boot the engine with an in-memory SQLite database (No Docker required).

nl2sql setup --demo

📚 Technical Documentation

System Architecture: Deep dive into the Control, Security, and Data planes.
Component Reference: Detailed specs for Planner, Validator, Executor, etc.
Security Model: Defense-in-depth strategy against prompt injection and unauthorized access.
Security Model: Defense-in-depth strategy against prompt injection and unauthorized access.
Reliability & Fault Tolerance: Guide to Circuit Breakers, Sandbox isolation, and Recovery strategies.
Observability & Operations: Configuring OpenTelemetry, Logging, and Audit Trails.

📦 Repository Structure

packages/
├── core/               # The Engine (Graph, State, Logic)
├── cli/                # Terminal Interface & Ops Tools
├── adapter-sdk/        # Interface Contract for new Databases
└── adapters/           # Official Dialects (Postgres, MSSQL, MySQL)
configs/                # Runtime Configuration (Policies, Prompts)
docs/                   # Architecture & Operations Manual

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
.github/workflows		.github/workflows
audit		audit
configs		configs
data		data
docs		docs
packages		packages
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
last_reasoning.json		last_reasoning.json
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements-docs.txt		requirements-docs.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Enterprise NL2SQL Engine

🏗️ System Topology

1. The Control Plane (The Graph)

2. The Security Plane (The Firewall)

3. The Data Plane (The Sandbox)

4. The Reliability Plane (The Guard)

5. The Observability Plane (The Watchtower)

📐 Architectural Invariants

🚀 Quick Start

Prerequisites

1. Installation

2. Run Demo (Lite Mode)

📚 Technical Documentation

📦 Repository Structure

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

nadeem4/nl2sql

Folders and files

Latest commit

History

Repository files navigation

Enterprise NL2SQL Engine

🏗️ System Topology

1. The Control Plane (The Graph)

2. The Security Plane (The Firewall)

3. The Data Plane (The Sandbox)

4. The Reliability Plane (The Guard)

5. The Observability Plane (The Watchtower)

📐 Architectural Invariants

🚀 Quick Start

Prerequisites

1. Installation

2. Run Demo (Lite Mode)

📚 Technical Documentation

📦 Repository Structure

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages