AEGIS Shield: Autonomous Evaluation and Guardian Intelligence System

AEGIS Shield is a modular, high-performance validation service engineered with LangGraph to perform parallel input validation using a suite of AI-powered evaluators. It acts as an intelligent guardian for larger AI systems, designed to preemptively identify and neutralize potentially harmful or non-compliant content, including spam, toxic language, harassment, and unsolicited financial advice.

The system's core strength lies in its parallel architecture, which ensures that all validation checks are executed simultaneously. This approach dramatically reduces latency compared to traditional sequential methods and guarantees a comprehensive evaluation of user input against all defined safety rails before it reaches the main application logic.

The AEGIS Shield system utilizes the google/gemma-3-4b-it model as its core Large Language Model (LLM) for all validation and response generation tasks. This specific model is configured and loaded in the config.py file.

Model Configuration

Model ID: The system is explicitly configured to use MODEL_ID = "google/gemma-3-4b-it".
Loading Process: The model is loaded with 4-bit quantization (load_in_4bit=True) onto a GPU (device_map="auto") to optimize performance and reduce memory usage. It uses a float16 data type for both storage and computation (torch_dtype=torch.float16, bnb_4bit_compute_dtype=torch.float16).
Custom Invoker: A custom class, ManualGemmaInvoker, is implemented to handle inference. This class manages tokenization, templating, generation, and response cleaning specifically for the Gemma model's chat format.
Shared Instance: A single instance of the loaded model serves as both the validator_llm for content checks and the response_llm for generating rejection messages.

Architecture 🏛️

The service is constructed with a modular LangGraph architecture that facilitates the parallel execution of multiple validation checks. This design champions high performance, scalability, and maintainability.

graph TD
    A[Client App / Main AI] -- HTTP POST --> B{FastAPI Endpoint};
    B -- user_id, question --> C[LangGraph State Manager];
    C -- Parallel Execution --> D[validate_spam];
    C -- Parallel Execution --> E[validate_toxic];
    C -- Parallel Execution --> F[validate_harassment];
    C -- Parallel Execution --> G[validate_financial];
    
    D --> H[Results Aggregator];
    E --> H;
    F --> H;
    G --> H;
    
    H --> I{All Validations Passed?};
    I -- Yes --> J[Success Response];
    I -- No --> K[AI Response Generator];
    K --> L[Rejection Response];
    
    J --> M[JSON Response];
    L --> M;
    M --> A;

    subgraph "AEGIS Shield Service"
        B
        C
        D
        E
        F
        G
        H
        I
        J
        K
        L
    end

File Structure 📁

The project is organized with a clear separation of concerns to enhance maintainability and scalability.

aegis-shield/
├── main.py                 # FastAPI application server and API endpoints
├── requirements.txt        # Python dependencies
├── config.py               # Model configuration, LLM invoker, and validation check definitions
├── nodes.py                # LangGraph node logic for validation, aggregation, and response generation
├── graph.py                # LangGraph state management and workflow definition
└── promp_guard/            # Directory for validation prompt templates
    ├── spam.txt            # Prompts for spam detection
    ├── toxic.txt           # Prompts for toxic content detection
    ├── harassment.txt      # Prompts for harassment detection
    └── financial.txt       # Prompts for financial advice detection

LangGraph Flow Architecture 🔄

Parallel Validation Flow

The core of AEGIS Shield is the parallel execution of all validation checks, which significantly boosts performance over sequential validation.

graph LR
    START([START]) --> INIT[start_node]
    INIT --> PARALLEL{Parallel Execution}
    
    PARALLEL --> SPAM[validate_spam]
    PARALLEL --> TOXIC[validate_toxic] 
    PARALLEL --> HARASS[validate_harassment]
    PARALLEL --> FINANCIAL[validate_financial]
    
    SPAM --> AGG[aggregator_node]
    TOXIC --> AGG
    HARASS --> AGG
    FINANCIAL --> AGG
    
    AGG --> DECISION{Any Violations?}
    DECISION -->|No| SUCCESS[success_response_node]
    DECISION -->|Yes| FAILED[response_generator_node]
    
    SUCCESS --> END([END])
    FAILED --> END

State Management

Each node in the graph processes and updates a shared GraphState, a TypedDict that ensures data consistency throughout the workflow.

class GraphState(TypedDict):
    user_id: str
    question: str
    validation_results: Annotated[Dict[str, bool], combine_validation_results]
    rejection_reason: str
    final_status: str
    ai_response: str

user_id: Unique identifier for the user.
question: The user-provided text to be validated.
validation_results: A dictionary that accumulates the boolean results from each validator node.
rejection_reason: A comma-separated string of failed validation keys.
final_status: The overall outcome, either "passed" or "failed".
ai_response: The AI-generated message for the end-user, populated only if validation fails.

Setup & Installation 🚀

Follow these steps to get the AEGIS Shield API running.

1. Prerequisites

Python 3.10+
NVIDIA GPU with CUDA support for local model hosting

2. Install Dependencies

Install the required Python libraries.

pip install -r requirements.txt

3. Model Cache

Upon first run, the script will download the google/gemma-3-4b-it model into a model_cache directory. This may take some time depending on your internet connection.

4. Run the Service

Start the FastAPI server using Uvicorn.

python main.py

The API will be live and accessible at http://localhost:8000.

API Reference 📖

Endpoint: `POST /validate`

Executes the complete parallel validation workflow and returns a detailed report.

Request Body (`application/json`)

Field	Type	Description	Required?
`user_id`	`string`	A unique identifier for the user.	Yes
`question`	`string`	The text content to be validated.	Yes

Response Body

For FAILED validation:

{
  "user_id": "user-123",
  "question": "HAIIII BODOH",
  "case": "spam, toxic, harassment",
  "execution_per_step": [
    { "step": "spam", "status": "failed" },
    { "step": "toxic", "status": "failed" },
    { "step": "harassment", "status": "failed" },
    { "step": "financial_advice", "status": "passed" }
  ],
  "status_guard": "failed",
  "ai_response": "Maaf, saya tidak dapat membantu permintaan terkait spam karena tidak sesuai dengan kebijakan kami."
}

For PASSED validation:

{
  "user_id": "user-456",
  "question": "Bagaimana cara top up saldo?",
  "case": "aman",
  "execution_per_step": [
    { "step": "spam", "status": "passed" },
    { "step": "toxic", "status": "passed" },
    { "step": "harassment", "status": "passed" },
    { "step": "financial_advice", "status": "passed" }
  ],
  "status_guard": "passed",
  "ai_response": ""
}

Endpoint: `POST /chat`

A simplified endpoint that provides a direct, user-facing AI response. It is ideal for integrations where only the final outcome is needed.

Request Body (`application/json`)

{
  "user_id": "user-123",
  "question": "Your question here"
}

Response Body

For FAILED validation:

{
  "response": "Mohon maaf, saya tidak dapat membantu dengan bahasa yang tidak sopan...",
  "status": "failed",
  "can_proceed": false
}

For PASSED validation:

{
  "response": "",
  "status": "passed",
  "can_proceed": true
}

Endpoint: `GET /health`

A standard health check endpoint for service monitoring.

Response Body

{
  "status": "healthy",
  "message": "Guardian Validation API is running"
}

Configuration & Customization ⚙️

Adding New Validators

Adding a new validation check is straightforward:

Create a prompt template: Add a new text file to the promp_guard/ directory.

echo "Your new validation prompt here" > promp_guard/new_validator.txt

Update configuration: Add the new check to the VALIDATION_CHECKS list in config.py.

# In config.py
VALIDATION_CHECKS = [
    # ... existing checks
    {"key": "new_validator", "path": str(CURRENT_DIR / "promp_guard/new_validator.txt")},
]

The graph in graph.py will automatically create and integrate the new validation node.

Customizing the LLM

The model can be changed by updating the MODEL_ID in config.py. The ManualGemmaInvoker class is specifically designed for Gemma-based models but could be adapted for other Hugging Face transformers.

Key Features & Benefits 🚀

🔄 Parallel Processing: All validation checks run simultaneously, offering significant performance gains over sequential processing.
🛡️ Comprehensive Protection: Employs multiple specialized guards against spam, toxicity, harassment, and unsolicited financial advice.
🤖 AI-Powered Responses: Generates natural, contextual rejection messages that politely enforce content policies.
📊 Detailed Reporting: The /validate endpoint provides a complete breakdown of which checks passed or failed for full visibility.
🔧 Developer-Friendly: Features a clean RESTful API, easy customization, and clear separation of concerns in the codebase.
Scalable: The parallel architecture's time complexity remains constant as new validators are added, enabling seamless scaling.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
promp_guard		promp_guard
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.py		config.py
graph.py		graph.py
main.py		main.py
nodes.py		nodes.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AEGIS Shield: Autonomous Evaluation and Guardian Intelligence System

Model Configuration

Architecture 🏛️

File Structure 📁

LangGraph Flow Architecture 🔄

Parallel Validation Flow

State Management

Setup & Installation 🚀

1. Prerequisites

2. Install Dependencies

3. Model Cache

4. Run the Service

API Reference 📖

Endpoint: `POST /validate`

Request Body (`application/json`)

Response Body

Endpoint: `POST /chat`

Request Body (`application/json`)

Response Body

Endpoint: `GET /health`

Response Body

Configuration & Customization ⚙️

Adding New Validators

Customizing the LLM

Key Features & Benefits 🚀

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

zeinhasan/AEGIS-Shield

Folders and files

Latest commit

History

Repository files navigation

AEGIS Shield: Autonomous Evaluation and Guardian Intelligence System

Model Configuration

Architecture 🏛️

File Structure 📁

LangGraph Flow Architecture 🔄

Parallel Validation Flow

State Management

Setup & Installation 🚀

1. Prerequisites

2. Install Dependencies

3. Model Cache

4. Run the Service

API Reference 📖

Endpoint: POST /validate

Request Body (application/json)

Response Body

Endpoint: POST /chat

Request Body (application/json)

Response Body

Endpoint: GET /health

Response Body

Configuration & Customization ⚙️

Adding New Validators

Customizing the LLM

Key Features & Benefits 🚀

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Endpoint: `POST /validate`

Request Body (`application/json`)

Endpoint: `POST /chat`

Request Body (`application/json`)

Endpoint: `GET /health`

Packages