AEGIS Shield is a modular, high-performance validation service engineered with LangGraph to perform parallel input validation using a suite of AI-powered evaluators. It acts as an intelligent guardian for larger AI systems, designed to preemptively identify and neutralize potentially harmful or non-compliant content, including spam, toxic language, harassment, and unsolicited financial advice.
The system's core strength lies in its parallel architecture, which ensures that all validation checks are executed simultaneously. This approach dramatically reduces latency compared to traditional sequential methods and guarantees a comprehensive evaluation of user input against all defined safety rails before it reaches the main application logic.
The AEGIS Shield system utilizes the google/gemma-3-4b-it model as its core Large Language Model (LLM) for all validation and response generation tasks. This specific model is configured and loaded in the config.py file.
- Model ID: The system is explicitly configured to use
MODEL_ID = "google/gemma-3-4b-it". - Loading Process: The model is loaded with 4-bit quantization (
load_in_4bit=True) onto a GPU (device_map="auto") to optimize performance and reduce memory usage. It uses afloat16data type for both storage and computation (torch_dtype=torch.float16,bnb_4bit_compute_dtype=torch.float16). - Custom Invoker: A custom class,
ManualGemmaInvoker, is implemented to handle inference. This class manages tokenization, templating, generation, and response cleaning specifically for the Gemma model's chat format. - Shared Instance: A single instance of the loaded model serves as both the
validator_llmfor content checks and theresponse_llmfor generating rejection messages.
The service is constructed with a modular LangGraph architecture that facilitates the parallel execution of multiple validation checks. This design champions high performance, scalability, and maintainability.
graph TD
A[Client App / Main AI] -- HTTP POST --> B{FastAPI Endpoint};
B -- user_id, question --> C[LangGraph State Manager];
C -- Parallel Execution --> D[validate_spam];
C -- Parallel Execution --> E[validate_toxic];
C -- Parallel Execution --> F[validate_harassment];
C -- Parallel Execution --> G[validate_financial];
D --> H[Results Aggregator];
E --> H;
F --> H;
G --> H;
H --> I{All Validations Passed?};
I -- Yes --> J[Success Response];
I -- No --> K[AI Response Generator];
K --> L[Rejection Response];
J --> M[JSON Response];
L --> M;
M --> A;
subgraph "AEGIS Shield Service"
B
C
D
E
F
G
H
I
J
K
L
end
The project is organized with a clear separation of concerns to enhance maintainability and scalability.
aegis-shield/
├── main.py # FastAPI application server and API endpoints
├── requirements.txt # Python dependencies
├── config.py # Model configuration, LLM invoker, and validation check definitions
├── nodes.py # LangGraph node logic for validation, aggregation, and response generation
├── graph.py # LangGraph state management and workflow definition
└── promp_guard/ # Directory for validation prompt templates
├── spam.txt # Prompts for spam detection
├── toxic.txt # Prompts for toxic content detection
├── harassment.txt # Prompts for harassment detection
└── financial.txt # Prompts for financial advice detection
The core of AEGIS Shield is the parallel execution of all validation checks, which significantly boosts performance over sequential validation.
graph LR
START([START]) --> INIT[start_node]
INIT --> PARALLEL{Parallel Execution}
PARALLEL --> SPAM[validate_spam]
PARALLEL --> TOXIC[validate_toxic]
PARALLEL --> HARASS[validate_harassment]
PARALLEL --> FINANCIAL[validate_financial]
SPAM --> AGG[aggregator_node]
TOXIC --> AGG
HARASS --> AGG
FINANCIAL --> AGG
AGG --> DECISION{Any Violations?}
DECISION -->|No| SUCCESS[success_response_node]
DECISION -->|Yes| FAILED[response_generator_node]
SUCCESS --> END([END])
FAILED --> END
Each node in the graph processes and updates a shared GraphState, a TypedDict that ensures data consistency throughout the workflow.
class GraphState(TypedDict):
user_id: str
question: str
validation_results: Annotated[Dict[str, bool], combine_validation_results]
rejection_reason: str
final_status: str
ai_response: struser_id: Unique identifier for the user.question: The user-provided text to be validated.validation_results: A dictionary that accumulates the boolean results from each validator node.rejection_reason: A comma-separated string of failed validation keys.final_status: The overall outcome, either "passed" or "failed".ai_response: The AI-generated message for the end-user, populated only if validation fails.
Follow these steps to get the AEGIS Shield API running.
- Python 3.10+
- NVIDIA GPU with CUDA support for local model hosting
Install the required Python libraries.
pip install -r requirements.txtUpon first run, the script will download the google/gemma-3-4b-it model into a model_cache directory. This may take some time depending on your internet connection.
Start the FastAPI server using Uvicorn.
python main.pyThe API will be live and accessible at http://localhost:8000.
Executes the complete parallel validation workflow and returns a detailed report.
| Field | Type | Description | Required? |
|---|---|---|---|
user_id |
string |
A unique identifier for the user. | Yes |
question |
string |
The text content to be validated. | Yes |
For FAILED validation:
{
"user_id": "user-123",
"question": "HAIIII BODOH",
"case": "spam, toxic, harassment",
"execution_per_step": [
{ "step": "spam", "status": "failed" },
{ "step": "toxic", "status": "failed" },
{ "step": "harassment", "status": "failed" },
{ "step": "financial_advice", "status": "passed" }
],
"status_guard": "failed",
"ai_response": "Maaf, saya tidak dapat membantu permintaan terkait spam karena tidak sesuai dengan kebijakan kami."
}For PASSED validation:
{
"user_id": "user-456",
"question": "Bagaimana cara top up saldo?",
"case": "aman",
"execution_per_step": [
{ "step": "spam", "status": "passed" },
{ "step": "toxic", "status": "passed" },
{ "step": "harassment", "status": "passed" },
{ "step": "financial_advice", "status": "passed" }
],
"status_guard": "passed",
"ai_response": ""
}A simplified endpoint that provides a direct, user-facing AI response. It is ideal for integrations where only the final outcome is needed.
{
"user_id": "user-123",
"question": "Your question here"
}For FAILED validation:
{
"response": "Mohon maaf, saya tidak dapat membantu dengan bahasa yang tidak sopan...",
"status": "failed",
"can_proceed": false
}For PASSED validation:
{
"response": "",
"status": "passed",
"can_proceed": true
}A standard health check endpoint for service monitoring.
{
"status": "healthy",
"message": "Guardian Validation API is running"
}Adding a new validation check is straightforward:
-
Create a prompt template: Add a new text file to the
promp_guard/directory.echo "Your new validation prompt here" > promp_guard/new_validator.txt
-
Update configuration: Add the new check to the
VALIDATION_CHECKSlist inconfig.py.# In config.py VALIDATION_CHECKS = [ # ... existing checks {"key": "new_validator", "path": str(CURRENT_DIR / "promp_guard/new_validator.txt")}, ]
The graph in graph.py will automatically create and integrate the new validation node.
The model can be changed by updating the MODEL_ID in config.py. The ManualGemmaInvoker class is specifically designed for Gemma-based models but could be adapted for other Hugging Face transformers.
- 🔄 Parallel Processing: All validation checks run simultaneously, offering significant performance gains over sequential processing.
- 🛡️ Comprehensive Protection: Employs multiple specialized guards against spam, toxicity, harassment, and unsolicited financial advice.
- 🤖 AI-Powered Responses: Generates natural, contextual rejection messages that politely enforce content policies.
- 📊 Detailed Reporting: The
/validateendpoint provides a complete breakdown of which checks passed or failed for full visibility. - 🔧 Developer-Friendly: Features a clean RESTful API, easy customization, and clear separation of concerns in the codebase.
- Scalable: The parallel architecture's time complexity remains constant as new validators are added, enabling seamless scaling.