Skip to content

Guardrail: Integration#575

Merged
nishika26 merged 10 commits intoProjectTech4DevAI:mainfrom
tattle-made:feat/guardrails-integration
Feb 6, 2026
Merged

Guardrail: Integration#575
nishika26 merged 10 commits intoProjectTech4DevAI:mainfrom
tattle-made:feat/guardrails-integration

Conversation

@rkritika1508
Copy link
Collaborator

@rkritika1508 rkritika1508 commented Feb 3, 2026

Summary

Target issue is #16
Explain the motivation for making this change. What existing problem does the pull request solve?
We have integrated Guardrails APIs with kaapi-backend. In the LLM call endpoint, we have added two more parameters input_guardrails and output_guardrails which lists all validators for input and output. A call is later made to the Guardrails endpoint present in Kaapi guardrails to validate the user input and LLM output.
Here is how to test -

  1. Add KAAPI_GUARDRAILS_AUTH="<add-auth-token>" in .env.
  2. In the LLM call endpoint, the body should be updated. For example,
{
  "query": {
    "input": "Mai pregnant hu aur janna chahti hu ki ladka hoga ya ladka. Aaspaas kahi sonography karwa sakti hu kya?",
    "conversation": {
      "id": null,
      "auto_create": true
    }
  },
  "input_guardrails": [
    {
        "type": "uli_slur_match",
        "severity": "all"
    },
    {
        "type": "ban_list",
        "banned_words": [
            "sonography"
        ],
        "on_fail": "fix"
    }
  ],
  "output_guardrails": [  ],
  "config": {
    "id": "55f178d1-3ff7-4c2a-be23-9d9eae85cf80",
    "version": "1"
  },
  "callback_url": "https://play.svix.com/in/e_5I971H1PISMQn4UKWLBJJUpYmg2",
  "include_provider_raw_response": true,
  "request_metadata": { }
}
  1. You can check the response from validator_log table and their status.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

  • New Features

    • Added optional input and output guardrails for LLM requests to validate, sanitize, or redact content before and after processing; supports bypass and rephrase handling.
    • Integrated guardrails service invocation to evaluate inputs/outputs when configured.
  • Chores

    • Added environment variable placeholders for guardrails service authentication and URL.
  • Tests

    • Added extensive tests covering guardrails success, bypass, sanitize, timeout/connect failures, and blocking scenarios.

@coderabbitai

This comment was marked as resolved.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @.env.example:
- Around line 83-84: Add a final newline to the end of the file so the
end-of-file-fixer CI hook passes; edit the file containing the OPENAI_API_KEY
and KAAPI_GUARDRAILS_AUTH lines and ensure there is a blank line (trailing
newline) after the last line.

In `@backend/app/services/llm/jobs.py`:
- Around line 140-155: The logger currently prints raw sensitive values in
execute_job (logger.info) for input_query, input_guardrails and safe_input;
change these logs to avoid raw payloads by using a sanitizer (e.g., mask_string)
and/or logging only metadata (lengths/counts) and follow the guideline to prefix
messages with the function name in square brackets: replace direct
interpolations of input_query/input_guardrails/safe_input in the logger.info
calls inside execute_job and after call_guardrails with masked or metadata forms
(e.g., mask_string(input_query) or len(input_query)) and keep the
call_guardrails/job_id references intact so you still log context without
exposing the raw content.
- Around line 312-333: call_guardrails currently has no return type and uses an
untyped dict; update the signature to add precise typing (e.g., def
call_guardrails(input_text: str, guardrail_config: list[dict[str, Any]], job_id:
UUID) -> dict[str, Any]) and add the necessary typing import (from typing import
Any) at the top of the module so guardrail_config and the return value are
explicitly typed; keep the function body unchanged apart from the signature and
import.
- Around line 156-176: The guardrails branch currently treats success with
rephrase_needed=True as a failure and returns an error; update the logic around
safe_input in jobs.py so that if safe_input["success"] is True you always set
request.query.input = safe_input["data"]["safe_text"] and only return
handle_job_error when safe_input["success"] is False (use boolean checks like if
safe_input["success"] and not safe_input["data"]["rephrase_needed"] / elif
safe_input["success"] and safe_input["data"]["rephrase_needed"] should be
collapsed so both success paths continue), remove the trailing whitespace, and
ensure failures call APIResponse.failure_response and handle_job_error with
safe_input["error"] as before; also add a return type hint to the
call_guardrails function signature (change def call_guardrails(...) -> dict:) so
typing requirements are satisfied.
- Around line 262-272: The output-guardrails branch currently treats failures as
success, logs raw sensitive output_text, and uses a non-idiomatic comparison;
fix by checking truthiness of safe_output (if safe_output["success"] is truthy)
and on failure call handle_job_error(job_id, reason=<appropriate message>) to
mark the job failed (mirror input_guardrails behavior), set
response.response.output.text to a masked or error message, and remove/raw
output_text from logs—use mask_string(output_text) when logging and avoid
printing safe_output contents; update the logger call in execute_job to log only
masked output and a minimal status.
🧹 Nitpick comments (1)
backend/app/models/llm/request.py (1)

211-224: Consider a typed guardrails schema for early validation.

list[dict[str, Any]] is flexible but unvalidated; a small Pydantic/SQLModel model for validator type/on_fail/config would surface malformed requests earlier once the schema is stable.

@AkhileshNegi AkhileshNegi added the enhancement New feature or request label Feb 4, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @.env.example:
- Around line 85-86: The two environment variable lines use inconsistent spacing
around the equals sign; update the KAAPI_GUARDRAILS_AUTH and
KAAPI_GUARDRAILS_URL entries to remove spaces around '=' so they match the
file's style (e.g., change 'KAAPI_GUARDRAILS_URL = ""' to
'KAAPI_GUARDRAILS_URL=""') and ensure no trailing whitespace.

In @.env.test.example:
- Around line 38-40: Remove the trailing-whitespace and ensure the file ends
with a newline, and make the KAAPI_GUARDRAILS_URL line consistent with
KAAPI_GUARDRAILS_AUTH by removing spaces around the "=" so both use the format
KEY="". Specifically edit the lines defining KAAPI_GUARDRAILS_AUTH and
KAAPI_GUARDRAILS_URL to use no spaces around the equals sign and add a final
newline at EOF.

In `@backend/app/services/llm/guardrail.py`:
- Around line 49-52: Remove the trailing whitespace after "except Exception as
e:" and update the logger.warning call to prefix the message with the enclosing
function's name in square brackets (replace the literal "[guardrails]" with the
actual function name), e.g. logger.warning(f"[<function_name>] Service
unavailable. Bypassing guardrails. job_id={job_id}. error={e}"), keeping the
rest of the message intact and still referencing job_id and the exception
variable e.

In `@backend/app/services/llm/jobs.py`:
- Around line 264-294: In execute_job, guardrails failures are being overwritten
by the final success response; modify the block that handles output_guardrails
(in jobs.py inside execute_job) so that when safe_output indicates bypassed,
rephrase_needed, or success==False you construct the
APIResponse.failure_response (use response.response.output.text for the
error/rephrase message instead of request.query.input) and immediately return it
instead of falling through; replace all `== True` checks with truthy checks
(e.g., if safe_output.get("bypassed"):) and use safe_output.get("data", {}) /
safe_output.get("error") to avoid KeyError; only reach the final
APIResponse.success_response when no failure/guardrail condition triggered.
- Around line 155-178: The code is using "== True" comparisons and indexing into
safe_input for "bypassed" which can raise KeyError; update the logic in the
execute_job block to use truthiness (e.g., if safe_input.get("bypassed"): ...)
and safe_input.get("success")/safe_input.get("data", {}).get("rephrase_needed")
to avoid KeyError, check bypassed first (fallback) then success, set
request.query.input from safe_input.get("data", {}).get("safe_text") or
safe_input.get("error") as appropriate, and preserve the existing failure path
that calls APIResponse.failure_response and handle_job_error (referencing
safe_input, request.query.input, APIResponse.failure_response, and
handle_job_error).
🧹 Nitpick comments (1)
backend/app/services/llm/guardrail.py (1)

12-31: Tighten type hint and fix misleading docstring.

  1. The guardrail_config parameter type could be more precise: list[dict[str, Any]].
  2. The docstring states "Raises: httpx.HTTPError" but the function catches all exceptions and returns a bypass response instead of raising.
♻️ Proposed fix
-def call_guardrails(input_text: str, guardrail_config: list[dict], job_id: UUID) -> dict[str, Any]:
+def call_guardrails(input_text: str, guardrail_config: list[dict[str, Any]], job_id: UUID) -> dict[str, Any]:
     """
     Call the Kaapi guardrails service to validate and process input text.
 
     Args:
         input_text: Text to validate and process.
         guardrail_config: List of validator configurations to apply.
         job_id: Unique identifier for the request.
 
     Returns:
         JSON response from the guardrails service with validation results.
-
-    Raises:
-        httpx.HTTPError: If the request fails.
+        On service failure, returns a bypass response with success=False and bypassed=True.
     """

@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 96.77419% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/services/llm/jobs.py 80.64% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @.env.test.example:
- Around line 35-39: The empty environment placeholders OPENAI_API_KEY,
KAAPI_GUARDRAILS_AUTH, and KAAPI_GUARDRAILS_URL are quoted which triggers
dotenv-linter's QuoteCharacter rule; remove the surrounding double quotes so
each line reads as an unquoted empty value (e.g., OPENAI_API_KEY=) to silence
the linter and keep the template consistent.

In `@backend/app/tests/api/routes/test_llm.py`:
- Around line 169-275: Both new tests test_llm_call_success_with_guardrails and
test_llm_call_guardrails_bypassed_still_succeeds are missing return type
annotations; update their signatures to include "-> None" (e.g., def
test_llm_call_success_with_guardrails(... ) -> None:) so they follow the project
typing guideline for function return types.

In `@backend/app/tests/services/llm/test_jobs.py`:
- Around line 722-907: All new guardrails test functions
(test_guardrails_sanitize_input_before_provider,
test_guardrails_sanitize_output_after_provider,
test_guardrails_bypass_does_not_modify_input,
test_guardrails_validation_failure_blocks_job,
test_guardrails_rephrase_needed_blocks_job) need explicit type hints on their
parameters and return type; update each signature to annotate db, job_env, and
job_for_execution with the same types used in other tests in this file (e.g.,
db: Session/Any, job_env: dict, job_for_execution: Callable or specific fixture
type) and add -> None as the return annotation so each test function has full
parameter and return type hints consistent with the project's typing pattern.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/app/tests/services/llm/test_guardrails.py`:
- Around line 16-96: The tests lack type hints on the patched parameter and
return types; annotate each test function (test_call_guardrails_success,
test_call_guardrails_http_error_bypasses,
test_call_guardrails_network_failure_bypasses,
test_call_guardrails_timeout_bypasses, test_call_guardrails_uses_settings) to
accept mock_client_cls: Mock and return -> None, and add the necessary import
(from unittest.mock import Mock) at the top so the annotations resolve.

@rkritika1508
Copy link
Collaborator Author

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Feb 5, 2026

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/app/tests/api/routes/test_llm.py`:
- Around line 223-274: The test claims to verify guardrails-bypassed behavior
but never asserts that call_guardrails was invoked or that its bypassed=True was
propagated; update test_llm_call_guardrails_bypassed_still_succeeds to assert
mock_guardrails.assert_called_once() and inspect mock_guardrails.call_args to
confirm the request input (e.g., "What is the capital of France?") and the
provided input_guardrails were passed; if mocking start_high_priority_job
prevents call_guardrails from running, remove or adjust that mock so
call_guardrails is exercised (or explicitly assert call order: mock_guardrails
called before start_high_priority_job) and add an assertion that the response
handling used the bypassed flag (for example by confirming
start_high_priority_job was still called after a bypass).
- Around line 174-220: The test patches of call_guardrails are unreachable
because start_high_priority_job is mocked and the job (execute_job) never runs;
remove the patch/return_value and the mock_guardrails.assert_not_called() in the
first test (references: start_high_priority_job, call_guardrails, execute_job)
so you don't set dead expectations; for the second test either remove the
call_guardrails patch altogether or (if you intend to assert guardrails were
invoked) stop mocking start_high_priority_job and instead let execute_job run or
explicitly call the job helper, then add mock_guardrails.assert_called_once();
finally, normalize the endpoint path to match existing tests (use the same
"api/v1/llm/call" convention without a leading slash).
🧹 Nitpick comments (2)
backend/app/tests/api/routes/test_llm.py (2)

207-211: Inconsistent URL path — new tests use a leading / while existing tests don't.

Lines 208 and 263 use "/api/v1/llm/call" while all existing tests (lines 40, 78, 114, 135, 160) use "api/v1/llm/call". Both work with TestClient, but pick one style for consistency.

Proposed fix
-            "/api/v1/llm/call",
+            "api/v1/llm/call",

189-205: Consider extracting a payload factory to reduce duplication.

Both new tests (and several existing ones) construct near-identical LLMCallRequest payloads inline. A small helper/factory that produces a default payload with optional overrides would reduce boilerplate and make tests easier to maintain.

def make_llm_call_request(**overrides) -> LLMCallRequest:
    defaults = dict(
        query=QueryParams(input="What is the capital of France?"),
        config=LLMCallConfig(
            blob=ConfigBlob(
                completion=NativeCompletionConfig(
                    provider="openai-native",
                    params={"model": "gpt-4o", "temperature": 0.7},
                )
            )
        ),
        callback_url="https://example.com/callback",
    )
    defaults.update(overrides)
    return LLMCallRequest(**defaults)

As per coding guidelines, backend/app/tests/**/*.py: Use factory pattern for test fixtures in backend/app/tests/.

Also applies to: 244-260

Comment on lines +174 to +220
with (
patch("app.services.llm.jobs.start_high_priority_job") as mock_start_job,
patch("app.services.llm.guardrails.call_guardrails") as mock_guardrails,
):
mock_start_job.return_value = "test-task-id"

mock_guardrails.return_value = {
"success": True,
"bypassed": False,
"data": {
"safe_text": "What is the capital of France?",
"rephrase_needed": False,
},
}

payload = LLMCallRequest(
query=QueryParams(input="What is the capital of France?"),
config=LLMCallConfig(
blob=ConfigBlob(
completion=NativeCompletionConfig(
provider="openai-native",
params={
"model": "gpt-4o",
"temperature": 0.7,
},
)
)
),
input_guardrails=[],
output_guardrails=[],
callback_url="https://example.com/callback",
)

response = client.post(
"/api/v1/llm/call",
json=payload.model_dump(mode="json"),
headers=user_api_key_header,
)

assert response.status_code == 200

body = response.json()
assert body["success"] is True
assert "response is being generated" in body["data"]["message"]

mock_start_job.assert_called_once()
mock_guardrails.assert_not_called()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== Searching for call_guardrails invocations in non-test code ==="
rg -n "call_guardrails" --type=py -g '!**/test*' -B 2 -A 2

echo ""
echo "=== Searching for the llm route handler ==="
fd -e py -path "*/routes/*llm*" --type f | head -20

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 2325


🏁 Script executed:

#!/bin/bash
echo "=== Finding llm route file ==="
find . -path "*/routes*" -name "*.py" | grep -i llm

echo ""
echo "=== Looking at route handler structure ==="
fd llm -e py -path "*routes*" 

echo ""
echo "=== Check app structure ==="
find backend/app -type d -name routes

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 491


🏁 Script executed:

#!/bin/bash
echo "=== Reading the LLM route handler ==="
wc -l backend/app/api/routes/llm.py

echo ""
echo "=== Finding the call endpoint in route handler ==="
rg -n "def.*call" backend/app/api/routes/llm.py -A 30

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1648


🏁 Script executed:

#!/bin/bash
echo "=== Full content of llm.py route handler ==="
cat -n backend/app/api/routes/llm.py

echo ""
echo "=== Finding start_job function ==="
rg -n "def start_job" --type py -g '!**/test*' -A 10

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 6825


🏁 Script executed:

#!/bin/bash
echo "=== Checking URL paths used in existing tests ==="
rg -n 'client.post.*llm/call' backend/app/tests/api/routes/test_llm.py

echo ""
echo "=== Verify: does TestClient care about leading slash? ==="
rg -n 'client.post' backend/app/tests/api/routes/test_llm.py | head -10

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 425


🏁 Script executed:

#!/bin/bash
echo "=== Get the exact POST URLs from test file ==="
rg -n 'client.post\(' backend/app/tests/api/routes/test_llm.py -A 1 | grep -E '(client.post|"/)' | head -20

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 435


🏁 Script executed:

#!/bin/bash
echo "=== Get exact URLs from all test POST calls ==="
sed -n '39,41p; 77,79p; 113,115p; 134,136p; 159,161p; 207,209p; 262,264p' backend/app/tests/api/routes/test_llm.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 843


🏁 Script executed:

#!/bin/bash
echo "=== Check the second test function (lines 223-274) ==="
sed -n '223,274p' backend/app/tests/api/routes/test_llm.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1878


Remove unreachable mock_guardrails patches — they never execute when start_high_priority_job is mocked.

call_guardrails is invoked inside execute_job (backend/app/services/llm/jobs.py lines 150, 265), which runs asynchronously via start_high_priority_job. When start_high_priority_job is mocked in the test, the job never executes, making any mock_guardrails setup and assertions trivially true/false and misleading.

First test (lines 174–220): The mock_guardrails.return_value setup (lines 180–187) is dead code. Remove the call_guardrails patch and assertion.

Second test (lines 223–274): Similar dead mock setup. Additionally, this test is missing mock_guardrails.assert_called_once() — if the intent is to verify that guardrails are invoked when input_guardrails is non-empty, add the assertion; otherwise, remove the patch entirely.

Minor: New tests use /api/v1/llm/call (leading slash) while existing tests use api/v1/llm/call (no slash). Align with existing convention.

🤖 Prompt for AI Agents
In `@backend/app/tests/api/routes/test_llm.py` around lines 174 - 220, The test
patches of call_guardrails are unreachable because start_high_priority_job is
mocked and the job (execute_job) never runs; remove the patch/return_value and
the mock_guardrails.assert_not_called() in the first test (references:
start_high_priority_job, call_guardrails, execute_job) so you don't set dead
expectations; for the second test either remove the call_guardrails patch
altogether or (if you intend to assert guardrails were invoked) stop mocking
start_high_priority_job and instead let execute_job run or explicitly call the
job helper, then add mock_guardrails.assert_called_once(); finally, normalize
the endpoint path to match existing tests (use the same "api/v1/llm/call"
convention without a leading slash).

Comment on lines +223 to +274
def test_llm_call_guardrails_bypassed_still_succeeds(
client: TestClient,
user_api_key_header: dict[str, str],
) -> None:
"""If guardrails service is unavailable (bypassed), request should still succeed."""

with (
patch("app.services.llm.jobs.start_high_priority_job") as mock_start_job,
patch("app.services.llm.guardrails.call_guardrails") as mock_guardrails,
):
mock_start_job.return_value = "test-task-id"

mock_guardrails.return_value = {
"success": True,
"bypassed": True,
"data": {
"safe_text": "What is the capital of France?",
"rephrase_needed": False,
},
}

payload = LLMCallRequest(
query=QueryParams(input="What is the capital of France?"),
config=LLMCallConfig(
blob=ConfigBlob(
completion=NativeCompletionConfig(
provider="openai-native",
params={
"model": "gpt-4",
"temperature": 0.7,
},
)
)
),
input_guardrails=[{"type": "pii_remover"}],
output_guardrails=[],
callback_url="https://example.com/callback",
)

response = client.post(
"/api/v1/llm/call",
json=payload.model_dump(mode="json"),
headers=user_api_key_header,
)

assert response.status_code == 200

body = response.json()
assert body["success"] is True
assert "response is being generated" in body["data"]["message"]

mock_start_job.assert_called_once()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing assertion on mock_guardrails — test doesn't verify the "bypassed" behavior it claims to test.

The docstring says the test validates that a bypassed guardrails response still succeeds, and input_guardrails is non-empty, yet there is no assertion on mock_guardrails (e.g., assert_called_once, checking the arguments, or verifying the bypassed flag propagation). The same root issue applies as in the previous test: with start_high_priority_job mocked, call_guardrails is never reached, so the bypass mock is entirely unused.

As-is, this test is functionally identical to test_llm_call_success — it only proves the route accepts guardrails fields and returns 200.

🤖 Prompt for AI Agents
In `@backend/app/tests/api/routes/test_llm.py` around lines 223 - 274, The test
claims to verify guardrails-bypassed behavior but never asserts that
call_guardrails was invoked or that its bypassed=True was propagated; update
test_llm_call_guardrails_bypassed_still_succeeds to assert
mock_guardrails.assert_called_once() and inspect mock_guardrails.call_args to
confirm the request input (e.g., "What is the capital of France?") and the
provided input_guardrails were passed; if mocking start_high_priority_job
prevents call_guardrails from running, remove or adjust that mock so
call_guardrails is exercised (or explicitly assert call order: mock_guardrails
called before start_high_priority_job) and add an assertion that the response
handling used the bypassed flag (for example by confirming
start_high_priority_job was still called after a bypass).

@AkhileshNegi AkhileshNegi changed the title Added code for kaapi integration Guardrail: Integration Feb 6, 2026
@nishika26 nishika26 merged commit fa12b10 into ProjectTech4DevAI:main Feb 6, 2026
3 checks passed
@rkritika1508 rkritika1508 deleted the feat/guardrails-integration branch February 6, 2026 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants