Conversation
| 200: { | ||
| "description": "Metrics retrieved successfully", | ||
| "content": { | ||
| "application/json": { | ||
| "examples": { | ||
| "metrics_response": { | ||
| "summary": "Current Metrics", | ||
| "description": ( | ||
| "All current application metrics including authentication counts and error rates" | ||
| ), | ||
| "value": { | ||
| "status": True, | ||
| "message": "Metrics retrieved successfully", | ||
| "timestamp": "2025-08-28T15:30:45.123456+05:30", | ||
| "metrics": { | ||
| "auth_success_total": 150, | ||
| "auth_failure_total": 12, | ||
| "validation_error_total": 8, | ||
| "pesu_academy_error_total": 5, | ||
| "unhandled_exception_total": 0, | ||
| "csrf_token_error_total": 2, | ||
| "profile_fetch_error_total": 1, | ||
| "profile_parse_error_total": 0, | ||
| "csrf_token_refresh_success_total": 45, | ||
| "csrf_token_refresh_failure_total": 1, | ||
| }, | ||
| }, | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| } | ||
| }, |
There was a problem hiding this comment.
Create a model for this. The response model will also need an update.
| "auth_success_total": 150, | ||
| "auth_failure_total": 12, |
There was a problem hiding this comment.
We should also track how many auth requests are received, including a split for how many with and without profile data
app/__init__.py
Outdated
|
|
||
| from app.metrics import metrics | ||
|
|
||
| __all__ = ["metrics"] |
There was a problem hiding this comment.
Remove this; we should not want to initialize a global collector outside the entry point.
| exc_type = type(exc).__name__.lower() | ||
| if "csrf" in exc_type: | ||
| metrics.inc("csrf_token_error_total") | ||
| elif "profilefetch" in exc_type: | ||
| metrics.inc("profile_fetch_error_total") | ||
| elif "profileparse" in exc_type: | ||
| metrics.inc("profile_parse_error_total") | ||
| elif "authentication" in exc_type: | ||
| metrics.inc("auth_failure_total") | ||
|
|
There was a problem hiding this comment.
There is a much cleaner solution. Look into a middleware layer. Here is some pseudo code to get you started:
@app.middleware("http")
async def metrics_middleware(request: Request, call_next):
metrics.inc("requests_total")
start_time = time.time()
try:
response: Response = await call_next(request)
latency = time.time() - start_time
# Track successes vs failures
if 200 <= response.status_code < 300:
metrics.inc("requests_success")
else:
metrics.inc("requests_failed")
metrics.inc(f"requests_failed_status_{response.status_code}")
# Latency metrics
metrics.inc("request_latency_sum", latency)
# Also add route metrics: route = request.scope.get("route")
return response
except Exception as e:
latency = time.time() - start_time
metrics.inc("requests_failed")
metrics.inc(f"requests_failed_exception_{type(e).__name__}")
metrics.inc("request_latency_sum", latency)
raiseNote, you will need to accordingly increment other metrics like how many with and without profile data by parsing the request.
achyu-dev
left a comment
There was a problem hiding this comment.
Please do the requested changes from @aditeyabaral and me
There was a problem hiding this comment.
Are these dependencies necessary to be added ?
…d clean up dependencies
…d clean up dependencies
#129 - Add metrics logging with thread-safe collector and /metrics endpoint
📌 Description
This PR implements comprehensive metrics logging for the PESUAuth API to enable monitoring and observability of authentication requests, errors, and system performance.
What is the purpose of this PR?
/metricsendpoint to expose collected metrics in JSON formatWhat problem does it solve?
Background:
The API previously had no metrics or monitoring capabilities, making it difficult to assess system health, debug issues, or understand usage patterns. This implementation provides comprehensive tracking without impacting performance.
🧱 Type of Change
🧪 How Has This Been Tested?
tests/unit/)tests/functional/)tests/integration/)Testing Details:
✅ Checklist
scripts/run_tests.py) -pre-commit run --all-files) -.envvars updated (if applicable) - Not applicablescripts/benchmark_auth.py) -🛠️ Affected API Behaviour
app/app.py– Modified/authenticateroute logicNew API Endpoint:
/metrics- New GET endpoint that returns current application metrics in JSON format🧩 Models
app/models/response.py– Used existing response model for metrics endpoint formattingNew Files Added:
app/metrics.py- Core MetricsCollector implementation with thread-safe operationsapp/docs/metrics.py- OpenAPI documentation for the new /metrics endpointtests/unit/test_metrics.py- Comprehensive unit tests for MetricsCollectortests/integration/test_metrics_integration.py- Integration tests for metrics collection🐳 DevOps & Config
Dockerfile– No changes to build process.github/workflows/*.yaml– No CI/CD pipeline changes requiredpyproject.toml/requirements.txt– No new dependencies added.pre-commit-config.yaml– No linting or formatting changes📊 Benchmarks & Analysis
scripts/benchmark_auth.py– No changes to benchmark scriptsscripts/analyze_benchmark.py– No changes to analysis toolsscripts/run_tests.py– No changes to test runner📸 Screenshots / API Demos
🎯 Metrics Endpoint in Action
Live metrics collection showing authentication success/failure tracking
Metrics Endpoint Response Example
{ "status": true, "message": "Metrics retrieved successfully", "timestamp": "2025-08-28T15:30:45.123456+05:30", "metrics": { "auth_success_total": 150, "auth_failure_total": 12, "validation_error_total": 8, "pesu_academy_error_total": 5, "unhandled_exception_total": 0, "csrf_token_error_total": 2, "profile_fetch_error_total": 1, "profile_parse_error_total": 0, "csrf_token_refresh_success_total": 45, "csrf_token_refresh_failure_total": 1 } }🔧 Testing Results Dashboard
Comprehensive test suite covering unit, integration, and functional scenarios
Updated API Endpoints Table
/GET/authenticatePOST/healthGET/readmeGET/metricsGETMetrics Tracked:
auth_success_total- Successful authentication attemptsauth_failure_total- Failed authentication attemptsvalidation_error_total- Request validation failurespesu_academy_error_total- PESU Academy service errorsunhandled_exception_total- Unexpected application errorscsrf_token_error_total- CSRF token extraction failuresprofile_fetch_error_total- Profile page fetch failuresprofile_parse_error_total- Profile parsing errorscsrf_token_refresh_success_total- Successful background CSRF refreshescsrf_token_refresh_failure_total- Failed background CSRF refreshes