Skip to content

Release 0.6.0: Community Dashboard, BIDS, Metrics#156

Merged
neuromechanist merged 29 commits intomainfrom
develop
Feb 5, 2026
Merged

Release 0.6.0: Community Dashboard, BIDS, Metrics#156
neuromechanist merged 29 commits intomainfrom
develop

Conversation

@neuromechanist
Copy link
Member

Summary

Release 0.6.0 brings the community dashboard epic, BIDS community support, and metrics infrastructure.

Major Features

  • Community Dashboard (Phases 1-4): Static HTML dashboard at osa-dash.pages.dev with public metrics overview, per-community views, usage charts, admin token/cost views, and sync status
  • Metrics Infrastructure: SQLite-based request logging, middleware for automatic capture, public and admin query APIs with time-bucketed aggregations
  • Per-Community Scoped Auth: Community-specific admin keys (COMMUNITY_ADMIN_KEYS), AuthScope system for role-based access control
  • Quality Metrics: Error rates, tool call counts, p50/p95 latency tracking, LangFuse trace integration
  • Cost/Budget Tracking: Model pricing table, per-community budget config, automated GitHub issue alerting when thresholds exceeded
  • BIDS Community Assistant: Full BIDS community config with system prompt, tools, and knowledge sources
  • Worker Security Hardening: Split auth modes (worker-key vs client-key passthrough), rate limiting on all endpoints, CORS protocol validation

Infrastructure

  • Cloudflare Pages deployment workflow for dashboard
  • Worker CORS auto-sync from community configs
  • Hybrid rate limiting (built-in API + KV)
  • Number-based PR/issue lookup in BIDS system prompt

Files Changed

  • 50 files changed, ~7,300 insertions, ~500 deletions
  • 15 new test files with ~3,000 lines of tests

Test plan

  • All Python tests passing (1222 passed, 11 pre-existing async/integration skips)
  • Worker deployed to dev and all routes verified
  • Dashboard loads and displays metrics at develop.osa-dash.pages.dev
  • Admin endpoints require client auth (security fix verified)
  • Linting clean (ruff check + format)

neuromechanist and others added 26 commits January 28, 2026 17:08
Tests were patching src.knowledge.faq_summarizer.create_openrouter_llm
but the import is now inside the function from src.core.services.litellm_llm.

Fixed both test instances to patch the correct module.
Implement hybrid rate limiting approach:
- Per-minute (bot protection): Built-in Rate Limiting API
- Per-hour (human abuse): Workers KV
- Rate limits: 10/min, 20/hour (prod), 60/min, 100/hour (dev)
- 50% reduction in KV writes
- Auto-deploy on worker file changes

Closes #129
Problem: Git diff fails with 'fatal: bad object' because GitHub Actions
uses shallow checkout (depth=1) and github.event.before doesn't exist.

Solution: Set fetch-depth=0 to fetch full history, allowing git diff
to work properly when detecting worker file changes.

This will trigger deployment on next push.
Remove branch filter from pull_request trigger in tests.yml
so lint and unit tests run on PRs targeting any branch, not
just main/develop. This ensures feature branch PRs to epic
branches still get CI coverage.
Remove branch filter from pull_request trigger in tests.yml
so lint and unit tests run on PRs targeting any branch, not
just main/develop. This ensures feature branch PRs to epic
branches still get CI coverage.
Skip test_documentation_urls_accessible; HED docs URL
returns 404 due to upstream repo change. See #139.
Skip test_documentation_urls_accessible; HED docs URL
returns 404 due to upstream repo change. See #139.
* feat: add backend metrics collection and request logging

- Add src/metrics/ package with SQLite storage (WAL mode),
  aggregation queries, and request timing middleware
- Add global /metrics/overview and /metrics/tokens endpoints
- Add per-community /{id}/metrics and /{id}/metrics/usage
- Log token usage, model, key_source, tools for ask/chat
- Streaming handlers log metrics at end of generator
- Middleware captures timing for all requests
- All metrics endpoints require admin auth

Closes #134

* Address PR review: error handling and type safety fixes

- Wrap middleware dispatch in try/except so metrics never crash requests
- Wrap init_metrics_db() in try/except for graceful degradation
- Always return AssistantWithMetrics (remove return_metrics flag)
- Narrow log_request except to sqlite3.Error
- Add try/except to _log_streaming_metrics
- Log metrics on streaming error paths (400/500)
- Add sqlite3 error handling to all metrics endpoints (503)
- Add logger + warning for malformed JSON in queries.py
- Fix middleware ordering comment
- Remove redundant inline import uuid
- Move get_metrics_connection to top-level import

* CI: run lint and tests on all PRs

Remove branch filter from pull_request trigger in tests.yml
so lint and unit tests run on PRs targeting any branch, not
just main/develop. This ensures feature branch PRs to epic
branches still get CI coverage.

* Disable broken URL test until upstream fix

Skip test_documentation_urls_accessible; HED docs URL
returns 404 due to upstream repo change. See #139.
Knowledge databases live inside Docker containers, not locally.
Added SSH + docker exec examples for listing tables, querying
docstrings, and searching symbols to avoid wasted debugging time.
)

* Add dashboard frontend with public metrics endpoints

- Add public query functions (no tokens/costs/models exposed)
- Create /metrics/public/* endpoints (no auth required)
- Build /dashboard page with Chart.js, community tabs, admin unlock
- Register new routers in main.py
- Add tests for public endpoints and dashboard page (28 new tests)

* Restructure dashboard as standalone static site

- Move per-community public metrics to community router
  (/{community_id}/metrics/public, /{community_id}/metrics/public/usage)
- Keep only global /metrics/public/overview in metrics_public router
- Remove FastAPI dashboard router
- Add dashboard/ as standalone static site for Cloudflare Pages:
  / = aggregate overview, /{community} = community detail
- Client-side routing with configurable API base URL
- Add _redirects for Cloudflare Pages SPA routing
- Update tests for new route structure

* Add CI workflow for dashboard Cloudflare Pages deploy

Deploys dashboard/ to osa-dash.pages.dev via wrangler.
Same pattern as existing deploy-pages.yml for the demo widget:
- main -> osa-dash.pages.dev (production)
- develop -> develop.osa-dash.pages.dev
- PRs -> {branch}.osa-dash.pages.dev with preview URL comment

* Add dynamic community tab bar to dashboard

Tabs are populated from /metrics/public/overview API so new
communities appear automatically. Navigation uses simple links
(All -> /, community -> /{id}) with active tab highlighting.

* Address PR review findings: XSS, error handling, tests

- Fix XSS: add escapeHtml() helper, sanitize all innerHTML interpolations,
  use encodeURIComponent() for URL path segments
- Move get_metrics_connection() inside try blocks in all metrics endpoints
- Add console.error/warn to all JavaScript catch blocks (no silent failures)
- Improve admin section UX: defer visibility until data loads successfully
- Extract shared helpers in queries.py (_count_tools, _validate_period)
- Add test classes: TestPublicAdminBoundary, TestEmptyDatabase,
  TestCommunityMetricsValues with dynamic community cross-checks
- Fix admin boundary tests to use auth_env fixture with test API key

* Address round-2 review: XSS, error logging, auth tests, simplify

- Fix single-quote XSS in onclick handlers: use encodeURIComponent
  for communityId in changePeriod calls, decode in changePeriod
- Validate health status against known values instead of escapeHtml
- Add console.warn to sync/health .catch() blocks
- Add console.error to loadCommunityView catch block
- Add auth-enabled tests proving public endpoints stay accessible
  when REQUIRE_API_AUTH=true (core security contract)
- Add metrics_connection() context manager in db.py; simplify all
  endpoint handlers from nested try/try/finally to with-statement
- Use tuple unpacking in _count_tools for clarity
* Serve dashboard from /osa/ base path for status.osc.earth

Move dashboard/index.html to dashboard/osa/index.html and add
BASE_PATH constant to strip /osa prefix in client-side router.
Update all internal links (tabs, community cards) to use the
/osa/ prefix. Update _redirects for SPA routing under /osa/.
Update dashboard tests for new file location.

* Handle /osa without trailing slash in _redirects

Add explicit /osa rule alongside /osa/* to ensure the path
without trailing slash also serves the SPA index.
* Add per-community auth, quality metrics, and budget alerting

Phase 4 of the community dashboard: per-community scoped
authentication (AuthScope + community admin keys), LangFuse
observability wiring, quality metrics (error rates, latency
percentiles, tool call tracking), cost estimation with model
pricing table, budget checking with configurable limits, and
automated GitHub issue alerting when spend thresholds are
exceeded. Includes scheduled budget check job (every 15 min)
and sample budget configs for HED and EEGLAB communities.

Tested: 1152 passed, 68% coverage.

* Address PR review: simplify code and fix error handling

- Fix _issue_exists to return True on error (prevent duplicate spam)
- Simplify redundant exception tuples in alerts.py
- Extract _require_community_access helper (4 duplicated blocks)
- Extract parse_admin_keys method on Settings (3 duplicated parsers)
- Strengthen AuthScope with Literal type, frozen, validation
- Make BudgetStatus frozen (immutable snapshot)
- Add BudgetConfig cross-field validation (daily <= monthly)
- Share single DB connection in budget check loop
- Add budget check failure escalation (matching sync pattern)
- Split LangFuse except into ImportError vs Exception
- Improve _migrate_columns to re-raise unexpected errors
- Log warnings for malformed community_admin_keys entries
- Bump unknown model fallback logging from debug to warning

* Fix streaming metrics fields and add quality endpoint tests

Add tool_call_count and langfuse_trace_id to streaming metrics
logging so streaming requests capture the same quality data as
non-streaming. Add 14 endpoint tests covering community quality,
quality summary, and global quality API routes.

* Address round 2 review: fix alerts, docstrings, tests, simplify

- Fix _issue_exists to return None on failure instead of True,
  with warning-level logging when dedup check fails
- Fix stale pricing date and deduplicate fallback branches
- Extract shared _fetch_latency_percentiles helper in queries
- Make BudgetConfig frozen (immutable after parsing)
- Fix inaccurate docstrings: maintainers usage, _percentile
  method name, _migrate_columns idempotency, regex claims
- Fix get_quality_summary docstring key name mismatch
- Track per-community scheduler failures for critical alerting
- Upgrade malformed config entry log from WARNING to ERROR
- Add tests: AuthScope validation, BudgetConfig daily>monthly,
  community-scoped keys on global endpoints, dedup failure
Code fixes:
- Handle HTTPException in streaming generators as SSE error events
  (cannot re-raise after response headers sent)
- Extract _match_wildcard_origin helper, AgentResult dataclass,
  _extract_agent_result/_set_metrics_on_request to deduplicate
  ask/chat endpoints
- Use metrics_connection() context manager in metrics router
- Add failure counting with escalation logging to log_request()
- Refactor check_budget() to accept BudgetConfig instead of 3 floats
- Add __post_init__ validation to BudgetStatus for non-negative spend
- Simplify list_sessions to reuse _evict_expired_sessions
- Move inline imports (re, os) to top-level
- Simplify _get_communities_with_sync to list comprehension

Docstring fixes:
- Clarify verify_api_key handles only global admin keys
- Update scheduler module docstring to mention budget checks

New tests:
- check_budget with today's timestamps (exercises date('now') SQL)
- Budget alert trigger with current-day spend
- BudgetStatus rejects negative spend values
- _percentile edge cases (single element, two elements, empty list)
- _count_tools with malformed JSON
- _extract_community_id documents intentional None for metrics paths
…-epic-community-dashboard

Community Dashboard: metrics, auth, budget alerting
…local testing guide (#150)

Add community development section to CLAUDE.md with links to the
documentation site registry guides (adding a community, local testing,
schema reference, extensions). Generalize .context/local-testing-guide.md
from EEGLAB-specific to community-agnostic with placeholder COMMUNITY_ID.
* feat: add BIDS community assistant (Phase 1)

Add BIDS (Brain Imaging Data Structure) as a new community with:
- 45 documentation sources (2 preloaded, 43 on-demand) covering
  the specification, all 12 modalities, derivatives, website
  getting-started guides, FAQs, tools, schema docs, and BEP process
- 4 GitHub repos for issue/PR sync (specification, validator,
  website, examples)
- 14 citation DOIs (canonical paper + 12 modality-specific
  extension papers + BIDS Apps)
- System prompt with modality awareness, schema awareness,
  validator guidance, converter recommendations, and explicit
  anti-hallucination instructions for GitHub references
- NeuroStars discourse integration (bids tag)
- Budget and maintainer configuration

All documentation URLs verified (raw GitHub + readthedocs stable).
Config validates correctly and community discovery works (336 tests
pass, no failures).

Closes #149

* fix: address PR review findings for BIDS config

- Fix phenotypic data description (phenotype/ directory, not participants.tsv)
- Fix code description (data preparation scripts, not analysis code)
- Add participants.tsv to data summary files description
- Fix behavioral wording (no neural recordings, not no neuroimaging)
- Fix genetics wording in system prompt (brain imaging data)
- Add missing physiological recordings documentation entry (13 modality docs)
- Add Physiological to system prompt modality list
Allows manual re-triggering when automated runs fail due to
transient issues (e.g. expired tokens, push race conditions).
Replace "What are the required metadata fields?" with
"What are the BIDS Common Principles?" as a more foundational
starting question for new users.
The model was not calling knowledge tools when asked about specific
PR/issue numbers. Add explicit patterns showing how to search by number.
* feat: add number-based lookup to GitHub item search

When query contains a PR/issue number (e.g. "2022", "#500", "PR 2022"),
search now does a direct number lookup first, then falls back to
full-text search for remaining slots. Deduplicates results.

Fixes #153

* feat: switch all communities from Qwen to Claude Haiku 4.5

Qwen was not reliably calling knowledge tools when asked about
specific PRs/issues. Claude Haiku 4.5 via Anthropic provider
is more stable for tool calling and has caching enabled.

* Address PR review: add number index, skip FTS for pure number queries, expand tests

- Add idx_github_items_number index for direct number lookups
- Add _is_pure_number_query() to skip FTS when query is just a number pattern
- Add debug logging when number lookup finds no results
- Add tests for status filter, nonexistent numbers, limits, bug/feature prefixes
- Strengthen assertions on existing number lookup tests

* Fix ruff SIM103: return condition directly in _is_pure_number_query

* Fix ruff formatting
* fix: add dashboard CORS origin and missing worker routes

Worker:
- Add osa-dash.pages.dev to CORS allowlist (dashboard origin)
- Add routes for dashboard endpoints: /metrics/public/overview,
  /metrics/{overview,tokens,quality}, /sync/{status,health},
  /{community}/metrics/public, /{community}/sessions

Dashboard:
- Auto-detect API backend from hostname instead of falling back
  to window.location.origin (which fails for deployed dashboards)
- osa-dash.pages.dev -> api.osc.earth/osa (prod)
- *.osa-dash.pages.dev -> api.osc.earth/osa-dev (dev/preview)

* fix: add osa-dash.pages.dev to backend CORS allowlist

The dashboard (osa-dash.pages.dev) was being rejected by the
backend's CORS middleware. Add both exact and wildcard patterns
for the dashboard Pages project.

* fix: harden worker security and improve dashboard errors

Worker:
- Split proxy into worker-key vs client-key passthrough modes
- Admin/sessions endpoints now forward client key (not worker key)
- Extract RESERVED_PATHS constant, rateLimitOrReject helper
- Extract validateCommunityId with consistent reserved path checks
- Add rate limiting to all endpoints including admin and sync
- Validate https:// protocol on subdomain CORS checks
- Only forward Origin header if CORS-validated
- Replace bare catch blocks with console.warn logging

Dashboard:
- Add status-code-aware error messages (429, 404, 401, 500)
- Replace generic "Failed to load" with actionable user guidance
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

🚀 Preview Deployment

Name Link
Preview URL https://develop.osa-demo.pages.dev
Branch develop
Commit a81c94b

This preview will be updated automatically when you push new commits.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

Dashboard Preview

Name Link
Preview URL https://develop.osa-dash.pages.dev
Branch develop
Commit a81c94b

This preview will be updated automatically when you push new commits.

neuromechanist and others added 2 commits February 5, 2026 05:56
Re-apply changes from d12c98e that were listed in the squash
merge commit message but not included in the actual diff:
- Split proxy into worker-key vs client-key passthrough modes
- Admin/sessions endpoints forward client key (not worker key)
- Extract RESERVED_PATHS, rateLimitOrReject, validateCommunityId
- Rate limiting on all endpoints including admin and sync
- Validate https:// protocol on subdomain CORS checks
- Only forward Origin header if CORS-validated
- Replace bare catch blocks with console.warn logging
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

⚠️ Worker Deployment Required

This PR modifies community CORS origins. Worker changes detected. After merging to main or develop, the workflow will automatically deploy the worker.

Manual deployment (if needed):

cd workers/osa-worker
wrangler deploy --env dev  # for develop branch
wrangler deploy            # for main branch

@neuromechanist
Copy link
Member Author

PR Review Summary (6 agents, parallel)

Already Fixed (during review)

  • Worker CORS subdomain checks missing https:// protocol validation (commit 11de1b6)
  • Worker missing osa-dash.pages.dev in CORS allowlist (commit 11de1b6)

Critical Issues (2)

1. [error-handling] Rate limiter fail-open without alerting
workers/osa-worker/index.js:100-131 -- All three rate limiter catch blocks silently fail open. If KV or built-in rate limiter goes down, zero rate limiting with only console.error that Workers may not retain. No operator alerting.

2. [error-handling] Turnstile verification silently skipped when secret key missing
workers/osa-worker/index.js:31-33 -- If TURNSTILE_SECRET_KEY is not configured, returns success: true unconditionally in production, completely disabling bot protection. Should fail-closed in production, fail-open only in dev.


High Issues (5)

3. [error-handling] Worker top-level catch leaks internal error messages
workers/osa-worker/index.js:481-486 -- Returns error.message directly to client, potentially exposing internal paths or stack traces. Should return generic "Internal server error".

4. [error-handling] Worker handleProtectedEndpoint no JSON parse error handling
workers/osa-worker/index.js:582 -- request.json() can throw on malformed body. Propagates to top-level catch as 500 instead of proper 400 Bad Request.

5. [error-handling] Metrics DB init failure silently degrades system
src/api/main.py:64-71 -- If init_metrics_db() fails at startup, health check still returns "healthy" while metrics, budget tracking, and alerting are all non-functional.

6. [error-handling] Metrics middleware broad exception catch
src/metrics/middleware.py:92-93 -- Bare except Exception catches programming errors (TypeError, AttributeError) alongside infrastructure errors, silently losing metrics data.

7. [error-handling] Streaming metrics not logged on session limit
src/api/routers/community.py:1512-1520 -- When add_assistant_message raises ValueError (session limit), function returns without calling _log_streaming_metrics. LLM cost incurred but not recorded.


Medium Issues (8)

8. [code-reviewer] KV rate limit double-read race condition
workers/osa-worker/index.js:92-126 -- Hourly count read twice (check + increment); hour boundary could cross between reads, incrementing wrong key.

9. [code-reviewer] Dashboard sync info not HTML-escaped
dashboard/osa/index.html:557-582 -- renderSyncInfo puts values in innerHTML without escapeHtml(), inconsistent with rest of dashboard.

10. [error-handling] Scheduler job registration failures silently degraded
src/api/scheduler.py:285-314 -- Invalid cron expression logged but scheduler starts anyway. Budget check job could silently never run.

11. [error-handling] _issue_exists bare Exception catch
src/metrics/alerts.py:54-56 -- Returns None on any error, which suppresses the alert. Programming bug in dedup logic would permanently suppress all budget alerts.

12. [test-analyzer] No tests for _check_community_budgets() scheduler job (criticality 9)
Integration point for budget checking + alerting. Non-trivial logic untested.

13. [test-analyzer] Per-community scoped auth endpoints untested (criticality 7)
_require_community_access(auth) raising 403 for cross-community access is untested via HTTP.

14. [comment-analyzer] Worker README endpoints table incomplete
Only lists /hed/ask and /hed/chat, missing 10+ new routes.

15. [comment-analyzer] Stale pricing date (2025-07) in cost.py
src/metrics/cost.py:4,13 -- Over 6 months old. Cascading impact on budget alerts and dashboard.


Suggestions (from simplifier, type-design, comments)

Types:

  • RequestLogEntry should be frozen=True (only type in metrics pipeline that isn't)
  • AuthScope would benefit from @classmethod named constructors (AuthScope.admin(), AuthScope.community("hed"))
  • MODEL_PRICING should be MappingProxyType to prevent accidental mutation
  • Consider NamedTuple for pricing tuples: TokenRate(input_per_million, output_per_million)

Simplification:

  • Streaming generators share ~80% identical code; extract shared event processing helper
  • Metrics endpoint boilerplate (try/except/metrics_connection) repeated across 3 files; extract helper
  • Snippet creation duplicated 4x in search.py; extract _make_snippet()
  • KV timestamp computed twice in checkRateLimit; compute once at top
  • Scheduler config accessors: extract _get_community_config() helper

Comments:

  • RESERVED_PATHS includes 'communities' which isn't an actual route; update comment
  • BYOKHeaders in security.py misaligned with actual BYOK mechanism in community router; add clarification
  • _percentile docstring should note it differs from numpy's default method

Positive Observations

  • Test suite is comprehensive: 15 new test files, ~3000 lines, real SQLite DBs (no mocks except alerts/subprocess)
  • BudgetConfig is excellently designed: frozen, extra="forbid", cross-field validation
  • AuthScope invariants directly prevent security bugs
  • Streaming error handling with separate HTTPException/ValueError/generic paths is thorough
  • SSRF validation with explicit limitation documentation is well done
  • Knowledge search raise on sqlite3.OperationalError (not empty results) is correct pattern
  • Security-relevant f-string SQL comment ("Safe to use: fmt from whitelist") prevents false audit findings

- Total now counts only community-scoped requests, not health
  checks, sync, and other infrastructure endpoints
- All registered communities appear in overview even with 0 requests
- Middleware _extract_community_id now recognizes metrics, sessions,
  and config endpoints (not just /ask and /chat)
@neuromechanist neuromechanist merged commit d41cf7c into main Feb 5, 2026
19 checks passed
@neuromechanist neuromechanist linked an issue Feb 5, 2026 that may be closed by this pull request
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Community dashboard with metrics, usage stats, and sync status

1 participant