Skip to content

Align telemetry with OpenTelemetry MCP semantic conventions #3399

@JAORMX

Description

@JAORMX

Summary

OpenTelemetry officially merged MCP semantic conventions on January 12, 2026 (PR #2083). ToolHive should align its telemetry implementation with these standards for better observability tool compatibility and ecosystem alignment.

Standard References

Current State

ToolHive has solid telemetry foundation but predates the official conventions:

  • Middleware: pkg/telemetry/middleware.go - spans, attributes, metrics for MCP proxy
  • vMCP: pkg/vmcp/server/telemetry.go - backend and workflow telemetry
  • Parser: pkg/mcp/parser.go - already extracts _meta field (lines 228-233)

Core Implementation Tasks

1. Update Attributes and Span Naming

File: pkg/telemetry/middleware.go

Attribute Renames (for standard compliance):

  • mcp.methodmcp.method.name (line 222)
  • mcp.request.idjsonrpc.request.id (line 229)
  • mcp.tool.namegen_ai.tool.name (line 263)
  • mcp.tool.argumentsgen_ai.tool.call.arguments (line 267, opt-in)
  • mcp.prompt.namegen_ai.prompt.name (line 279)
  • mcp.transportnetwork.transport with value mapping:
    • stdiopipe
    • sse, streamable-httptcp

Add Missing Required Attributes:

  • mcp.protocol.version - MCP spec version (e.g., "2025-11-25")
  • mcp.session.id - Session identifier
  • jsonrpc.protocol.version - When not "2.0"
  • error.type - On failures (JSON-RPC error code or "tool_error")
  • rpc.response.status_code - When response contains error
  • gen_ai.operation.name - "execute_tool" for tool calls
  • network.protocol.name - "http" for SSE/streamable-http

Span Naming (lines 161-170):

  • Current: mcp.tools/call
  • Standard: tools/call get_weather (include target when available)
  • Format: {mcp.method.name} {target} where target is tool/prompt name

2. Add Standard Metrics

File: pkg/telemetry/middleware.go

New Standard Metrics (alongside existing):

  • mcp.client.operation.duration (histogram, seconds)
  • mcp.server.operation.duration (histogram, seconds)
  • Use recommended buckets: [0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 30, 60, 120, 300]

Keep existing toolhive_mcp_* metrics for backward compatibility.

3. Implement W3C Trace Context Propagation

Critical Feature: Enable distributed tracing across MCP boundaries.

3a. Context Injection (vMCP → Backends)

Files:

  • New: pkg/telemetry/propagation.go - W3C Trace Context helpers
  • pkg/vmcp/client/client.go - Inject before backend calls

Implementation: Inject traceparent and tracestate into params._meta:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "get-weather",
    "_meta": {
      "traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
      "tracestate": "rojo=00f067aa0ba902b7"
    }
  }
}

Code Structure:

// propagation.go
func InjectTraceContext(ctx context.Context, params map[string]interface{}) {
    meta := getOrCreateMeta(params)
    carrier := &MetaCarrier{meta: meta}
    otel.GetTextMapPropagator().Inject(ctx, carrier)
}

type MetaCarrier struct {
    meta map[string]interface{}
}
// Implement TextMapCarrier interface

3b. Context Extraction (Clients → ToolHive)

File: pkg/telemetry/middleware.go (around line 114)

Implementation: Extract trace context from incoming params._meta and use as parent for server span:

if parsedMCP := mcpparser.GetParsedMCPRequest(ctx); parsedMCP != nil && parsedMCP.Meta != nil {
    carrier := &MetaCarrier{meta: parsedMCP.Meta}
    ctx = otel.GetTextMapPropagator().Extract(ctx, carrier)
}

4. Add Client-Side Spans for vMCP

File: pkg/vmcp/client/client.go

Current: Only SERVER spans when serving requests
Needed: CLIENT spans when vMCP calls backend MCP servers

Operations to Instrument:

  • initialize - Protocol handshake
  • tools/list, tools/call
  • resources/list, resources/read
  • prompts/list, prompts/get

Span Kind: Use trace.SpanKindClient for these operations.

5. Add Session Duration Metrics

Files:

  • pkg/vmcp/server/session_adapter.go - Track session lifecycle
  • Proxy components - Track session termination

Metrics:

  • mcp.client.session.duration (histogram, seconds)
  • mcp.server.session.duration (histogram, seconds)

Attributes:

  • mcp.protocol.version
  • network.protocol.name
  • network.transport
  • error.type (if session terminated with error)

Backward Compatibility

Approach: Emit both legacy and standard names during transition period.

Configuration: Add optional flag:

telemetry:
  useLegacyAttributes: false  # default: standard only

CLI Flag: --otel-use-legacy-attributes (enables dual emission)

Timeline:

  • Ship standard-compliant attributes/metrics immediately
  • Announce deprecation after 6 months
  • Remove legacy support in v2.0

Components Affected

  • pkg/telemetry/middleware.go - MCP proxy telemetry (spans, metrics, attributes)
  • pkg/telemetry/propagation.go - New file for trace context helpers
  • pkg/vmcp/client/client.go - CLIENT spans and context injection
  • pkg/vmcp/server/session_adapter.go - Session duration tracking
  • pkg/telemetry/config.go - Backward compatibility configuration
  • cmd/thv-operator/api/v1alpha1/*_types.go - CRD telemetry specs
  • docs/observability.md - Update documentation
  • Test files: Update assertions for new attribute names

Testing Requirements

  • Update test expectations in pkg/telemetry/middleware_test.go
  • Update E2E tests in test/e2e/telemetry_middleware_e2e_test.go
  • Add trace propagation E2E test (vMCP → backend → vMCP chain)
  • Validate span hierarchy (CLIENT/SERVER relationship)
  • Test session duration tracking
  • Verify histogram buckets

Success Criteria

  • All required attributes emitted per standard
  • Span names follow {method} {target} format
  • Standard metrics recorded with correct units/attributes
  • W3C Trace Context propagates through params._meta
  • CLIENT spans created for vMCP backend calls
  • Session duration metrics tracked
  • Network transport values mapped correctly (stdio→pipe, http→tcp)
  • Documentation updated
  • Backward compatibility maintained (with flag)
  • No performance regression

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiItems related to the APIcliChanges that impact CLI functionalityenhancementNew feature or requestgoPull requests that update go codeobservabilityoperatortelemetry

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions