Skip to content

[logstash] Enable TSDB for metrics data streams#17401

Draft
AndersonQ wants to merge 2 commits intoelastic:mainfrom
AndersonQ:16512-logstash-metrics-TSDB
Draft

[logstash] Enable TSDB for metrics data streams#17401
AndersonQ wants to merge 2 commits intoelastic:mainfrom
AndersonQ:16512-logstash-metrics-TSDB

Conversation

@AndersonQ
Copy link
Member

@AndersonQ AndersonQ commented Feb 12, 2026

Proposed commit message

[logstash] Enable TSDB for metrics data streams

Enable TSDB for health_report and node_cel data streams in the Logstash 
integration and add metric_type annotations to pipeline and plugins.

Dimensions added:
- health_report: agent.id, logstash.node.name, logstash.node.uuid, logstash.pipeline.id
- node_cel: agent.id, logstash.node.stats.logstash.name
- pipeline: agent.id, logstash.pipeline.host.name
- plugins: agent.id, logstash.pipeline.host.name

`agent.id` is added as dimension to all data streams, so the events with errors
when fetching metrics can be indexed, having their TSDB ID from the 
`agent.id+timestamp`.

Annotate numeric fields with appropriate metric_type for the health_report,
node_cel, pipeline, plugins data streams.

metric_type corrections (counter → gauge):
- node_cel: jvm.threads.count, jvm.threads.peak_count, jvm.mem.heap_max_in_bytes,
queue.events_count
- pipeline: logstash.pipeline.queues.events
- plugins: beats.peak_connections, beats.current_connections

Assisted by Cursor
Data stream changes summary

Data Stream: health_report

Dimensions Added

Field Type
logstash.node.name keyword
logstash.node.uuid keyword
logstash.pipeline.id keyword

Fields with metric_type Added

Field Type metric_type
logstash.pipeline.impacts.severity short gauge
logstash.pipeline.flow.worker_utilization.current float gauge
logstash.pipeline.flow.worker_utilization.last_1_hour float gauge
logstash.pipeline.flow.worker_utilization.last_5_minutes float gauge
logstash.pipeline.flow.worker_utilization.last_15_minutes float gauge
logstash.pipeline.flow.worker_utilization.lifetime float gauge
logstash.pipeline.flow.worker_utilization.last_1_minute float gauge
logstash.pipeline.flow.worker_utilization.last_24_hours float gauge

Data Stream: node_cel

Dimensions Added

Field Type
logstash.node.stats.logstash.name keyword

Fields with metric_type Added

Field Type metric_type
logstash.node.stats.pipelines.reloads.failures long counter
logstash.node.stats.pipelines.reloads.successes long counter
logstash.node.stats.pipelines.queue.events_count long gauge
logstash.node.stats.pipelines.queue.queue_size_in_bytes long gauge
logstash.node.stats.pipelines.queue.max_queue_size_in_bytes long gauge
logstash.node.stats.pipelines.events.in long counter
logstash.node.stats.pipelines.events.out long counter
logstash.node.stats.pipelines.events.filtered long counter
logstash.node.stats.pipelines.events.duration_in_millis long counter
logstash.node.stats.pipelines.events.queue_push_duration_in_millis long counter

Fields with metric_type Changed

Field Type Old metric_type New metric_type
logstash.node.stats.jvm.threads.count long counter gauge
logstash.node.stats.jvm.threads.peak_count long counter gauge
logstash.node.stats.jvm.mem.heap_max_in_bytes long counter gauge
logstash.node.stats.queue.events_count long counter gauge

Data Stream: pipeline

Dimensions Added

Field Type
logstash.pipeline.host.name keyword

Fields with metric_type Added

Field Type metric_type
logstash.pipeline.info.batch_size long gauge
logstash.pipeline.info.batch_delay long gauge
logstash.pipeline.info.workers long gauge

Fields with metric_type Changed

Field Type Old metric_type New metric_type
logstash.pipeline.queues.events long counter gauge

Data Stream: plugins

Dimensions Added

Field Type
logstash.pipeline.host.name keyword

Fields with metric_type Changed

Field Type Old metric_type New metric_type
logstash.pipeline.plugin.input.metrics.beats.peak_connections long counter gauge
logstash.pipeline.plugin.input.metrics.beats.current_connections long counter gauge

Open questions

As the data streams are migrated to TSDB they do not support metrics which documents
containing the error when fetching a metric as those would not have the necessary
dimensions. When testing, I started the agent before logstash was ready, getting
a few errors, which when testing the migration with https://github.com/elastic/TSDB-migration-test-kit
could not be migrated.

If we want to keep this behaviour, we'd need to add more dimensions to allow
those events to be ingested.

On my tests, the error events were present on health_report and on node_cel
data streams.

Here is an example of the error documents:

{"error": {"type": "illegal_argument_exception", "reason": "Error extracting routing: source didn't contain any routing fields"}, "status": 400, "document": {"cloud": {"availability_zone": "us-central1-f", "instance": {"name": "anderson-logstash", "id": "8077647130981829769"}, "provider": "gcp", "service": {"name": "GCE"}, "machine": {"type": "e2-standard-4"}, "project": {"id": "elastic-observability"}, "region": "us-central1", "account": {"id": "elastic-observability"}}, "input": {"type": "cel"}, "agent": {"name": "anderson-logstash", "id": "27ad10fc-cef3-424a-879c-23721c867517", "type": "filebeat", "ephemeral_id": "c2cfc104-c50f-4906-af6e-7ddcf911012a", "version": "8.17.10"}, "@timestamp": "2026-02-12T11:32:41.562Z", "ecs": {"version": "8.0.0"}, "data_stream": {"namespace": "default", "type": "metrics", "dataset": "logstash.node"}, "elastic_agent": {"id": "27ad10fc-cef3-424a-879c-23721c867517", "version": "8.17.10", "snapshot": false}, "host": {"hostname": "anderson-logstash", "os": {"kernel": "6.1.0-43-cloud-amd64", "codename": "bookworm", "name": "Debian GNU/Linux", "type": "linux", "family": "debian", "version": "12 (bookworm)", "platform": "debian"}, "containerized": false, "ip": ["10.128.0.72", "fe80::4001:aff:fe80:48"], "name": "anderson-logstash", "id": "370ef8b742434f90a470fd961035344e", "mac": ["42-01-0A-80-00-48"], "architecture": "x86_64"}, "error": {"message": "failed eval: ERROR: <input>:7:7: Get \"http://localhost:9600/_node/stats?graph=true&vertices=true\": dial tcp [::1]:9600: connect: connection refused\n |       ? {\n | ......^"}, "event": {"agent_id_status": "verified", "ingested": "2026-02-12T11:32:50Z", "dataset": "logstash.node"}}}

Here is an example of the agent event log for the error:

{"log.level":"warn","@timestamp":"2026-02-16T10:01:10.838Z","message":"Cannot index event '{\"@timestamp\":\"2026-02-16T10:01:00.473Z\",\"agent\":{\"id\":\"a66ab834-dd3a-4f6b-aa64-7738f1655bab\",\"version\":\"8.17.10\",\"ephemeral_id\":\"4590be95-02e2-4273-a8ed-087074c06506\",\"name\":\"anderson-logstash\",\"type\":\"filebeat\"},\"host\":{\"id\":\"6f91b331f776434eb70c91014391f0ef\",\"containerized\":false,\"ip\":[\"10.128.0.107\",\"fe80::4001:aff:fe80:6b\"],\"mac\":[\"42-01-0A-80-00-6B\"],\"hostname\":\"anderson-logstash\",\"name\":\"anderson-logstash\",\"architecture\":\"x86_64\",\"os\":{\"platform\":\"debian\",\"version\":\"12 (bookworm)\",\"family\":\"debian\",\"name\":\"Debian GNU/Linux\",\"kernel\":\"6.1.0-43-cloud-amd64\",\"codename\":\"bookworm\",\"type\":\"linux\"}},\"cloud\":{\"region\":\"us-central1\",\"project\":{\"id\":\"elastic-observability\"},\"account\":{\"id\":\"elastic-observability\"},\"provider\":\"gcp\",\"service\":{\"name\":\"GCE\"},\"instance\":{\"name\":\"anderson-logstash\",\"id\":\"8447075611918163866\"},\"machine\":{\"type\":\"e2-standard-4\"},\"availability_zone\":\"us-central1-f\"},\"error\":{\"message\":\"failed eval: ERROR: <input>:7:7: Get \\\"http://localhost:9601/_node/stats?graph=true&vertices=true\\\": dial tcp [::1]:9601: connect: connection refused\\n |       ? {\\n | ......^\"},\"input\":{\"type\":\"cel\"},\"data_stream\":{\"namespace\":\"default\",\"type\":\"metrics\",\"dataset\":\"logstash.node\"},\"ecs\":{\"version\":\"8.0.0\"},\"event\":{\"dataset\":\"logstash.node\"},\"elastic_agent\":{\"version\":\"8.17.10\",\"id\":\"a66ab834-dd3a-4f6b-aa64-7738f1655bab\",\"snapshot\":false}}\n, Meta: {\"input_id\":\"cel-logstash-ac5b6da5-0cbf-4f95-b1a3-7019b9947c2a\",\"raw_index\":\"metrics-logstash.node-default\",\"stream_id\":\"cel-logstash.node-ac5b6da5-0cbf-4f95-b1a3-7019b9947c2a\"}' (status=400): {\"type\":\"illegal_argument_exception\",\"reason\":\"Error extracting routing: source didn't contain any routing fields\"}, dropping event!","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"cel-default","type":"cel"},"log":{"source":"cel-default"},"ecs.version":"1.6.0","log.logger":"elasticsearch","log.origin":{"file.line":528,"file.name":"elasticsearch/client.go","function":"github.com/elastic/beats/v7/libbeat/outputs/elasticsearch.(*Client).applyItemStatus"},"service.name":"filebeat","log.type":"event","ecs.version":"1.6.0"}

One option could be to have the agent.id as dimension, that way the error events
could be indexed, having their TSDB ID from the agent.id+timestamp, what seems
ok. However, it means additional mappings and dimensions for this corner case.

I checked and the errors are still logged and appear on the agent dashboards
which show errors, like the "concerning agents". So, the errors aren't lost,
nevertheless, it's still a breaking change if anyone would rely on documents with
the error key to know something is wrong.

Checklist

  • [ ] I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • [ ] I have added an entry to my package's changelog.yml file.
  • [ ] I have verified that Kibana version constraints are current according to guidelines.
  • [ ] I have verified that any added dashboard complies with Kibana's Dashboard good practices

How to test this PR locally

I used a modified version
of the TSDB migration test kit
to test the migration of the data streams to TSDB. It reindex the source index
to the destination index. This approach fails if there are documents that cannot
be ingested into the TSDB. For example the "error" events I mentioned above.
Thus, I modified it to scan the source index and use the bulk API to index the
documents, saving the failed and duplicated documents into 2 different files.

You may try both versions of the test kit.

click to show instructions
  • set up an elastic stack version 8.17.10 (8.17 is the lowest version supported)

Agent + Logstash output setup

Let's setup 2 nodes. All paths are relative to the Logstash directory

logstash: node 1:

agent.conf:

input {
  elastic_agent {
    port => 5044
    enrich => none
    ssl_enabled => false
  }
}

output {
  elasticsearch {
    cloud_id => "cloudID"
    data_stream => true
    ssl_enabled => true
    user => "elastic"
    password => "changeme"
  }
}

config/logstash.yml:

node.name: logstash-node-1

log.level: debug 
log.format.json.fix_duplicate_message_fields: true

path.logs: /tmp/logstash-node-1/logs
cd logstasn-node-1
./bin/logstash -f agent.conf

logstash: node 2:

agent.conf:

input {
  elastic_agent {
    port => 5045
    enrich => none
    ssl_enabled => false
  }
}

output {
  elasticsearch {
    cloud_id => "cloudID"
    data_stream => true
    ssl_enabled => true
    user => "elastic"
    password => "changeme"
  }
}

config/logstash.yml:

node.name: logstash-node-2
api.http.port: 9601

log.level: debug 
log.format.json.fix_duplicate_message_fields: true

path.logs: /tmp/logstash-node-2/logs
cd logstasn-node-2
./bin/logstash -f agent.conf

Elastic Agent

  • create a agent policy
  • add a logstash output with hosts: ["localhost:5044", "localhost:5045"]
  • set the output as the integrations output
  • generate some logs: flog -t log -o /tmp/agent/in/log.ndjson -w -f json -l -p 1048576 -d 500ms
  • setup a filestream input to collect the generated logs
  • install the agent with --namespace logstash-output

Elastic Agent with Logstash integration

  • create an agent policy
  • add 2 logstash integrations, each to monitor a node:
    • integration 1:
      • Metrics Metrics (Elastic Agent) -> Logstash URL: http://localhost:9600
      • Logs paths: /tmp/logstash-node-1/logs/logstash-plain*.log
      • Slowlogs paths: /tmp/logstash-node-1/logs/logstash-slowlog-plain*.log
      • Metrics (Stack Monitoring) -> Hosts http://localhost:9600
    • integration 1:
      • Metrics Metrics (Elastic Agent) -> Logstash URL: http://localhost:9601
      • Logs paths: /tmp/logstash-node-2/logs/logstash-plain*.log
      • Slowlogs paths: /tmp/logstash-node-2/logs/logstash-slowlog-plain*.log
      • Metrics (Stack Monitoring) -> Hosts http://localhost:9601
    • install the agent with --namespace monitoring

Verify

  • verify the agent-logstash-output is sending data, check there are logs from
    path /tmp/agent/in/log.ndjson
  • verify the agent-monitoring is collecting data. Check the logstash dashboards.
    I recomend to let it run for a good while, so there will be a good amount of data
    https://github.com/elastic/TSDB-migration-test-kit can use when testing the TSDB
    migration.
  • build and install the integration from this PR: elastic-package build -v && elastic-package install -v
  • run TSDB-migration-test-kit for each one of the changed data streams
  • check the dashboards are working fine with the old and new data

Check failures are ingested as metrics

tl;dr: the cel input produces an event with error.message if it fails to fetch
data. The only dimension present on this event is agent.id. Given it has at
least one dimension, present, the event should be ingested.

  • Go to the logstash node 2 and stop it, so the agent cannot fetch data from it
  • observe the agent logs, you should see errors like:
{
  "log.level": "warn",
  "message": "Failed to index 4 events in last 10s: events were dropped! Look at the event log to view the event and cause.",
  "log.logger": "elasticsearch"
}
  • go to the "Logstash Stack Monitoring Metrics" data view in discover
  • filter by your error.message exists: error.message : *
  • ensure you see connection errors like: error making http request: Get "http://localhost:9601/": dial tcp 127.0.0.1:9601: connect: connection refused

Related issues

@AndersonQ AndersonQ self-assigned this Feb 12, 2026
@AndersonQ AndersonQ added Integration:logstash Logstash Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] labels Feb 12, 2026
Enable TSDB for health_report and node_cel data streams in the Logstash
integration and add metric_type annotations to pipeline and plugins.

Dimensions added:
- health_report: logstash.node.name, logstash.node.uuid, logstash.pipeline.id
- node_cel: logstash.node.stats.logstash.name
- pipeline: logstash.pipeline.host.name
- plugins: logstash.pipeline.host.name

Annotate numeric fields with appropriate metric_type for the health_report,
node_cel, pipeline, plugins data streams.

metric_type corrections (counter → gauge):
- node_cel: jvm.threads.count, jvm.threads.peak_count, jvm.mem.heap_max_in_bytes,
queue.events_count
- pipeline: logstash.pipeline.queues.events
- plugins: beats.peak_connections, beats.current_connections

Assisted by Cursor
@AndersonQ AndersonQ force-pushed the 16512-logstash-metrics-TSDB branch from 1b1af21 to 6cbf6df Compare February 12, 2026 16:53
@AndersonQ AndersonQ changed the title WIP: [logstash] Enable TSDB for metrics data streams [logstash] Enable TSDB for metrics data streams Feb 13, 2026
@github-actions
Copy link
Contributor

Vale Linting Results

Summary: 4 warnings, 2 suggestions found

⚠️ Warnings (4)
File Line Rule Message
packages/logstash/docs/README.md 33 Elastic.Latinisms Latin terms and abbreviations are a common source of confusion. Use 'and so on' instead of 'etc'.
packages/logstash/docs/README.md 38 Elastic.Latinisms Latin terms and abbreviations are a common source of confusion. Use 'for example' instead of 'e.g'.
packages/logstash/docs/README.md 40 Elastic.Latinisms Latin terms and abbreviations are a common source of confusion. Use 'for example' instead of 'e.g'.
packages/logstash/docs/README.md 75 Elastic.DontUse Don't use 'Note that'.
💡 Suggestions (2)
File Line Rule Message
packages/logstash/docs/README.md 36 Elastic.WordChoice Consider using 'can, might' instead of 'may', unless the term is in the UI.
packages/logstash/docs/README.md 75 Elastic.WordChoice Consider using 'refer to (if it's a document), view (if it's a UI element)' instead of 'see', unless the term is in the UI.

The Vale linter checks documentation changes against the Elastic Docs style guide.

To use Vale locally or report issues, refer to Elastic style guide for Vale.

@elastic-vault-github-plugin-prod

🚀 Benchmarks report

Package logstash 👍(0) 💚(0) 💔(2)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
log 7142.86 4950.5 -2192.36 (-30.69%) 💔
slowlog 6451.61 4405.29 -2046.32 (-31.72%) 💔

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @AndersonQ

@andrewkroh andrewkroh added the documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. label Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. Integration:logstash Logstash Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Logstash integration] Migrate datas treams to TSDB.

3 participants