Changelog
What's shipping
Product updates from the Infrawatch team. We ship every 2–3 weeks. Every release improves correlation accuracy or reduces noise — we don't ship dashboards for the sake of it.
Config change fingerprinting on Platform tier
Infrawatch now captures Kubernetes ConfigMap and Helm release diffs within the correlation window. When a deployment rolls out 4 minutes before an incident, we surface the config diff alongside the signal cluster.
- Helm release change detection via Kubernetes audit log
- ConfigMap diff viewer on the incident timeline
- Fingerprint hash linking repeated config patterns to prior incidents
Root cause heatmap
A new visualization that shows which services and config changes appear most frequently across your incident history. Helps you find the 10% of causes behind 60% of your incidents.
- 7-day and 30-day heatmap views
- Linkable to individual incidents for drill-down
- Available on Platform tier
Alert deduplication tuning
We improved the deduplication algorithm to handle flapping alerts — alerts that fire, resolve, and re-fire within the same incident window. These are now grouped correctly instead of spawning duplicate incident threads.
- Flap detection with configurable re-fire threshold (default 3×)
- Dedup stats added to incident detail view
- PagerDuty alerts now carry dedup key through to Infrawatch
Runbook attachment per incident type
Platform teams can now attach runbook URLs to incident signature patterns. When Infrawatch identifies a known pattern, the runbook link is surfaced to the on-call engineer in the Slack alert and the incident detail view.
- Match runbooks to signal fingerprints
- Runbook preview renders Markdown from GitHub / Confluence / Notion URLs
- Incident resolution notes auto-populate from runbook completion
OpenTelemetry Collector v0.95 support
Updated the OTLP ingestion pipeline to support the latest OTel Collector release, including the new OTLP/HTTP JSON format and improved span batching.
- OTLP/HTTP JSON ingestion (previously OTLP/gRPC only)
- Span link correlation for distributed trace→incident mapping
Custom correlation rules
Platform tier users can now write custom correlation rules in YAML. Override the default 2-minute window per service cluster, add service affinity rules, or exclude noisy signals from correlation entirely.
- YAML-based rule editor in the Infrawatch dashboard
- Per-cluster window overrides (range 30s – 15min)
- Signal exclusion patterns
- Rule dry-run mode using historical incident data
Early access launch
Infrawatch is open for early access. Core Starter tier is live: metric + k8s event correlation, alert deduplication, and Slack/PagerDuty integration. We're onboarding platform teams manually — each gets a founder-led setup session.
- Multi-source signal ingestion (Prometheus, k8s Events, Alertmanager)
- 2-minute correlation window with 15s precision
- Incident severity scoring
- Slack alert with correlated signal summary
- PagerDuty incident creation