Why we rebuilt the agent
The original Kubeadapt agent required Prometheus, OpenCost, node-exporter, and kube-state-metrics before it could collect a single metric. Every cluster needed a full monitoring stack installed up front. Setup was slow, RBAC was complex, and upgrades broke things.
We started over.
kubeadapt-agent is a ground-up rewrite. One binary, one Deployment, one dependency: metrics-server.
What changed
Zero monitoring stack required
Prometheus, OpenCost, node-exporter, kube-state-metrics: all four are gone. kubeadapt-agent reads resource usage directly from the Kubernetes Metrics API. If your cluster has metrics-server installed, you're ready to go.
GPU metrics, automatically
kubeadapt-agent detects NVIDIA GPUs at startup. If dcgm-exporter is running anywhere in your cluster, the agent finds it and starts collecting GPU utilization, memory, temperature, and power draw per device. Standard and MIG-partitioned GPUs are both supported out of the box. No configuration needed.
Real-time collection
The old agent re-fetched every resource on a fixed interval, pulling the same data over and over. kubeadapt-agent uses Kubernetes watch connections: it syncs once at startup, then receives only change events in real time. When a pod scales or a node joins, the agent knows instantly.
It tracks 20+ resource types out of the box: nodes, pods, deployments, statefulsets, daemonsets, jobs, HPAs, VPAs, persistent volumes, and more.
Auto-discovery
On startup the agent probes the cluster and adapts to what's available:
- Cloud provider (AWS, GCP, Azure)
- VPA and Karpenter if installed
- GPU nodes and NVIDIA metrics exporters
- metrics-server availability
Nothing to configure.
Smarter data, not just more data
Before each snapshot is sent, the agent enriches raw metrics automatically:
- Workload mapping: Every pod is traced back to its parent deployment, statefulset, or daemonset. No orphaned metrics.
- Resource totals: Cluster-wide CPU, memory, GPU, and storage usage are pre-calculated and ready for the dashboard.
Before and after
| Old agent | kubeadapt-agent | |
|---|---|---|
| External dependencies | 4 (Prometheus, OpenCost, node-exporter, kube-state-metrics) | 1 (metrics-server) |
| GPU support | Manual config | Auto-discovered |
| Data collection | Periodic full-list API calls | Informer-based (real-time) |
| Cloud/VPA/Karpenter | Manual or N/A | Auto-detected |
| Health reporting | Basic | Full diagnostics per snapshot |
Migration
The original agent Helm chart is deprecated. The backend accepts data from both agents during the transition, so you can switch over with zero downtime.
What's next
This is the foundation for smarter right-sizing, real-time cost attribution, and automated savings. With a lightweight agent that just works, we ship faster and you spend less time on setup.