altitudes® Cloud · Platform · AI Amsterdam · Rotterdam --:--
FINOPSMAY 07, 20267 min read
[INSIGHT] / FINOPS _

Kubernetes cost visibility is an engineering problem.

Your cluster runs at 68 percent CPU utilization. Capacity planning says that is healthy. The FinOps review says you are spending €180,000 a month on compute. Neither number tells you which team owns which slice. That gap is an engineering problem, and it stays until you build the attribution layer.

Kubernetes cost visibility is an engineering problem.

Why cluster utilization misleads

Cloud cost dashboards report at the node level. Kubernetes schedules at the pod level. The gap between those two views is where the waste hides.

A cluster at 68 percent average CPU utilization might contain namespaces running at 90 percent and others running at 20 percent. The averages cancel out in the aggregate. The billing does not.

The three numbers you actually need: cost per namespace, cost per workload within a namespace, and the gap between what a pod requests and what it actually consumes. A pod that requests 4 CPU and uses 0.3 CPU is wasting 3.7 CPU worth of cluster capacity. The cluster utilization metric reads fine. The pod is the problem. You cannot see it from the dashboard you are looking at.

Requests vs limits vs actual: reading the three together

Kubernetes cost follows a simple rule: you pay for node capacity. The scheduler sizes nodes to satisfy pod resource requests. Actual consumption is irrelevant to the scheduler and irrelevant to the bill.

The waste pattern we encounter most often: teams set resource requests conservatively high for safety margin. They set limits much higher, or leave them absent. Actual consumption sits at 10 to 30 percent of the request. The cluster schedules based on the request. Nodes run at the size needed to satisfy the request. The bill is accurate to the capacity allocated, not to the work performed.

On a typical mid-market Kubernetes estate with 8 to 15 node pools, request-versus-actual waste averages 25 to 35 percent of total cluster cost. Fixing it requires visibility first, then a targeted right-sizing pass on the pods with the largest request-to-actual gap.

The tooling that makes this visible

OpenCost is a CNCF incubating project that allocates node cost to namespaces and individual workloads. It runs in-cluster, reads the Kubernetes API, and emits Prometheus metrics. The cost model handles on-demand, spot, and reserved node capacity correctly. It integrates with Grafana without additional configuration and requires no changes to existing workloads.

Kubecost extends the same allocation model with multi-cluster views, savings recommendations, and a budget alerting layer. The free tier covers most teams. The commercial tier adds SSO and programmatic API access.

Both tools are table stakes, not differentiators. The differentiation is in who looks at the dashboard and what they do with it. That is the part we work on.

"In most clusters, 20 percent of workloads account for 70 percent of the wasted request capacity. The right-sizing is a PR, not an infrastructure change."

Danny Zak / FinOps Lead

What acting on it looks like

The first sprint is visibility. Deploy OpenCost, connect it to existing Grafana, add a namespace-cost panel to the team dashboard next to CPU and memory. Cost becomes a first-class signal on the same screen as the other engineering metrics.

The second sprint is right-sizing. Identify the four or five workloads with the largest gap between request and actual. In most clusters, 20 percent of workloads account for 70 percent of the wasted request capacity. The right-sizing change is a PR against the Helm values or the Kubernetes manifests. It is not an infrastructure change and it does not require a maintenance window.

The third sprint is policy. LimitRange objects enforce request ceilings per namespace. ResourceQuota gates require a FinOps review before a namespace can increase its ceiling beyond a threshold. A monthly chargeback report goes to each team's engineering lead with three numbers: cost last month, cost versus the month before, and cost as a percentage of total cluster spend.

Teams that run all three sprints report 20 to 30 percent Kubernetes cost reduction in the first quarter. The reduction comes from right-sizing, not from removing workloads. No features go away. The bill goes down.

Written by Danny Zak FinOps Lead
[KEEP TALKING]

Recognise this in your own platform? One call, one written summary.