Monitoring costs of containerized workloads in EKS using OpenCost and AWS Managed Prometheus / Grafana

Oleksii Bebych

January 25, 2024

Problem statement

Using clouds is convenient and has many advantages, like allocating as much workload as you need immediately, deploying globally pretty fast, focusing on business instead of maintaining a data center, etc. But on the other hand, you need to be really careful about costs, understand how cloud providers charge you, and how to monitor your costs continuously. AWS Billing and Cost Management provides you with detailed reports, AWS Budgets can help you with planning and alerting in unforeseen situations, but if we are talking about containerized workloads, especially in EKS, there is no native way to look inside the cluster and understand what kind of workload costs more than expected, is there a way to identify overprovisioned resources and optimize the usage. There are many tools on the market recently, for example:

In this particular case, we wanted to avoid extra payments for licensing new products and utilize the current solutions as much as possible. Opencost was chosen as a free and lightweight application that can be integrated with Prometheus and Grafana, which are currently used for overall monitoring. More information about configuring them is in this post.

Solution overview

Opencost remote write to AMP

In the previous post regarding Prometheus remote write configuration, we covered how to install and configure the Kube Prometheus Stack with AWS Managed Prometheus as a persistent storage for metrics.

Opencost shows a capability to do a similar thing:

Here is an example of values in the Helm chart
Prometheus and Opencost Helm charts were installed as part of infrastructure via Terraform along with the EKS cluster itself:

### Variables
variable "amp_workspace_id" { type = string }
variable "opencost_helm_version" { default = "1.29.0" }
variable "opencost_service_account_name" { type = string }

data "aws_iam_policy_document" "opencost-oidc-assume-role-policy" {
  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]
    effect  = "Allow"

    condition {
      test     = "StringEquals"
      variable = "${replace(var.iam_openid_provider_url, "https://", "")}:sub"
      values   = ["system:serviceaccount:${var.namespace}:${var.opencost_service_account_name}"]
    }

    principals {
      identifiers = [var.iam_openid_provider_arn]
      type        = "Federated"
    }
  }
}

resource "aws_iam_role" "opencost-irsa-role" {
  assume_role_policy = data.aws_iam_policy_document.opencost-oidc-assume-role-policy.json
  name               = "${var.eks_cluster_name}-${var.opencost_service_account_name}-role"
}

resource "kubernetes_service_account" "opencost-irsa" {
  automount_service_account_token = true
  metadata {
    name      = var.opencost_service_account_name
    namespace = var.namespace
    annotations = {
      "eks.amazonaws.com/role-arn" = aws_iam_role.opencost-irsa-role.arn
    }
  }
}

### Inline IAM Policy
resource "aws_iam_role_policy" "eks-system-opencost" {
  name = "opencost-policy"
  role = aws_iam_role.opencost-irsa-role.id

  policy = <<-EOF
  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "aps:RemoteWrite",
          "aps:GetSeries",
          "aps:GetLabels",
          "aps:GetMetricMetadata",
          "aps:QueryMetrics"
        ],
        "Resource": "*"
      }
    ]
  }

  EOF
}

### Opencost helm chart
resource "helm_release" "opencost" {
  name       = "opencost-charts"
  repository = "https://opencost.github.io/opencost-helm-chart"
  chart      = "opencost"
  version    = var.opencost_helm_version

  create_namespace = false
  namespace        = var.namespace

  values = [<<EOF

    serviceAccount:
      create: false
      name: ${kubernetes_service_account.opencost-irsa.metadata[0].name}

    opencost:
      ui:
        enabled: false
      prometheus:
        internal:
          enabled: false
        external:
          enable: false
        amp:
          enabled: true  # If true, opencost will be configured to remote_write and query from Amazon Managed Service for Prometheus.
          workspaceId: ${var.amp_workspace_id}

      sigV4Proxy:
        image: public.ecr.aws/aws-observability/aws-sigv4-proxy:1.7
        name: aps
        port: 8005
        region: ${var.aws_region}
        host: "aps-workspaces.${var.aws_region}.amazonaws.com" # The hostname for AMP service.

      nodeSelector:
        pool: system
      tolerations:
        - key: dedicated
          operator: Equal
          value: system
          effect: NoSchedule

    EOF
  ]
}

The key elements are:

Using the IAM role for the service account (IRSA) to achieve the least privilege principle
Overwrite several Helm Values to use the created Service Account, disable UI, enable remote write to AMP, and schedule Opencost on the “system” nodes separately from the main workload and overcome its taints.

As a result, we did not have new metrics from Opencost in Prometheus. There are no errors in the logs and nothing in discussions on the internet. After that, I started looking for another way.

Scraping Opencost metrics from Prometheus

As Opencost itself did not push its metrics to the Amazon Managed Prometheus, so I checked another thing. Opencost creates a Kubernetes service, that exposes /metrics:

% kubectl get svc -n monitoring
NAME                                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
opencost-charts                                  ClusterIP   172.20.203.255   <none>        9003/TCP            22h

% kubectl port-forward service/opencost-charts -n monitoring 9003:9003
Forwarding from 127.0.0.1:9003 -> 9003
Forwarding from [::1]:9003 -> 9003

So we can add the additional scrape config to our Prometheus operator, which is deployed via Terraform as well along with the EKS cluster:

### Proemtheus-operator helm
resource "helm_release" "prometheus" {
  name       = "kube-prometheus-stack"
  repository = "https://prometheus-community.github.io/helm-charts"
  chart      = "kube-prometheus-stack"
  version    = var.helm_version

  create_namespace = true
  namespace        = var.namespace

  values = [<<EOF

    alertmanager:
      enabled: false

    prometheus:
      serviceAccount:
        create: false
        name: ${kubernetes_service_account.irsa.metadata[0].name}
      prometheusSpec:
        additionalScrapeConfigs: |
          - job_name: opencost
            honor_labels: true
            scrape_interval: 1m
            scrape_timeout: 10s
            scheme: http
            metrics_path: /metrics
            static_configs:
            - targets: ['opencost-charts:9003']

        remoteWrite:
          - url: ${var.amp_remote_write_url}
            sigv4:
              region: ${var.aws_region}
            queue_config:
              max_samples_per_send: 1000
              max_shards: 200
              capacity: 2500

        nodeSelector:
          pool: system
        tolerations:
          - key: dedicated
            operator: Equal
            value: system
            effect: NoSchedule

    prometheusOperator:
      enabled: true
      nodeSelector:
        pool: system
      tolerations:
        - key: dedicated
          operator: Equal
          value: system
          effect: NoSchedule

    grafana:
      enabled: false

    kube-state-metrics:
      enabled: true
      nodeSelector:
        pool: system
      tolerations:
        - key: dedicated
          operator: Equal
          value: system
          effect: NoSchedule
    EOF
  ]
}

Connecting to the Prometheus UI via port-forward:

% kubectl port-forward service/kube-prometheus-stack-prometheus -n monitoring 9090:9090 
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090
Handling connection for 9090

Checking targets:

The target is up, and new metrics are being scraped:

You can find all new metrics on the Opencost website.

So the new architecture looks like this (simplified a bit):

Visualization in Grafana

There are several Grafana dashboards, but not all of them are actually working. I’ve found a couple of interesting ones.

For example 11270-kubecost:
This dashboard gives your Kubernetes cluster costs:

Cluster Wide (Live and Estimative)
Relative price of spot instances
Namespace (Live and Estimative)
Price variation between days and weeks
APP (Live and average)
Price comparison with 7 days ago
PVC Costs

And another one
Kubecost Dashboard for Grafana Cloud

Conclusion

Among the numerous available solutions for the monitoring costs of containerized workloads in the EKS cluster, in this post, we looked at free and lightweight OpenCost integrated with AWS Managed Prometheus. OpenCost works as a standalone solution as well; it has its own simple web UI, but in this particular case we already had AWS Managed Prometheus and Grafana for the overall monitoring, so we decided to integrate OpenCost with them and have all visualization and metrics in one place. For some reason, OpenCost does not send metrics directly to AWS Managed Prometheus via the “remote write” configuration, whereas it’s documented. So, as a workaround, I tried scraping Opencost metrics from the Prometheus operator, which sends metrics to AWS Managed Prometheus later, and it works. I found a couple of detailed dashboards for Grafana on the internet and demonstrated what they look like. Monitor your spends in clouds =)