Ensure Stackdriver Monitoring is set to Enabled on Kubernetes Engine Clusters

Overview

Stackdriver Monitoring to monitor signals and build operations in your Kubernetes Engine clusters. Stackdriver Monitoring can access metrics about CPU utilization, some disk traffic metrics, network traffic, and uptime information. Stackdriver Monitoring uses the Monitoring agent to access additional system resources and application services in virtual machine instances.

Rationale

By Enabling Stackdriver Monitoring you will have system metrics and custom metrics. System metrics are measurements of the cluster's infrastructure, such as CPU or memory usage. For system metrics, Stackdriver creates a Deployment that periodically connects to each node and collects metrics about its Pods and containers, then sends the metrics to Stackdriver. Metrics for usage of system resources are collected from the CPU, Memory, Evictable memory, Non-evictable memory, and Disk sources.

Remediation guidance

Using Console

Go to Kubernetes GCP Console by visiting https://console.cloud.google.com/kubernetes/list?
Select reported Kubernetes clusters for which monitoring is disabled
Click on EDIT button and Set 'Stackdriver Monitoring' to Enabled

Using Command Line

To enable monitoring for an existing cluster, run the following command:

gcloud container clusters update \[CLUSTER_NAME\] --zone \[COMPUTE_ZONE\] --monitoring-service monitoring.googleapis.com

Impact

You are charged for the accrued storage costs when you export logs to another Google Cloud Platform service, such as BigQuery. Exporting logs from Stackdriver has no Stackdriver charge.

Default Value

By default, Stackdriver Monitoring is enabled when you create a new cluster using the gcloud command-line tool or Google Cloud Platform Console.

References

https://cloud.google.com/kubernetes-engine/docs/how-to/creating-a-container-cluster
https://cloud.google.com/kubernetes-engine/docs/how-to/monitoring?hl=en_US
https://cloud.google.com/monitoring/agent/

Notes

If you are using Stackdriver Logging, Stackdriver Error Reporting, Debugging, or Stackdriver Trace, and you are not using any services from Stackdriver Monitoring, then you do not have to associate your GCP project with a Stackdriver account and you do not have to select a service tier. By default, Stackdriver limits the features available to your project to those features in the Basic Tier of service.

Service-wide remediation

Recommended when many resources are affected: fix the platform baseline first so new resources inherit the secure setting, then remediate the existing flagged resources in batches.

Google Cloud

Use organization or folder policies where available, shared project templates, logs and alerting baselines, and IaC modules so new resources inherit the secure setting.

Operational rollout

Fix the baseline first at the account, subscription, project, cluster, or tenant scope that owns this control.
Remediate the currently affected resources in batches, starting with internet-exposed and production assets.
Re-scan and track approved exceptions with an owner and expiry date.

Query logic

These are the stored checks tied to this control.

Stackdriver Monitoring is set to Enabled on Kubernetes Engine Clusters

Connectors

Google Cloud

Covered asset types

Cluster

Expected check: eq []

gkeClusters(where:{monitoringService_NOT:"monitoring.googleapis.com"}){...AssetFragment}

Ensure Stackdriver Monitoring is set to Enabled on Kubernetes Engine Clusters

Overview

Remediation guidance

Service-wide remediation

Google Cloud

Operational rollout

Query logic

Platform

Use Cases

Industries

Compare

Resources

Company