Agent version
7.78.2
Bug Report
Description
After setting use_stats_summary_as_source: true in the kubelet check, sum aggregations of container resource utilization metrics roughly double:
kubernetes.cpu.usage.total
kubernetes.memory.usage
kubernetes.memory.working_set
The change is visible only on sum: aggregations, which is consistent with the same container series being emitted by both the cAdvisor source (/metrics/cadvisor) and the Summary API source (/stats/summary) at the same time, instead of Summary replacing cAdvisor. Per-series values look unchanged; the total roughly doubles.

This is not a real load increase. container.cpu.usage (collected directly from cgroups, independent of the kubelet endpoints) is flat across the same change window, while sum:kubernetes.cpu.usage.total steps up ~2x at deploy time. See attached screenshots.

Environment
- Datadog Agent: 7.78.2
- Kubernetes: EKS
- Container runtime: containerd, runc workloads (no gVisor on the affected nodes)
- Setting that triggers the change:
use_stats_summary_as_source: true in the kubelet check
Expected behavior
When use_stats_summary_as_source: true, the kubelet check should emit kubernetes.cpu.* and kubernetes.memory.* for each container from a single source (Summary), so existing dashboards / monitors using sum: remain correct.
Actual behavior
sum:kubernetes.cpu.usage.total and sum:kubernetes.memory.working_set step up by roughly 2x at the moment the flag is enabled. Cross-checks:
sum:container.cpu.usage (cgroup-direct) is unchanged, so host load did not change.
system.cpu.* shows no corresponding jump.
Pattern is consistent with cAdvisor-derived and Summary-derived series being emitted concurrently for the same containers.
Why other metric families are unaffected
| Metric family |
Collector |
Source |
Affected by use_stats_summary_as_source? |
system.cpu.* |
system check |
host kernel stats |
No |
container.cpu.*, container.memory.* |
container check |
cgroups + runtime |
No |
kubernetes.cpu.*, kubernetes.memory.* |
kubelet check |
/metrics/cadvisor and/or /stats/summary |
Yes, and currently both, hence the 2x |
Two independent ground-truth sources (system.* and container.*) stay flat while the kubelet-derived family doubles, which points at double counting inside the kubelet check rather than a workload change.
Screenshots
(attached below) Step-up in sum:kubernetes.cpu.usage.total and sum:kubernetes.memory.working_set at the time use_stats_summary_as_source: true was rolled out, alongside sum:container.cpu.usage flat across the same window.
Workaround
No workaround for runsc workloads.
For standard runc workloads, reverting use_stats_summary_as_source to its default (false) avoids the duplication.
However, we originally enabled use_stats_summary_as_source: true because we need the kubelet Summary (CRI) source for sandboxed workloads such as gVisor (runsc), where cAdvisor does not report per-container resource usage at all (see #44084). For those workloads we have no workaround: turning the flag off loses the metrics entirely, while turning it on double-counts every other container on the node.
Agent configuration
init_config:
loader: core
instances:
- kubelet_metrics_endpoint: https://localhost:10250/metrics
use_stats_summary_as_source: true
min_collection_interval: 20
Operating System
Linux (EKS nodes)
Agent version
7.78.2Bug Report
Description
After setting
use_stats_summary_as_source: truein the kubelet check, sum aggregations of container resource utilization metrics roughly double:kubernetes.cpu.usage.totalkubernetes.memory.usagekubernetes.memory.working_setThe change is visible only on

sum:aggregations, which is consistent with the same container series being emitted by both the cAdvisor source (/metrics/cadvisor) and the Summary API source (/stats/summary) at the same time, instead of Summary replacing cAdvisor. Per-series values look unchanged; the total roughly doubles.This is not a real load increase.

container.cpu.usage(collected directly from cgroups, independent of the kubelet endpoints) is flat across the same change window, whilesum:kubernetes.cpu.usage.totalsteps up ~2x at deploy time. See attached screenshots.Environment
use_stats_summary_as_source: truein the kubelet checkExpected behavior
When
use_stats_summary_as_source: true, the kubelet check should emitkubernetes.cpu.*andkubernetes.memory.*for each container from a single source (Summary), so existing dashboards / monitors usingsum:remain correct.Actual behavior
sum:kubernetes.cpu.usage.totalandsum:kubernetes.memory.working_setstep up by roughly 2x at the moment the flag is enabled. Cross-checks:sum:container.cpu.usage(cgroup-direct) is unchanged, so host load did not change.system.cpu.*shows no corresponding jump.Pattern is consistent with cAdvisor-derived and Summary-derived series being emitted concurrently for the same containers.
Why other metric families are unaffected
use_stats_summary_as_source?system.cpu.*container.cpu.*,container.memory.*kubernetes.cpu.*,kubernetes.memory.*/metrics/cadvisorand/or/stats/summaryTwo independent ground-truth sources (
system.*andcontainer.*) stay flat while the kubelet-derived family doubles, which points at double counting inside the kubelet check rather than a workload change.Screenshots
(attached below) Step-up in
sum:kubernetes.cpu.usage.totalandsum:kubernetes.memory.working_setat the timeuse_stats_summary_as_source: truewas rolled out, alongsidesum:container.cpu.usageflat across the same window.Workaround
No workaround for runsc workloads.
For standard runc workloads, reverting
use_stats_summary_as_sourceto its default (false) avoids the duplication.However, we originally enabled
use_stats_summary_as_source: truebecause we need the kubelet Summary (CRI) source for sandboxed workloads such as gVisor (runsc), where cAdvisor does not report per-container resource usage at all (see #44084). For those workloads we have no workaround: turning the flag off loses the metrics entirely, while turning it on double-counts every other container on the node.Agent configuration
Operating System
Linux (EKS nodes)