EKS nodes reporting kubelet eviction due to image garbage collection and the node-oom tuning that prevented pod restarts during deployments

Facebook X Reddit Pinterest

Amazon EKS (Elastic Kubernetes Service) continues to be the go-to option for enterprises running containerized workloads at scale in the AWS cloud. However, like all Kubernetes-based platforms, EKS clusters occasionally suffer from node-level issues. One such issue reported by users involves Kubelet triggering evictions because of aggressive image garbage collection. At a glance, this can result in pod disruptions during deployments, impacting reliability and uptime. Fortunately, through careful analysis and node-level memory tuning, it is possible to prevent such incidents.

TL;DR: EKS nodes may report Kubelet evictions due to image garbage collection when they run low on disk or memory. This can cause unnecessary pod restarts during deployments. By tuning the node’s out-of-memory (OOM) behavior using kernel parameters and properly configuring Kubelet thresholds, you can prevent disruptions. Smart node-oom tuning acts as an effective safeguard against unplanned pod terminations while keeping deployments seamless.

Understanding Image Garbage Collection in Kubernetes

Image garbage collection is a process where the Kubernetes Kubelet reclaims disk space by removing unused container images from the node. The logic makes sense — over time, new containers get deployed with updated images, leaving behind old ones. But when this mechanism becomes overly aggressive, it may interfere with active pods or trigger kubelet-level evictions.

The key insight is that while image garbage collection aims to prevent nodes from running out of disk space, it can sometimes conflict with other resource needs — most notably, memory and CPU. Resource pressure can result in eviction of critical pods, particularly during deployments when container images are rapidly pulled, unpacked, and run.

Why This Matters During Deployments

During a typical deployment, new pods are scheduled on various nodes. These pods may require fresh container images, which are pulled freshly from repositories such as Amazon ECR. If the node already has a large number of images, the disk usage may breach the configured evictionThreshold for imagefs (the partition storing container images). This triggers the Kubelet to start removing unused images to recover space. However, under tight memory margins, this action can initiate broader resource contention resulting in:

Eviction of low-priority pods
Pod restarts that affect availability
Delays in rolling updates or rollout failures

Kubernetes treats container image removal as a low-cost operation, but the surrounding conditions such as memory pressure and node-level limits have complex consequences. Understanding this behavior is key to avoiding disruptions.

Dissecting a Real-World Scenario

In a production EKS cluster, multiple nodes began reporting frequent evictions accompanied by logs similar to:

eviction manager: must evict pod(s) to reclaim imagefs

Upon reviewing kubectl describe node, administrators observed the imagefs.available metric falling below the threshold. The node had a substantial number of images, many unused, but the reason they hadn’t been cleared earlier was due to inactive image garbage collection cycles. Triggering the process during deployments added to the load, leading to memory pressure, and ultimately pod OOM (Out of Memory) kills.

More troubling was the fact that the evicted pods were deployment-critical services. Each eviction triggered a restart loop, making the rollout take longer and increasing the risk of timeouts in CI/CD pipelines.

Remedy through Node-Level OOM Tuning

To avoid further disruptions, engineers explored the root causes and implemented node-level Out-Of-Memory tuning. By adjusting Linux kernel parameters and kubelet configuration, they reduced unnecessary pod evictions under memory pressure. The key parameters tuned included:

vm.overcommit_memory = 2: This controls kernel behavior to deny allocations when memory is exhausted, preventing overcommits that cause crashes.
vm.panic_on_oom = 0: Prevents the node from panicking when under heavy load — crucial for backend stability.
systemd-oomd configuration: On newer AMIs with systemd-oomd, the configuration was customized to prioritize eviction based on cgroup memory limits and usage patterns.

The Kubelet eviction thresholds were also revisited:

--eviction-hard=imagefs.available<10%,memory.available<500Mi

This configuration was adjusted so that image garbage collection wouldn’t start too early and interfere with concurrent image pull actions. By putting more tolerance into imagefs thresholds and increasing memory pressure buffer, image GC and application deployment could coexist peacefully.

Additionally, the team set up custom kubelet QoS classes to ensure critical pods retained higher priority during eviction evaluations. This reduced the probability of mission-critical pods being victims of image-garbage-collection side effects.

Memory-Aware Pod Strategies

Outside of system-level tuning, changes were made at the application and deployment level to further insulate microservices from resource competition:

Defining resource requests and limits for memory explicitly for all workloads
Pinning persistent workloads to dedicated node groups using affinity
Implementing liveness and readiness probes with thresholds such that temporary delays during image pulls didn’t trigger pod restarts

This layered architecture helped ensure a single point of failure, such as aggressive garbage collection or tight resource margins, did not cascade into larger application failures during deployments.

Better Observability for Predictive Actions

Engineers also made monitoring enhancements using Prometheus and Grafana dashboards to alert on early warning signs such as:

Spike in container image pull counts
Rapid erosion of imagefs.available
Increased OOM kill count in /var/log/messages

These metrics helped predict resource contention scenarios and allowed preemptive scaling of nodes or triggering clean-up jobs ahead of brownout conditions. Solutions also included targeting specific nodes with image GC daemons rather than letting kubelet process take that task automatically under pressure.

Conclusion: A Fine Balance

Kubernetes brings powerful scheduling and orchestration capabilities, but its behavioral nuance around resource management requires proactive planning. In the case of image garbage collection in EKS, it is not only about freeing disk space but ensuring it doesn’t intervene destructively during peak workflows such as deployments.

Node-oom tuning helps fortify nodes against transient memory spikes and ensures predictability in deployment rollouts. A well-tuned node honors resource limits, protects key pods, and minimizes infrastructure-based disruptions. Investments here offer high returns in platform resilience and developer confidence.

Ultimately, teams must treat node configurations as a first-class citizen in the Kubernetes ecosystem. By blending system-level tuning, kubelet adjustments, and smarter scheduling at the pod level, EKS environments can achieve high availability with less drama during high-change operations.

Facebook X Reddit Pinterest