Optimize Kubernetes Performance: Distribute CPUs Across NUMA Nodes

Is your Kubernetes application underperforming? Bottlenecks can arise from uneven CPU allocation across NUMA (Non-Uniform Memory Access) nodes. Learn how the distribute-cpus-across-numa CPUManager policy option can help boost performance.

What is `distribute-cpus-across-numa` and Why Should You Care?

Kubernetes, by default, tends to pack CPUs onto a single NUMA node until it's full. This can lead to performance issues, especially for parallel applications that rely on synchronized operations.

The Problem: Imagine a scenario where one worker in a parallel process is consistently slower because it has fewer CPUs available on its NUMA node.
The Solution: The distribute-cpus-across-numa policy option ensures that CPU allocations are evenly spread across NUMA nodes. It allows application developers to create environments where no single worker suffers from NUMA effects more than another.

Who Benefits from Even CPU Distribution?

This feature is particularly valuable for:

High-performance computing (HPC) applications
Applications using parallel algorithms with barrier synchronization
Workloads where consistent CPU access across NUMA nodes is critical

How `distribute-cpus-across-numa` Works

The distribute-cpus-across-numa policy option is implemented within the static CPUManager policy in Kubernetes. When enabled, it uses an algorithm to try the following:

Even Distribution: The CPU Manager seeks to evenly split the CPU requirements of your container across all available NUMA nodes.
Best Effort Allocation: If a perfectly even split isn't possible, the remaining CPUs are assigned to maintain balance across NUMA nodes as much as possible.

Enabling the `distribute-cpus-across-numa` Policy Option

To enable the distribute-cpus-across-numa feature, you'll need to:

Enable the CPUManagerPolicyBetaOptions feature gate on your kubelet.
Set the CPUManager policy to static.
Include distribute-cpus-across-numa in the list of CPUManager policy options in your kubelet configuration.

Important: Enabling or disabling this feature requires a kubelet restart.

Verifying the CPU Distribution

Once the policy option is enabled, you can verify its effectiveness by:

Deploying a pod with a nodeSelector targeting a node with multiple NUMA nodes.
Requesting exclusive CPUs for the container.
Verifying that the allocated CPUs are evenly distributed across the NUMA nodes. You can do this by examining the container's CPU affinity or checking the cpu_manager_numa_allocation_spread metric.

Upgrade and Downgrade Considerations

This feature is opt-in, so upgrades and downgrades should not impact running workloads.
Existing workloads will continue to run uninterrupted, with any future workloads having their CPUs allocated according to the policy in place.

Monitoring CPU Distribution Across NUMA Nodes

Kubernetes provides metrics to help you monitor the distribution of CPUs across NUMA nodes. Key metrics include:

cpu_manager_numa_allocation_spread: Shows how CPUs are distributed across NUMA nodes. Look for a more even distribution when the policy is enabled.

Potential Caveats

Node Availability: Effective CPU distribution requires sufficient CPU resources available on multiple NUMA nodes.
Existing Workloads: Pre-existing CPU allocations might affect the ability to achieve a perfectly balanced distribution.

Conclusion

The distribute-cpus-across-numa policy option offers a powerful way to optimize Kubernetes performance for NUMA-aware applications. By distributing CPUs evenly, you can minimize bottlenecks and improve the overall efficiency of your workloads. If you're running parallel applications in Kubernetes, consider exploring this feature.