Hao Liang's Blog

Embrace the World with Cloud Native and Open-source

【Scheduling】From kube-scheduler Extender to kube-scheduler Framework

I. Introduction Friends who follow the Kubernetes Scheduler SIG (Special Interest Group) should know that the recently released Kubernetes In version 1.19, Scheduler Framework replaces the original Schduler working mode and officially provides the scheduler to users in a plug-in form. Compared with the “four-piece set” of the old version of the scheduler: Predicate, Priority, Bind, and Preemption. The new version of the scheduler framework is more flexible and introduces a total of 11 extension points.

【Scheduling】Priority and preemption mechanism, affinity scheduling, in-tree scheduling algorithm (new features in version 1.19)

1. Priority and preemption mechanism During the scheduling process, Kube-scheduler takes out the Pod from the scheduling queue (SchedulingQueue) each time and performs one round of scheduling. So in what order are the Pods in the scheduling queue added to the queue? The Pod resource object supports setting the Priority attribute. Through different priorities, Pods with high priority are placed in front of the scheduling queue and scheduled first. If the scheduling of a Pod with a high priority fails and no suitable node is found, it will be placed in the UnschedulableQueue and enter the preemption phase.

【Scheduling】kube-scheduler architecture design and startup process code breakdown

1. kube-scheduler architecture design The core function of the scheduler is to find the most suitable node for the Pod to run on. For small-scale clusters, each scheduling cycle will traverse all nodes in the cluster to find the most suitable node for scheduling. For large-scale clusters, each scheduling cycle will only traverse some nodes in the cluster, and find the most suitable nodes among these nodes for scheduling. The entire scheduling process is mainly divided into three nodes: pre-selection, optimization and binding.

【Scheduling】Working principle of kube-scheduler: preemption mechanism in Priority algorithm

1. Why is the preemption mechanism needed? When a pod fails to be scheduled, it is temporarily in the pending state. The scheduler will not reschedule the pod until the pod is updated or the cluster status changes. However, in actual business scenarios, there will be a distinction between online and offline services. If the pod of the online service fails to be scheduled due to insufficient resources, it is necessary for the offline service to drop part of the resources to provide resources for the online service.

【Code Breakdown】Kubernetes scheduler--Analysis of Predicates preselection algorithm

Scheduler workflow When we use K8S clusters, we often need to create, modify, and delete Deployment Controllers. K8S will create, destroy, and reschedule Pods on the appropriate nodes. This scheduling process is implemented through the K8S Scheduler scheduler. Schduler’s workflow is shown below: The Informer component has been monitoring changes in Pod information in etcd. To be precise, it is monitoring changes in the Spec.nodeName field in Pod information. Once

【Code Breakdown】Kubernetes scheduler--Analysis of Priority optimization algorithm

Scheduler workflow When we use K8S clusters, we often need to create, modify, and delete Deployment Controllers. K8S will create, destroy, and reschedule Pods on the appropriate nodes. This scheduling process is implemented through the K8S Scheduler scheduler. Schduler’s workflow is shown below: The Informer component has been monitoring changes in Pod information in etcd. To be precise, it is monitoring changes in the Spec.nodeName field in Pod information. Once it detects that this field is empty, it is considered that there are Pods in the cluster that have not been scheduled to Node.