This is the first scheduler plug-in I contributed to the scheduler-plugin open source project of the Kubernetes sig-scheduling group in 2020.
1. Background
Related PR: PR103: Pod State Scheduling Plugin
Source code address: Pod State Scheduling
- The current Kubernetes native scheduler scoring algorithm (Score) does not consider the existing Terminating status Pods on the node.
- The current Kubernetes native scheduler scoring algorithm (Score) does not consider the existing Nominated status Pods on the node.
2. Function introduction
The Pod State scheduling scheduler plug-in implements a scoring extension plug-in (Score Plugin)
- The more Pods in the Terminating state on a node, the higher the score they will get in the scoring phase (the Pod is about to be unbound from the node and release resources, and it is expected to get a higher score)
- The more Nominted Pods on a node, the lower the score they will receive in the scoring phase (the Pod will be bound to the node and occupy resources, and it is expected to receive a lower score)
3. Implementation principle
The implementation principle of the scoring algorithm:
- Count the number of Pods in Terminating and Nominated status on each node respectively
- Add 1 to the Pod score for each Terminating state
- Decrease Pod score by 1 for each Nominated status
func (ps *PodState) score(nodeInfo *framework.NodeInfo) (int64, *framework.Status) {
var terminatingPodNum, nominatedPodNum int64
// get nominated Pods for node from nominatedPodMap
nominatedPodNum = int64(len(ps.handle.PreemptHandle().NominatedPodsForNode(nodeInfo.Node().Name)))
for _, p := range nodeInfo.Pods {
// Pod is terminating if DeletionTimestamp has been set
if p.Pod.DeletionTimestamp != nil {
terminatingPodNum++
}
}
return terminatingPodNum - nominatedPodNum, nil
}
Regularized score range:
- Count the scores of the highest and lowest scoring nodes respectively
- Determine the source score range (oldRange) by subtracting the lowest score from the highest score
- Use the highest score configured by the scoring plug-in minus the lowest score to determine the target score range (newRange)
- Calculation formula: ((real score of each node - lowest score of all nodes) * newRange / oldRange) + lowest score configured by the scoring plug-in
func (ps *PodState) NormalizeScore(ctx context.Context, state *framework.CycleState, pod *v1.Pod, scores framework.NodeScoreList) *framework.Status {
// Find highest and lowest scores.
var highest int64 = -math.MaxInt64
var lowest int64 = math.MaxInt64
for _, nodeScore := range scores {
if nodeScore.Score > highest {
highest = nodeScore.Score
}
if nodeScore.Score < lowest {
lowest = nodeScore.Score
}
}
// Transform the highest to lowest score range to fit the framework's min to max node score range.
oldRange := highest - lowest
newRange := framework.MaxNodeScore - framework.MinNodeScore
for i, nodeScore := range scores {
if oldRange == 0 {
scores[i].Score = framework.MinNodeScore
} else {
scores[i].Score = ((nodeScore.Score - lowest) * newRange / oldRange) + framework.MinNodeScore
}
}
return nil
}
4. Usage
Scheduler plug-in configuration example:
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
leaderElection:
leaderElect: false
clientConnection:
kubeconfig: "REPLACE_ME_WITH_KUBE_CONFIG_PATH"
profiles:
- schedulerName: default-scheduler
plugins:
score:
enabled:
- name: PodState