百度智能云容器引擎服务 CCE 动态调度插件说明
文档简介:
cce-dysched-extender 是 k8s 默认调度器的一个插件,利用 scheduler extender 机制向 kube-scheduler 注册 Filter、Prioritize 钩子,来干预默认调度器的调度行为。
节点的 metrics 数据来自 metrics-server 组件,在部署 cce-dysched-extender 之前需要确保集群中的 metrics-server 组件正常工作。
CCE 动态调度插件
cce-dysched-extender 是 k8s 默认调度器的一个插件,利用 scheduler extender 机制向 kube-scheduler 注册 Filter、Prioritize 钩子,来干预默认调度器的调度行为。
节点的 metrics 数据来自 metrics-server 组件,在部署 cce-dysched-extender 之前需要确保集群中的 metrics-server 组件正常工作。
该插件主要功能如下:
- 预选阶段(Filter),支持根据阈值过滤掉高资源使用率的节点;
- 优选节点(Prioritize),资源使用率较低的节点优先级较高,比较的顺序为 cpu, memory;
可以根据自己的实际情况设置资源容忍阈值 --tolerance-memory-rate 或 --tolerance-cpu-rate 参数,两个参数默认值为 80%。
安装部署
- 部署 cce-dysched-extender 插件,执行 kubectl apply -f all-in-one.yaml 命令,all-in-one.yaml 文件内容如下:
apiVersion: v1 kind: ServiceAccount metadata: labels: app: cce-dysched-extender app.kubernetes.
io/component: controller app.kubernetes.io/name: cce-dysched-extender name: cce-dysched-extender
namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata:
labels: app: cce-dysched-extender app.kubernetes.io/name: cce-dysched-extender name:
cce-dysched-extender rules: - apiGroups: - "" resources: - nodes - pods verbs: -
get - list - watch - apiGroups: - metrics.k8s.io resources: - nodes - pods verbs:
- get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind:
ClusterRoleBinding metadata: labels: app: cce-dysched-extender app.kubernetes.io/name:
cce-dysched-extender name: cce-dysched-extender roleRef: apiGroup: rbac.authorization.k8s.
io kind: ClusterRole name: cce-dysched-extender subjects: - kind: ServiceAccount name:
cce-dysched-extender namespace: kube-system --- apiVersion: v1 kind: Service metadata:
annotations: service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
labels: app: cce-dysched-extender name: cce-dysched-extender namespace: kube-system
spec: ports: - name: http port: 8080 targetPort: 8080 selector: app: cce-dysched-extender
sessionAffinity: None type: LoadBalancer --- apiVersion: apps/v1 kind: Deployment
metadata: labels: app: cce-dysched-extender name: cce-dysched-extender namespace:
kube-system spec: replicas: 1 selector: matchLabels: app: cce-dysched-extender
template: metadata: labels: app: cce-dysched-extender spec: containers: -
args: - -v=3 - --tolerance-cpu-rate=80 - --tolerance-memory-rate=80 image:
registry.baidubce.com/cce-plugin-pro/cce-dysched-extender:v0.3.0 imagePullPolicy:
Always livenessProbe: httpGet: path: /healthz port: http scheme: HTTP periodSeconds:
60 name: dysched-dysched ports: - containerPort: 8080 name: http protocol:
TCP readinessProbe: httpGet: path: /healthz port: http scheme: HTTP periodSeconds:
60 resources: limits: cpu: "4" memory: 4Gi requests: cpu: 100m memory:
100Mi serviceAccountName: cce-dysched-extender
- 获取 dysched-extender service ip
$ export DYSCHED_EXTENDER_SVC_IP=$(kubectl get service cce-dysched-extender -n kube-system
| awk '{print $4}' | grep -v EXTER) $ echo $DYSCHED_EXTENDER_SVC_IP
剩下的都步骤需要在所有 master 节点上执行。
- 修改 /etc/kubernetes/scheduler-policy.json 这个文件(没有则创建)内容如下, 并把 <DYSCHED_EXTENDER_SVC_IP> 替换为如上步骤中获取的值
{ "kind": "Policy", "apiVersion": "v1", "extenders": [ { "urlPrefix": "http://<DYSCHED_EXTENDER_SVC_IP>
:8080/dysched/extender", "prioritizeVerb": "prioritize", "filterVerb": "filter", "enableHttps": false,
"nodeCacheCapable": true, "ignorable": true, "weight": 10 } ] }
- 修改 kube-scheduler 的启动参数,加上 --policy-config-file=/etc/kubernetes/scheduler-policy.json,使用如下命令
$ vim /etc/systemd/system/kube-scheduler.service
- 重启 kube-scheduler, 使用如下命令
$ systemctl daemon-reload && systemctl restart kube-scheduler