- Needs : Using k8s cpu and memory resources not fully of host node
(Example : cpu 16, mem 32 → only use cpu 8, mem 16 for k8s)
Summary
- We can use systemreserved options to limit k8s resources
- use
kubeletwithdynamic config
🚧 It makes editable node resources, need very careful approach for this settings
Setup Test Environment
- gcp nodes * 2, ( 8cpu, 32GB ) on ubuntu 18.04
- docker version : 18.06.2-ce , using cgroup
- kubernetes 1.17 kubeadm
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet=1.17.0-00 kubeadm=1.17.0-00 kubectl=1.17.0-00
sudo apt-mark hold kubelet kubeadm kubectl
- swapoff
- kubeadm init (master)
kubeadm init --pod-network-cidr=192.168.0.0/16
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular
user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed
at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following o
n each as root:
kubeadm join 10.178.0.41:6443 --token 71a5zg.9tzbr53ygw4dcul9 --discovery-token-ca-cert-hash sha256:dc08975ed40c701f20d18c1945510cef0bb76ee3cb88a82614e3a325aeab9f0b
- using calico
#calico 3.17 for kube 1.17
kubectl apply -f https://docs.projectcalico.org/archive/v3.17/manifests/calico.yaml
- check node ready
kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-reserve-compute-resource-1 Ready master 12m v1.17.0
kube-reserve-compute-resource-2 Ready <none> 9m53s v1.17.0
Test-1 : start kubelet with systemReserved options
Result-1 : Capacity is same, Allocatable is reduced
- kubectl describe node
kubectl describe node kube-reserve-compute-resource-1
Capacity:
cpu: 8
ephemeral-storage: 9983232Ki
hugepages-1Gi: 0
hugepages-2Mi: 0:no
memory: 32884412Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 9200546596
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32782012Ki
pods: 110
kubectl describe node kube-reserve-compute-resource-2
Capacity:
cpu: 8
ephemeral-storage: 9983232Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32884412Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 9200546596
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32782012Ki
pods: 110
- edit kubelet config in kube-reserve-compute-resource-1
(config path : /var/lib/kubelet/config.yaml)
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
systemReserved:
cpu: 4000m
memory: 16Gi
kubeReserved:
cpu: 200m
memory: 2Gi
- kubelet restart
kubectl drain kube-reserve-compute-resource-1 --ignore-daemonsets
systemctl stop kubelet
systemctl stop docker
vi /var/lib/kubelet/config.yaml
# edit above file
systemctl start docker
systemctl start kubelet
kubectl uncordon kube-reserve-compute-resource-1
- kubectl describe node1
kubectl describe node kube-reserve-compute-resource-1
...
Capacity:
cpu: 8
ephemeral-storage: 9983232Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32884412Ki
pods: 110
Allocatable:
cpu:
3800m
ephemeral-storage: 9200546596
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 13907644Ki
pods: 110
- edit kubelet config in kube-reserve-compute-resource-2
vi config.yaml
..
systemReserved:
cpu: "4"
memory: 16Gi
kubectl drain kube-reserve-compute-resource-2 --ignore-daemonsets
systemctl stop kubelet
systemctl stop docker
vi /var/lib/kubelet/config.yaml
#edit above file
systemctl start docker
systemctl start kubelet
kubectl uncordon kube-reserve-compute-resource-2
- kubectl describe node2
kubectl describe node kube-reserve-compute-resource-2
Capacity:
cpu: 8
ephemeral-storage: 9983232Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32884420Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 9200546596
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16004804Ki
pods: 110
- create pod
1 cpu, 4Gi mem pod * 10 Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 10
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
cpu: "1"
memory: "4Gi"
limits:
cpu: "1"
memory: "4Gi"
→ setup master taint to schedule pod
kubectl taint nodes --all node-role.kubernetes.io/master-
pod become pending because of limited resources
Test 1-1 : add reserved option without kubelet restart (for production node)
Result : It is not possible without restarting kubelet. ( It can be possible if kubelet is set as dynamic config options ) → Deprecated on 1.22
→ Maybe we can do similar thing by below link
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-reconfigure/
- edit kubelet service to use dynamic-config
cd /etc/systemd/system/kubelet.service.d
vi 10-kubeadm.conf
ExecStart=/usr/bin/kubelet
--dynamic-config-dir=/var/lib/kubelet-dynamic
$KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_AR
GS $KUBELET_EXTRA_ARGS
:wq
systemctl daemon-reload
systemctl restart kubelet
- extract configmap
kubectl get configmap -n kube-system
NAME DATA AGE
calico-config 4 4h22m
coredns 1 4h23m
extension-apiserver-authentication 6 4h23m
kube-proxy 2 4h23m
kubeadm-config 2 4h23m
kubelet-config-1.17 1 4h23m
kubectl get configmap -n kube-system kubelet-config-1.17 -oyaml > kubelet-config-node2.yaml
- edit configmap 하단 kind:Confimap 필드의 경우 빨간글씨 만 존재하여도 무방함)
4.apiVersion: v1
data:
kubelet: |
...
systemReserved:
cpu: "4"
memory: 16Gi
kind: ConfigMap
metadata:
name: kubelet-config-node2 # 이름설정
namespace: kube-system
- create configmap
kubectl create -f kubelet-config-node2.yaml
> configmap/kubelet-config-node2 created
- edit node to see new config
kubectl edit node kube-reserve-compute-resource-2
spec:
configSource:
configMap:
kubeletConfigKey: kubelet
name: kubelet-config-node2
namespace: kube-system
podCIDR: 192.168.1.0/24
podCIDRs:
- 192.168.1.0/24
- wait a sec and can see Allocatable changed
kubectl describe nodes kube-reserve-compute-resource-2
...
Capacity:
cpu: 8
ephemeral-storage: 9983232Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32884420Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 9200546596
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16004804Ki
pods: 110
...
Normal KubeletConfigChanged 98s kubelet, kube-reserve-compute-resource-2 Kubelet restarting to use /api/v1/names
paces/kube-system/configmaps/kubelet-config-node2, UID: f9cc30d5-ff15-4a02-a24d-abc21148e3dc, ResourceVersion: 39145, KubeletConfigKey: ku
belet
Normal
NodeAllocatableEnforced
87s kubelet, kube-reserve-compute-resource-2 Updated Node Allocatable limit across p
ods
Normal Starting 87s kubelet, kube-reserve-compute-resource-2 Starting kubelet.
References
- kubelet config
Kube Reserved (over k8s 1.8 )
- Kubelet Flag:
-kube-reserved=[cpu=100m][,][memory=100Mi][,][ephemeral-storage=1Gi][,][pid=1000] - Kubelet Flag:
-kube-reserved-cgroup=
System Reserved (over k8s 1.8 )
- Kubelet Flag:
-system-reserved=[cpu=100m][,][memory=100Mi][,][ephemeral-storage=1Gi][,][pid=1000] - Kubelet Flag:
-system-reserved-cgroup=
Explicitly Reserved CPU List (over k8s 1.17) cpu isolation
FEATURE STATE: Kubernetes v1.17 [stable]
Kubelet Flag: --reserved-cpus=0-3
you can use this option to define the explicit cpuset for the system/kubernetes daemons as well as the interrupts/timers, so the rest CPUs on the system can be used exclusively for workloads:
Example Scenario
Here is an example to illustrate Node Allocatable computation:
- Node has
32Giofmemory,16 CPUsand100GiofStorage -kube-reservedis set tocpu=1,memory=2Gi,ephemeral-storage=1Gi-system-reservedis set tocpu=500m,memory=1Gi,ephemeral-storage=1Gi-eviction-hardis set tomemory.available<500Mi,nodefs.available<10%