uninitialized taint
I have upgraded LKE service to k8s v24, and nodes did not get to initialized state. (Ongoing issue after 3 hours)
It shows nodes have uninitialized trait
kubectl get nodes -o json | jq '.items[].spec'
{
"podCIDR": "10.2.5.0/24",
"podCIDRs": [
"10.2.5.0/24"
],
"taints": [
{
"effect": "NoSchedule",
"key": "node.cloudprovider.kubernetes.io/uninitialized",
"value": "true"
}
]
}
kubectl get pods
shows
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress ingress-nginx-controller-74cb6699df-655xq 0/1 Pending 0 80m
ingress ingress-nginx-controller-74cb6699df-mqn8x 0/1 Pending 0 80m
ingress ingress-nginx-controller-74cb6699df-ms8sf 0/1 Pending 0 80m
ingress ingress-nginx-controller-74cb6699df-sdv2t 0/1 Pending 0 80m
kube-system calico-kube-controllers-9f8c48f46-bl8jt 1/1 Running 0 21m
kube-system calico-kube-controllers-9f8c48f46-ldckh 0/1 Error 0 68m
kube-system calico-node-df568 1/1 Running 0 21m
kube-system coredns-9cc8b85c6-6fth5 1/1 Running 0 20m
kube-system coredns-9cc8b85c6-d4r75 1/1 Running 1 (19m ago) 68m
kube-system csi-linode-controller-0 0/4 Init:CrashLoopBackOff 3 (3m33s ago) 21m
kube-system csi-linode-node-6bwwh 0/2 Init:CrashLoopBackOff 8 (3m3s ago) 21m
kube-system kube-proxy-hmb92 1/1 Running 0 20m
kubernetes-dashboard dashboard-metrics-scraper-8c47d4b5d-4tqdk 0/1 Pending 0 80m
kubernetes-dashboard kubernetes-dashboard-67bd8fc546-sj9bw 0/1 Pending 0 80m
kubectl describe pod csi-linode-controller-0 -n kube-system
shows
Name: csi-linode-controller-0
Namespace: kube-system
Priority: 0
Service Account: csi-controller-sa
Node: lke38402-61346-63a9eb700c21/139.162.74.50
Start Time: Mon, 26 Dec 2022 20:53:21 +0100
Labels: app=csi-linode-controller
controller-revision-hash=csi-linode-controller-6549d77457
role=csi-linode
statefulset.kubernetes.io/pod-name=csi-linode-controller-0
Annotations: cni.projectcalico.org/containerID: b74b47cf06429054d6f770679850bd1c3b925f41fcb15c728ed346ad8538e0d7
cni.projectcalico.org/podIP: 10.2.5.6/32
cni.projectcalico.org/podIPs: 10.2.5.6/32
Status: Pending
IP: 10.2.5.6
IPs:
IP: 10.2.5.6
Controlled By: StatefulSet/csi-linode-controller
Init Containers:
init:
Container ID: containerd://73c89328a61d110c36e31916db6b9ba52105cb264b1ef76046a2253ba639e3e3
Image: bitnami/kubectl:1.16.3-debian-10-r36
Image ID: docker.io/bitnami/kubectl@sha256:c4a8d9c0cd9c5f903830ea64816c83adf307ff1d775bc3e5b77f1d49d3960205
Port: <none>
Host Port: <none>
Command:
/scripts/get-linode-id.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 26 Dec 2022 21:11:11 +0100
Finished: Mon, 26 Dec 2022 21:11:12 +0100
Ready: False
Restart Count: 3
Environment:
NODE_NAME: (v1:spec.nodeName)
Mounts:
/linode-info from linode-info (rw)
/scripts from get-linode-id (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lw2kp (ro)
Containers:
csi-provisioner:
Container ID:
Image: linode/csi-provisioner:v3.0.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--volume-name-prefix=pvc
--volume-name-uuid-length=16
--csi-address=$(ADDRESS)
--default-fstype=ext4
--v=2
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
KUBERNETES_SERVICE_HOST: eec3a87f-3ea9-475a-b8f9-3b18898f131e.ap-northeast-2.linodelke.net
KUBERNETES_SERVICE_PORT: 443
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lw2kp (ro)
csi-attacher:
Container ID:
Image: linode/csi-attacher:v3.3.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--v=2
--csi-address=$(ADDRESS)
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
KUBERNETES_SERVICE_HOST: eec3a87f-3ea9-475a-b8f9-3b18898f131e.ap-northeast-2.linodelke.net
KUBERNETES_SERVICE_PORT: 443
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lw2kp (ro)
csi-resizer:
Container ID:
Image: linode/csi-resizer:v1.3.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--v=2
--csi-address=$(ADDRESS)
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
KUBERNETES_SERVICE_HOST: eec3a87f-3ea9-475a-b8f9-3b18898f131e.ap-northeast-2.linodelke.net
KUBERNETES_SERVICE_PORT: 443
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lw2kp (ro)
linode-csi-plugin:
Container ID:
Image: linode/linode-blockstorage-csi-driver:v0.5.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--endpoint=$(CSI_ENDPOINT)
--token=$(LINODE_TOKEN)
--url=$(LINODE_API_URL)
--node=$(NODE_NAME)
--bs-prefix=$(LINODE_BS_PREFIX)
--v=2
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
CSI_ENDPOINT: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
LINODE_API_URL: <set to the key 'apiurl' in secret 'linode'> Optional: false
LINODE_BS_PREFIX:
NODE_NAME: (v1:spec.nodeName)
LINODE_TOKEN: <set to the key 'token' in secret 'linode'> Optional: false
KUBERNETES_SERVICE_HOST: eec3a87f-3ea9-475a-b8f9-3b18898f131e.ap-northeast-2.linodelke.net
KUBERNETES_SERVICE_PORT: 443
LINODE_URL: <set to the key 'apiurl' in secret 'linode'> Optional: false
Mounts:
/linode-info from linode-info (rw)
/scripts from get-linode-id (rw)
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lw2kp (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
socket-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
linode-info:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
get-linode-id:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: get-linode-id
Optional: false
kube-api-access-lw2kp:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: :NoSchedule op=Exists
:NoExecute op=Exists
CriticalAddonsOnly op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22m default-scheduler Successfully assigned kube-system/csi-linode-controller-0 to lke38402-61346-63a9eb700c21
Normal Pulled 22m (x3 over 22m) kubelet Container image "bitnami/kubectl:1.16.3-debian-10-r36" already present on machine
Normal Created 22m (x3 over 22m) kubelet Created container init
Normal Started 22m (x3 over 22m) kubelet Started container init
Warning BackOff 22m (x3 over 22m) kubelet Back-off restarting failed container
Normal SandboxChanged 20m (x2 over 21m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 19m (x4 over 20m) kubelet Container image "bitnami/kubectl:1.16.3-debian-10-r36" already present on machine
Normal Created 19m (x4 over 20m) kubelet Created container init
Normal Started 19m (x4 over 20m) kubelet Started container init
Warning BackOff 55s (x93 over 20m) kubelet Back-off restarting failed container
kubectl describe pod csi-linode-node-6bwwh -n kube-system
shows
Name: csi-linode-node-6bwwh
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Service Account: csi-node-sa
Node: lke38402-61346-63a9eb700c21/139.162.74.50
Start Time: Mon, 26 Dec 2022 20:55:09 +0100
Labels: app=csi-linode-node
controller-revision-hash=857f78fd89
pod-template-generation=3
role=csi-linode
Annotations: <none>
Status: Pending
IP: 139.162.74.50
IPs:
IP: 139.162.74.50
Controlled By: DaemonSet/csi-linode-node
Init Containers:
init:
Container ID: containerd://4c8a21699fb22702d03484aa5065c33f342a602d39bc91fb2492f067c0f7b3fe
Image: bitnami/kubectl:1.16.3-debian-10-r36
Image ID: docker.io/bitnami/kubectl@sha256:c4a8d9c0cd9c5f903830ea64816c83adf307ff1d775bc3e5b77f1d49d3960205
Port: <none>
Host Port: <none>
Command:
/scripts/get-linode-id.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 26 Dec 2022 21:16:53 +0100
Finished: Mon, 26 Dec 2022 21:16:53 +0100
Ready: False
Restart Count: 9
Environment:
NODE_NAME: (v1:spec.nodeName)
Mounts:
/linode-info from linode-info (rw)
/scripts from get-linode-id (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xx7bq (ro)
Containers:
csi-node-driver-registrar:
Container ID:
Image: linode/csi-node-driver-registrar:v1.3.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--v=2
--csi-address=$(ADDRESS)
--kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
ADDRESS: /csi/csi.sock
DRIVER_REG_SOCK_PATH: /var/lib/kubelet/plugins/linodebs.csi.linode.com/csi.sock
KUBE_NODE_NAME: (v1:spec.nodeName)
KUBERNETES_SERVICE_HOST: eec3a87f-3ea9-475a-b8f9-3b18898f131e.ap-northeast-2.linodelke.net
KUBERNETES_SERVICE_PORT: 443
Mounts:
/csi from plugin-dir (rw)
/registration from registration-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xx7bq (ro)
csi-linode-plugin:
Container ID:
Image: linode/linode-blockstorage-csi-driver:v0.5.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--endpoint=$(CSI_ENDPOINT)
--token=$(LINODE_TOKEN)
--url=$(LINODE_API_URL)
--node=$(NODE_NAME)
--v=2
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
CSI_ENDPOINT: unix:///csi/csi.sock
LINODE_API_URL: <set to the key 'apiurl' in secret 'linode'> Optional: false
NODE_NAME: (v1:spec.nodeName)
KUBERNETES_SERVICE_HOST: eec3a87f-3ea9-475a-b8f9-3b18898f131e.ap-northeast-2.linodelke.net
KUBERNETES_SERVICE_PORT: 443
LINODE_URL: <set to the key 'apiurl' in secret 'linode'> Optional: false
LINODE_TOKEN: <set to the key 'token' in secret 'linode'> Optional: false
Mounts:
/csi from plugin-dir (rw)
/dev from device-dir (rw)
/linode-info from linode-info (rw)
/scripts from get-linode-id (rw)
/var/lib/kubelet from pods-mount-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xx7bq (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
linode-info:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
get-linode-id:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: get-linode-id
Optional: false
registration-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins_registry/
HostPathType: DirectoryOrCreate
kubelet-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet
HostPathType: Directory
plugin-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins/linodebs.csi.linode.com
HostPathType: DirectoryOrCreate
pods-mount-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet
HostPathType: Directory
device-dir:
Type: HostPath (bare host directory volume)
Path: /dev
HostPathType:
udev-rules-etc:
Type: HostPath (bare host directory volume)
Path: /etc/udev
HostPathType: Directory
udev-rules-lib:
Type: HostPath (bare host directory volume)
Path: /lib/udev
HostPathType: Directory
udev-socket:
Type: HostPath (bare host directory volume)
Path: /run/udev
HostPathType: Directory
sys:
Type: HostPath (bare host directory volume)
Path: /sys
HostPathType: Directory
kube-api-access-xx7bq:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: :NoSchedule op=Exists
:NoExecute op=Exists
CriticalAddonsOnly op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/csi-linode-node-6bwwh to lke38402-61346-63a9eb700c21
Normal Pulled 21m (x5 over 23m) kubelet Container image "bitnami/kubectl:1.16.3-debian-10-r36" already present on machine
Normal Created 21m (x5 over 23m) kubelet Created container init
Normal Started 21m (x5 over 23m) kubelet Started container init
Warning BackOff 3m42s (x92 over 23m) kubelet Back-off restarting failed container
I have tried to recycle all nodes, did not help.
I have tried to scale up cluster, did not help.
I have tried to scale down to 1. Did not help.
I have tried to reset kubeconfig settings, did not help.
I have spinned up new cluster, and after i deleted one failed calico pod, new cluster is working ok. (Not related to this cluster)
I assume this is something to do with old cluster.. I was running k8s v21, later upgraded to v22, later upgraded to v23 (had issues here as well) (pvcs in ss were not mapped correctly).
Is there anything i can do to save old cluster, or is it recommended way to create new cluster for each k8s version?
2 Replies
from the support i have received this request to run kubectl -n kube-system get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-9f8c48f46-bl8jt 1/1 Running 0 16h 10.2.5.4 lke38402-61346-63a9eb700c21 <none> <none>
calico-kube-controllers-9f8c48f46-ldckh 0/1 Error 0 17h 10.2.5.3 lke38402-61346-63a9eb700c21 <none> <none>
calico-node-df568 1/1 Running 0 16h 139.162.74.50 lke38402-61346-63a9eb700c21 <none> <none>
coredns-9cc8b85c6-6fth5 1/1 Running 0 16h 10.2.5.3 lke38402-61346-63a9eb700c21 <none> <none>
coredns-9cc8b85c6-d4r75 1/1 Running 1 (16h ago) 17h 10.2.5.5 lke38402-61346-63a9eb700c21 <none> <none>
csi-linode-controller-0 0/4 Init:CrashLoopBackOff 194 (18s ago) 16h 10.2.5.6 lke38402-61346-63a9eb700c21 <none> <none>
csi-linode-node-6bwwh 0/2 Init:CrashLoopBackOff 198 (4m51s ago) 16h 139.162.74.50 lke38402-61346-63a9eb700c21 <none> <none>
kube-proxy-hmb92 1/1 Running 0 16h 139.162.74.50 lke38402-61346-63a9eb700c21 <none> <none>
From my point of view the results are the same as when i run kubectl get pods -A
above
Calico and csi are in failed state. The cluster has still trait uninitialized.
Note that i have tried to scale up to 3 instances from my original 2 instances. Right now i am at one instance which should have no affect on calico nor csi in my opinion.