k8s volume reattachment takes 7 minutes after node disconnect!
Hi,
I have a test k8s cluster running on Linode and it works just fine. I am very happy with it yet. Just discovered one problem today.
There was 2 nodes in the cluster, and I scaled it down to one node. There was 3 volumes attached to K8S via PVC. Seems all of these have been attached to the node that has been shut down.
I start receiving error messages that PV cannot be reattached. It only started to work after 7 minutes (I did not change anything in configs).
There are messages from e-mails with timestamps:
lke770-947-5e0609564edb - (184299783) System Shutdown - Completed Mon, 30 Dec 2019 15:39:52 GMT
lke770-947-5e0609564edb - (184299784) Inactivate Linode - Completed Mon, 30 Dec 2019 15:40:01 GMT
lke770-947-5e06098e443d - (184300280) Attach Volume - pvcc1390606cfac4605 - Completed Mon, 30 Dec 2019 15:47:26 GMT
lke770-947-5e06098e443d - (184300281) Attach Volume - pvc8b986173ea434649 - Completed Mon, 30 Dec 2019 15:47:26 GMT
lke770-947-5e06098e443d - (184300282) Attach Volume - pvccb7ce4c5453345b9 - Completed Mon, 30 Dec 2019 15:47:26 GMT
So in short, if something happens with a node, it will take up to SEVEN MINUTES to get it running again and we have no way to speed it up. Do I understand it correctly?
Thanks in advance, Alex
1 Reply
There might be a conflict with Storage Object in Use Protection causing the delay. With this configuration enabled, Persistent Volume Claims (PVCs) that are being actively used by a pod and their corresponding Persistent Volumes are not removed from the system. This is meant to protect against data loss, but can cause the delay you're experiencing. There are a few ways to tweak your configurations to fit your use case better; these can be found in the Kubernetes documentation.
A more general way to look at this is that the lifetime of a PVC is separate than the life of the pod. A good starting point for troubleshooting is to examine your PVC configurations to determine how they currently handle deletion of a node in your cluster.