StatefulSet VolumeClaimTemplates stopped working recently
I have a StatefulSet that has been running on Linode for a few months, and after a recent cluster upgrade, the pod is no longer able to mount the volume.
To illustrate the problem, I created the simplest possible statefulset (straight out of the kubernetes documentation, with some Linode-specific changes to the spec) and tried to deploy it to my cluster:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx
serviceName: "nginx"
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "linode-block-storage-retain"
resources:
requests:
storage: 10Gi
This fails with the following error:
$ kubectl describe pod web-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedAttachVolume 0s (x3 over 2s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-6675d1525e224fd5" : csinode.storage.k8s.io "lke14531-17810-5fca422f0e88" not found
$ kubectl get pvc -o wide
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
www-web-0 Bound pvc-6675d1525e224fd5 10Gi RWO linode-block-storage-retain 6m51s Filesystem
As you can see from the above, the volume fails to mount with an error that it does not exist -- despite the fact that the pvc clearly does exist and is bound to a pv.
Has anyone else had this problem?
If I could just get one complete simple example of a StatefulSet with volumeClaimTemplates that works on Linode, I would be satisfied.
Here's the version info:
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.11", GitCommit:"27522a29febbcc4badac257763044d0d90c11abd", GitTreeState:"clean", BuildDate:"2021-09-15T19:16:25Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
Thanks in advance for any help or ideas!
2 Replies
I just deployed a new LKE cluster running Kubernetes v1.20 to test and was able to successfully deploy a StatefulSet using the manifest you provided.
$ k get pods
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 9m33s
Taking a look at the error you're getting, it looks like the issue is that for whatever reason the csi-linode-controller is unable to find the node in your cluster. I'd recommend try to reboot or recycle the nodes in your cluster to see if that helps. You can recycle the nodes in your node pool by clicking the Recycle Node Pool button on your Cluster Summary page.
If you're still having trouble, it might be helpful to get some logs from the csi-attacher container in the csi-linode-controller-0 pod. You can find this information by running kubectl logs -n kube-system csi-linode-controller-0 csi-attacher
Hi rl0nergan
Thanks for investigating this. I upgraded my cluster to 1.21 and the problem vanished. My apologies for bothering support before first rebooting and making sure I was on the latest version.