Linode Block Storage CSI Driver
Intro
Block Storage and Container Orchestration had a big year at Linode.
In February 2018, Linode announced Block Storage Volumes. In the 10 months since its introduction, thousands of users have provisioned petabytes of storage through the service. What started in only one region has become available in seven regions with more on the way. This has provided fast access that requires the least user configuration and minimal system overhead.
Enter the Linode Block Storage Cloud Storage Interface driver. With block storage service available in nearly all of Linode's regions, the service is well suited for use by persistent storage claims in container orchestrators like Kubernetes and Mesos. In November 2018, we built on the work of external collaborators (1) to promote the Linode Block Storage CSI driver to Linode's Github organization.
Edit March 14, 2019: us-southeast
(Atlanta) does not have BlockStorage support. The CSI driver will fail in this environment with events that contain the error message returned from the Linode API.
What is a CSI?
The Container Storage Interface specification provides an abstract interface that enables any container orchestrator to make persistent storage claims against a multitude of storage backends, including a user's chosen cloud provider. The CSI specification is being adopted broadly, not just at Linode.
The Linode Block Storage CSI adheres to the CSI 1.0.0 specification which was released November 15, 2018. With the introduction of the Linode CSI, Linode users can easily adapt container orchestrators like Kubernetes with the dynamic storage provisioning it offers, for use on Linodes with Block Storage Volumes.
Persistent Storage use in Kubernetes on Linode
When deploying stateful applications in Kubernetes, such as a database (MySQL, PostgresSQL), object stores (Redis, MongoDB), or file stores (NFS), it's desirable to back that service with persistent storage. Pods providing these services can access and share volumes that are externally and dynamically managed by persistent storage interfaces (2).
Prior to the Linode CSI, Linode users deploying Kubernetes clusters traditionally had to make use of the fast local SSD storage through tools like Rook with FlexVolume support. Local storage, even when distributed throughout a cluster, can not be relied on in every case. Storage that can outlive a pod, node, or cluster is a requirement for many environments. With the addition of Linode Block Storage, a second tier of storage could be added to clusters (3). The CSI driver offered us a the clear path forward with persistent storage claims on Linode Block Storage.
Usage
Recommended Installers
To get a Linode integrated Kubernetes experience, we recommend using the linode-cli
or the terraform-linode-k8s
module directly.
These installers bundle Linode aware addons, like the Cloud Controller Manager (CCM) (4), Container Storage Interface (CSI), and External-DNS. Linode's recommended installers provision clusters that make use of Linodes regional private IP network to avoid high transfer usage. The K8s masters and nodes will be created with Kubernetes node names that match the Linode labels and Linux hostnames.
Installing the CSI
Install the manifest which creates the roles, bindings, sidecars, and the CSI driver necessary to use Linode Block Storage in a Kubernetes cluster.
This step is not necessary if the cluster was created with one of the Linode recommended installers. This will only work with Kubernetes 1.13+.
kubectl apply -f https://raw.githubusercontent.com/linode/linode-blockstorage-csi-driver/master/pkg/linode-bs/deploy/releases/linode-blockstorage-csi-driver-v0.0.3.yaml
Consuming Persistent Volumes
Using the Linode CSI is very simple. The region of dynamically created volume will automatically match the region of the Kubernetes node. There is currently a minimum size requirement of 10GiB and a maximum volume count per node of eight (which includes the local disk(s), e.g. sda
, sdb
).
Create a manifest named csi-example.yaml
, and add it to the cluster using kubectl apply -f csi-example.yaml
.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-example-pvc
spec:
accessModes:
- ReadWriteOnce # Currently, the only supported accessMode
resources:
requests:
storage: 10Gi # Linode Block Storage has a minimum of 10GiB
storageClassName: linode-block-storage
---
apiVersion: v1
kind: Pod
metadata:
name: csi-example-pod
spec:
containers:
- name: csi-example-container
image: busybox
volumeMounts:
- mountPath: "/data"
name: csi-example-volume
command: [ "sleep", "1000000" ]
volumes:
- name: csi-example-volume
persistentVolumeClaim:
claimName: csi-example-pvc # This must match the metadata name of the desired PVC
Within a minute, the pod should be running. Interact with the storage through the pod:
$ kubectl exec -it csi-example-pod -- /bin/sh -c "echo persistence > /data/example.txt; ls -l /data"
total 20
-rw-r--r-- 1 root root 12 Dec 5 13:06 example.txt
drwx------ 2 root root 16384 Dec 5 06:03 lost+found
Delete the pod and it will be rescheduled with the same volume attached.
Additional Resources
Additional details are available in the linode-blockstorage-csi-driver project's README.md.
Features and Limitations
Features
- Creates Linode Block Storage Volumes on demand
- Single Node Writer / Read Write Once types
- Volumes include topology.linode.com/region annotations
- Prefix Volume Labels
Design Choices
- Defines the Linode CSI as the default StorageClass
- Default ReclaimPolicy deletes unused volumes
- Waits up to 300s for volume creation
- Formats volumes as ext4
Limitations
- Currently requires a minimum volume size of 10GB ($1/mo as of Dec 2018)
- Currently supports Kubernetes 1.13+
- Currently supports 7 Block Storage device attachments (see below)
- Linode Label must match the Kubernetes Node name
- Snapshots are not supported
Todo
- Support Volume Tags
- Support Resize - https://github.com/container-storage-interface/spec/pull/334/files
- Support Raw BlockStorage
- Support Partitioning
- Support attaching existing volumes
- Support Per-Directory use on formatted volumes
- Support Read-only mounts (single-reader)
Summary
The CSI enables us to support persistent storage claims for container orchestrators like Kubernetes and those claims enable us to support user's stateful applications. We look forward to 2019, where we plan on providing additional support for container orchestrators.
Footnotes
The development team benefited from the work of AppsCode, Ciaran Liedeman, Ricardo Ramirez (Linode Docker Volume driver) and the Kubernetes Storage SIG contributors in producing early proof-of-concept drivers.
Kubernetes volume plugins began life in-tree, with support for only a few specific cloud providers. Within the Kubernetes community, the Storage and Cloud-Provider Special-Interest-Group (SIG) have been working to remove volume plugins and other cloud specific clode from the core Kubernetes codebase. This resulted in out-of-tree FlexVolume drivers and, most recently, Container Storage Interface (CSI) drivers.
These could even be added through Rook, although there are complexities to this configuration and provisioning the attached volumes would be a manual effort.To reduce complexity and simplify Kubernetes access to Linode's Block Storage, AppsCode's Pharmer team created a FlexVolume plugin for Linode. There were limitations with FlexVolume, such as root access requirements for installation and provisioning.
The CCM informs the Kubernetes API of Node status and removals based on the Linode API. If a Linode is deleted through the manager, the pods from that node will be rescheduled to other nodes and the node will be removed from the cluster. When a service uses
type: LoadBalancer
, a Linode NodeBalancer will be automatically provisioned and configured to route the traffic from the public IP address of the NodeBalancer to the private IP address any of the nodes capable of providing the service (all nodes, thanks to kube-proxy).
8 Replies
Getting this when trying to install on Kubernetes 1.13:
$ kubectl apply -f https://raw.githubusercontent.com/linode/linode-blockstorage-csi-driver/master/pkg/linode-bs/deploy/releases/linode-blockstorage-csi-driver-v0.0.3.yaml
customresourcedefinition.apiextensions.k8s.io/csinodeinfos.csi.storage.k8s.io created
customresourcedefinition.apiextensions.k8s.io/csidrivers.csi.storage.k8s.io created
serviceaccount/csi-node-sa created
clusterrole.rbac.authorization.k8s.io/driver-registrar-role created
clusterrolebinding.rbac.authorization.k8s.io/driver-registrar-binding created
serviceaccount/csi-controller-sa created
clusterrole.rbac.authorization.k8s.io/external-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-provisioner-binding created
clusterrole.rbac.authorization.k8s.io/external-attacher-role created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-attacher-binding created
clusterrole.rbac.authorization.k8s.io/external-snapshotter-role created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-snapshotter-binding created
storageclass.storage.k8s.io/linode-block-storage created
statefulset.apps/csi-linode-controller created
daemonset.extensions/csi-linode-node created
error: unable to recognize "https://raw.githubusercontent.com/linode/linode-blockstorage-csi-driver/master/pkg/linode-bs/deploy/releases/linode-blockstorage-csi-driver-v0.0.3.yaml": no matches for kind "CSIDriver" in version "csi.storage.k8s.io/v1alpha1"
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:31:33Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Thanks, @recipedude. The piece of the yaml
that is not being accepted follows (the comment block hints at the problem):
---
# pkg/linode-bs/deploy/kubernetes/02-csi-driver.yaml
# Requires CSIDriverRegistry feature gate (alpha in 1.12)
# xref: https://raw.githubusercontent.com/kubernetes/csi-api/master/pkg/crd/manifests/csinodeinfo.yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: csidrivers.csi.storage.k8s.io
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
version: v1alpha1
group: csi.storage.k8s.io
names:
kind: CSIDriver
plural: csidrivers
scope: Cluster
validation:
openAPIV3Schema:
properties:
spec:
description: Specification of the CSI Driver.
properties:
attachRequired:
description: Indicates this CSI volume driver requires an attach operation,
and that Kubernetes should call attach and wait for any attach operation
to complete before proceeding to mount.
type: boolean
podInfoOnMountVersion:
description: Indicates this CSI volume driver requires additional pod
information (like podName, podUID, etc.) during mount operations.
type: string
---
According to the list of feature gates, CSIDriverRegistry
is not enabled by default in K8s 1.13.1.
Try applying the yaml again after ensuring the required feature gates are enabled with --feature-gates=CSIDriverRegistry=true, CSINodeInfo=true
added to your Kubelet and API Server. (These feature flags are enabled in the terraform-linode-k8s module for the API Server and the Kubelet.)
I'll have to update the listed requirements and the versions they are introduced at.
This seemed like a fabulous feature and was working great during initial testing. However, we just ran into the volume attach limit quite quickly when scaling up our test environment. With effectively only 6 PVs per node (8 volume limit minus local SSDs) this seems to be of very limited use in real production situations.
Are there plans to relax this 8 volume constraint?
@inviscid Plans to increase this limit have already been enacted, but the CSI driver hasn't caught up, yet.
Volumes previously had to be bound to one of the Linode Boot Config sda-sdh device slots.
The API has already removed that limitation by permitting a Volume to be attached without a ConfigID reference (persist_across_boots
).
https://developers.linode.com/api/docs/v4#operation/attachVolume
These volumes will not be reattached by Linode on reboot (which is not a problem in Kubernetes since the CSI will either reattach the volume on reboot or the volume (and related pods) will have already been rescheduled to another node.
In LinodeGo, persist_across_boots
should be added:
https://github.com/linode/linodego/pull/81
The offending lines in the CSI driver are:
Since the persist_across_boots
value "default is true", for backward compatibility this LinodeGo field will need to be a reference to a boolean and it will need to be explicitly made false here:
Excellent! Thanks for the quick response and fix.
We will make use of it as soon as it is available.
We were looking at the documentation and noticed a constraint that we would like to understand a little better.
https://developers.linode.com/api/docs/v4#operation/attachVolume
In there, it states the limit of block device volumes that can be attached to a node is equal to the RAM in GB. Can you help us understand why this constraint is in place?
We certainly understand there needs to be some reasonableness check so someone doesn't attempt attaching 200 volumes to a nanode but some use cases we have require user specific volumes (JupyterHub) which can add up quickly.
Thanks…
@inviscid Your reasoning is accurate. There is an appreciable amount of RAM and CPU overhead involved with attaching and sustaining block storage connections to virtual machines. GCP has similar restrictions in place.
The Kubernetes slack channel - #linode is a great place to give this feedback to the developers (and get their responses).
This limit has now been increased based on the size of linode. This is now supported since linode-block-storage-csi-driver version v0.7.0 and onwards. Please check release notes for detailed explanation: https://github.com/linode/linode-blockstorage-csi-driver/releases/tag/v0.7.0