Save to My DOJO
Table of contents
- An introduction to VMware vSphere cloud native storage
- Kubernetes PVs, PVCs and Pods
- The vSphere Container Storage Interface driver
- vSphere cloud native storage features and benefits
- The transformation from VCP (vSphere Cloud Provider) to CSI (Container Storage Interface)
- vSAN Cloud Native Storage integration
- Wrap up
In this article, we will have a look at how to provision storage for Kubernetes workloads directly on vSphere cloud native storage without resorting to an extra software layer in between.
In the infrastructure world, storage is the building block of data persistency which comes in many different shapes and forms, including vSphere cloud native storage. Shared storage lets you leverage clustering services and enables a number of data protection scenarios that are vital to most IT environments. Ensuring a solid storage backend will give IT departments much appreciated peace of mind as storage outages are among the most dreaded failure scenarios in any IT professional’s mind. Despite the growing number of storage solutions available on the market, provisioning shared storage in a vSphere environment is something that many now consider mainstream, as it is a tried a tested process. VMware vSphere environments offer several options for storage such as VMFS, vSAN, VVols and NFS to store your virtual machines and other resources consumed by the hypervisors among other things.
In recent years, VMware have extended the reach of vSphere storage backends and the capabilities of the vSphere suite to integrate more closely with modern applications, in other words, container workloads and microservices that leverage vSphere cloud-native storage. An area VMware has been heavily involved in after the acquisition of several Cloud Native companies like Pivotal to build their VMware Tanzu portfolio.
While the flexibility and customization potential of Kubernetes is unbeatable, its complexity means that the learning curve is fairly steep compared to other infrastructure solutions. Let’s see how vSphere Cloud Native Storage deals with that.
An introduction to VMware vSphere cloud native storage
First of all, what is Cloud Native? The term Cloud Native has been somewhat of a buzzword these last few years and it started appearing in more and more places. Cloud Native mostly refers to infrastructure agnostic container workloads that are built to run in the cloud. That means no more monolithic software architectures and separation of duties. Microservices are meant to be service-specific workloads interacting with each other in a streamlined fashion. Kubernetes is a container orchestrator platform that has been enabling this revolution and became the de-facto industry standard for running containers in enterprise settings.
Having said that, not all workloads running on Kubernetes can be stateless and ephemeral. We still need to store data, configs and other resources permanently on backends such as vSphere cloud native storage for those stateful applications. That way the data will remain even after ruthlessly killing a bunch of pods. Here come persistent volumes (PVs). Kubernetes resources that let you provision storage on a specific backend like vSphere cloud native storage to store data persistently.
“VMware CNS supports most types of vSphere storage”
Kubernetes PVs, PVCs and Pods
VMware Tanzu is an awesome product; however, it is easy for a vSphere admins to jump headfirst in it with no prior knowledge of Kubernetes just because it has the “VMware” label on it. This makes the learning process incredibly confusing and not a great way to start on this journey. So, before we dig in, I’d like to cover a few Kubernetes terms for those that aren’t too familiar with this. More will follow in the next chapter.
- Pod: A pod is the smallest schedulable entity for workloads, you manage pods, not containers. A pod can contain one or more containers but a container is only in one pod. It contains information on volumes, networking and how to run the containers.
- Persistent Volume (PV): A PV is an object to define storage that can be connected to pods. It can be backed by various sources such as temporary local storage, local folder, NFS or interact with an external storage provider through a CSI driver.
- Persistent Volume Claim (PVC): PVCs are like storage requests that let you assign specific persistent volumes to pods.
- Storage Class (SC): Those let you configure different tiers of storage or infrastructure-specific parameters to apply PVs backed by a certain type of storage without having to be too specific, much like storage policies in the vSphere world.
The vSphere Container Storage Interface driver
The terms described in the previous chapter are the building blocks of provisioning vSphere Cloud Native storage. Now we will quickly touch base on what a Container Storage Interface (CSI) driver is. As mentioned earlier, persistent volumes are storage resources that let a pod store data onto a specific storage type. There is a number of built-in storage types to work with but the strength of Kubernetes is its extensibility. Much like you can add third-party plugins to vCenter or array-specific Path Selection Policies to vSphere, you can interact with third-party storage devices in Kubernetes by using drivers distributed by the vendor, which will plug into the Container Storage Interface. Most storage solution vendors now offer CSI drivers and VMware is obviously one of them with the vSphere Container Storage Interface or vSphere CSI which enables vSphere cloud-native storage.
When a PVC requests a persistent volume on vSphere, the vSphere CSI driver will translate the instructions into something vCenter understand. vCenter will then instruct the creation of vSphere cloud native storage that will be attached to the VM running the Kubernetes node and then attached to the pod itself. The added benefit is that vCenter will report information about the container volumes in the vSphere client with more or less information depending on the version you are running. And this is what is called vSphere Cloud Native Storage.
“vSphere cloud native storage lets you provision persistent volumes on vSphere storage”
Now in order to leverage vSphere cloud native storage, the CSI provider must be installed in the cluster. If you aren’t sure or you are getting started with this, you can use CAPV or Tanzu Community Edition to fast track this step. Regardless, the configuration to instruct the CSI driver how to communicate with vCenter is contained in a Kubernetes secret (named csi-vsphere-config by default) that is mapped as a volume on the vSphere CSI controller. You can display the config of the CSI driver by opening it.
k get secrets csi-vsphere-config -n kube-system -o jsonpath='{.data.csi-vsphere\.conf}’ |
“The vSphere CSI driver communicates with vCenter to provision vSphere cloud native storage”
vSphere cloud native storage features and benefits
Part of the job of an SRE (Site Reliability Engineer), or whatever title you give to the IT professional managing Kubernetes environments, is to work with storage provisioning. We are not talking about presenting iSCSI LUNs or FC zoning to infrastructure components here, we are working a level higher in the stack. The physical shared storage is already provisioned and we need a way to provide a backend for Kubernetes persistent volumes. vSphere Cloud native storage greatly simplifies this process with the ability to match vSphere storage policies with Kubernetes storage classes. That way when you request a PV in Kubernetes you get a virtual disk created directly on the datastore.
Note that these disks are not of the same type as traditional virtual disks that are created with virtual machines. This could be the topic of its own blog post but in a nutshell, those are called Improved Virtual Disk (IVD), First Class Disks (FCD) or managed virtual disk. This type is needed because it is a named virtual disk unassociated with a VM, as opposed to traditional disks that can only be provisioned by being attached to a VM.
The other benefit of using vSphere cloud native storage is better visibility of what’s being provisioned in a single pane of glass (a.k.a. vSphere web client). With vSphere CNS, you can view your container volumes in the vSphere UI and find out what VM (a.k.a. Kubernetes node) the volume is connected to along with extra information such as labels, storage policy… I will show you that part in a bit.
Note that support for vSphere CSI will depend on your environment and you may or may not be able to leverage it in full. This is obviously subject to change across versions so you can find the up to date list here.
Functionality | vSphere Container Storage Plug-in Support |
vSphere Storage DRS | No |
vSAN File Service on Stretched Cluster | No |
vCenter Server High Availability | No |
vSphere Container Storage Plug-in Block or File Snapshots | No |
ESXi Cluster Migration Between Different vCenter Server Systems | No |
vMotion | Yes |
Storage vMotion | No |
Cross vCenter Server Migration
Moving workloads across vCenter Server systems and ESXi hosts. |
No |
vSAN, Virtual Volumes, NFS 3, and VMFS Datastores | Yes |
NFS 4 Datastore | No |
Highly Available and Distributed Clustering Services | No |
vSAN HCI Mesh | No |
VM Encryption | Yes |
Thick Provisioning on Non vSAN Datastores
For Virtual Volumes, it depends on capabilities exposed by third-party storage arrays. |
No |
Thick Provisioning on vSAN Datastores | Yes |
A lot of features get added over the versions of the release cycle such as:
- Snapshot support for block volumes
- Exposed metrics for Prometheus monitoring
- Support for volume topology
- Performance and resiliency improvements
- Online volume expansion
- vSphere Container Storage support on VMware Cloud on AWS (VMC)
- ReadWriteMany volumes using vSAN file services
- And others…
The transformation from VCP (vSphere Cloud Provider) to CSI (Container Storage Interface)
Originally, cloud provider-specific functionalities were integrated in Kubernetes natively within the main Kubernetes tree, also called in-tree modules. Kubernetes is a highly fast-changing landscape with a community that strives to make the product scalable and as efficient as possible. The growing popularity of the platform meant more and more providers jumped on the train which made this model hardly maintainable and difficult to scale. As a result, vendor-specific functionalities must now be removed from the Kubernetes code and offered as out-of-tree plug-ins. That way, vendors can maintain their own software independently from the main Kubernetes repo.
This was the case with the in-tree vSphere Volume plugin that was part of the Kubernetes code which will be deprecated and removed from future versions in favor of the current vSphere CSI driver (out of tree). In order to simplify the shift from the in-tree vSphere volume plug-in to vSphere CSI, Kubernetes added a Migration feature to provide a seamless procedure.
The migration will allow existing volumes using the in-tree vSphere Volume Plugin to continue to function, even when the code has been removed from Kubernetes, by routing all the volume operations to the vSphere CSI driver. If you want to know more, the procedure is described in this VMware blog.
“vSphere cloud native storage includes additional and modern features with vSphere CSI driver compared to the in-tree vSphere volume plugin”
vSAN Cloud Native Storage integration
I will demonstrate here how to provision vSphere cloud native storage on vSAN without going too much into the details. The prerequisites to this demonstration is to have a Kubernetes cluster running on a vSphere infrastructure with the vSphere CSI driver installed in it. If you want a head start and skip the step of installing the CSI driver, you can use CAPV or Tanzu Community Edition to deploy your Kubernetes cluster.
Anyways, in order to use vSphere cloud native storage, we will create a Storage policy in our Kubernetes cluster that matches the vSAN storage policy, then we will create a Persistent Volume Claim using that storage policy, we will attach it to a pod and see how vCenter displays it in the vSphere client.
- First, I create a Storage Class that matches the name of the vSAN storage policy which is “vSAN Default Storage Policy”. The annotation field means that PVCs will use this storage class unless specified otherwise. It will obviously depend on which vSAN storage policy you want to set as the default one.
kind: StorageClass
apiVersion: storage.k8s.io/v1 metadata: name: vsan-default-policy annotations: storageclass.kubernetes.io/is-default-class: “true” provisioner: csi.vsphere.vmware.com parameters: storagepolicyname: “vSAN Default Storage Policy” |
“The storage class references the vSAN storage policy and the storage provisioner (vSphere CSI driver)”
- Then I create a persistent volume claim (PVC) that references the storage class. The storage request will be the size of the virtual disk backing the PV.
apiVersion: v1
kind: PersistentVolumeClaim metadata: name: test-altaro-blog spec: accessModes: – ReadWriteOnce resources: requests: storage: 5Gi storageClassName: vsan-default-policy |
“The PVC creates a VMware cns with a PV”
- You should now see a persistent volume provisioned by the PVC.
“The PVC should automatically create a PV”
- At this point you should see the vSphere cloud-native storage in the vSphere client by browsing to Cluster > Monitor > Container Volumes.
The volume name matches the name of the persistent volume claim, I also tagged it in Kubernetes to show how the tags are displayed in the vSphere client.
- You can get details if you click on the icon to the left of the volume. You will find the Storage Policy, datastore and you’ll see that no VM is attached to it yet.
- In the Kubernetes objects tab, you will find information such as the namespace in use, the type of cluster…
- Then the Physical Placement tab shows you were the vSAN components backing this vSphere cloud-native storage or stored in the hosts.
- At this point the vSphere cloud native storage is created but it isn’t used by any pod in Kubernetes. I created a basic pod to consume the PVC.
apiVersion: v1
kind: Pod metadata: name: test-pod-altaro spec: volumes: – name: test-pv-altaro persistentVolumeClaim: claimName: test-altaro-blog containers: – name: test-cont-altaro image: nginx volumeMounts: – mountPath: “/usr/share/nginx/html” name: test-pv-altaro |
Notice where the pod is scheduled, on node “test-clu-145-md-0-5966988d9d-s97vm”.
- At this point, the newly created pod gets the volume attached and it will be quickly shown in the vSphere client where you see the VM running the node where the pod is scheduled.
- If you open the settings of said VM, you will find a disk attached which is the vSphere Cloud native storage created earlier.
To properly protect your VMware environment, use Altaro VM Backup to securely backup and replicate your virtual machines. We work hard perpetually to give our customers confidence in their VMware backup strategy.
To keep up to date with the latest VMware best practices, become a member of the VMware DOJO now (it’s free).
Wrap up
Most IT pros will agree that the learning curve of Kubernetes is fairly steep as it is a maze of components, plugins and third-party products that can seem daunting at first. However, they will also agree that Kubernetes has been one of the fastest-growing technologies in the last 5 years. The big players in the tech industry have all jumped on the bandwagon and either developed their own product or added support/managed services for it somehow. VMware is one of them with their Tanzu portfolio and vSphere Cloud native storage is a critical component of this stack as it reduces the complexity by offering vSphere storage to Kubernetes workloads. The cool thing about it is that it is made easier to use thanks to the CSI driver plugin architecture and tightly integrated with the vSphere web client for added visibility.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!