VMware DRS – Distributed Resource Scheduler

 

 

 

High Availability (HA) and Distributed Resource Scheduler (DRS) are the main reasons why you would set up a vSphere cluster. In today’s post, I’ll be covering the benefits and functionality of VMware DRS while briefly touching base on how to set it up.

 

What is DRS?


Let’s begin by highlighting some of the benefits derived from using DRS.

Load Balancing – In a nutshell, DRS monitors resource utilization and availability on the constituent hosts of a cluster. Depending on how it’s configured, DRS will recommend or automatically migrate virtual machines from a host running low on resources to any that can sustain the additional load and supplement the resources required by the VMs. The main function of DRS is thus to ensure that a VMis allocated the required compute resources for it to run optimally.

Power Management – VMware Distributed Power Management (DPM) is a sub-component of DRS which essentially places one or more ESXi hosts in standby-by mode if the remaining hosts are found to be providing sufficiently excess capacity. When resources start running low, DPM will power back on hosts to keep capacity running at an optimum level.

Virtual Machine Placement – Using DRS groups, affinity and anti-affinity rules, you can specify which virtual machines will reside on which hosts. You can also lock the placement of mutually dependent VMs to a specific host for improved performance.

Resource Pools – While resource pools are not exclusive to DRS since they can be created on any ESXi host, it is only after you enable DRS that you are able to create resource pools on those hosts which are members of a cluster.

Storage DRS – This feature is independent of enabling DRS on an ESXi cluster but nevertheless I thought it’s best to give it a mention even though I won’t cover it in any detail. Put simply, if you have several datastores, you can group these under a datastore cluster for which you can optionally enable Storage DRS as shown in Figure 1. From there on, sDRS takes care of load balancing the disk space and I/O requirements for the virtual machines residing within that datastore cluster.

Figure 1 – Turning on sDRS (using the C# vSphere Client)

 

What are the requirements?


Basic – You will need at least two ESXi hosts participating in a cluster managed by vCenter Server. Every ESXi host must be configured for vMotion. Each host will preferably be allocated a 1Gbit link on a private network reserved solely for vMotion traffic.

Storage – A SAN or NAS based shared-storage solution allowing for the provision of iSCSI or NFS based datastores mounted on every ESXi hosts which is included in the cluster. Datastore naming should be consistent across all hosts.

Processors – Preferably, all hosts should be sporting the same type of processor(s) to ensure correct vMotion transferring and state resumption. Once the VM is transferred, the processor(s) on the destination host should present the same processor instruction set and pick up executing instructions from where the source host processor(s) stopped. Enhanced vMotion Compatibility (EVC) should be enabled wherever dissimilar processors are used.

Licensing – As of Feb. 2016, an Enterprise Plus license is required to enable DRS; see A quick look at VMware vSphere Editions and Licensing.

 

Are there any gotchas?


Software companies like Oracle and Microsoft require you to purchase a license for every host on which you plan to run products such as Microsoft SQL Server or Oracle Database. If you have large clusters, the price tag will quickly inflate. As you’ll see further on, you could use VM-Host affinity rules to make sure that such VMs are “preferably” placed only on those hosts for which you acquired licenses. You could also opt to disable DRS altogether for the specific VMs. While I’m not covering HA here, note that the same issue will arise when a host fails since the VMs that were running on it are optionally restarted on another host, one that has not necessarily been licensed.

Licensing is generally a complex and often obfuscated topic, so make sure to understand the requirements and repercussions before enabling DRS (and HA) for products burdened by restrictive licensing schemes. This will ensure that, come audit season, you’re not caught violating licensing agreements.

 

Setting up DRS


Enabling DRS is very simple. Just right-click on the cluster name, select Edit Settings and turn it on as shown in Figure 2.

Figure 2 – Enabling DRS on a cluster (using the C# vSphere Client)

 

Simply turning on DRS will suffice for most environments. However you need to be aware that the default automation level is set to Fully Automated. This means that DRS will automatically migrate VMs across hosts whenever it deems it necessary. In fact there are 3 levels of automation these being:

Manual – with this mode selected, DRS will advise you to migrate VMs when resources are running low. Any subsequent action requires user-intervention. As shown in Figure 3, DRS keeps on prompting you until you select a host on which you want a VM powered up.

Figure 3 – Selecting the host on which a VM is powered up

 

Partially Automated – in this mode, DRS will automatically place VMs, just powered up, on a hosts that is guaranteed to provide the required resources. During the course of normal operations, DRS will also make suggestions about any VMs that would benefit from being migrated to another host. To check which, click on the cluster name and select the DRS tab while in Hosts and Clusters view using vSphere client. Clicking the Apply Recommendations button will migrate the respective VMs to the DRS selected hosts. Suggestions are also given when DRS is run in manual mode.

DRS carries out a check every 5 minutes but you can force it to run at any time by clicking on Run DRS as shown highlighted in red in Figure 4.

Figure 4 – Manually running DRS

 

Fully Automated  As the name implies, DRS will automatically migrate VMs according to need as shown in Figure 5. One should be careful of the Migration Threshold setting which, if set too high, may trigger an inordinate number of migrations more so in large environments. This may have an impact on the overall performance specifically on the storage and network fronts due to an increase in the demand for IOPs and bandwidth.

Figure 5 – Setting the migration threshold

 

The automation level can also be individually set for each VM. Doing so, overrides the cluster settings.

Figure 6 – Overriding cluster enforced settings

 

DRS Groups and Rules


There are instances where you’d want a group of virtual machines to run on the same host or group of hosts. There are also instances when two or more VMs are required to run on separate hosts to mitigate potential performance issues. Keeping a heavily used MS-SQL VM separate from an equally heavily used Exchange VM is one example. DRS allows for this as follows;

 

VM-VM Affinity Rules

  • Keep VMs together (Affinity) – group 2 or more VMs such that they are always hosted on the same host.
  • Separate VMs (Anti-Affinity) – group 2 or more VMs such that they run independently of each other on different hosts.

Note: If any two rules conflict, the older one is left enabled while the most recent is disabled. You can however select which rule to enable. In the following example I set up two rules. The first specifies that VM a and VM b should be kept together. The second, on the contrary, specifies that the two VM should be kept apart thus resulting in a conflict with the first rule. A red icon next to the rule will alert you of existing conflicts (Figure 7).

Figure 7 – Conflicting rules

 

VM-Host Affinity Rules

Virtual machines to hosts – Bind one or more VMs to a pre-defined DRS group of hosts

Note: No rule checking is performed for VM-Host affinity rules so you may end up with conflicting rules. Again, the older rule takes precedence with the new one being automatically disabled. Care should also be exercise when creating this type of rule since any action violating a required affinity rule may prevent;

  • DRS from evacuating VMs when a host is placed in maintenance mode.
  • DRS from placing virtual machines for power-on or load balance virtual machines.
  • HA from performing failovers.
  • DPM from placing hosts into standby mode.

Consult the following guide (specifically 83-86) for further details.

DRS Groups and Rules can be set from the Cluster settings shown below. You only need to create groups when setting up “Virtual machines to hosts rules” since the option is not available when creating affinity and anti-affinity rules (See Figures 8 and 9).

Figure 8 – Setting up VM and Host DRS groups

 

Figure 9 – Creating vm rules

 

Pay particular attention when creating Virtual Machines to Hosts rules. You are given four options (see Figure 10 – options boxed in green) each of which exhibits dissimilar behavior although similarly worded.

Be wary of using rules starting with must as this implies strict imposition. In practical terms, let’s say you create a must run on hosts in group rule for a particular VM. If for any reason the hosts in the referenced group are offline, the VM will not migrate and/or is prevented from powering up; unless of course you disable or delete the rule. This may also lead to a host affinity rule violation due to any of the unwanted scenarios previously mentioned. If this happens, disable the offending rule and run DRS manually or wait for it to do so automatically at 5 minute intervals. Stuck processes, such as placing a host in maintenance, will resume after a short while.

Unless absolutely necessary, avoid using must and opt instead for should. This simply sets a preference on which ESXi host to use. If none are available, DRS selects the next best option.

Figure 10 – Specifying the rule type

 

Monitoring DRS


Switching to the Summary tab, you will find a vSphere DRS information pane on the upper-right part of the screen as per Fig. 11. You are presented with DRS information which includes the set automation level, a list of outstanding recommendations and the load-balancing status of the cluster. You’ll also find a link to a Resource Distribution Chart which opens up a window showing the load distribution across the cluster as per Fig. 12.

Figure 11 – DRS status window

 

Figure 12 – DRS Resource Distribution chart

 

Disabling DRS


If for any reason you need to disable DRS, you must keep a couple of things in mind. The first is that you WILL LOSE the existing Resource Pool hierarchy, including VM membership; I learned this the hard way a couple of times, fun times!

On a positive note, you can save the resource pool tree via the vSphere Web client for future imports which is perhaps another reason why you should ditch the legacy vSphere client. Note however that while the process will restore the original resource pool hierarchy IT WILL NOT restore VM membership, meaning you are left with a bunch of empty resource pools. In addition, you will not be able to re-import the resource pool tree if you created new resource pools after re-enabling DRS. Quite honestly, I don’t see this being of much use but better this than nothing.

Figure 13- Disabling DRS / Resource Pool removal warning

 

The second point to keep in mind is that rules set up prior to disabling DRS will still apply. If fact, if you re-enable DRS, all previously set rules will magically reappear. According to this article, one should be wary of should rules when DRS is disabled as apparently you can expect the unexpected.

Note: I tried replicating the above point on vCenter Server 6.0 / ESXi 6.0 and the should rules I specified worked just fine irrespective of DRS being enabled or not.

 

Conclusion


I believe I covered most of what there is to cover on the subject. I did not explore DPM as I currently lack the resources to test the feature out thoroughly but will try to in the future. Something I’ll definitely be covering in the near future, is High Availability as it ties in neatly with DRS.

With that said, do keep an eye on this space to learn more on VMware products and technology.

UPDATE: I talk about HA in Setting up VMware High Availability on a vSphere Cluster. Do have a look!

[the_ad id=”4738″][the_ad id=”4796″]

Altaro VM Backup
Share this post

Not a DOJO Member yet?

Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!

23 thoughts on "VMware DRS – Distributed Resource Scheduler"

  • Ahmed says:

    Hey, Nice Article but you forgot to mention what vsphere license is required in the requirements section.

  • […] hosts load balanced which is a great first step in the right direction. That’s the point of DRS (Distributed Resource […]

  • Sahar Nassif says:

    Our DRS is configured to be fully automated, and there is a configured vm-vm affinity rules set to keep about 5 VMs working together on same host
    However every time i try to move the 5 VMs from their host to different host

    after they all successfully moved to the target host, they revert back to original host although the target host resources better than the source host

    what could be the reason?

    • Luke Orellana says:

      Hi Sahar,

      You may have affinity rules created to ensure that those 5 VMs are configured to stay on that particular host. I would check that first. Also, there is an algorithm that DRS uses and it isn’t always accurate until vSphere 7 came out and provided a better experience. It may be that the DRS algorithm is calculating that those 5 VMs are better off on the original host even though its not.

Leave a comment

Your email address will not be published. Required fields are marked *