Save to My DOJO
Table of contents
- 1. Mis-Provisioning Resources in Hyper-V
- 2. Creating Too Many Networks and/or Virtual Adapters
- 3. Creating Too Many Virtual Switches
- 4. Optimizing Page Files
- 5. Not Leveraging Dynamic Memory
- 6. Leaving Default VM Configurations
- 7. Not Troubleshooting the Right Thing
- 8. Overloading the Management Operating System
- 9. Leaving the Management OS in Workgroup Mode When there is a Perfectly Good AD Domain Available
- 10. Not Testing
- 11. Avoiding PowerShell
- 12. Not Figuring Out Licensing in Advance
I’ve seen a lot of questions from those who have recently deployed Hyper-V for the first time. Some just need a few pointers to iron some minor glitches, but some are in really bad shape. Here are some of the common deployment mistakes and their solutions.
1. Mis-Provisioning Resources in Hyper-V
There are a lot of ways to get the hardware wrong. This is usually the result of not having system profiles or by taking advice from people that don’t have to write the checks for your systems.
Improper Balance of CPU and Memory
You are almost guaranteed to run out of memory resources long before you run out of CPU. Don’t be one of those poor souls that buys dual 20-core CPUs with 64GB of RAM. Memory can’t be shared. Even though you can use Dynamic Memory to squeeze in more VMs than might otherwise fit, the memory that each VM uses belongs only to that VM. CPU, on the other hand, spreads out nicely. To keep it short, it’s because CPU cores aren’t dedicated to anything and will easily handle the load from multiple virtual machines.
The best thing to do is run Performance Monitor traces against systems you intend to place into your new virtual environment. If you can’t do that, try to get some idea from people who have. If you can’t do that, you can talk to the manufacturers of the applications you’ll be virtualizing. They’re going to overstate their needs, but it’s a starting point. If all else fails, Microsoft used to recommend 8-12 virtual CPUs per physical core for virtualized server operating systems. It’s not wonderful, as explained in the linked article, but it’s better than nothing.
Improper Balance of Networked Storage and Network Connectivity
Two things I’ve noticed about storage are that people dramatically overestimate just how much storage speed they need and dramatically overestimate just how fast storage can perform. They’ll hear that RAID-10 is the fastest RAID build, so they’ll stick six or eight disks in a RAID-10 array and then blow a bunch of money connecting to it over dual 10GbE adapters. It will probably work, but mainly because people who build that sort of configuration usually don’t need a great deal of disk performance.
The short explanation is: spinning disks are very slow and 10GbE networking is very fast. A bonded 10GbE pair is very very fast. You’re going to need a lot more than 6 or 8 disks to come anywhere keeping that thing satisfied. Even if you’ve got that many disks, you’re still going to need lots of lots of demand or the line is still going to be mostly empty.
It’s more important to size disk appropriately than CPU, because storage costs can escalate quickly. Try to get an idea for what your real I/O needs will be in advance. Don’t blindly ask people for advice; you’ll be advised to over-buy every time. If you don’t know, databases performing hundreds of transactions per minute, or more, need lots of I/O. Most everything else doesn’t. In aggregate, disk needs can be high, of course, but a dozen VMs that each average 30 IOPS of load won’t even bog down four 15k disks.
Improper Balance of SSD and Spinning Disks
SSD is obviously a fantastic application for virtualization loads. It’s really fast and the low latency makes the scattered access of multiple virtual machines into a non-issue. We have a while until it gets cheap enough to replace spinning disk, though. In the interim, we can build hybrid arrays that use both. With Storage Spaces, we can even do it with commodity server hardware.
Unfortunately, these SSDs are often used inappropriately. More than once, I’ve seen newcomers install Hyper-V Server on a pair of SSDs and use their spinning disks to hold virtual machines. This is a horrible misapplication of disk resources. Hyper-V Server, or Windows Server with Hyper-V, is going to do a lot of disk churn when it’s first turned on and then it’s going to sit idle. It’s your virtual machines that need disk I/O. If you’re not going to put those SSD to use holding VM data, yank them and put them in a different computer where they can earn their keep. It’s better to use Storage Spaces as mentioned in the previous paragraph.
Improper Balance of Networking Resources
If you’re building a standalone Hyper-V host with two network cards, the tribal knowledge has been to dedicate one to the management operating system and cram all the virtual machines into the other. In 2008 R2 and prior, you either had to do that or you had to go off-support and make a manufacturer team. With native teaming, you don’t have to do that anymore.
For clustered guests, I see people balancing networking resources in all sorts of odd ways. Virtual machines are given one or two physical adapters while an entire physical adapter is left for CSV traffic. Again, that was a necessity in 08 R2 and earlier, but not anymore. In 2012 R2, a dedicated CSV network is all but pointless, provided that you have sufficient bandwidth available for cluster communications in general.
There are a lot of ways to screw up networking in Hyper-V. For this section, just spend some time thinking about where you need the most bandwidth to keep your virtualized applications happy, and design accordingly.
Improper Focus of Resources
I know that all the cool kids are telling you to use always fixed VHDXs for everything, but it’s lazy, terrible advice. Yes, they’re a bit faster, and yes, they’ll prevent you from ever accidentally over-provisioning storage. But they’re not much faster, and I have full faith that you can do the minor math necessary to keep from over-provisioning. If you provision virtual machines with C: drives for guest operating systems and other VHDX files for data, then that C: drive is optimal for the dynamically expanding format. You can set it for 60 GB and expect it to stay well under 40 for its entire life. If you’ve got 10 virtual machines, that’s a minimum of 200 GB of savings. Or, if you unquestioningly use fixed, a minimum of 200 GB of completely wasted space.
The same goes for a lot of other aspects of virtualization. Before you design your deployment: Stop! Think! What are the most likely bottlenecks you will face? That is where you focus, not based on some list of false always/never items dreamed up by somebody who copy/pasted it from someone else who copied it from someone else who invented it out of whole cloth.
The thing is, over-committing resources is sort of “what we do” in the virtualization space. If you’re uncomfortable with that, then maybe you’re just not ready for virtualization. In 2014 (or whatever year you happen to be reading this), it’s simply a waste to provision much of anything in a virtual environment at a 1:1 ratio unless you’re demonstrably certain that those resources are going to be consumed at that ratio.
[optin-monster-shortcode id=”sqalbinecvw0jkbj”]
2. Creating Too Many Networks and/or Virtual Adapters
This mistake category usually sources from not understanding the Hyper-V virtual switch. For a standalone Hyper-V system, the management operating system needs to be present in exactly one network using exactly one IP. Preferably, it will be routed (have access to a gateway), but you can skip this if your security requirements necessitate it. That single IP should register in DNS so your management system(s) can reach it by name. That IP doesn’t need to have anything in common with any of the guests at all. It doesn’t have to be on the same subnet(s) or VLAN(s). You don’t need to create an IP for the management OS in all of the VLANs/subnets that your guests will be using. You certainly don’t want to create a virtual adapter for anything to do with virtual machine traffic. Your guests will not be routing any traffic through the management operating system. The only other network presence you need to create for the management operating system would be for any iSCSI connections.
For a clustered host, you must have a management IP address and a cluster network IP address. You should create a cluster network to dedicate to Live Migrations. You can create additional cluster communications networks if you want, but they’re only useful if you can use SMB multichannel. You might also need some IPs for iSCSI connections. That’s it. Just as with the standalone, don’t go creating a lot of IP addresses within VM networks and don’t go creating virtual adapters in the management operating system for virtual machine traffic.
3. Creating Too Many Virtual Switches
You probably only need one virtual switch. There are use cases for multiple virtual switches, but not many. One would be if you have some guests you want to place in a DMZ while other guests need to stay on the corporate network. You could use dedicated physical switches with a dedicated virtual switch for the DMZ guests. It should be pointed out that for most organizations, VLANs or network virtualization can probably achieve a sufficient degree of isolation.
4. Optimizing Page Files
The purpose of a page file is to give applications access to more physical memory than is available to the operating system. The operating system’s preference is to use it to hold memory that is rarely accessed. If your page files have such a performance dependency that you need to put significant time into thinking where they are going to go, that means the memory it is holding is not rarely accessed. That means you broke something way back at step 1 or you did it right and some condition changed. Note that I’m not talking about preparation for Replica. As for the hypervisor’s page file, it will have near zero usage. You just need to be sure you have enough space for it.
5. Not Leveraging Dynamic Memory
When we bloggers and book authors write about Dynamic Memory, we exert so much effort scaring people away from using it on SQL and Exchange servers that a lot of people never use it for anything. There isn’t as much slack in memory as there is with CPU or disk, but it’s still there. The really nice thing about Dynamic Memory is that you can adjust it on the fly. Set what you think is a good minimum, erring toward the high side, and what you think is a good maximum, erring toward the low side. You can always reduce minimums and increase maximums. What you can’t do is modify fixed memory while the guest is on.
6. Leaving Default VM Configurations
If you use the wizard to create a virtual machine, you get a single vCPU. If you enable Dynamic Memory, its maximum will be 1TB. If your guest operating system is post-XP/Server 2003, you want at least 2 vCPUs. You almost definitely don’t want Dynamic Memory to allow up to 1TB, and not least because you don’t have the physical memory to back it up. As I said in #5, you can always increase the maximum, but you have to turn the guest off to reduce it. If it’s at a ridiculously high number and you get some greedy or memory-leaking application running, one guest can throw off the entire balance of the host’s allocation to every other guest.
7. Not Troubleshooting the Right Thing
“I bought a 120-core server with 32TB of RAM and a SAN with 720 SSD disks and Hyper-V guest disk access is SOOOOO SLOW and it shouldn’t be because look at all this hardware!”
See the problem?
“Oh, my network connection between the host and the SAN? Well, they have 10GB cards so that’s not it.”
Did we really get an answer?
“Oh, the interconnects? Well, they go into Cisco switches that then go to a couple of old Novell Netware servers running TCP/IP-to-IPX/SPX gateways through a couple of really old Ethernet II hubs connected by 847 feet of Cat-3 cable, about 350 feet of which runs through an air conditioner bank, over an exposed hilltop, and around a nuclear waste dump. Why? Is that a problem?”
OK, so that’s never happened. I think. But, the point is, people never seem to remember all of the components that go into making this stuff work.
It’s not just resources, either.
“I virtualized a web server and no one can get to it! What is wrong with Hyper-V? Didn’t Microsoft test this stuff?”
Two or three questions later.
“Well, yeah, I left the firewall on without a port 80 exception and didn’t put the guest’s adapter into the correct VLAN or give it good IP information, but still, it’s all because Hyper-V!”
Virtual machines can’t do anything magical. They have to be configured correctly, just like a physical environment.
8. Overloading the Management Operating System
The management operating system should run virtual machines and backup software. If you want, it can run anti-malware software. End of list.
Want to run Active Directory domain services? A software router? A web server? A Team Fortress 2 dedicated server? Hmm, I wonder where that sort of thing could go… Oh, look! A hypervisor! A thing that can run virtual machines! Up to 1,024 on the same host! If it isn’t a virtual machine or something to backup a virtual machine or something meant to prevent the hypervisor or a virtual machine from being compromised, then it goes inside a virtual machine. Every time. If you don’t want to virtualize it, then find another physical system. If you don’t want to do that, then I’m sorry, I just don’t think you’re understanding this whole virtualization thing.
9. Leaving the Management OS in Workgroup Mode When there is a Perfectly Good AD Domain Available
Is your Hyper-V host in the DMZ? No? Then join it to the domain if you have one. I know you’ve been told that it’s more secure to leave it in workgroup configuration, but you’ve been told very, very wrong. Are you concerned that if someone compromises the management operating system and it’s in the domain that they’ll also be able to access all your guests? Guess what? If it’s in the workgroup and they compromise it, that situation is not any better! If an attacker gains access to the host, s/he has, at minimum, read access to all the VHDX files for all its guests. If any of them are in the domain, there is no functional difference whether or not the host is a domain member. Worse, your host has been compromised and the only barrier you put up to stop it was workgroup-grade security.
If you’re doing it because all of your domain controllers are virtual and you got conned by the chicken-and-egg myth, we debunked that a while ago. The only possible chicken-and-egg scenario is if your DCs are all virtualized and stored on SMB 3 shares, because SMB 3 connections are refused if they can’t be authenticated. The first, best answer is “don’t put [all] virtual DCs on SMB 3 shares”. The last, worst answer is “don’t join the host to the domain”, especially considering that it won’t fix your SMB 3 problem either.
10. Not Testing
Failure to test is sort of a universal failure of IT shops. Just because a set of hardware and a particular configuration should work doesn’t mean that it will. Plan -> deploy -> test -> go live. Order variance is not acceptable.
11. Avoiding PowerShell
I can understand not wanting to learn something new. It’s hard, it breaks the pattern that works for you, and there’s so many other things you need to do that there’s just no time. Thing is, there’s a lot of Hyper-V that you can only get to with PowerShell. There’s a lot of most Microsoft server products that can only be reached by PowerShell. It seems like a lot, but once you automate something that you used to have to do manually, you become addicted pretty quickly. Don’t cut off the best tool you’ve got.
12. Not Figuring Out Licensing in Advance
I’m going to estimate that about 90% of the unofficial material out there on licensing is junk. I’ve written a couple of posts on it, most recently this one, but it’s clear that we’re not winning the war on licensing ignorance. It would be nice if Microsoft would publish a clearer licensing FAQ, but they don’t. The thing is, you know that you have to buy licenses, and you probably know who you’re going to buy them from. Rather than doing a bunch of searches and scouring forums and listening to people that are not authorized to speak on the subject, just call that vendor. If they’re an authorized reseller of Microsoft licenses, they’ve got someone on staff that will be able to ask you a couple of questions and tell you within the span of a few minutes exactly what you need to buy.
This is really important. If you’re ever audited (all it takes to trigger an audit is a single phone call) and you’re out of compliance, the fines can get astronomical. They do tend to show leniency when you can make a convincing case that you didn’t know any better. However, out of the organizations I know that have been audited, all of them that were out of compliance had to pay at least a symbolic fine — and I’m pretty sure they disagreed with the auditors on how much money constitutes “symbolic”. Why did they have to pay? Because the phone call is free, that’s why. No one can really say they faced meaningful barriers to finding out what to do. You might say, “Hey, this Eric Siron guy on this page here said that I could do this.” Know what that gets you? A fine. Are they going to come after me? No. Am I going to pay it? No. If I made a mistake or wasn’t clear enough, then you have my apologies and sympathies, but nothing else. The responsibility is yours, and yours alone, to get this right. Make the phone call.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!
98 thoughts on "12 Common Hyper-V Deployment Mistakes"
Nice article, thank you! Thankfully I haven’t been guilty of too many !!
hello eric!
great article, as usual… 🙂 one thing about the dynamic memory though: we run a lot of webservers on our hyper-v cluster and initially i had dynamic memory configured on them. but then i observed that they always ran at the lowest limit of the range and that they would not use the offered maximum memory-spread for operating system level file caching. and that caching is a big performance plus on http-servers, as it reduces disc I/O greatly. so i put those back on static allocation. as said, just my observation.
regards, lukas
That’s certainly good knowledge. I’m not advocating that people should just blindly use Dynamic Memory. I’m advocating for people to not blindly avoid Dynamic Memory. It should never be used anywhere that it doesn’t make sense to.
I wonder though, in your case, if it might make sense to use a higher Startup value. When a VM starts, it thinks that all it has available is that value. Hyper-V will dynamically increase the maximum available as demand increases, thereby raising the upper limit of system memory that the VM sees. When Hyper-V reclaims memory, it uses the balloon driver. The virtual machine still thinks that it has that same maximum. So, basically, you’ll get different results from a VM that has a 512MB startup, 512MB minimum, and 2GB maximum than you would get from a VM that has a 2GB startup, a 512MB minimum, and a 2GB maximum.
hello eric!
great article, as usual… 🙂 one thing about the dynamic memory though: we run a lot of webservers on our hyper-v cluster and initially i had dynamic memory configured on them. but then i observed that they always ran at the lowest limit of the range and that they would not use the offered maximum memory-spread for operating system level file caching. and that caching is a big performance plus on http-servers, as it reduces disc I/O greatly. so i put those back on static allocation. as said, just my observation.
regards, lukas
I think it is a Windows Server thing but Windows Server 2003 never goes beyond the Startup RAM assigned. So if you assign 16GB startup and 10GB minimum and 20GB maximum the effective maximum is 16GB. For Windows 2012 onwards this doesn’t seem to be a problem and it can actually stretch out to the maximum. For SQL Servers it is best to use a fixed amount of RAM.
Did you configure 4GT? If not, 32-bit W2003S will stop at 16GB anyway. https://technet.microsoft.com/en-us/library/cc786709(v=ws.10).aspx Just a friendly reminder, W2003S goes out of support in less than a month.
SQL Server can now support hot-add memory (various version and edition restrictions apply), so Dynamic Memory is no longer a must-not.
Good read, its good to be challenged on some of the smart choices we make that may turn out to be stupid. 🙂
Keep challenging,
-Paul
Good read, its good to be challenged on some of the smart choices we make that may turn out to be stupid. 🙂
Keep challenging,
-Paul
I have noticed as well that dynamic memory really does not work, again and again i have seen where the system go to the low end of the memory range, then they cant seem to get it back (windows guests) and i end up having to reboot the guest.
There’s a patch for that. https://support.microsoft.com/en-us/kb/3095308
All of my systems are operating fine with that in place.
Is a good idea to join in domain 2 host 2012r2 core ? (1 primary and 1 replica), when the DC/DNS/DHCP server is inside (guest) the primary server?.
Thanks a lot.
Vittorio
Yes.
http://www.altaro.com/hyper-v/domain-joined-hyper-v-host/
http://www.altaro.com/hyper-v/virtualized-domain-controllers-4-myths-12-best-practices/
Is a good idea to join in domain 2 host 2012r2 core ? (1 primary and 1 replica), when the DC/DNS/DHCP server is inside (guest) the primary server?.
Thanks a lot.
Vittorio
Yes.
https://www.altaro.com/hyper-v/domain-joined-hyper-v-host/
https://www.altaro.com/hyper-v/virtualized-domain-controllers-4-myths-12-best-practices/
I’m barely containing my fury at this point not being able to access the internet from my VM.
Is there anywhere that gives UP TO DATE information on what TO DO instead of what NOT TO DO?
Everyone tells me to use a bridge, but every time I create a bridged connection w/ my ethernet adapter and the virtual switch, it has me restart my computer and when that’s done, both the virtual switch and the bridge are GONE!
I’m about to scream seriously!
Nevermind
OK, I gather that you’re frustrated.
But, first, every single item on this list includes “do this” along with the “don’t do that”. Every single item. If that was some sort of shot at the world in general, fine, but leave me out of it.
Second, you’ve looped me in at chapter 6 of your saga and I haven’t even seen the prologue yet.
So, I’m probably not going to be a lot of help with no more information than you’ve left here. But, I will say that 90% of the broken virtual switch implementations that have been brought to me are radically overthought and overengineered. The remaining 10% are using something like a wifi adapter that is just never going to work because its engineers never had that purpose in mind.
In the nearly seven years that I’ve been using Hyper-V, I’ve never once tried to use bridging as a solution. I don’t know who is suggesting that or why, but it would probably be my last choice.
Hi, nice article. What are your thoughts on joining host to the domain when you have a single server setup and the only DC is a VM?
Join it: https://www.altaro.com/hyper-v/domain-joined-hyper-v-host/.
terrible idea. if the VM isn’t booted what domain will the hyper-v host authenticate against? For single server setup the hyper-v host should be workstation. The only alternative is to have a physical server DC as well, then you can join a single Hyper-V host to the domain as you will at least have 1 server available to authenticate against.
A Hyper-V system behaves perfectly well when it hosts its own domain controller, even in a single host/DC environment. This has been repeatedly tested and proven.
I should have added that your VM’s will boot up without having to logon to the Hyper-V host, so it technically will work and not lock you out assuming you have your VM’s setup to power up on server boot. I would recommend though that you definitely leave yourself a break glass local account in case the VM domain controller does not boot up for whatever reason and then you cannot logon to the Hyper-V host to troubleshoot it. For this reason I prefer to keep at least one physical GC on the network. But you can do it either way, just make sure you have that local admin account for safety.
That doesn’t make any sense. If you have a single Hyper-V host running its own domain controller, then the only way to permanently lock yourself out is by intentionally disabling cached credentials AND intentionally disabling the local host administrator account AND if the virtual domain controller fails or isn’t set to auto-start. Or, in other words, if you intentionally architect for failure, you’re at a higher risk of failure. Even if you have done all of the above, you should still be able to restore your virtualized domain controller from backup to another machine, even a Windows 10 desktop, and that will be enough to get everything going again. It’s just a waste of money to keep a physical DC as a security blanket against a problem that literally no one in a small environment ever has any reason to cause for themselves.
Most intelligent comment I’ve read on this particular topic in like…………………….EVER !
Great article Eric. But I think “You can set it for 60 GB and expect it to stay well under 40 for its entire life” is not correct. With all the Windows Updates and upgrades you can expect more than 40 GB used. I fyou want to upgrade from 2012 to 2016 at any point, you will also need more space.
Every single one of my Windows Server VMs has a C: VHDX under 40GB, with the exception of the SQL VMs because SQL. Most of my Server Core VMs are under 15GB. If routine patches and upgrades are causing your VM to balloon over 40GB, then something is broken or you’re putting more in the C: partition than just Windows.
I stand by my statement.
Eric, great article. Thanks, Paul
“More than once, I’ve seen newcomers install Hyper-V Server on a pair of SSDs and use their spinning disks to hold virtual machines. This is a horrible misapplication of disk resources. Hyper-V Server, or Windows Server with Hyper-V, is going to do a lot of disk churn when it’s first turned on and then it’s going to sit idle. It’s your virtual machines that need disk I/O.”
I guess I am not understanding the reason why this is a bad idea. Wouldn’t having the Host on SSD’s help to read the VM’s? Dell has the new BOSS controller that is designed exactly for this purpose as well as freeing up those two slots that would other wise be used for the host.
Also, I need some advice. I just got a new server and I was wondering what the best practice is for the disk space on the VM’s. Is it better to just create one large virtual disk and create the VM’s inside or put each VM on their own virtual? As an example, I have eight 4 TB drives. I have roughly 4 important systems: 3 SQL databases, and 1 File server each one using roughly 1.5 TB’s a piece, for now. Should I turn all 4 TB drives into one large 16 TB RAID 10 virtual disc and stick all the VM’s in there or should I split it into 4 sets of 4 TB virtual disks each in RAID 10 and stick each server on their own virtual disc? I can see that the advantage of putting it in the large 16TB would be that if any of them grow past the 4 TB then I can just reallocate accordingly, it is more dynamic, however there seems to be a higher possibility of losing all the VMs if enough drives go down at once or if I need to expand more than one machine, I have to backup and restore ALL VMs. On the other hand putting them each in their own partition makes it so if for what ever reason need to expand a VM past that 4 TB, I can just back up that one VM, then replace just those 2 physical disks with larger drives and throw the VM back on it. I can do this without even affecting the other VMs. I am just not sure how this works on performance of the RAID controller and the Read and write abilities.
Your Thoughts?
No. Putting the hypervisor on an SSD will not help it retrieve data from spinning disks more quickly. I think you’re thinking of a system that uses SSDs as a cache for spinners. You can’t get there like this. Also, that’s not what the BOSS card does. It has its best use in systems where one big RAID can’t solve the problem for some reason.
Use a RAID build with the highest number of spindles possible. Multiple smaller RAID systems is a micromanagement hell that wastes IOPS potential. Divide your singular array into two or more big partitions, if that makes you feel better. Monitor for drive failures. Keep good backups. Reallocation should not be a thing that you do. Expansion to handle new drives is OK, of course.
Yes another superb article Eric! Learning so much from you that are rarely mentioned in learning (books, videos). Thanks!
You are my aspiration , I possess few blogs and very sporadically run out from to post .