I had a client call me in a complete panic recently with their entire environment completely down. It was a real world example of a lesson learned the hard way.
This particular client was fully on the virtualization path and had began to virtualize pretty much everything in sight. The hypervisor this particular client had chosen for the task was Microsoft’s Hyper-V. Multiple Hyper-V hosts with shared storage and redundant network connections to VLANs and the such. The environment was built out with enough resources and was happily chugging along for an extended period of time providing resources to infrastructure, application and business critical virtual machines.
During a routine building power event, they were forced to shut down all systems. After the power event, the powered up Hyper-V hosts were unable to access the shared storage necessary to power up any Virtual Machines.
- All Domain Controllers in the environment were virtualized and moved up to the Highly Available Cluster Shared Volumes. (Shared Storage)
- Hyper-V leverages Cluster File Services to mount and access Cluster Shared Volumes.
- Cluster File Services uses an Active Directory Service Account for permissions and access.
- The Domain Controllers (on the CSVs) were in a powered down state and unable to authenticate the Cluster File Services’ Service account to mount the CSVs to power the Domain Controllers back on. And around and around we go.
Obviously, the easiest solution to this situation is to always have a physical domain controller up and running in the environment (also useful for a reliable time source). If you are resource constrained or just super Pro Virtualization, you can also just make sure to NOT put your only Domain Controllers on the Shared Storage volumes. You can easily leave one on each of the Hyper-V host’s local storage. Even with the entire Domain down, you can always log in locally to the Hyper-V host and power up your locally stored Domain Controller Virtual Machine.
A side note:
For those more familiar with VMware, this could be easily overlooked since this is not an issue with VMware. Although vCenter authentication is handled by Active Directory, the local hosts running ESX do not depend on AD service accounts for any Host operations including accessing the shared storage and power operations on the VMs.
Fortunately for this client, they were able to do a restore of a DC to local storage and get the environment back up.