Safely virtualizing Windows Server 2012 Active Directory via Generation-ID

Windows Server 2012 generation ID is a great new feature that will allow use to safely virtualize a domain controller, on specific hypervisors. One of the really great features that hypervisors have had for ages is the ability to perform snapshots, then roll back to a prior state with a click of a mouse. Invaluable feature in both the lab, and in production.

I know during all my (failed) vSphere 5.1 installs I practically wore out the revert to snapshot button in vCenter. But, there is at least one class of VMs that you almost NEVER want to roll back from a snapshot with, those which are vector-clock synchronized software such as Active Directory.

Why is rolling back AD bad? I mean why is rolling back AD *REALLY* bad? Microsoft has these little things called USNs, or Update Sequence Numbers. A USN is an Active Directory database instance counter which gets incremented each time an update to AD is made. USNs are unique to each DC, and use a monotonically increasing value. USNs are used to determine what changes need to be replicated to other DCs.

When you revert to a snapshot a USN rollback occurs. What can happen if a USN rollback occurs? Lots of bad things, such as missing AD objects, wrong security group memberships, passwords are reset, and re-appearing AD objects. Also, DCs that are rolled back may accumulate many changes which never get replicated to other DCs. In short, the AD consistency of your forest is SHOT.  Starting with Windows Server 2003 SP1 and later, an event log ID 2095 is generated if a USN roll-back is detected, but it’s up to you to fix the mess. Microsoft has a great KB article here that goes into a lot more detail.

What has Microsoft done in Windows Server 2012 (and Windows 8) to address this problem? They’ve introduced a safeguard called a VM-Generation ID, which can be implemented by any hypervisor. This generation ID can be used by applications and operating systems to detect if a virtual machine has been rolled back in time, and take appropriate measures.

So what happens when AD detects that the Generation IDs have changed? First, it dumps the RID pool, then does a non-authoritative synchronization of the SYSVOL folder. AD replication is then re-established to other DCs, to bring the reverted DC back into a consistent state with the rest of the forest.

Sounds great right? Well it is, but only a very limited number of hypervisors support VM-Generation ID. As of this writing the hypervisors are Hyper-V 3.0, vSphere 5.0 U2, and vSphere 5.1. Since a USN rollback is quite unpleasant, you of course want to verify that WS2012 and your hypervisor are playing nice and using the Generation-ID feature. If you look in the Directory Service event log, you will see event ID 2168 and 2172. In the screenshots below they have the same Generation-ID, since the VM was not reverted to a previous snapshot.

To test out this new feature I fired up my vCenter 5.1 web console and took a snapshot of my WS2012 domain controller. After the snapshot completed, I created a new group on another DC, then reverted the WS2012 DC back to my snapshot. Let’s look in the event viewer and see what happened:

Yes, AD realized it was reverted back to a prior snapshot…

Microsoft even tells you that snapshots are not backups, and silly, use an AD aware backup program to restore AD.

And now life is almost good…

Let’s freshen up FRS a little bit while we are at it…

Nothing like a new database to start off the day with…

A touch of USN cleanup…

And a few minutes later, everything is back in sync! As you can see from the screenshots, Microsoft is very verbose in the logs on exactly what is happening and why. In a very large forest with a lot of DCs the recovery process could take longer.

So under what circumstances does the Generation-ID change and not change? Here’s a list:

Generation-ID NOT changed when:
VM is paused or resumed
VM is rebooted
VM host reboots
VM is vMotioned/Live Migrated

Generation-ID IS changed when:
VM starts executing a snapshot
VM is recovered from a backup
VM is failed over in a disaster recovery environment
VM is imported, copied, or cloned

This feature alone should be a huge driver for deploying WS2012 based DCs on all of your hypervisors. Never thought I’d say this..but happy snapshotting your domain controllers! For even more detailed information on virtualized domain controllers, Microsoft has a great series of articles here you can read.

P.S. This feature does NOT work with array-based snapshots. The hypervisor tracks and creates the new Generation-IDs. So DO NOT revert a domain controller back to a prior state by reverting to a previous snapshot that your array created vice your hypervisor. With the forthcoming VVOLS in vSphere .Next, Generation-ID could be supported with hardware-snapshot offloads but we will have to wait and see if that’s the case.

Related Posts

Subscribe
Notify of
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
v88
July 14, 2017 12:49 pm

Hey Derek,

Does Nutanix replication or a storage array based replication from site A-Primary to site B-DR (for DR purposes) changes this Generation-ID on destination when the DC is brought up in site B during disaster? Is it a supported solution for a domain controller (storage level replication for a DC disaster recovery)?