This session was extremely technical and went over the inner workflows of HA. For a better and more in-depth details, I would strongly suggest getting the VMware vSphere 5.1 Clustering Deepdive book.
- HA protects against three failure modes: Host/VM failures; host network isolated and datastore PDL; Guest OS hangs and apps crashes
- Datastore accessibility outages occur infrequently but have a large cost
- vSphere 5.0 introduced FDM, or Fault Domain Manager, which completely replaces the 4.x HA agent and software.
- Datastores are used for two purposes by HA: Communications channel between FDMs and persistent storage for configuration information
- Heartbeat datastores – two chosen by each host, enables the master to detect VM power states.
- Best practice: Use “leave powered on” host isloation response option
- In 5.0 U1, Permanent Device Loss (PDL) the guest I/O will trigger the VM to be killed, and HA will restart it on a host that can access the datastore.
- Futures for HA
- Add support for All Paths Down (APD)
- Tiggere by PDL/APD declaration rather than guest I/Os
- Full customization of responses
- Full user interface and detailed reporting
- VM placement sensitive to accesibility