vSphere 5 High Availability: Bring on the Blades
vSphere 5 has many new and exciting features. This post will concentrate on High Availability(HA) and how it affects blade designs. While HA is certainly not new, it has been rewritten from the ground up to be more scalable and flexible than ever. The old HA software was based on Automated Availability Manager (AAM) licensed from Legato. This is why HA had its own set of binaries and log files.
One of the problems with this “now legacy” software was the method it used to track the availability of host resources. HA prior to vSphere 5 used the concept of primary nodes. There were a maximum of (5) primary nodes per HA cluster. These nodes were chosen by an election process at boot time. The (5) primary nodes kept track of the cluster state so that when an HA failover occurred, the virtual machines could restart on an available host in the cluster. Without the primary nodes, there was no visibility into the cluster state. So, if all (5) primary nodes failed, HA could not function.
This was not usually an issue in rackmount infrastructures. However, it posed some challenges in a blade infrastructure where a chassis failure can cause multiple blades to fail. Blade environments should typically have at least two chassis for failover reasons. If there was only a single chassis providing resources for an HA cluster, that single chassis failure could cause an entire cluster outage. You’ll seen in the diagram below that just because multiple chassis are used does not mean that the entire HA cluster is protected.
In this case, two chassis are used and populated with blades. However, the HA primary nodes all ended up on the same chassis. If that chassis were to fail, then HA will not function and the virtual machines will not restart on the remaining hosts in the other chassis. The way to design around this scenario prior to vSphere 5 is depicted in the below diagram.
No more than (4) blades should be part of the same HA cluster within a chassis. This does not mean that the entire chassis cannot be populated. The remaining slots in the chassis could be used for a second HA cluster. This scenario hinders single cluster scalability from a hardware perspective.
vSphere 5 HA
Some significant changes were made in vSphere 5 HA that address this challenge. HA was completely rewritten as Fault Domain Manager (FDM). The new HA software is baked into ESXi and does not rely at all on the AAM binaries. The idea of primary nodes has been abandoned. In its place is the concept of a single “Master” node and many (as many as are in the cluster) “Slave” nodes. All the nodes in an FDM based HA cluster can keep track of the cluster state. The “Master” node controls the distribution of cluster state information to the “Slave” nodes. However, any node in the cluster can initiate the HA failover process. The new HA failover process also includes electing a new “Master” node in the event that it is the node that fails. As you can see from the diagram below, a chassis failure can no longer take out an entire HA cluster that is stretched across multiple chassis.
The new FDM HA in vSphere 5 is much more resilient and allows the scaling of large single clusters in a blade environment. While blade architectures were certainly viable before, now those architectures can be utilized even more fully without compromises when it comes to HA.