window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-16803030-1');

VCE, EMC, VMware Cisco and TBL: Innovating Together

Limelight Private Cloud as a ServiceWith Limelight PCaaS – Private Cloud as a Service – TBL has developed strategic alliances with VCE, EMCVMware and Cisco, to bring you access to the most flexible and reliable technology and infrastructure possible at a per user monthly fee and with no capital expenditures.

With Limelight PCaaS, we manage an advanced solution of network and infrastructure for you. We take care of maintenance, updates, backups, integrations, configurations and more all on equipment that sits at your location. By having your own private cloud, you can have the peace of mind of predictable costs and security with none of the hassle.

Through these strategic alliances with VCE, EMC, VMware and Cisco, you’ll seamlessly utilize the best in server infrastructure, desktop virtualization, and collaboration, enabling you to focus on your business, customers, and service while TBL focuses on serving your organization with Limelight PCaaS.

Learn More About Limelight PCaaS

Stretched Clusters: Use Cases and Challenges Part I – HA

I have been hearing a lot of interest from my clients lately about stretched vSphere clusters. I can certainly see the appeal from a simplicity standpoint. At least on the surface. Let’s take a look at the perceived benefits, risks, and the reality of stretched vSphere clusters today.

First, let’s define what I mean by a stretched vSphere cluster. I am talking about a vSphere  (HA / DRS) cluster where some hosts exist in one physical datacenter and some hosts exist in another physical datacenter. These datacenters can be geographically separated or even on the same campus. Some of the challenges will be the same regardless of the geographic location.

To keep things simple, let’s look at a scenario where the cluster is stretched across two different datacenters on the same campus. This is a scenario that I see attempted quite often.

 

image

 

This cluster is stretched across two datacenters. For this example let’s assume that each datacenter has an IP-based storage array that is accessible to all the hosts in the cluster and the link between the two datacenters is Layer 2. This means that all of the hosts in the cluster are Layer 2 adjacent. At first glance, this configuration may be desirable because of its perceived elegance and simplicity. Let’s take a look at the perceived functionality.

  • If either datacenter has a failure, the VM’s should be restarted on the other datacenter’s hosts via High Availability (HA).
  • No need for manual intervention or something like Site Recovery Manager

Unfortunately, perceived functionality and actual functionality differ in this scenario. Let’s take a look at an HA failover scenario from a storage perspective first.

  • If virtual machines failed over from hosts in one datacenter to hosts in another datacenter, the storage will still be accessed from the originating datacenter.
  • This will cause storage that is not local to the datacenter to be accessed by hosts that are local to the datacenter as shown in the diagram below.

image

This situation is not ideal in most cases. Especially if the datacenter is completely isolated. Then the storage cannot be accessed anyway. Let’s take a look at what happens when one datacenter loses communication with the other datacenter, but not with the datacenter’s local hosts. This is depicted in the diagram below.

image

  • Prior to vSphere 5.0, if the link between the datacenters went down or some other communication disruption happened at this location in the network, each set of hosts would think that the others were down. This is a problem because each datacenter would attempt to bring the other datacenter’s virtual machines up. This is known as a split-brain scenario.
  • As of vSphere 5.0, each datacenter would create its own Network Partition from an HA perspective and proceed to operate as two independent clusters (although with some limitations) until connectivity was restored between the datacenters.
  • However, this scenario is still not ideal due to the storage access.

So what can be done? Well, beyond VM to Host affinity rules, if the sites are truly to be active / standby (with the standby site perhaps running lower priority VM’s), the cluster should be split into two different clusters. Perhaps even different vCenter instances (one for each site) if Site Recovery Manager (SRM) will be used to automate the failover process. If there is a use case for a single cluster, then external technology needs to be used. Specifically, the storage access problem can be addressed by using a technology like VPlex from EMC. In short, VPlex allows one to have a distributed (across two datacenters) virtual volume that can be used for a datastore in the vSphere cluster. This is depicted in the diagram below.

 

image

A detailed explanation of VPlex is beyond the scope of this post. At a high level, the distributed volume can be accessed by all the hosts in the stretched cluster. VPlex is capable of keeping track of which virtual machines should be running on the local storage that backs the distributed virtual volume. In the case of a complete site failure, VPlex can determine that the virtual machines should be restarted on the underlying storage that is local to the other datacenter’s hosts.

Technology is bringing us closer to location aware clusters. However, we are not quite there yet for a number of use cases as external equipment and functionality tradeoffs need to be considered. If you have the technology and can live with the functionality tradeoffs, then stretched clusters may work for your infrastructure. The simple design choice for many continues to be separate clusters.

Running a Lean Branch Office with the Cisco UCS Express

Centralized management brings organizations more control over resources with fewer equipment assets in the field. There are many cases where equipment may be needed in a branch office to speed access time to a resource or eliminate the dependency on a network link to the central datacenter. It is very common to see at least one, if not multiple, servers at the branch office to provide file/print services or user authentication. Perhaps the servers are providing some service that is specialized to a particular business (banking applications come to mind here). Whatever service is being provided, sometimes it is better to maintain local access at the branch. So there are servers to maintain at the branch office, as well as networking gear and other such devices.

What if you could consolidate your branch office services with your router? That is exactly what the Cisco UCS Express is meant to do. The UCS Express is a Services-Ready Engine (SRE) module that works in Integrated Services Router Generation 2 (ISR G2) routers. This module is a server that you can run VMware ESXi on to provide branch office services. Here is an example of an ISR G2 device:

 

Cisco UCS Express ISR G2 port schematics

 

The slots you see at the bottom of the device is where the SRE UCS Express modules are located. A UCS Express module is seen below.

 

Cisco UCS Express main schematics

 

Here are a couple of the highlights of this architecture:

  • (1) or (2) 500 GB drive options are available (hot swap hard drive)
  • (1) or (2) Core CPU’s are available
  • 4 or 8GB of RAM available
  • iSCSI Initiator Hardware offload if you need to connect to an external iSCSI device
  • There is direct SRE to LAN connectivity which reduces cabling
  • Maintenance is covered under SMARTnet

This architecture provides all that a branch office may need by virtualizing several branch office services onto the SRE UCS Express Module. The ESXi instance can be managed centrally by your existing vCenter installation. This gives you the benefits of local service access and centralized management while reducing the equipment needs at the branch office. Pretty slick.

If you would like to discuss how this architecture might be able to help your organization or want further technical details, please feel free to contact me.

Memory Management in vSphere – Where we are at today

This is a quick blog to discuss where vSphere is at with memory management today. vSphere has many mechanisms to reclaim memory before resorting to paging to disk. Let’s briefly look at these methods.

 

Memory Reclamation

  • Transparent Page Sharing (TPS)
    • Think of this as deduplication for memory. Identical pages of memory are shared with many VM’s instead of provisioning a copy of that same page to all VM’s. This can have a tremendous impact on the amount of RAM used on a given host if there are many identical pages.
  • Balooning
    • This method increases the memory pressure inside the guest so that memory that is not being used can be reclaimed. If the hypervisor were to just start taking memory pages from guests, the guest Operating Systems would not react positively to that. So, balooning is a way to place artificial pressure on the guest VM so that the VM pages unused memory to disk. Then, the hypervisor can reclaim that memory without disrupting the guest OS.
  • Memory compression
    • This method attempts to compress memory pages that would normally be swapped out via hypervisor swapping. This is preferable to swapping as there can be a performance impact when memory is swapped to disk.
  • Hypervisor swapping
    • This is the last resort for memory management. The memory pages are swapped to disk. New in vSphere 5 is the support for swapping these memory pages to SSD’s. This increases the performance when swapping is needed.

As you can see there are many memory management techniques in vSphere that allow greater consolidation ratios. The hypervisor in the virtual infrastructure does much more than just host guest VM images. There is a lot going on under the hood to consider before choosing a specific hypervisor to serve as the foundation for your infrastructure. Feel free to contact me if you would like to discuss any of the “under the hood” features of vSphere.

Security in a Virtualized World

For our August Lunch & Learn presentation “Security in a Virtualized World,”  TBL’s virtualization expert Harley Stagner was joined by Bryan Miller of Syrinx Technologies.    The speakers each brought their own unique perspectives to the subject matter, as Harley’s job is to help clients  to build virtual infrastructures, and Bryan’s job is to see if he can break into them (with his clients’ permission, of course).

Harley and Bryan discussed the challenges of managing a virtualized infrastructure while maintaining system security.  Focus areas included patching issues and best practices for auditing and hardening your system.  In addition, Harley and Bryan covered  vulnerabilities that are germane to virtualized environments, while also reviewing the newest compliance guidelines.

To conclude, Harley and Bryan provided a live demonstration of vMotion, and explained how it can be infiltrated.  Bryan demonstrated how during a typical vMotion session information could be easily exposed without taking security measures.

A big TBL “Thank You” to all those who attended for sessions in Virginia Beach and Richmond in August.  In September, TBL Networks is participating in two very exciting events.  On Wednesday, September 14, TBL is hosting an exclusive Lunch & Learn in Virginia Beach entitled “Cisco Collaboration Meets Virtualization.”  This presentation will be lead by TBL Networks’ CCIE and Collaboration Practice Lead Engineer Patrick Tredway.  For our Richmond readers, please join us at RichTech’s Annual TechLinks Golf Tournament on Monday, September 12th.   TBL Network is serving as a Reception Sponsor for this event, and we look forward to meeting everyone in the technology field in Central Virginia for a great day of golf and socializing.

>>>>>

For more information on Bryan Miller and Syrinx Technologies, please to http://www.syrinxtech.com

vSphere 5 High Availability: Bring on the Blades

vSphere 5 has many new and exciting features. This post will concentrate on High Availability(HA) and how it affects blade designs. While HA is certainly not new, it has been rewritten from the ground up to be more scalable and flexible than ever. The old HA software was based on Automated Availability Manager (AAM) licensed from Legato. This is why HA had its own set of binaries and log files.

One of the problems with this “now legacy” software was the method it used to track the availability of host resources. HA prior to vSphere 5 used the concept of primary nodes. There were a maximum of (5) primary nodes per HA cluster. These nodes were chosen by an election process at boot time. The (5) primary nodes kept track of the cluster state so that when an HA failover occurred, the virtual machines could restart on an available host in the cluster. Without the primary nodes, there was no visibility into the cluster state. So, if all (5) primary nodes failed, HA could not function.

This was not usually an issue in rackmount infrastructures. However, it posed some challenges in a blade infrastructure where a chassis failure can cause multiple blades to fail. Blade environments should typically have at least two chassis for failover reasons. If there was only a single chassis providing resources for an HA cluster, that single chassis failure could cause an entire cluster outage. You’ll seen in the diagram below that just because multiple chassis are used does not mean that the entire HA cluster is protected.

 

image

 

In this case, two chassis are used and populated with blades. However, the HA primary nodes all ended up on the same chassis. If that chassis were to fail, then HA will not function and the virtual machines will not restart on the remaining hosts in the other chassis. The way to design around this scenario prior to vSphere 5 is depicted in the below diagram.

 

image

 

No more than (4) blades should be part of the same HA cluster within a chassis. This does not mean that the entire chassis cannot be populated. The remaining slots in the chassis could be used for a second HA cluster. This scenario hinders single cluster scalability from a hardware perspective.

 

vSphere 5 HA

Some significant changes were made in vSphere 5 HA that address this challenge. HA was completely rewritten as Fault Domain Manager (FDM). The new HA software is baked into ESXi and does not rely at all on the AAM binaries. The idea of primary nodes has been abandoned. In its place is the concept of a single “Master” node and many (as many as are in the cluster) “Slave” nodes. All the nodes in an FDM based HA cluster can keep track of the cluster state. The “Master” node controls the distribution of cluster state information to the “Slave” nodes. However, any node in the cluster can initiate the HA failover process. The new HA failover process also includes electing a new “Master” node in the event that it is the node that fails. As you can see from the diagram below, a chassis failure can no longer take out an entire HA cluster that is stretched across multiple chassis.

 

image

 

The new FDM HA in vSphere 5 is much more resilient and allows the scaling of large single clusters in a blade environment. While blade architectures were certainly viable before, now those architectures can be utilized even more fully without compromises when it comes to HA.

A Structured Virtual Infrastructure Approach Part IV: Shared Storage Options

The storage platform in a virtual infrastructure serves as most important foundation piece of the infrastructure. There are certainly many options to choose from. Those storage options generally fall into two main categories. Block storage and File System storage. Let’s take a look at these two categories.

Block Storage

This method of providing shared storage to a VMware cluster has been supported the longest. At its core, block storage presents a set of physical disks as a logical disk to a host (ESX server in this case). This is a very well understood method of providing storage for the virtual infrastructure. There are a couple of protocols that we can use to provide this type of storage: Fibre Channel and iSCSI.

Fibre Channel

  • Fibre Channel uses a dedicated Fibre Channel fabric to provide connectivity for the storage.
  • Fibre Channel was built from the ground up as a storage protocol.
  • Fibre Channel is the most mature protocol for block storage presentation.

iSCSI

  • iSCSI can use the same network fabric as your LAN servers. However, it is best to use a separate Ethernet fabric.
  • iSCSI is an IP based storage protocol that utilizes the existing TCP/IP stack.
  • iSCSI is a relatively new protocol for block storage.

The protocol chosen to support the Block Storage infrastructure will depend on a number of factors that go beyond the scope of this post. I will say that if you are building a greenfield environment with the Cisco UCS, it is already utilizing Fibre Channel over Ethernet. So, choosing a Fibre Channel fabric is a good way to go.

File System Storage

This method of providing shared storage to a VMware cluster utilizes the Network File System (NFS) protocol. This method doesn’t provide a logical disk to the host server. The host server is simply connecting to a network share or mount point using the IP network. This certainly offers some advantages in reducing management complexity. This comes at the cost of file system overhead versus providing a block disk to the host.

Which to Choose?

There are block-only storage devices and there are file system-only storage devices. So, which one should you choose?

It was a trick question. You shouldn’t have to choose. There are unified storage arrays (like the EMC VNX series) that offer Fibre Channel, iSCSI, NFS and CIFS from the same array. This is definitely a good way to go for needs today and future scalability tomorrow. We discussed the methods for providing VMware specific storage. I want to focus on one more protocol option in this post. Common Internet File System (CIFS) is the protocol that is used for Windows File Serving. Most of the clients I deal with have Windows File Servers with large hard drives. This can introduce some challenges when virtualizing those servers.

  • A single VMFS datastore can only be 2TB minus 512 bytes. This means a single file system needs to fit within those parameters.
  • Also, large VMDK files are more difficult to manage.
  • Why virtualize the file server at all?

I typically recommend that the files from the file servers be consolidated onto the unified storage array. This offers several advantages:

  • The file system can expand without restrictions placed on it by Windows.
  • There are fewer Windows Servers to patch and maintain if the files are consolidated onto the array.
  • No need to take resources on the VMware infrastructure if the files are consolidated onto the array.

If the infrastructure typically uses Linux file servers, NFS can be used for the same purpose.

Block-only or File System-only storage arrays hinder flexibility in the infrastructure. Shouldn’t the storage platform be as flexible as the virtual infrastructure can be?

Two TBL Engineers Named vExperts 2011

TBL Networks is very proud to announce that two of our Data Center Engineers, Sean Crookston and Harley Stagner, have been named as vExperts 2011 by VMware. Sean and Harley received this designation as recognition of their contributions to the VMware, virtualization, and cloud computing communities.

According to VMware, “the vExperts are people who have gone above and beyond their day jobs in their contributions to the virtualization and VMware user community. vExperts are the bloggers, the book authors, the VMUG leaders, the tool builders and town criers, the tinkerers and speakers and thinkers who are moving us all forward as an IT industry.”

TBL Networks’ Solutions Engineer Sean Crookston also has the title of VMware Certified Advanced Professional in Data Center Administration (VCAP-DCA). Sean is only the 47thperson worldwide to achieve this elite virtualization certification.

TBL Networks’ Account Engineer Harley Stagner is the first VMware Certified Design Expert (VCDX) in Virginia, and just the 46th person worldwide to hold this title.  The VCDX is the highest certification available from VMware.

Congratulations again to Sean Crookston and Harley Stagner – vExperts 2011.

A Structured Virtual Infrastructure Approach Part III: Compute Platform Software

In Part II of the Structured Virtual Infrastructure Approach Series, we explored the Cisco Unified Computing System (UCS) hardware. This post will explore the UCS management software. Up to 20 chassis can be managed with a single instance of the UCS Manager. The UCS Manager is included with the 6100 series fabric interconnects. All of the blades in the infrastructure can be managed through this single interface. Below, we’ll discuss some of the features that make this interface unique among compute platform management interfaces.

Complete Compute Infrastructure Management

  • All the chassis / blades (up to 20 chassis worth) are managed in this single interface.
  • The management is not “per chassis” like legacy blade systems.
  • Consolidated management means efficient management for the entire compute platform.

Service Profiles

  • All of the items that make a single server (blade) unique are abstracted with a service profile.
  • This may include WWN, MAC, Bios Settings, Boot Order, Firmware Revisions, etc.
  • WWN’s and MAC’s are pulled from a pool that can be defined.
  • Even the KVM management IP’s are pulled from a pool so the administrator does not have to manage those IP’s at all.
  • You can create a Service Profile template with all of these characteristics and create Service Profiles from the template.
  • When you need to deploy a new blade all of the unique adjustments are already completed from the Service Profile template.
  • With the M81KR Virtual Interface Card (VIC) the number of interfaces assigned to a blade can be defined in the Service Profile template.
  • Even though the a single mezzanine card in a blade will only have (2) 10Gb ports, the M81KR VIC allows you to define up to 56 FC/Ethernet ports. This allows for more familiar vSphere Networking setups like the one below:

image

The diagram above is a setup that can be used with the Cisco Nexus 1000v. It would be impossible to do this setup on the UCS B-Series without the M81KR VIC. We’ll explore why a networking setup like this may be necessary when we get to the vSphere specific posts in this series.

Role Based Access Control

  • Even though the components are converging (storage, compute, networking) the different teams responsible for those components can still maintain access control for their particular area of responsibility.
  • The UCS manager has permissions that can be applied in such a way that each team only has access to the administrative tab(s) that they are responsible for.
  • Network team –> Network Tab, Server Team –> Server Tab, Storage Team –> Storage Tab.
  • The UCS manager also supports several different authentication methods, including local and LDAP based authentication.

What vSphere does for the Operating System instances, the UCS does for the blade hardware. It abstracts the unique configuration items into a higher software layer so that they can be easier managed from a single location.  The next post in this series will take a look at some storage platform hardware. It’s not just about carving out disks for the virtual infrastructure any longer. We’ll take a look at some options that integrate well with a modern vSphere infrastructure.

Cisco Expands UC Virtualization Support

Stand back….this is a pretty big announcement!  As of June 7, 2011 Cisco began support for some Collaboration (formerly Unified Communications) applications running in a virtual environment on hardware other than their own Unified Computing System (UCS). The is the first in hopefully many steps to come in widening support for benefits we often realize with typical desktop and server applications running on a VMware hypervisor. The details are as follows.

 

Cisco is pleased to announce expanded virtualization of Cisco Unified Communications starting Jun 7, 2011.

On Jun 7 Cisco will add two additional virtualized UC offers. Customers will then have three deployment options:

1. UC on UCS – Tested Reference Configurations

2. UC on UCS – Specs-based VMware hardware support

3. HP and IBM – Specs-based VMware hardware support

Phase 1 support begins Jun 7, 2011 and should include the following (see www.cisco.com/go/uc-virtualized for final products and versions supported):

– Cisco Unified Communications Manager 8.0.2+ and 8.5.1

– Cisco Unified Communications Manager – Session Management Edition 8.5.1

– Cisco Unified Communications Management Suite

– Cisco Unity Connection 8.0.2+ and 8.5.1

– Cisco Unity 7.0.2+ (with Fiber Channel SAN only)

– Cisco Unified Contact Center Express and IP IVR 8.5.1

Support for additional products and versions will phase in over rest of CY11.

Specs-based VMware hardware support adds the following

– UC Compute support for UCS, HP, IBM servers on VMware’s hardware compatibility list and running Intel Xeon 5600 / 7500 family CPUs

– UC Network support for 1Gb through 10Gb NIC, CNA, HBA and Cisco VIC adapters that are supported by above servers

– UC Storage support for DAS, SAN (Fiber Channel, iSCSI, FCoE) and NAS (NFS).

– More co-resident UC VMs per physical server if more powerful CPUs are used

– Note that UC / non-UC / 3rd-party co-residency is still not supported.

– Note that hardware oversubscription is still not supported by UC.

– No changes to VMware product, version or feature support by UC

 

This most certainly gives us far more agility for the manner in which we deploy these applications. More info to come as I get it…