Halloween Cookbook: Tips and tricks to roust goblins from your datacenter

If you’re an existing user of CloudPhysics here is a brief tutorial of how you could roust goblins from your virtual datacenter using CloudPhysics analytics.

Note: CloudPhysics provides powerful filters, all of these checks could be run across all your vCenters or for any small slice of your infrastructure.

Availability – Health Checks

Finding HA configuration issues:

CloudPhysics HA Cluster Health Check card provides a quick overview of your clusters’  health state. According to our global dataset 40% of HA clusters do not have admission control enabled. Without admission control VMware HA cannot guarantee that all the VMs on your cluster could be powered-on in case of a host failure. To check if you are clusters have this setting configured correctly, click on the HA Cluster Health card, choose your vCenter and the HA cluster.

HA Cluster Health
Finding critical knowledge base articles that matter:

VMware and hardware system vendors release new or update existing knowledge base (KB) articles on a regular basis. Every month roughly 200 KB articles are updated and about 10 of the 200 KB articles are related to serious system outage or data loss issues. To find out which of these critical KB articles are applicable to your environment, open the Knowledge Base Advisor Card. From the Card you can read KB articles and see the hosts that are associated with it in your environment. You could also type a particular host or VM name along with symptoms to further narrow down the KB articles that apply.

Knowledge Base Advisor
Bully and victim VMs:

All it takes is a few bully VMs to wreak havoc and cause serious performance degradation. Our global dataset shows that on average each bully VM victimizes 5 other VMs. To identify the VM bullies in your datacenter, head over to the Datastore Contention Card. Choose the datacenter or cluster or datastore that you are interested in and you’ll see all the datastores that have experienced contention in the last 24 hours. Click on a datastore and you’ll see the contention phases and the culprit and victim VMs for each of those phases.

Culprits and Victim Analysis

Utilization – Dead Space and Zombie VMs

Locating dead space:

Dead space is the space that was previously allocated but currently deleted and unused. Our analysis shows that over 26% of the disk space used by the virtual machine is dead space and this space is easily reclaimable with few simple steps. You need to run the disk shrink operation within the VM using VMware tools. After that you need to (re-)convert disk to thin virtual disk. To locate VMs with dead space, head over to VM Space Saver card. Here you will see the summary total space savings opportunity.

Dead Space Analysis

Below the summary you also see savings contribution from individual VMs, and you can easily identify which VMs provide the biggest space savings opportunity.

Potential Space Savings
Locating zombie VMs:

Our global dataset shows that 16% of virtual machines are zombie VMs (VMs in powered off or suspended state). These VMs are using valuable disk space on your storage array. To locate these VMs and identify the space savings opportunity provided, head over to Unused VMs card. Here you will see the summary information of the potential space savings opportunity and the number of VMs broken down by the days of inactivity.

Unused VMs

Below the summary you also get the individual listing of VMs.

Unused VMs List

Security Vulnerabilities

Finding ESX hosts with Heartbleed vulnerability:

VMware has released patches for the Heartbleed vulnerability that affects the ESX hosts. Our global dataset shows that over 22% of ESX hosts remain unpatched. To find if any of your ESX hosts in that list, go to the CloudPhysics card store. Find “A Heartbleed Check for ESX 5.5” card. Add that card to “My Cards” and go ahead and click on the card. If any host is unpatched they will be listed there.

Heartbleed Check
Finding Exposure to Shellshock:

Shellshock impacts all Linux virtual machines and older version of ESX hosts that have console OS (Classic ESX). To find all the Linux virtual machines in your environment head over to OS Inventory Card. From within the card select the OS type = Linux and you can get a listing of all Linux virtual machines.

OS Inventory Card

To find out if you have older versions of ESX hosts,  go to the CloudPhysics card store and locate Older & Unsupported ESX Classic Hosts card and add it to“My Cards.” Then click on the card and you will see the listing of the hosts that are running classic version of ESX and vulnerable to the Shellshock issue.

Classic ESX Hosts

End of Life – Unsupported Software

Windows Server 2003 accounts for 25% of all the existing windows VMs. Windows 2003 is reaching its end of lifecycle support next year. Windows XP accounts for 5.4% of the total windows VM and it has already reached its end of support life this year. To get a listing of all the Windows 2003/XP/NT and VMs go to OS Inventory card and select Guest OS Family = Windows and then subselect Windows 2003/NT/XP.

Windows VMs

Hope this mini-tutorial is useful for helping you hut down the goblins in your datacenter this Halloween.