nested hypervisor (on ESXi 5.1)

Sometimes you might want to run a hypervisor on a hypervisor for testing purposes…. this is how you pass through the required CPU extensions in ESXi 5.1

Remember you will also need to enable promiscuous mode on the networking side also.

How to Enable Nested ESXi & Other Hypervisors in vSphere 5.1

There are some changes with Nested Virtualization in vSphere 5.1 also officially known as VHV (Virtual Hardware-Assisted Virtualization). If you are using vSphere 5.0 to run Nested ESXi or other nested Hypervisors, then please take a look at the instructions in this article. With vSphere 5.1, there have been a few minor changes to enable VHV.

  1. The new Virtual Hardware 9 compatibility will be required when creating your nested ESXi VM, Virtual Hardware 8 will not work if you are running ESXi 5.1 on your physical host. You will still need to enable promiscuous mode on the portgroup that will be used for your nested ESXi VM for network connectivity.
  2. vhv.allow = “true” is no longer valid for ESXi 5.1 to enable VHV. A new parameter has been introduced called vhv.enable = “true” that is now defined on a per VM basis to provide finer granularity of VHV support. This also allows for better portability between VMware’s hosted products such as VMware Fusion and Workstation as they also support the vhv.enable parameter.
  3. You can now enable VHV on a per VM basis and using the new vSphere Web Client which basically adds the vhv.enable = “true” parameter to the VM’s .VMX configuration file.

 

vmware – updating vcenter email alert for monitoring

I’ve used the following powershell script to assist with setting up a consistant email alert on various vmware envionrments.


# Usage ;
# Please manually connect to vCenter, use "Connect-VIServer" -- this promotes usernames and passwords not beings saved with script.
# Update below variable with email(s) use comma as delimita 
$MailtoAddresses= “monitoring@company.com”

#—-These Alarms will send a single email message and not repeat —-
$LowPriorityAlarms=”Timed out starting Secondary VM”,`
“No compatible host for Secondary VM”,`
“Virtual Machine Fault Tolerance vLockStep interval Status Changed”,`
“Migration error”,`
“Exit standby error”,`
“License error”,`
“Virtual machine Fault Tolerance state changed”,`
“VMKernel NIC not configured correctly”,`
“Unmanaged workload detected on SIOC-enabled datastore”,`
“Host IPMI System Event Log status”,`
“Host Baseboard Management Controller status”,`
“License user threshold monitoring”,`
“Datastore capability alarm”,`
“Storage DRS recommendation”,`
“Storage DRS not supported on host”,`
“Datastore is in multiple datacenters”,`
“Insufficient vSphere HA failover resources”,`
“License capacity monitoring”,`
“Pre-4.1 host connected to SIOC-enabled datastore”,`
“Virtual machine cpu usage”,`
“Virtual machine memory usage”,`
“License inventory monitoring”

#—-These Alarms will repeat every 24 hours—-
$MediumPriorityAlarms=`
“Virtual machine error”,`
“Health status changed alarm”,`
“Host cpu usage”,`
“Health status monitoring”,`
“Host memory usage”,`
“Cannot find vSphere HA master agent”,`
“vSphere HA host status”,`
“Host service console swap rates”,`
“vSphere HA virtual machine monitoring action”,`
“vSphere HA virtual machine monitoring error”

#—-These Alarms will repeat every 2 hours—-
$HighPriorityAlarms=`
“Host connection and power state”,`
“Host processor status”,`
“Host memory status”,`
“Host hardware fan status”,`
“Host hardware voltage”,`
“Host hardware temperature status”,`
“Host hardware power status”,`
“Host hardware system board status”,`
“Host battery status”,`
“Status of other host hardware objects”,`
“Host storage status”,`
“Host error”,`
“Host connection failure”,`
“Cannot connect to storage”,`
“Network connectivity lost”,`
“Network uplink redundancy lost”,`
“Network uplink redundancy degraded”,`
“Thin-provisioned LUN capacity exceeded”,`
“Datastore cluster is out of space”,`
“vSphere HA failover in progress”,`
“vSphere HA virtual machine failover failed”,`
“Datastore usage on disk”

#—Set Alarm Action for Low Priority Alarms—
Foreach ($LowPriorityAlarm in $LowPriorityAlarms) {
Get-AlarmDefinition -Name “$LowPriorityAlarm” | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
Get-AlarmDefinition -Name “$LowPriorityAlarm” | New-AlarmAction -Email -To @($MailtoAddresses)
# Get-AlarmDefinition -Name “$LowPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Green” -EndStatus “Yellow”
Get-AlarmDefinition -Name “$LowPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Yellow” -EndStatus “Red” # This ActionTrigger is enabled by default.
# Get-AlarmDefinition -Name “$LowPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Red” -EndStatus “Yellow”
# Get-AlarmDefinition -Name “$LowPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Yellow” -EndStatus “Green”
}

#—Set Alarm Action for Medium Priority Alarms—
Foreach ($MediumPriorityAlarm in $MediumPriorityAlarms) {
Get-AlarmDefinition -Name “$MediumPriorityAlarm” | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
Set-AlarmDefinition “$MediumPriorityAlarm” -ActionRepeatMinutes (60 * 24) # 24 Hours
Get-AlarmDefinition -Name “$MediumPriorityAlarm” | New-AlarmAction -Email -To @($MailtoAddresses)
# Get-AlarmDefinition -Name “$MediumPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Green” -EndStatus “Yellow”
Get-AlarmDefinition -Name “$MediumPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | Get-AlarmActionTrigger | Select -First 1 | Remove-AlarmActionTrigger -Confirm:$false
Get-AlarmDefinition -Name “$MediumPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Yellow” -EndStatus “Red” -Repeat
# Get-AlarmDefinition -Name “$MediumPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Red” -EndStatus “Yellow”
# Get-AlarmDefinition -Name “$MediumPriorityAlarm” | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus “Yellow” -EndStatus “Green”
}

#---Set Alarm Action for High Priority Alarms---
Foreach ($HighPriorityAlarm in $HighPriorityAlarms) {
Get-AlarmDefinition -Name "$HighPriorityAlarm" | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
Set-AlarmDefinition "$HighPriorityAlarm" -ActionRepeatMinutes (60 * 2) # 2 hours
Get-AlarmDefinition -Name "$HighPriorityAlarm" | New-AlarmAction -Email -To @($MailtoAddresses)
# Get-AlarmDefinition -Name "$HighPriorityAlarm" | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Green" -EndStatus "Yellow"
Get-AlarmDefinition -Name "$HighPriorityAlarm" | Get-AlarmAction -ActionType SendEmail | Get-AlarmActionTrigger | Select -First 1 | Remove-AlarmActionTrigger -Confirm:$false
Get-AlarmDefinition -Name "$HighPriorityAlarm" | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Red" -Repeat
# Get-AlarmDefinition -Name "$HighPriorityAlarm" | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Red" -EndStatus "Yellow"
# Get-AlarmDefinition -Name "$HighPriorityAlarm" | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Green"
}
This is another version i created that just grabs all alarms and sets email trigger. Note that it will delete all the current triggers (so ensure you dont have SNMP traps etc that you need)


# v3 : grab all alarms from vcenter (so should work all versions) and use these for alarm variables
# Usage ;
# Please manually connect to vCenter, use "Connect-VIServer" -- this promotes usernames and passwords not beings saved with script.
# Any alarm that is currently active will send email alert -- please confirm appropriate values for tiggers before running this script.

#Set Notification emails here;
$MailTo= “monitoring@company.com”

#define alarms to be set;
$Alarms = Get-AlarmDefinition | sort Name | select name | ft -HideTableHeaders

foreach ($Alarm in $Alarms)
{
# Delete Trigger;
Get-AlarmDefinition -Name “$Alarm” | Get-AlarmAction | Remove-AlarmAction -Confirm:$false

# Create Trigger;
Get-AlarmDefinition -Name “$Alarm” | New-AlarmAction -Email -To “$MailTo”
}

UPDATE : I generally use the following script now. Less to maintain, and covers any alarms that have not been managed.

# Author : Darren Taylor
# v5 : grab all alarms from vcenter (so should work all versions) and use these for alarm variables
# Usage ;
# Please manually connect to vCenter, use “Connect-VIServer” — this promotes usernames and passwords not being saved with script.
# Any alarm that is currently active will send email alert — please confirm appropriate values for tiggers before running this script.
#
# Note ;
# This script needs to be modified to exclude alarms that are not critical (exclusive rather than inclusive)
# Once Exceptions list is updated, re-run script.

# ——————- VARIABLES ————————-

#Set Notification emails here;
$MailTo= “vcenter@company.com”

# These are the names of the alarms to ignore — i.e. do NOT setup email alert
# THESE ALARMS ARE CONSIDERED NON CRITICAL
$Exceptions= `
“Virtual machine cpu usage”,`
“Virtual machine memory usage”

# ——————- CODE ONLY BELOW ——————-

# TODO:
# Change triggers on some alarms?

#define alarms to be set; (ALL ALARMS)
$Alarms = Get-AlarmDefinition | sort name | select name

foreach ($Alarm in $Alarms)
{

# Test variable in array
Write-Host “Setting Alarm… ” -NoNewLine; Write-Host $Alarm.Name -NoNewLine;

# Delete Trigger; (clears all existing EMAIL triggers)
Get-AlarmDefinition -Name $Alarm.Name | Get-AlarmAction -ActionType:SendEmail | Remove-AlarmAction -Confirm:$false;

# Exceptions to email trigger
$SetAlarm = 1;
foreach ($Exception in $Exceptions) {if($Alarm.Name -eq $Exception){$SetAlarm=0; Write-Host ” Ignored” -foregroundcolor red;}}

# Create Trigger;
if($SetAlarm -eq “1”){Get-AlarmDefinition -Name $Alarm.Name | New-AlarmAction -Email -To $MailTo}

}

 

vmware and NLB (unicast vs multicast)

Thought I’d get a bit of clarification around this one. VMWare states –

VMware recommends that you use multicast mode, because unicast mode forces the physical switches on the LAN to broadcast all Network Load Balancing traffic to every machine on the LAN.

There are some issues which come to the surface if you are using unicast. Where possible we should always try to use a Multicast NLB as Unicast can cause some complications around vSwitch configurations. (requirement to change switch notifications to “No” – and seemingly break port id teaming). Detail below;

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1556

• All members of the NLB cluster must be running on the same ESX host.
• All members of the NLB cluster must be connected to the single portgroup on the virtual switch
• VMotion for unicast NLB virtual machines is not supported (unless you want to migrate ALL NLB members to a different ESX host)
• The Security Policy Forged Transmit on the Portgroup is set to Accept
• The transmission of RARP Packet is prevented on the Portgroup/Virtual Switch as explained in the later part of the article.

Here is a bit about each type… (from vmware docs)

Multicast

Multicast mode allows communication among hosts because it adds a Layer 2 multicast address to the cluster instead of changing the cluster. Communication among hosts is possible because the hosts retain their original unique media access control (MAC) addresses and already have unique media access control (MAC) addresses and already have unique dedicated IP addresses. However, the address resolution protocol (ARP) reply that is sent by a host in the cluster (in response to an ARP request) maps the cluster’s unicast IP address to its multicast MAC address.

Some routers do not support the resolution of unicast IP addresses to multicast MAC addresses, and they discard the ARP reply. As a result, an administrator must add a static ARP entry in the router, mapping the cluster IP address to its MAC address.

  • Can be single nic
  • Add static ARP to default gateway

Unicast

Unicast mode works seamlessly with all routers and Layer 2 switches. However, this mode induces switch flooding, a condition in which all switch ports are flooded with Network Load Balancing
traffic, even ports to which servers not involved in Network Load Balancing are attached. To communicate among hosts, you must have a second virtual adapter for each host.

Normally, switched environments avoid port flooding when a switch learns the MAC addresses of the hosts that are sending network traffic through it. The Network Load Balancing cluster masks the cluster’s MAC address for all outgoing traffic to prevent the switch from learning the MAC address.

On an ESX host, the VMkernel sends a reverse address resolution protocol (RARP) packet each time
certain actions occur—for example, when a virtual machine is powered on, when there is a teaming failover, or when certain VMotion operations occur. The RARP packet gives physical switches the MAC
address of the virtual machine involved in the action. In a Network Load Balancing cluster environment, after a Network Load Balancing node is powered on, the notification in the RARP packet exposes the MAC address of the cluster NIC. As a result, switches might begin to send all inbound traffic destined for the Network Load Balancing cluster through one switch port to a single node of the cluster.

Because the virtual switch operates with complete data about the underlying MAC addresses of the virtual NICs inside each virtual machine, it always correctly forwards packets containing a MAC address
matching that of a running virtual machine. As a result of this behavior, the virtual switch does not forward traffic destined for the Network Load Balancing MAC address outside the virtual environment
into the physical network, because it is able to forward it to a local virtual machine.

  • Requires 2 nics if you want host to host communication.

vmware – cpu ready (what to look for) and performance

keeping things simple – below are the numbers to look for when your investigating cpu scheduling issues withing a vmware environment…

Look for the cpu “ready” counter which is a percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU. CPU ready time is dependent on the number of virtual machines on the host and their CPU loads.

These are the peak / average figures to watch for in vmware performance graphs. Anything above 5% cpu ready time is worth fixing. (Note : the sampling rate is different depending on which view you are looking at)

Realtime : 1000 x 20 (20 second sample) x 5% = 1000

Daily : 1000 x 300 (5 min sample) x 5% = 15000
Weekly : 1000 x 1800 (30 min sample) x 5% = 90000
Monthly : 1000 x 7200 (2hrs sample) x 5% = 360000

Note : above is 1vCPU only, see this reference for some more detail and some nice charts!

http://vmtoday.com/2013/01/cpu-ready-revisted-quick-reference-charts/

EMC Celerra Optimizations for VMware on NFS

go here http://blog.scottlowe.org/2010/01/31/emc-celerra-optimizations-for-vmware-on-nfs/

Turn on the uncached write mechanism for NFS file systems used as VMware datastores. This can have a significant performance improvement for VMDKs on NFS but isn’t the default setting. From the Control Station, you can use this command to turn on the uncached write mechanism:
server_mount <data mover name> -option <options>,uncached <file system name> <mount point>
Be sure to review pages 99 through 101 of the VMware on Celerra best practices document for more information on the uncached write mechanism and any considerations for its use.

vmware and load balancing NFS trunks

Straight from the post below, this is the best way (currently) to load balance your NFS datastores…  No MPIO magic here unfortunately.

http://communities.vmware.com/message/1466595#1466595

Basically you can setup IP alias on the NFS side and then setup multiple connections (using a unique IP) per datastore  on each ESX host. This works well if you are using a team of nics running IP-hash load balancing…

Static EtherChannel.

My setup is as follows:

ESXi 4.0 U1, Cisco 3750 Switches, and NetApp NFS on the storage side.

I have a total of 8 nics. I divided the nics into 3 groups.

2 nics on vSwitch0 for Mgmt & vMotion
3 nics on vSwitch1 for VM’s (Multiple port groups (3 VLANS))
3 nics on vSwitch2 for IP Storage (Mostly NFS, a little iSCSI)
(One vSwitch3 I also have a VM port group for iSCSI access from within the VM)

Since I have 3 nics on my IPStorage port group I needed a way to be able to utilize all three nics and not have the server just use one for ingress and egress traffic. This was done by:

Setting up a static EtherChannel on the cisco switch (Port Channel).
Configuring the cisco switch to IP Hash
Configure the vSwitch to “Route based on IP Hash” as well.

The next part is to create multiple datastores on the NFS device. Each of my NFS datastores is about 500GB in size. Reason for this is that my larger luns are iSCSI and are access directly from the VM using the MS iSCSI initiator from the VM itself.
My NetApp NAS has an address of, let say, 192.168.1.50. So all my data stores are accessible by utilizing the address of “\\192.168.1.50\NFS-Store#”. This will not be useful as the esx box and the cisco switch will always use the same nic/port to access the nas device. This is due to the algorithm (IP HASH) to decide what link it’ll go over. So to resolve the issue, I added IP aliases to the NFS box. NetApp allows to have multiple Ip addresses pointing to the same NFS export, I suspect EMC would do the same. So, I added 2 aliases 51 & 52. Now my NFS datastores are accessible by using Ip address 192.168.1.50,.51, & .52.

So I went ahead and added the datastores to the ESX box using the multiple IP addresses:

Datastore1 = \\192.168.1.50\NFS-Store1
Datastore2 = \\192.168.1.51\NFS-Store2
Datastore3 = \\192.168.1.52\NFS-Store3

If you have more datastores it’ll just repeat: Datastore4 = \\192.168.1.50\NFS-Store4 and so on…

Since having multiple datastores and address to each, the 3 nics on the ESX box dedicated to IP Storage get utilized. It does not aggregate the bandwidth but it does use all three to send and recieve packets. So the fastest speed you will get is 1Gbit, theoretically, each way for traffic but, it is better than trying to cram all the traffic over 1 nic.

I also enabled Jumbo Frames on the vSwitch as well as the vmNic for IP-Stroage. (need the best performance!)
I should mention that your NFS storage device should have EtherChannel setup on it as well. Otherwise, you’ll be on the same boat just on the other end of it.

Hope it helps!

Larry B.

I should mention that you should not use different addresses to access the same NFS share (datastore). It is not supported and may cause you issues.

vmware – hp procurve lacp / trunk

Cisco’s Etherchannel and HP’s LACP are very similar – probably why I assumed both are supported by vmware. But as per below – it is not the case.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004048

From a procurve perspective the differences between a “trunk” and “static lacp” trunk is the total bandwidth per connection. A typical end point on a “trunk” trunk can transmit various connections down both pipes, but only receive down one – sometimes refered to as TLB (transmit load balancing). In vmwares case when you are using a “trunk” trunk you will have ip hash set as a load balancer, effectively meaning a different nic will be used on the vmware side for each connection a virtual machine makes.

This most probably explains why we hit a 1GBit limit per connection over vmware since a switch “trunk” can only receive back down a single interface. Where as LACP the interface is the team itself (i.e. it is considered as a larger single interface) – and can load balance. Matches up with what i’ve seen on the live switch statistics.

HP’s LACP – should be used where possible – between switches and servers that support it. LACP is a protocol that is wrapped around packets (why both ends need to support it).

“trunk” trunks  – have to be used with vmware at the moment – limits each connections bandwidth to a single interface. (i.e. you will never get more than 1gbit per connection if your nics are all 1gbit).

Linux – install vmware tools onto guest

fire up the vm, then run the following after initiating a vmware tools install…

mount /dev/cdrom /mnt/cdrom
cd /tmp
tar zxf /mnt/cdrom/VMwareTools-x.x.x.gz
cd vmware-tools-disstrib
./vmware-install.pl

Then just follow the prompts through to the end.

If your running fedora or similar make sure your’ve got gcc and kernel headers…. (you’ll probably have to update kernel too)

yum update
shutdown -r now
yum install -y gcc make kernel-devel perl

Ubuntu 12.x

apt-get install open-vm-tools

some notes from fedora 13…

Did you also copy the missing/misplaced include file?

(Having just updated the kernel I am getting the original messages, so have copied them below as I workaround the problem)

= = = First I get:

What is the location of the directory of C header files that match your running
kernel? [/usr/src/linux/include] /usr/src/kernels/2.6.33.5-112.fc13.x86_64/include

The directory of kernel headers (version @@VMWARE@@ UTS_RELEASE) does not match
your running kernel (version 2.6.33.5-112.fc13.x86_64). Even if the module
were to compile successfully, it would not load into the running kernel.

= = = Then over in another session at
/usr/src/kernels/2.6.33.5-112.fc13.x86_64/include

[Tom@tlsf13a include]$ find . -iname ‘*relea*’
./config/kernel.release
./generated/utsrelease.h
[Tom@tlsf13a include]$ sudo cp -p generated/utsrelease.h linux/

= = = Then back in first session:

What is the location of the directory of C header files that match your running
kernel? [/usr/src/linux/include] /usr/src/kernels/2.6.33.5-112.fc13.x86_64/include

Extracting the sources of the vmmemctl module.

= = = and the vmware-config-tools.pl runs ….
(well, all but vmci builds … :-/ )

vmware srm – replicating over a wan – optimizations

I’ve been working through a SRM setup and have been looking at ways to optimize the amount of traffic that is sent over the WAN. The first obvious move is to move your vmware swap files off the replicated LUNS.

Another way is to reduce the sync window. i.e. how often is yoru replication technology trying to keep the source and destination in sync? — Increasing this window can sometimes help you out. But that all depends on your delta’s.

For example – In the case of a windows page file on an active server (SQL etc) it could “potentially” change the whole file within an hour. If your replication was set to every hour and the page file was 4gb then you’d be sending at least 4gb every hour. Changing the sync on your replicated LUN to 8hrs instead would mean you’d only send the 4gb of “delta” (i.e. blocks that have changed since original snap)

Problem is that you would typically want to sync your virtual machines on a more frequent schedule than 8hrs. So this is where you need to move your windows page files onto a separate LUN (also replicated), but on a larger sync window (perhaps only once if your servers are static).

monitoring changed blocks (CBT) for replication of vmware virtual machines

Check this link for a great script to monitor changes made to a virtual machine via the vmware CBT API. This is perfect for finding culprit machines that are generating a lot of replication traffic if you are replicating over a WAN.

http://www.vmguru.com/index.php/articles-mainmenu-62/scripting/105-using-powershell-to-track-block-change-sizes-over-time

If you have some disks on a virtual machine you don’t want this script to capture then just set them as independent disks (so no snapshots can take place). This is handy if you have your windows page file on a separate disk that you don’t want to be measured as a part of the CBT changes.

CBT_Tracker