vmware – HA issues

Most of the time your HA issues are going to be DNS related. So ensure that your vcenter can ping all your hosts by FQDN without issue.  In some cases though a stubborn server may not want to play the game even when everything is configured properly.

This method is considered a “last effort” as you’ll need to run some CLI commands on the ESX box. But i have found it useful in a few situations.

This page has a great write up on which files HA uses and how to temporary stop the HA service. http://itknowledgeexchange.techtarget.com/virtualization-pro/vmware-ha-failure-got-you-down/

Remember to get to the console on ESXi you logon to the console press Alt-F1 then type “unsupported” (note: you cannot see what you are typing), then enter the root password.

The main bits are as follows;

Stop the HA service

service vmware-aam stop

Check that HA has stopped (if not then use kill command to kill them)

ps ax | grep aam | grep -v grep

Move the current HA config files to a backup directory (before restarting HA)

cd /etc/opt/vmware/aam

mkdir .old

mv * .old

mv .[a-z]* .old

Then back to your vcenter and select Reconfigure for VMware HA on the effected host. Fingers crossed that it starts up and reconfigures without any issues.

VirtualBox – crashing / freezing

I’ve had some problems since my upgrade to virtualbox 2.2.0 on OpenSolaris. After some time all of my linux boxes seem to just die. The virtual machine just stops responding. Strangely there was no problem with my windows vms after the update.

From what i can tell it looks like the upgrade turned off “IO APIC” – this is the bit that seemed to cause the problem. Re-enabling this on all of my linux boxes seems to have fixed the problem. I’ll continue testing for another week and update this post if any problems re-occur.

Updated : 01/09/2009

Here is a bit more on IO APIC from the virtualbox wiki…  (from a windows perspective)
http://www.virtualbox.org/wiki/Migrate_Windows

The hardware dependent portion of the Windows kernel is dubbed “Hardware Abstraction Layer” (HAL). While hardware vendor specific HALs have become very rare, there are still a number of HALs shipped by Microsoft. Here are the most common HALs (for more information, refer to this article: http://support.microsoft.com/kb/309283):

Hal.dll (Standard PC)
Halacpi.dll (ACPI HAL)
Halaacpi.dll (ACPI HAL with IO APIC)

If you perform a Windows installation with default settings in VirtualBox, Halacpi.dll will be chosen as VirtualBox enables ACPI by default but disables the IO APIC by default. A standard installation on a modern physical PC or VMware will usually result in Halaacpi.dll being chosen as most systems nowadays have an IO APIC and VMware chose to virtualize it by default (VirtualBox disables the IO APIC because it is more expensive to virtualize than a standard PIC). So as a first step, you either have to enable IO APIC support in VirtualBox or replace the HAL. Replacing the HAL can be done by booting the VM from the Windows CD and performing a repair installation.

Updated : 5/09/2009

I’ve had even more problems with opensolaris crashing completely after upgrading to the newer versions of virtualbox (3.0.4), and have since reverted back to 2.2.0 which has fixed alot of the hanging issues i have encountered