I have had spent a lot of times recently doing a so called poor man’s HA for ESXi 5.1 (which I named it “heartbeat”) without vCenter. Too many questions arose during the initial stage such as can it be done, how should I done it and etc. Well after two weeks of effort, I’m happy to say that it start to pay off.
VMware vMA is a good platform to start with as I don’t have to worry about a set of remote CLI commands required for ESXi communication. After carefully designed the flow of overall process, I’ve started to run every command that I need one after another just to make sure everything can be done later via shell script.
Then, I have created three below scripts:
- Heartbeat - script to check node status
- Failover - automatically move or failover VM to surviving node when failover detected
- Cron - a process that periodically running in the background to call Heartbeat script
A couple of scenarios (host & network failure) had been tested and as expected it successfully managed to restart protected VM on surviving node. So, what is next? Although at this moment “Heartbeat” able to automatically failover VMs from one host to another when there is a host failure, I have to admit it is not a complete tool yet. I’m still searching a way on how to overcome a bug and split brain concern after few successful failovers which hopefully I can settle it by this month. Anyhow, I would be glad if any of you here would like to test it on your test environment and provide me with some input for the outcome.
ps: I have tested to run it on both free and paid ESXi 5.1 versions and as expected, I cannot have it run on free ESXi due to locking API. Nevertheless my feeling is telling me, I still can do it on free ESXi though the method should be different.