Channel: VMware Communities : Blog List - All Communities

↧

New Orchestrated Restarts in vSphere 6.5

November 16, 2017, 6:55 am

Latest and popular articles on VMWare Virtualization

≫ Next: Evento de lançamento do VMUG RJ

≪ Previous: How To: Download and deploy VMcom Backup Appliance

vSphere 6.5 was released at the end of 2016 and so, at this point, has been on the market for about a year. VMware introduced several new features in vSphere 6.5, and several of them are very, very useful however sometimes people don’t take the time to really read and understand these new features to solve problems that might already exist. One such feature that I’d like to focus on today is the new HA feature called Orchestrated Restarts. In prior releases, vSphere High Availability (HA) has served to reliably restart VMs on available hosts should one host fail. It does this by building a manifest of protected VMs and, through a master-slave relationship structure, makes those manifests known to other cluster members. Fun fact that I’ve used in interviews when assessing another’s VMware skill set is HA does not require vCenter for its operation although it does for the setup. In other words, HA is able to restart VMs from a failed host even if vCenter is unavailable for any reason. The gap with HA, until vSphere 6.5 that is, is it has no knowledge of the VMs it is restarting as far as their interdependencies are concerned. So, in some situations, HA may restart a VM that has a dependency upon another VM which results in application unavailability when all return to service. In vSphere 6.5, VMware addressed this need with a new enhancement to HA called Orchestrated Restarts in which you can declare those dependencies and their order so HA restarts VMs in the necessary sequence. This feature is imminently useful in multi-tier applications, and one such application that can benefit tremendously is vRealize Automation. In this article, I’ll walk through this feature and illustrate how you can leverage it to increase availability of vRA in the face of disaster in addition to covering a couple other best practices with vSphere features when dealing with similar stacks.

In prior versions of HA, there was no dependency awareness—HA just restarted any and all VMs it knew about in any order. The focus was on making them power on and that’s it. There were (and still are) restart priorities which can be set, but not a chain. In vSphere 6.5, this changed with Orchestrated Restarts.

Image may be NSFW.
Clik here to view.

With special rules set in the web client, we can determine the order in which power-ons should occur. First, let’s look at a common vRA architecture. These are the machines present.

Image may be NSFW.
Clik here to view.

We’ve got a couple front-end servers (App), manager and web roles (IaaS), a vSphere Agent server (Agent), and a couple of DEM workers (DEM). The front-end servers have to be available before anything else is, followed by IaaS, and then the rest. So, effectively, we have a 3-tier structure.

Image may be NSFW.
Clik here to view.

And the dependencies are in this order, so therefore App must be available before IaaS, and IaaS must be available before Agent or DEM.

Going back over to vCenter, we have to first create our groups or tiers. From the Hosts and Clusters view, click the cluster object, then Configure, and go down to VM/Host Groups.

Image may be NSFW.
Clik here to view.

We’ll add a new group and put the App servers in them.

Image may be NSFW.
Clik here to view.

And do the same for the other tiers with the third tier having three VMs. It should end up looking like the following.

Image may be NSFW.
Clik here to view.

Now that you have those tiers, go down to VM/Host Rules beneath it. Here is where the new feature resides. In the past, there was just affinity, anti-affinity, and host pinning. In 6.5, there is an additional option now called “Virtual Machines to Virtual Machines.”

Image may be NSFW.
Clik here to view.

This is the rule type we want to leverage, so we’ll create a new rule based on this and select the first two tiers.

Image may be NSFW.
Clik here to view.

This rule says anything in vRA-Tier1 must be restarted before anything in vRA-Tier2 in the case where a host failure takes out members from both groups. Now we repeat the process for tiers 2 and 3. Once complete, you should have at least two rules in place, possibly more if you’re following these instructions for another application.

Image may be NSFW.
Clik here to view.

After they’ve been saved, you should see tasks that kick off that indicate these rules are being populated on the underlying ESXi hosts.

Image may be NSFW.
Clik here to view.

In my case, I’m running vSAN and since vSAN and HA are very closely coupled, the HA rules serve as vSAN cluster updates as well. And by the way, here is another opportunity we have to exercise best practice with a distributed or enterprise vRealize Automation stack. We need to ensure machines of like tier are separated to increase availability. This is also done here and we need to specify some anti-affinity rules to keep the App servers apart as well as the IaaS servers and others. My full complement of rules, both group dependency based and anti-affinity, looks like so.

Image may be NSFW.
Clik here to view.

Now we have the VM groups and the orchestration rules, let’s configure a couple other important points to make this stack function better. In vRA, the front-end (café) appliance(s) usually take some time to boot up because of the number of services that are involved. This process, even with a well-performing infrastructure can still take several minutes to complete, so we should complement these orchestrated restart rules with a delay that’ll properly allow the front-end to start up before attempting to start other tiers. After all, there’s no point starting other tiers if they have to be restarted manually later because the first tier isn’t yet ready for action.

Let’s go down to VM Overrides and add a couple rules. This is something else that’s great about vSphere 6.5, the ability to fine-tune how HA restarts VMs based on conditions. Add a new rule and put both App servers in there.

Image may be NSFW.
Clik here to view.

Three key things we want to change. First, the VM restart priority. By default, an HA cluster has a Medium restart priority where everything is of equal weight. We want to change the front-end appliances to be a bit higher than that because this serves as the login portal, so HA needs to make haste when prioritizing resources to start VMs elsewhere. Next, the “start next priority VMs when” setting allows us to instruct HA when to being starting VMs in the next rule. There are a few options here.

Image may be NSFW.
Clik here to view.

The default in the cluster unless it’s overridden is “Resources allocated” which simply means as soon as the scheduler has powered it on—basically immediately. Powered On is waiting for confirmation that the VM was actually powered on rather than just attempted. But the one that’s extremely helpful here is what I’d suggest setting which is “Guest Heartbeats detected.” This setting allows ESXi to listen for heartbeats from VMware tools, which is usually a good indicator that the VM in question has reached a suitable run state for its applications to initialize.

Then back to the list, an additional delay of 120 seconds will further allow the front-end to let services start before attempting to start any IaaS servers. If, after this custom value, guest heartbeats are still not detected, a timeout will occur and, afterwards, other VMs will be started. Extremely helpful in these situations when you need all members to come up, even at the expense of pieces maybe needing to be rebooted again. Rinse and repeat for your second tier containing your IaaS layer. Using the same settings as the front-end tier is just fine.

Great! Now the only thing left is to test. I’ll kill a host in my lab to see what happens. Obviously, you may not want to do this in a production situation

I killed the first host (10.10.40.246) at 7:55pm that was running App01, IaaS01, and Agent01. Here’s the state before.

Image may be NSFW.
Clik here to view.

Now after the host has died and vCenter acknowledges that, the membership looks like the following.

Image may be NSFW.
Clik here to view.

Those VMs show disconnected with an unknown status. Let’s see how HA behaves.

Image may be NSFW.
Clik here to view.

Ok, good, so at 8:00pm it started up App01 as it should have once vSAN updated its cluster status. Normally this failover is a bit quicker when not using vSAN.

Next, when guest heartbeats were detected, the 2-minute countdown started.

Image may be NSFW.
Clik here to view.

So at 8:04, it then started up IaaS01 followed by Agent01 similarly. After a few minutes, the stack is back up and functional.

Image may be NSFW.
Clik here to view.

Pretty great enhancements in vSphere 6.5 related to availability if you ask me, and all these features are extremely handy when running vRA on top.

I hope this has been a useful illustration on a couple of the new features in vSphere 6.5 and how you can leverage those to provide even greater availability to vRealize Automation. Go forward and use these features anywhere you have application dependencies, and if you aren’t on vSphere 6.5 yet, start planning for it now!

↧

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

September 22, 2019, 11:40 pm

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

February 16, 2017, 4:24 pm

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

January 5, 2014, 10:34 pm

Ominde Commission Report and Recommendations – Ominde Report of 1964

March 16, 2015, 5:14 am

Bureau of Internal Revenue: Regional Offices (Directory)

January 9, 2014, 11:06 pm

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

March 26, 2017, 11:23 pm

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

October 17, 2016, 7:20 am

Mp3 Download: Mdu - Kunjenjenjena

December 7, 2017, 8:16 am

How the kill the job , when DTP request running for long hours.

July 26, 2013, 2:41 am

Microsoft Intune から展開しているアプリのアップデートについて

October 17, 2016, 4:11 am

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

September 1, 2017, 10:00 pm

Car crash in Dunton Bassett leaves driver in critical condition

October 7, 2014, 7:51 am

Macky 2, Two Others In Road Accident

March 29, 2015, 5:34 am

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

May 14, 2015, 11:27 pm

Detroit mafia: D’Anna Brothers agree to plea deal

April 21, 2016, 6:56 am

Delivery block field greyed out using VA02

January 26, 2016, 2:52 pm

Muloraki Au

June 22, 2016, 1:44 am

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

October 12, 2017, 2:23 pm

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

February 9, 2018, 4:56 am

FIAT 500 B0111 B0112

July 5, 2018, 10:31 am

© 2025 //www.rssing.com