Quantcast
Channel: VMware Communities : Blog List - All Communities
Viewing all articles
Browse latest Browse all 3135

Virtualizing Microsoft Clustering Services (MSCS)-Windows 2012 on vSphere Best Practices

$
0
0

Hi All ...

 

Microsoft Clustering Services (MSCS) is one of the first HA solutions in our IT world and one of the hardest to configure. Although, I don't have personal experience with MS Failover Clustering, but I know the severe pain of deploying, testing and troubleshooting this solution. Microsoft's developed this solution so much since its first version. Available versions now are: MS Clustering Service on Windows 2008 R2 and Windows 2012 and 2012 R2. With vSphere 5.x, MSCS can now be virtualized and it's fully supported by Microsoft.

In this part, we'll talk about MSCS in Windows 2012 and best practices to deploy it in vSphere 5.x environments.These best practices are collected from published VMware best practices guides and Microsoft best practices guide regarding MSCS in Windows 2012. I followed the same style of the previous post and divided them into six categories, i.e. Design Qualifiers(AMPRS - Availability, Manageability, Performance, Recoverability and Security) and Scalability.


Availability:

1-) Use vSphere HA with MSCS to provide additional level of availability to your protected application.

 

2-) Use vSphere DRS with Partial Automated level with MSCS to provide automatic placement of clustered VMs when powered on only. Clustered VMs use SCSI BUS Sharing which requires not to migrate these VMs with vMotion and hence, Automatic DRS load balancing can’t be used. If the vSphere Cluster on which clustered VMs is configured with Automatic DRS, change VMs-specific DRS configuration to Partially Automated.

 

3-) Affinity Rules:

With Cluster-in-a-box configuration, use VMs Affinity Rule to gather all clustered VMs together on the same host. With Cluster-across-boxes or Physical-Virtual Cluster, use VMs Anti-affinity Rule to separate the VMs across different Hosts. HA doesn’t respect VM Affinity/Anti-affinity rules and when a host fails, HA may violate these rules. In vSphere 5.1, configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity rules. In vSphere 5.5, configure the vSphere Cluster with both “ForeAffinePowerOn” & “das.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Anti-affinity rules respectively.

 

4-) Try to use VM Monitor to monitor activity of the clustered VMs and restart them in case of OS failure.

 


Performance:

1-) Memory Sizing:

Don’t use Memory Over-commitment on ESXi hosts hosting clustered VMs. Memory Over-commitment may cause small pauses to these VMs which are so sensitive to any time delay and may cause false fail-over process.

 

2-) Storage:

a- SCSI Driver:

SCSI Driver Supported

OS (Windows)

LSI-Logic Parallel

2003 SP1 or SP2 32/64 bit

LSI-Logic SAS

2008 SP2 or 2008 R2 SP1 32/64 bit

LSI-Logic SAS

2012 (vSphere 5.5.x) or 2012 R2 (vSphere 5.5 U1 or later)

Keep in mind that, you have to use different SCSI drivers for both of Guest OS Disk and shared Quorum Disk, i.e. both of SCSI (0:x) and (1:x).


b- Disk Types for OS Disks :

For OS disks in clustered VMs, it’s recommended to use Thick-provisioned Disks instead of Thin ones for max. performance.

 

c- Disk Types Supported for Shared Quorum Disk:

vSphere Version

Cluster Configuration Type

OS (Windows)

Disk Type

SCSI BUS Sharing

vSphere 5.x

Cluster-in-a-box
(Recommended Configuration)

2003 SP1 or SP2

2008 SP2 or 2008 R2 SP1

Eager-Zeroed Thick-Provisioned Virtual Disk (.vmdk): Local/on Fiber SAN.

Virtual

Cluster-in-a-box

2003 SP1 or SP2

2008 SP2 or 2008 R2 SP1

Virtual-mode RDM Disk: on Fiber SAN.

Virtual

Cluster-across-boxes

(Recommended Configuration)

2003 SP1 or SP2

2008 SP2 or 2008 R2 SP1

Physical-mode RDM Disk on Fiber SAN.

Physical

Cluster-across-boxes

2003 SP1 or SP2

Virtual-mode RDM Disk: on Fiber SAN.

Physical

Physical-Virtual

2003 SP1 or SP2

2008 SP2 or 2008 R2 SP1

Physical-mode RDM Disk on Fiber SAN.

Physical

vSphere 5.5 Only

Cluster-in-a-box
(Recommended Configuration)

2008 SP2 or 2008 R2 SP1 or 2012 or 2012 R2 (2012 R2 requires vSphere 5.5 U1)

Eager-Zeroed Thick-Provisioned Virtual Disk (.vmdk): Local/on iSCSI/FCoE SAN.

Virtual

Cluster-in-a-box

2008 SP2 or 2008 R2 SP1 or 2012 or 2012 R2 (2012 R2 requires vSphere 5.5 U1)

Virtual-mode RDM Disk: on iSCSI/FCoE SAN.

Virtual

Cluster-across-boxes

(Recommended Configuration)

2008 SP2 or 2008 R2 SP1 or 2012 or 2012 R2 (2012 R2 requires vSphere 5.5 U1)

Physical-mode RDM Disk on iSCSI/FCoE SAN.

Physical

Physical-Virtual

2008 SP2 or 2008 R2 SP1 or 2012 or 2012 R2 (2012 R2 requires vSphere 5.5 U1)

Physical-mode RDM Disk on iSCSI/FCoE SAN.

Physical

Keep in mind that: - In-Guest iSCSI target sharing for Quorum Disk is supported for any type of clustering configuration and any OS.

- vSphere 5.5.x supports in-guest FCoE target sharing for Quorum Disk.

- Mixing between Cluster-across-box/Cluster-in-box configuration isn’t supported as well as mixing between different verison of vSphere in a single cluster.

- Mixing between different types of storage protocols connecting Quorum Disk isn’t supported, i.e. first node connected to Quorum Disk using iSCSI and the second is connected using FC.

- Mixing between different types of initiators for a storage protocol is supported only on vSphere 5.5.x, i.e. Host 1 can connect using Software iSCSI and Host 2 can connect using HW iSCSI Initiator. Same goes for FCoE.


d- Set shared RDM LUN on which Quorum Disk is placed as Perennially Reserved on each host participating, to prevent long time duration of starting of any ESXi host participating or hosting a clustered VM.

Check the following KB for more information:

VMware KB: ESXi/ESX hosts with visibility to RDM LUNs being used by MSCS nodes with RDMs may take a long time to sta…


e- Storage Array Multi-pathing Policy:

For clustered VMs configuration on vSphere 5.1 that requires FC SAN, certain Multi-pathing policy must be set to configure how ESXi Hosts connect to that FC SAN:

Multi-pathing Plugin

SAN Type

Path Selection Policy

NMP

Generic

Round Robin

using SATP: ALUA_CX

EMC Clariion

EMC VNX

Fixed

using SATP: ALUA

IBM 2810XIV

MRU

using SATP: Default_AA

IBM 2810XIV
Hitachi

NETAPP Data ONTAP 7-Mode

Fixed

using SATP: SYMM

EMC Symmetrix

Fixed

In vSphere 5.5 or above, this issue was resolved according to both ofVMware KB: Using the PSP_RR path selection policy with MSCS results in quorum disk problems &VMware KB: MSCS support enhancements in vSphere 5.5.

 

3-) Guest Disk IO Timeout:

From Guest OS, it’s recommended to change Disk IO Timeout to more than 60 seconds from the following registry key:

“HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\TimeOutValue”.

 

 

4-) Network:

a- You should choose the latest vNICs available to the Guest OS. The most preferred is VMXNET3 for both Private and Public networks. It gives highest throughput with least latency and least CPU overhead.

b- Try to set your port groups with -at least- 2 physical NICs for redundancy and NIC teaming capabilities. Connect each physical NIC to a different physical switch for max. redundancy.

c- Consider network separation between different types of networks, like: vMotion, Management, Production, Fault Tolerance, etc. Network separation is either physical or logical using VLANs.

d- Clustered VMs should have two vNICs, one for public network and the other one for heartbeat network. For Cluster-across-boxes, configure heartbeat network port group with two physical NICs for redundancy.


 

Manageability:

1-) Time Sync:
Time Synchronization is one of the most important things in SQL environments. It’s recommended to do the following:

a- Let all your SQL VMs sync their time with DC’s only, not with VMware Tools.

b- Disable time-sync between SQL VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page, VM can sync with the Host using VMware Tools in case of startup, resume, snapshotting, etc.) according to the following KB:

VMware KB: Disabling Time Synchronization  

c- Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain.

 

2-) Supported OS’s and Number of Nodes:

No. of Nodes

OS (Windows)

2 Nodes

2003 SP1 or SP2 32/64 bit and per vSphere 5.1 hypervisors.

2 Nodes

FCoE SAN hosting Quorum Disk with vSphere 5.1 U2 and Windows 2008/2012.

5 Nodes

2008 SP2 or 2008 R2 SP1 32/64 bit

5 Nodes

Windows 2012 (vSphere 5.5.x) or Windows 2012 R2 (vSphere 5.5 U1 or later)

5 Nodes

FC SAN hosting Quorum Disk with vSphere 5.1 U2 and Windows 2012

 

 

Recoverability:

1-) Try to maintain a proper backup/restore plan. This helps in case of total corrupt of a cluster node which requires a full restore on bare metal/VM. Keep in mind also to continuously test restoring your backup sets to test their effectiveness.

2-) Try to maintain a proper DR/BC plan. Clustering Configurations would not help a lot in case of total data center failure situation. Try to test your DR/BC plan from time to time, at least twice per year.

 

 

Security:

1-) All security procedures done for securing physical Microsoft Clusters should be done in Clustered VMs, like: Role-based Access Policy.

2-) Follow VMware Hardening Guide (v5.1/v5.5) for more security procedures to secure both of your VMs and vCenter Server.


 

Scalability:

For greater scalability, try to upgrade your clustered VMs to Windows Server 2012. With vSphere 5.5.x and Windows Server 2012, Quorum Disk can be hosted on iSCSI or FCoE SAN. Issue of using Round Robin PSP is solved (under certain conditions mentioned in thisKB).

 

 

I know that your mind is twisted by this line , but it's MSCS as we know and unfortunately, it carries the same hard configuration with it to virtual world on vSphere 5.1/5.5. I hope that this guide is able to make it easy -even a little- to configure your Microsoft Cluster on vSphere. For more details or further explanation, refer to References section.

Share the knowledge ...

 

References:

-- Virtualizing MS Business Critical Applications by Matt Liebowitz and Alex Fontana.

-- Virtualizing MS Clustering Services on vSphere 5.1.
-- Virtualizing MS Clustering Services on vSphere 5.5.

-- vSphere Design Sybex 2nd Edition by Scott Lowe, Kendrick Coleman and Forbes Guthrie


Viewing all articles
Browse latest Browse all 3135

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>