Quantcast
Channel: VMware Communities : Blog List - All Communities
Viewing all articles
Browse latest Browse all 3135

vSphere 5.x Notes & Tips - Part VII: vSphere Datastore Configuration - Advanced/Troubleshooting:

$
0
0

Hi All ...
In the seventh part of our series, we'll go through many advanced configurations of vSphere Datastores, like: Datastores Clusters, SIOC & Storage vMotion.

Kindly, concentrate well as it's one of the long parts in my series.

Credits:

  • Cormac Hogan.
  • Duncan Epping
  • Chad Sakac
  • Frank Denneman
  • Ashraf Al-Dabbas
  • Michael Webster
  • Scott Lowe

Now, Let's Start...

 

 

1. Locking Mechanisms:

At certain time, only one host controls the storage array when it change the array metadata. This is when Locking Mechanisms are required. First of them is SCSI Reservation in which an ESXi host lock all the datastore to modify its metadata. This can happen in many operations, like:

1-) Creation, deleting or modifying VMDK files.

2-) Modifying VMFS datastore.

3-) Creation or deletion of snapshots.

In case of heavy VMFS updates or snapshots activity, SCSI Reservation Conflict can happen which will lead to performance loss. This can be reduced by VAAI and here comes the second one, Atomic Test and Set. Used only on VAAI-aware storage arrays, it makes an ESXi host lock only a disk sector instead of entire datastore. It’s used only on VMFS-5 datastores with VAAI capabilities.

A nice official article on VMware blog by Cormac Hogan is describing it clearly:

VMFS Locking Uncovered | VMware vSphere Blog - VMware Blogs

 

 

2. Volume Extent:

Official article from VMware by Cormac Hogan describing VMFS Volume Extents:

VMFS Extents - Are they bad, or simply misunderstood? | VMware vSphere Blog - VMware Blogs

 

 

3. Boot from SAN:

Boot from SAN is only available when ESXi host is connected to SAN storage using Hardware iSCSI Initiator.

 

 

4. Adding a Snapshot/a Clone of a LUN:

When adding a clone (instead of a failed original), you keep the original signature of the LUN. In case of a snapshot, you change the original signature to a new one.

 

 

5. Storage Queues:

It represents the number of waiting IO requests to be processed. These IO requests are stored in many places like:

1-) Host Kernel: This queue is adjustable and has default length of 32 IO request.

2-) Storage Array: This queue depends on storage itself.

The more the queue, the more is the latency.

The following nice article by Duncan Epping describes it clearly:

No one likes queues - Yellow Bricks

Another deep article by Chad Sakac is also nice:

VMware I/O queues, micro-bursting, and multipathing - Virtual Geek

Last, Frank Denneman wrote a really nice article about how to calculate the optimized queue depth:

Increasing the queue depth? - frankdenneman.nl

 

 

6. Datastores Clusters & Storage Distributed Resource Scheduler (SDRS):

Datastores clusters are similar to hosts cluster but with datastores instead of hosts. Datastore Clusters must be homogeneous, i.e. either NFS datastores or VMFS datastores. Storage DRS (SDRS) is a feature similar to DRS but on storage level and for storage size and IO latency. It leverages Storage vMotion (SvMotion) to move VMs’ disks between datastores to control storage size and IO latency. This nice and simple article by Ashraf Al-Dabbas is a nice entry for Datastores Clusters & SDRS:

VMware Storage DRS | Datastore Clusters

The following series of articles by Frank Denneman deep dives in Storage Clusters, types of Storage Clusters (Fully & Partially connected) and Storage DRS (SDRS):

Architecture and design of Datastore clusters - frankdenneman.nl

Storage DRS Partially connected datastore clusters

Impact of load balancing on datastore cluster configuration - frankdenneman.nl

Storage DRS and Multi-extents datastores - frankdenneman.nl

Connecting multiple DRS clusters to a single Storage DRS datastore cluster. - frankdenneman.nl

Aggregating datastores from multiple storage arrays into one Storage DRS datastore cluster. - frankdenneman.nl

Last, keep in mind that SDRS always tries to keep both storage space utilization and IO latency below certain threshold, so SDRS measures IO latency every 8 hrs. and space utilization is measured every 5 min. SDRS action is taken every 8 hrs. (Confirmation needed)

 

 

7. Storage vMotion Datamovers in Different Types of Datastores:

A Data Mover is the component inside VMKernel that moves data packets from place to place when doing Storage vMotion or any other types of storage migrations (clones). There’re three types of vSphere Datamovers: FSDM, FS3DM & FS3DM-HW Offload.

The following article by Duncan Epping is describing the difference between them from SAN (VMFS-based Type) datastore perspective:

Blocksize impact? - Yellow Bricks

From NAS (NFS-based Type) datastore perspective, read the following article by Michael Webster:

VMware Storage vMotion, Data Movers, Thin Provisioning, Barriers to Monster VMs - Long White Virtu…



8. vMotion and Storage vMotion Cost Rules:

VMware uses concept of “Cost” for each vMotion or SvMotion operation takes place. Each host, datastore or physical NIC has its own balance from which each “Cost” is withdrawn with each vMotion or SvMotion operation takes place. This is for controlling how many operations can take place simultaneously and hence comes the limit of eight simultaneous vMotion operation and only two simultaneous SvMotion operations per host and some other limits.

For more details, check the following articles about vMotion and Storage vMotion Cost Rules by Frank Denneman on his blog:

Limiting the number of Storage vMotions - frankdenneman.nl

Limiting the number of concurrent vMotions - frankdenneman.nl

Keep in mind that, as stated by Frank himself, changing these limits are not supported by VMware.

 

 

9. Storage IO Control (SIOC):

Storage IO Control is a new feature since vSphere 4.1 which distribute the available storage resources to virtual machines in proportion to their shares. The following Technical Paper from VMware is about SIOC in vSphere 4.1 and unfortunately it isn’t updated to vsphere 5.5:

http://www.vmware.com/files/pdf/techpaper/VMW-vSphere41-SIOC.pdf

Next, a nice article by Duncan Epping about SIOC:

Storage I/O Control, the basics

For some enhancements done on SIOC in vSphere 5.1, read the following articles by Cormac Hogan:

vSphere 5.1 Storage Enhancements - Part 8: Storage I/O Control | CormacHogan.com

Storage I/O Control - Workload Injector Behaviour | CormacHogan.com

Last, the following article describes the relation between SIOC and Storage DRS and how it co-operates in vSphere 5.1 and it’s applied to vSphere 5.5 too (confirmation needed):

vSphere 5.1 Storage DRS load balancing and SIOC threshold enhancements - frankdenneman.nl

 

 

10. Pluggable Storage Architecture (PSA):

PSA Framework is an important part inside VMKernel which is responsible for storage devices (except NFS-based) connection (claiming), multi-pathing, IOps routing and some other things. It’s designed to integrate with any other framework produced by any HW vendor to use max. capabilities of any storage device.

The following series of PSA Deep Dive by Cormac Hogan is really great, as he explains every single detail of it clearly:

Pluggable Storage Architecture (PSA) Deep-Dive: Part 1 | CormacHogan.com

Pluggable Storage Architecture (PSA) Deep-Dive: Part 2 | CormacHogan.com

Pluggable Storage Architecture (PSA) Deep-Dive: Part 3 | CormacHogan.com

Pluggable Storage Architecture (PSA) Deep-Dive: Part 4 | CormacHogan.com

Automating the IOPS setting in the Round Robin PSP | CormacHogan.com

The following VMware blog article, also by Cormac Hogan, is summarizing how PSA works to route an IO operation:

Path failure and related SATP/PSP behaviour | VMware vSphere Blog - VMware Blogs

I summarized the previous blog in the following points, describing how PSA framework works to route an IO operation from a storage device managed by the default Native Multi-pathing Plugin (NMP):

1-) When IO request is out from a VM, Native Multi-pathing Plugin (NMP) calls for Path Selection Plugin (PSP) assigned to the storage to handle it.

2-) PSP determines which path will be used to handle this IO request.

3-) If the IO operation is finished and a (Successful) is reported, NMP reports a (Successful) operation and another IO request is handled.

4-) If the IO operation is finished and a (Failed/Error) is reported, NMP calls Storage Array type Plugin (SATP) to interpret error code.

5-) SATP finished interpreting error code and activating in-active one if needed.

6-) NMP calls again PSP to handle that IO request again, select new path rather than dead one and the process is retried.

 

 

11. vSphere APIs for Array Integration (VAAI):

This API is used for integration between some storage devices models and ESXi hosts to offload some storage operations to be executed on storage level, not on hosts’ level. This gives much better performance and reduce CPU overhead. The following white paper from VMware is about VAAI:

http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-Storage-API-Array-Integration.pdf

Another KB article from VMware about VAAI covering a lot of FAQs:

VMware KB: Frequently Asked Questions for vStorage APIs for Array Integration

VAAI can be used in the following operations:

1-) Acceleration of some processes, like: Cloning, Creating VMs from Templates, writing to Thick or Thin disks and Storage vMotion.

2-) Responsible for VMFS clustered locking (ATS Locking) and metadata operations for VMs.

3-) Creating Thick or Thin disks on NFS Storages (requires software plugin to be installed on ESXi hosts).

VAAI can’t be used in the following cases:

1-) Datastore is across different types of arrays (by using extents).

2-) Converting from Thick-provisioned to Thin-provisioned while cloning from template.

3-) Source and Destination VMFS files are with different block size.

4-) Cloning VMs with snapshots or View replicas.

5-) Storage vMotion between two datastores not on the same array.

For VAAI full operation, the following advanced settings must be set:

                        Adv. Option

Value

Description

DataMover/ HardwareAcceleratedInit

1

Responsible for HW Accelerated creating of VMFS files using VAAI, like: creating Thick-provisioned disks.

DataMover/ HardwareAcceleratedMove

1

Responsible for HW Accelerated moving of VMFS files using VAAI, like: Storage vMotion.

VMFS3/ HardwareAcceleratedLocking

1

Responsible for HW Accelerated locking system (Atomic Test & Set - ATS).

 

 

12. vSphere APIs for Storage Awareness (VASA):

This API is used for present storage array capabilities to ESXi hosts connected to it. This can be used when implementing Storage Profiles which mainly depend on either VASA-presented capabilities or user-defined capabilities.

Scott Lowe wrote a nice article about VASA in the following link:

A Deeper Look at VASA - blog.scottlowe.org



13. Storage Filters:

Storage Filters are some advanced settings set on vCenter Server level which control presenting of LUNs to ESXi hosts. This can be useful in some special cases.

The following article by VMware is describing these advanced settings and their syntax:

vSphere 5.5 Documentation Center - Storage Filtering

Also, Duncan Epping has his nice deep-diving article about Storage Filters as well:

Storage Filters - Yellow Bricks

Last, this article is also by Duncan Epping for only Host Rescan Filter:

Automatic rescan of your HBAs.... - Yellow Bricks

 

 

14. Troubleshooting iSCSI Storage Issues:

These KB articles published by VMware are for troubleshooting iSCSI Storage connection issues:

VMware KB: Troubleshooting iSCSI LUN connectivity issues on ESX/ESXi hosts

VMware KB: Troubleshooting ESXi/ESX connectivity to iSCSI arrays using software initiators

VMware KB: Troubleshooting ESX and ESXi connectivity to iSCSI arrays using hardware initiators

 

 

15. Troubleshooting NFS Storage Issues:

The following KB articles published by VMware are for troubleshooting NFS storage issues:

VMware KB: Troubleshooting connectivity issues to an NFS datastore on ESX and ESXi hosts

   VMware KB: Using nfsstat3 to troubleshoot NFS error: Failed to get object: No connection

 

 

Share the Knowledge ....


Previous: vSphere 5.x Notes & Tips - Part VI:

Next: vSphere 5.x Notes & Tips - Part VIII:


Viewing all articles
Browse latest Browse all 3135

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>