Description: This document outlines considerations around optimisation of vCenter Server instances and best practice recommendations to maximise performance of your vCenter ecosystem. Each item listed should be addressed in the context of the target environment as there is no one solution to optimise the vCenter management environment. The following is simply a list of recommendations that should, to some extent, improve performance in most environments.
Prerequisites:
• Microsoft Windows vCenter Server
• Microsoft SQL Database Instance
Part 1 – vCenter Server
Description | Recommendations / Best Practices / KB Links |
Virtual Server Sizing | Ensure that the vCenter virtual system(s) are sized accordingly based in the inventory size. Where vCenter components are separated and distributed across multiple virtual machines ensure that all systems meet the sizing recommendation set out in the installation and configuration documentation https://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html (vSphere 5.1 ) |
Distribute vCenter Services across multiple virtual machines | Depending on inventory size, multiple virtual machines can be used to accommodate different vCenter roles. VMware recommends separating VMware vCenter, SSO Server, Update Manager and SQL for flexibility during maintenance and to improve scalability of the vCenter management ecosystem. |
Dedicated Management Cluster | For anything other than the smallest of environments, VMware recommends separating all vSphere management components onto a separate out-of-band management cluster. The primary benefits of management component separation, include: · Facilitating quicker troubleshooting and problem resolution as management components are strictly contained in a relatively small and manageable cluster. · Providing resource isolation between workloads running in the production environment and the actual systems used to manage the infrastructure. · Separating the management components from the resources they are managing. |
vCenter to Host operational latency | The number of network hops between the vCenter Server and the ESXi host affects operational latency. The ESXi host should reside as few network hops away from the vCenter Server as possible. |
vCenter to SQL Server operational latency | The number of network hops between the vCenter Server and the SQL database affects operational latency. Where possible, vCenter should reside on the same network segment as the supporting database. If appropriate, configure an affinity rule to ensure that the vCenter Server and database server reside on the same ESXi host, reducing latency still further. |
Java Max Heap Size | Ensure that the max heap size for Java virtual machine is set correctly based on the inventory size. Confirm heap size on JVM Heap settings on vCenter, Inventory Service, SSO and Web Client are checked. Monitor Web Services to verify. (vSphere 5.1) |
Client Connections | Attempt to limit the number of clients connected to vCenter Server as this affects its performance. This is particularly the case for the traditional Windows C# client. |
Performance Monitoring | Use performance monitoring tools to ensure the health of the vCenter eco system and troubleshot problems as they arise. Where appropriate, configure a vC Ops Custom Dashboard for vCenter/Management components. Also ensure appropriate alerts and notifications on performance monitoring tools exist. |
Virtual disk type | All management virtual machine VMDK’s should be provisioned in an eagerZeroedThick format. This provides approximately a 10-20 percent performance improvement over the other two disk formats. |
vCenter vNIC type | Ensure to employ the VMXNET3 paravirtualized network adaptor to maximise network throughput, efficiency and reduce latency. |
ODBC Connection | Ensure that the vCenter and VUM ODBC connections are configured with the minimum permissions required for daily operations. Additional permissions are required during installation and upgrade activities but not for day to day operations. Please refer to the Service Account Permissions section below. |
vCenter Logs Clean Up | vCenter server has no automated way of purging old vCenter Log files. These files can grow and consume significant disk space on the vCenter Server. Consider a 3/6 monthly scheduled task to delete or move log files older than the period of time defined by business requirements. The example, the VBscript below could be used to clean up old log files from vCenter. This script delete files that are older than a fixed number of days, defined in line 9, from the path set in line 6. This VBscript can be configured to run as a scheduled task using the windows task scheduler. Dim Fso Dim DirectoryDim Modified Dim FilesSet Fso = CreateObject("Scripting.FileSystemObject") Set Directory = Fso.GetFolder("C:\ProgramData\VMware\VMware VirtualCenter\Logs\") Set Files = Directory.Files For Each Modified in Files If DateDiff("D", Modified.DateLastModified, Now) > 180 Then Modified.DeleteNext For more information, refer to KB article: KB1021804Location of vCenter Server log files. For additional information on modifying logging levels in vCenter please refer toKB1004795 and KB1001584. Note: Once a log file reaches a maximum size it is rotated and numbered similar to component-nnn.log files and they may be compressed. |
Statistics Levels | Statistic collection intervals determine the frequency at which statistic queries occur, the length of time statistical data is stored in the database, and the type of statistical data that is collected. As historical performance statistics can take up to 90% of the vCenter server database size, they are the primary factor in the performance and scalability of the vCenter Server database. You can view the collected historical statistics through the performance charts in the vSphere Web Client, through the traditional Windows Client or through command-line monitoring utilities for up to 1 year after the data was first ingested into the database. Ensure that statistics collection times are set as conservatively as possible so that the system does not become overloaded. At the same it is equally important to ensure that the retention of this historical data meets the customer’s data compliance requirements. Because the statistics data consumes such a large proportion of the database, proper management of vCenter statistics is an important consideration for the overall database health. In addition, the processing of this data through a series of rollup jobs to stop the SQL server becoming overloaded, is also a key consideration for vCenter Server performance. For Instance: Set new DB Data Retention Period of 60 Days Configure DB to not retain performance data beyond 60 days. |
Task and Events Retention | Ensure that Task and Events Retention levels are set as conservatively as possible whilst meeting the customer’s data retention requirements. Every time a task or event that is executed via vCenter, this is stored in the database. For example a task is created when a user powers on or off on a virtual machine or an event when something occurs such as vCPU usage for a VM changing to red. vCenter Server has a Database Retention Policy setting that allows you to specify after how long vCenter Server tasks and events should be deleted. This correlates to a database rollup job that purges the data from the database after the selected period of time. Whilst compared to statistical data these tables consume a relevantly small amounts database space, it is good practice to consider this option for further database optimisation. For Instance, by default, vCenter is configured to store tasks and events data for 180 days. However, it may be possible, based on the customer’s compliance requirements, to configure vCenter not to retain Event and Task Data in the database beyond 60 days. |
vCenter Server Backup Best Practice | In addition to scheduling regular backups of the vCenter database, the backups for the vCenter Server should also include the SSL certificates and license key information. |
Part 2 – SQL DB Server
SQL Database Server Disk Configuration | The database data file generates mostly random I/O, while database transaction logs generate mostly sequential I/O. The traffic for these files is almost always simultaneous so it’s preferable to keep these files on two separate storage resources that don’t share disks or I/O. Therefore, where a large inventory demands it, ensure that the vCenter Server database uses separate drives for data and logs which, in turn, are backed by different physical disks. |
tempDB Separation | For large inventories place tempDB on a different drive, backed by different physical disks than the vCenter database files or transaction logs. |
Reduce Allocation Contention in SQL Server tempDB database | Use multiple data files to increase the I/O throughput to tempDB. Configure 1:1 alignment between TempDB files and vCPUs (up to eight) by spreading TempDB across at least as many equal sized files as there are vCPUs. For instance, where 4 vCPUs exist on the SQL server, create three additional TempDB data files, and make them all equally sized. They should also be configured to grow in equal amounts After changing the configuration, restart the SQL Server instance. For more information please refer to: http://support.microsoft.com/kb/2154845 |
Database Connection Pool | vCenter server starts, by default, with a database connection pool of 50 threads. This pool is then dynamically sized according to vCenter workload. If high load is expected due to a large inventory, then the size of the pool can be increased to 128 threads. This will increase memory consumption and load time of the vCenter Server. To change the pool size, edit the vpxd.cfgfile, adding: Where ‘128’ is the number of connection threads to be configured.
|
Table Statistics | Update statistics of the tables and indexes on a regular basis for better overall performance of the database. Create an SQL job to carry out this task or alternatively it should form part of a vSphere database maintenance plan. |
Index Fragmentation (Not Applicable to vCenter 5.1 or newer) | Check for fragmentation of index objects and recreate indexes if needed. This happens with vCenter due to statistic roll ups. Defragment after <30% fragmentation. See thisKB1003990. Note: With the new enhancements and design changes made in the vCenter Server 5.1 database and later versions, this is no longer applicable or required. |
Database Recovery Model | Set the transaction logs to SIMPLE recovery. This model will reduce the disk space needed for the logs as well decrease I/O load. Choosing the Recovery Model for a Database: http://msdn.microsoft.com/en-us/library/ms175987(SQL.90).aspx How to view or Change the Recovery Model of a Database in SQL Server Management Studio http://msdn.microsoft.com/en-us/library/ms189272(SQL.90).aspx |
Virtual Disk Type | Where the vCenter database server is a virtual machine, ensure that all VMDK’s are provisioned in an eagerZeroedThick format. This option provides approximately 10-20 percent performance improvement over the other two disk formats. |
Verify SQL Rollup Jobs | Ensure SQL Agent rollup jobs have been created on SQL during the vCenter Installation. For instance:
For the full set of stored procedures and jobs please refer to the appropriate article below. If necessary, recreate MSSQL agent rollup jobs. Note that detaching, attaching, importing, and restoring a database to a newer version of MSSQL Server does not automatically recreate these jobs. To recreate these jobs, if missing, please refer to: KB1004382. KB 2033096 (vSphere 5.1 & 5.5) http://kb.vmware.com/kb/2033096 KB 2006097 (vSphere 5.0) http://kb.vmware.com/kb/2006097 Ensure that the myDB references the vCenter Server database and not the master or some other database. If these jobs reference any other database, you must delete and recreate the jobs. |
Ensure database jobs are running correctly | Monitor scheduled database jobs to ensure they are running correctly. For more information, refer to KB article: Checking the status of vCenter Server performance rollup jobs: KB2012226 |
Verify MSSQL Permissions | Ensure that the local, SQL and AD Permissions required are in place and align with the principle of least privilege (see below) |
If necessary, truncate all unrequired performance data from the database (Purging Historical Statistical Performance Data) | For more information, refer to KB article: Reducing the size of the vCenter Server database when the rollup scripts take a long time to run KB1007453 To truncate all performance data from vCenter Server 5.1 and 5.5: Warning:This procedure permanently removes all historical performance data. Ensure to take a backup of the database/schema before proceeding. |
Shrink Database | After purging historical data from the database optionally shrink the database. This is an online procedure to reduce the database size and to free up space on the VMDK, however, this activity will not in itself improve performance. Shrinking the size of the VMware vCenter Server SQL database KB1036738 |
Rebuilding indexes to Optimize the performance of SQL Server | Configure regular maintenance job to rebuild indexes. KB2009918 |
VPX_HIST_STAT Table Sizes | VMware recommend a fill factor of 70% for the 4 VPX_HIST_STAT tables. If this is too high for resources on the server, then it will need to take time splitting pages, which equates to additional I/O. If you are experiencing high unexplained I/O in the environment, monitor the SQL Server Access Methods object: Page Splits/sec. Page splits are expensive, and cause your table to perform more poorly due to fragmentation. Therefore, the fewer page splits you have the better your system will perform. By decreasing the fillfactor in your indexes, what you are doing is increasing the amount of empty space on each data page. The more empty space there is, the fewer page splits you will experience. On the other hand, having too much unnecessary empty space can also hurt performance because it means that less data is stored per page, which means it takes more disk I/O to read tables, and less data can be stored in the buffer cache. High Page Splits/sec will result in the database being larger than necessary and having more pages to read during normal operations. Determining where growth is occurring in the VMware vCenter Server database (1028356) http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1028356 Troubleshooting VPX_HIST_STAT table sizes in VMware vCenter Server 5.1KB2038474 Reducing the size of the vCenter Server database when the rollup scripts take a long time to runKB1007453 |
Monitor Database Growth | Monitor database growth over a period of time to ensure the database is functioning as expected. For more information, refer to KB article: Determining where growth is occurring in the vCenter Server database KB1028356 |
schedule and verify regular database backups | The vCenter, SSO, VUM and SRM servers are by themselves stateless. The databases themselves are far more critical since they store all the configuration and state information for each of the management components. These databases must be backed-up nightly and the restore process of each database needs to be tested periodically. Ensure that a schedule of regular backups exists of the vCenter database and based on requirements of the business, restore and mount databases from backup periodically onto a non-production system to ensure a clean recovery is possible, should database corruption or data loss occur in the production environment. |
Create a Maintenance Plan for vSphere databases | Work with your DBA’s to create a daily and weekly database maintenance plan. For Instance: · Check Database Integrity · Rebuild Index · Update Statistics · Back Up Database (Full) · Maintenance Cleanup Task Note: DO NOT SHRINK DB IN MAINTENACE PLAN UNLESS THERE IS A SPECIFIC REQUIREMENT TO RECLAIM DISK SPACE |
Part 3 - Service Account Permissions (Least Privilege)
vCenter Service Account | Required by the ODBC Connection for access to the database. The vCenter service account must be configured with dbo_owner privileges for normal operational use. However, the vCenter database account being used to make the ODBC connection also requires the db_ownerrole on the MSDB System database during installation or upgrade of the vCenter Server. This permission facilitates the installation of SQL Agent jobs for vCenter statistic rollups. Typically the DBA would only grant the vCenter service account the db_owner role on the MSDB System database when installing or upgrading vCenter, then revoke that role when these activities are complete. |
RSA_DBO | Only Required for SSO 5.1 the RSA_DBA account is a local SQL account which is used for creating the schema (DDL) and requires dbo_owner permissions. |
RSA_USER | Only Required for SSO 5.1, the RSA_USER reads and writes data (only DML). |
VUM Service Account | Despite being a 64bit application, VUM requires a 32bit ODBC connection from “C:\Windows\SysWOW64\odbcad32.exe”. The VUM service account must be provide the dbo_owner permission on the VUM DB. The installation of vCenter Update Manager 5.x with a Microsoft SQL back end database also requires the ODBC connection account to temporarily have db_owner permissions on the MSDB System database. This was a new requirement in vSphere 5.0. As with the vCenter service account, typically the DBA would only grant the VUM service account the db_owner role for the MSDB System database when installing or upgrading the VUM component of vCenter, then revoke that role when this task is complete. |
SRM Service Account | Despite being a 64bit application, SRM requires a 32bit ODBC connection from “C:\Windows\SysWOW64\odbcad32.exe”. The SRM service account must be configured with the dbo_owner privilege on the SRM database. |