Hi All
In this part, we'll talk about vSphere Networking, only basic concepts and best practices guides. I hope you enjoy it.
Credits:
- Mohammed Raffic
- Mike Da Costa
- Sreekanth Setty
- Frank Denneman
- Rickard Nobel
- Scott Lowe
- Jason Nash
Let's Start......
1. Although, it may appear simple, but it made me confused in the beginning. Physical NICs of vSphere Hosts (pNICs) are called in Networking Section vmnics, when virtual NICs of VMs are called vnics.
2. Concept of Switching in vSwitch:
Any virtual switch (vSwitch) doesn't learn MAC addresses. It only saves MAC addresses of VMs connected to it and their ports in its table. When a packet comes with a Destination MAC not in that table it forwards the packet directly to uplinks’ ports. So, there’s no need for Spanning Tree protocol and there’s no possibility of Denial of Service (DoS) attacks or Spanning Tree Protocol attacks. That’s why it’s recommended to configure all ports connected to a vSphere host as Edge Port, so that once it’s connected, it begins to send and receive packets directly without a time gap for Spanning Tree Protocol calculations which helps for fast media disconnection/connection state transition fast.
3. vSphere Distributed Switch (vDS):
Advanced type of switches in vSphere. Consists of two components:
1-) Data Plane (IO Plane): Consists of many hidden vSphere Standard Switches (vSS) on every host that is a part in vDS. Responsible for data flow between hosts through the vDS.
2-) Management Plane (Control Plane): Integrated inside vCenter. Responsible for all configuration and management of vDS.
ESXi Hosts can be connected to several vDS’s.
The following official document by VMware gives full technical information about vDS:
http://www.vmware.com/files/pdf/techpaper/vsphere-distributed-switch-best-practices.pdf
4. vSwitch Security Policy:
This is about Security Policies on both of vSphere Standard Switch (vSS) and vSphere Distributed Switch (vDS). The following official article by VMware is describing all of them:
vSphere 5.5 Documentation Center - Security Policies
This is table is just a summary of it:
Promiscuous Mode | If Accepted, all data is exposed to any type of sniffing on any different level accordingly (Switch: exposed to all VMs/ Port Group: exposed to all VMs in this PG, etc.) |
Forged Transmits | If Rejected, ESXi server drops the packet only if it contains Source MAC Address different from VM’s original MAC Address. It controls only outbound traffic. |
MAC Address Changes | If Rejected, ESXi server drops any inbound packets for a VM whose Guest OS changed its MAC Address to a different value than recorded in .vmx file. It controls only inbound traffic. |
Another KB article by VMware for explaining how Promiscuous Mode works:
VMware KB: How promiscuous mode works at the virtual switch and portgroup levels
5. Port Binding Types:
This official KB published by VMware explains all about Port Binding types:
VMware KB: Choosing a port binding type in ESX/ESXi
Keep in mind the following, Ephemeral Type is the best solution for VDI port Groups. It’s not dependent on vCenter Server and it scales out/in according to the number of online virtual desktops.
6. VLAN Configuration: External Switch Tagging (EST) vs. Virtual Switch Tagging (VST) vs. Virtual Guest Tagging (VGT):
This for describing different methods of VLAN tagging in vSphere 5.1 Networking. The following article by Mohammed Raffic describing all of them:
http://www.vmwarearena.com/2012/07/vlan-tagging-vst-est-vgt-on-vmware.html
Keep in mind the following:
VLAN ID 0: | Communicate with untagged VLAN traffic (No-VLAN traffic). |
VLAN ID 4095: | It’s equivalent to Trunking in Physical Switches. It listens to any VLAN tagged data, without changing VLAN tag in the data frame. Used when you want to set VLANs from the Guest OS itself (VLAN Gest Tagging – VGT). |
Another thing, VLAN Tagging using DCUI is only used for VLAN tagging of Management Kernel ports (vmk#).
7. Best Practices for VLAN Tagging:
1-) Don’t use native VLAN as any data sent on native VLAN is untagged. When using it, all components will wait for certain VLAN tagging, while the frames are sent without any tags and it’ll lead to huge miscommunication.
2-) Make sure which VLANs are reserved on physical switches used. Some VLANs like VLAN 1 are reserved for legacy communication which not recommended for being used.
3-) Change the pre-configured native VLAN on the physical switches used to a VLAN ID not commonly used, like: 3500.
4-) Only trunk VLANs required, but not all VLANs on each port.
8.Traffic Shaping Policy:
For vSphere Standard Switch (vSS): It applies to outbound (egress) traffic only, i.e. traffic that leaves only vSS. Can be applied on the entire vSS or on a single port group.
For vSphere Distributed Switch (vDS): It applies to inbound (ingress) and/or outbound (egress), i.e. traffic that enters and leaves vDS. Can be applied on the entire vDS or a single port group.
Keep in mind that it’s applied per port, wither it’s applied on a single port group or entire switch (confirmation needed).
9. Physical NICs (pNICs) Teaming Policies:
1-) Route based on Source Virtual Port ID Policy:
When a VM is powered-up first time, its vNIC is up and connected to certain port in the vSwitch and that gives the VM its Virtual Port ID, based on which, that VM is hashed to certain pNIC. In case of a VM with many vNICs, the first vNIC is up and connected to certain port in the vSwitch, gives that VM its Virtual Port ID. So, when this VM communicates with outside world through its many vNICs, it hashes to the same pNIC on the host using this load distribution scheme according to its Virtual Port ID.
VMKernel port has a MAC address, considered as a VM and is the first to come up and it always takes the first Virtual Port ID and hence always distributed to the first pNIC available.
If this mechanism lead to congestion on certain pNIC, the only solution is to vMotion some VMs from host to another in order to be re- distributed to different pNICs.
2-) Route based on Source MAC Address:
Used when VMs have more than 1 vNIC/VM. Hashing to certain pNIC is based on Source MAC Address. Each vNIC MAC Address will be hashed to certain pNIC.
If this mechanism lead to congestion on certain pNIC, the only solution is to vMotion some VMs from host to another in order to be re- distributed to different pNICs.
3-) Route based on IP-Hash Policy:
This blog post by Mike Da Costa and officially released by VMware deep dives inside IP Hash Load-balancing Policy:
Troubleshooting Network Teaming Problems with IP Hash | VMware Support Insider - VMware Blogs
Another KB from VMware summarizes the calculations in Mike’s article:
VMware KB: Troubleshooting IP-Hash outbound NIC selection
4-) Route based on Physical NICs (pNICs) Load Policy (Load-based Teaming Policy - LBT):
This policy is only available through vSphere Distributed Switch (vDS). First, it works like Virtual port ID Policy, i.e. initial hashing is based on Virtual Port ID hashing mechanism. It continues like that till the load on certain pNIC exceeds a certain limit (70%) and for certain time window (30 seconds continuous) then it re-maps certain VM to a new pNIC to decrease congestion state.
The following nice article by Sreekanth Setty and officially released on VMware blog describes this policy in deep details:
VMware Load-Based Teaming (LBT) Performance | VMware VROOM! Blog - VMware Blogs
5-) Some Comparisons:
IP-Hash based Policy vs. pNICs Load based Policy:
This excellent article by Frank Denneman compares both of IP-Hash based Policy and Physical NICs Load based Policy:
IP-Hash versus LBT - frankdenneman.nl
10. Network Failover Detection: Beacon Probing:
Beacon Probing is a way of discovering dead network paths used by virtual siwtches to initialize failover procedure on its physical NICs. Official KB from VMware about Beacon Probing and its limitations:
VMware KB: What is beacon probing?
11. Notify Switches:
This feature is to allow virtual switches to notify uplink physical switches about VMs locations (switch ports connected to ESXi hosts host these VMs), MAC Addresses and any changes happen to VMs networking. It must set to yes in all situations unless using Microsoft Load Balancing Clusters in Unicast Mode. The following article by Rickard Nobel describes it clearly:
The vSwitch Notify Switches setting | Rickard Nobel
12. Best Practice for Physical NICs (pNICs) Connections:
1-) Use redundant pNICs from different cards (onboard, PCI, etc.) on each vSwitch if possible.
2-) Never share pNICs configured for iSCSI/NFS Storage traffic.
3-) Fault Tolerance (FT) doesn’t require redundant pNICs, only one dedicated is enough. When that pNIC fails, vCenter will automatically restart secondary VM on another host for continuous replications with the primary one.
4-) Typically -and if possible- for 1GB pNICs, use 6 pNICs/host when using FC Storage or 8 pNICs/host when using iSCSI storage or NFS. For 10GB pNICs, typically use 2 pNICs/host.
5-) It’s recommended to use Network IO Control (NIOC) when using 10 GB pNICs.
13. Best Practice for Management Traffic:
For Management Traffic (VMKernel, iSCSI, vMotion, etc.), try to separate it from production traffic using VLANs and physical switches. The following article by Scott Lowe is all about separating management traffic and best practices for it:
vSphere Networking Design Presentation - blog.scottlowe.org
And the following table is summarizing all about it:
| VMKernel | vMotion | iSCSI | Fault Tolerance (FT) |
BW Usage | Low | Medium | High | Medium->High+ (depends on number of protected VMs) |
BW Usage Type | Continuous | Discontinuous Bursts | Continuous | Continuous |
Best Practice | Can be mixed with vMotion using different VLANs. | Can be mixed with VM Kernel or FT in some situations using different VLANs. | Separated from any other type of traffic. | Separated in case of high number of protected VMs, else can be mixed with VM Kernel traffic or vMotion using different VLANs. |
14. vSphere Distributed Switch Designing Best Practices:
This technical white paper from VMware is a nice designing and implementing best practices guide for vDS:
http://www.vmware.com/files/pdf/techpaper/vsphere-distributed-switch-best-practices.pdf
15. Using Multiple Port Groups for Logical/Physical Separation of Data on vSphere Distributed Switch (vDS):
On a single vDS, you can logically and physically separate all types of data. For doing so, create the following matrix:
| Port Group | Teaming Policy | Active Uplinks | Stand by Uplinks | Unused Uplinks |
iSCSI | iSCSI - PG1 | None | dvuplink a | None | dvuplink x,y,z |
vMotion | vMotionPG1 | … | … | … | … |
VMs Data | VM-1 | … | … | … | … |
FT | FT PG | … | … | … | … |
…. | … | … | … | … | … |
By filling this matrix, you can design how to separate your data logically into separated Port Groups, each can be tagged with certain VLAN ID and then separated physically using certain physical uplinks for each Port Group.
16. Migrating vSphere Hosts from vSphere Standard Switch (vSS) to vSphere Distributed Switch (vDS) (Walking Hosts over in a Production Environment):
This guide is a summary of Walking-Hosts approach by Jason Nash in his VSOS course demonstration and I modified it a bit:
1-) Create vDS with all its port groups will be needed.
2-) Configure each port group and the required uplinks - in case of physical separation using different uplinks.
3-) Configure each port group options (VLAN, Traffic Shaping, etc.).
4-) Migrate one or two pNICs from vSS on the first host to the vDS created, then create a temp VM Kernel port and assign it to its new port group in vDS in order to begin testing connectivity to that host and configuration of the created vDS. Repeat for each VM Kernel port group used on the vSS on that host. After testing is finished delete those temps or keep according to your need.
5-) Migrate a VM or two to their port group(s) on the vDS and test connectivity to them.
6-) Begin migrating all VM Kernel ports from vSS to their corresponding port groups on the vDS.
7-) Migrate the rest of VMs to their corresponding port groups on the vDS.
8-) Migrate the rest of pNICs on the vSS to the vDS.
9-) Remove vSS.
10-) In case of different hosts configuration skip this step to step 11. After finishing the first host, you can migrate the rest of hosts’ vSSs by one shot using (Add Hosts to vDS) Wizard. You select each host’s pNICs and which uplinks to put, VM Kernel ports and which port groups on vDS to migrate and the same for VMs. This will migrate the rest of VM infrastructure networking to vDS.
11-) If the hosts are differently configured, repeat steps (4-9) to walk other hosts one by one from vSS to vDS.
12-) In case you want to choose which uplink assigned to pNIC, you can add host without anything by using (Add Hosts to vDS) Wizard, select the host only, finish the wizard, select vDS tab in Networking page of the host, select (Manage Physical Adapters) and choose which uplink assigned to which pNIC. But by using that method, you have to repeat steps (4-9) to walk hosts over.
Share the knowledge ....
Previous: vSphere 5.x Notes & Tips - Part VII: