Dear readers
Welcome to a new blog talking about a specific NSX-T Edge Node VM deployment with only a single Edge Node N-VDS. You may have seen the 2019 VMworld session "Next-Generation Reference Design with NSX-T: Part 1" (CNET2061BU or CNET2061BE) from Nimish Desai. On one of his slides he mention how we could deploy a single NSX-T Edge Node N-VDS instead the three Edge Node N-VDS. This new approach (available since NSX-T 2.5 for Edge Node VM) with a single Edge Node N-VDS has the following advantages:
- Multiple TEPs to load balance overlay traffic for different overlay segments
- Same NSX-T Edge Node N-VDS design for VM-based and Bare Metal (with 2 pNIC)
- Only two Transport Zone (Overlay & VLAN) assigned to a single N-VDS
The diagram below shows the slide with a single Edge Node N-VDS from one of the VMware session (CNET2061BU):
However, the single NSX-T Edge Node design comes with an additional requirements respective recommendation:
- vDS port group Trunks configuration to carry multiple VLANs (requirement)
- VLAN pinning for deterministic North/South flows (recommendation)
This blog talks mainly about the second bullet point and how we can achieve the correct VLAN pinning. A correct VLAN pinning requires multiple individual configuration steps at different levels, as example vDS trunk port group teaming or N-VDS named teaming policy configuration. The goal behind this VLAN pinning is a deterministic end-to-end path.
When configured correctly the BGP session is enforced to be aligned with the data forwarding path and hence the MAC addresses from the Tier-0 Gateway Layer 3 Interfaces (LIF) are only learnt at the expected ToR/Leaf switch trunk interfaces.
In this blog the NSX-T Edge Node VMs are deployed on ESXi hosts which are NOT prepared for NSX-T. The two ESXi hosts belong to a single vSphere Cluster exclusively used only for NSX-T Edge Node VMs. There are a few good reason NOT to prepare these ESXi hosts with NSX-T where you host only NSX-T Edge Node VMs:
- It is not required and does not cost you extra licenses
- Better NSX-T upgrade-ability (you don't need to evacuate the NSX-T Edge Node VM during host NSX-T software upgrade with vMotion to enter maintenance mode; every vMotion of the NSX-T Edge Node VM will cause a short unnecessary data plane glitch)
- Shorter NSX-T upgrade cycles (for every NSX-T upgrade you need only to upgrade the ESXi hosts which are used for the payload VMs and only the NSX-T Edge Node VMs, but not the ESXi hosts where you have your Edge Node deployed
- vSphere HA can be turned off (do we want to move an highly loaded packet forwarding node with vMotion in a host vSphere HA event? No I dont think so - as the routing HA model is much quicker)
- Simplified DRS settings (do we want to move an NSX-T Edge Node with vMotion to balance the resources?)
- Typically a resource pool is not required
We should never underestimate how important are smooth upgrade cycles. Upgrade cycles are time consuming events and are typically required multiple time per year.
To have the ESXi host NOT prepared for NSX-T is considered best practice and should be always deployed in any NSX-T deployments which can afford a dedicated vSphere Cluster only for NSX-T Edge Node VM. Install NSX-T on ESXi hosts where you have deployed your NSX-T Edge Node VM (called collapsed design) is appropriate for customer which have a low number of ESXi hosts the keep the CAPEX costs low.
The diagram below shows the lab test bed of single ESXi host with a single Edge Node appliance which has only a single N-VDS. The relevant configuration steps are marked with 1 to 4.
The NSX-T Edge Node VM is configured with two transport zone. The same overlay transport zone is used for the compute ESXi hosts where I host the payload VMs. Both transport zone are assigned to a single N-VDS, called NY-HOST-NVDS. The name of the N-VDS might confused you a little bit due the selected name, but the same NY-HOST-NVDS is used for all compute ESXi hosts prepped with NSX-T and indicate that only a single N-VDS is required independent of Edge Node or compute ESXi host. However, you might select a different name for the N-VDS.
The single N-VDS (NY-HOST-NVDS) on the Edge Node is configured with a Uplink Profile (please see more details below) with two static TEP IP addresses, which allow use to load balance the Geneve encapsulated overlay traffic for North/South. Both Edge Node FastPath interfaces (fp-eth0 & fp-eth1) are mapped to an labelled Active Uplink name as part of the default teaming policy.
There are 4 areas where we need to take care of the correct settings.
<1> - At the physical ToR/Leaf Switch Level
The trunk ports will allow only the required VLANs
- VLAN 60 - NSX-T Edge Node management interface
- VLAN 151 - Geneve TEP VLAN
- VLAN 160 - Northbound Uplink VLAN for NY-N3K-LEAF-10
- VLAN 161 - Northbound Uplink VLAN for NY-N3K-LEAF-11
The resulting interface configuration along with the relevant BGP configuration is in the table below shown. Please note for redundancy reason both Northbound Uplink VLAN 160 and 161 are allowed on the trunk configuration. Under normal condition, NY-N3K-LEAF-10 will learn only MAC addresses from VLAN 60, 151 and 160 and NY-N3K-LEAF-11 will learn only MAC addresses from VLAN 60, 151 and 161.
Table 1 - Nexus ToR/LEAF Switch Configuration
NY-N3K-LEAF-10 Interface Configuration | NY-N3K-LEAF-11 Interface Configuration |
---|---|
interface Ethernet1/2 description *NY-ESX50A-VMNIC2* switchport mode trunk switchport trunk allowed vlan 60,151,160-161 spanning-tree port type edge trunk | interface Ethernet1/2 description *NY-ESX50A-VMNIC3* switchport mode trunk switchport trunk allowed vlan 60,151,160-161 spanning-tree port type edge trunk |
interface Ethernet1/4 description *NY-ESX51A-VMNIC2* switchport mode trunk switchport trunk allowed vlan 60,151,160-161 spanning-tree port type edge trunk | interface Ethernet1/4 description *NY-ESX51A-VMNIC3* switchport mode trunk switchport trunk allowed vlan 60,151,160-161 spanning-tree port type edge trunk |
router bgp 64512 router-id 172.16.3.10 log-neighbor-changes ---snip--- neighbor 172.16.160.20 remote-as 64513 update-source Vlan160 timers 4 12 address-family ipv4 unicast neighbor 172.16.160.21 remote-as 64513 update-source Vlan160 timers 4 12 address-family ipv4 unicast | router bgp 64512 router-id 172.16.3.11 log-neighbor-changes ---snip--- neighbor 172.16.161.20 remote-as 64513 update-source Vlan161 timers 4 12 address-family ipv4 unicast neighbor 172.16.161.21 remote-as 64513 update-source Vlan161 timers 4 12 address-family ipv4 unicast |
As part of the Cisco Nexus 3048 BGP configuration we see that only NY-N3K-LEAF-10 terminates the BGP session on VLAN 160 and only NY-N3K-LEAF-11 terminates the BGP session on VLAN 161.
<2> - At the vDS Port Group Level
The vDS is configured with totally four vDS port groups:
- Port Group (Type VLAN): NY-VDS-PG-ESX5x-NSXT-EDGE-MGMT60: carries only VLAN 60 has an active/standby teaming policy
- Port Group (Type VLAN): NY-vDS-PG-ESX5x-EDGE2-Dummy999: this dummy port group is used for the remaining unused Edge Node Fastpath (fp-eth2) interface to avoid that NSX-T report admin status down
- Port Group (Type VLAN trunking): NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkA: Carries the Edge Node TEP VLAN 151 and Uplink VLAN 160
- Port Group (Type VLAN trunking): NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkB: Carries the Edge Node TEP VLAN 151 and Uplink VLAN 161
The two trunk port groups have only one vDS-Uplink active, the other vDS-Uplink is set to standby. This is required that the Uplink VLAN traffic along the BGP session can only be forwarded on the specific vDS-Uplink (vDS-Uplink is mapped to the corresponding pNIC) during normal condition. With these settings we can achieve
- Failover order gets deterministic
- Symmetric Bandwidth for both overlay and North/South traffic
- The BGP session between the Tier-0 Gateway and the ToR/Leaf switches should stay UP even when one or both physical links between the ToR/Leaf switches and the ESXi hosts goes down (the BGP session is then carried over the Trunk Link between the ToR/Leaf switches).
The table below highlights the relevant VLAN and Teaming settings:
Table 2 - vDS Port Group Configuration
NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkA Configuration | NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkB Configuration |
---|---|
<3> - At the NSX-T Uplink Profile Level
The NSX-T Uplink Profile is a global construct that defines how traffic will leave a Transport Node respective Edge Transport Node.
The single Uplink Profile used for the two Edge Node FastPath interfaces (fp-eth0 & fp-eth1) needs to be extended with two additional Named Teaming Policies to steer the North/South uplink traffic to the corresponding ToR/Leaf switch.
- The default teaming requires to be configured Source-port-ID with the two Active Uplinks (I am using label EDGE-UPLINK1 & EDGE-UPLINK2)
- An additional teaming policy called NY-Named-Teaming-N3K-LEAF-10 is configure with failover teaming policy with one a single Active Uplink (label EDGE-UPLINK1)
- An additional teaming policy called NY-Named-Teaming-N3K-LEAF-11 is configure with failover teaming policy with one a single Active Uplink (label EDGE-UPLINK2)
Please note, the Active Uplink labels for the default and the additional Named Teaming Policy needs to be the same.
<4> - At the NSX-T Uplink VLAN Segment Level
To activate the previous configured Named Teaming Policy for the specific Tier-0 VLAN segment 160 respective segment 161 we need first to assign the Named Teaming Policy to the VLAN transport zone.
The last step involves the configuration of each of the two Uplink VLAN segment (160 & 161) the corresponding Named Teaming Policy. NSX-T 2.5.1 requires to configure the VLAN Segment with the Named Teaming Policy in the "legacy" Advance Networking&Security UI. The recently released NSX-T 3.0 will support Policy UI.
Table 3 - NSX-T VLAN Segment Configuration
VLAN Segment NY-T0-EDGE-UPLINK-SEGMENT-160 | VLAN Segment NY-T0-EDGE-UPLINK-SEGMENT-161 |
---|---|
Verification
The resulting topology with both NSX-T Edge Nodes and the previous shown steps is shown below. It shows how the Tier-0 VLAN Segment 160 respective 161 is "routed" through the different levels from the Tier-0 Gateway towards the Nexus Leaf switches via the vDS trunk port groups.
The best option to verify if all your settings were correct is to validate on which ToR/Leaf trunk port you learn the appropriate MAC address of the Tier-0 Gateway Layer 3 interfaces. These Layer 3 interfaces belongs to the Tier-0 Service Router (SR). You can get the MAC address via CLI.
Table 4 - NSX-T Tier-0 Layer 3 Interface Configuration
ny-edge-transport-node-20(tier0_sr)> get interfaces | ny-edge-transport-node-21(tier0_sr)> get interfaces |
---|---|
Interface: 2f83fda5-0da5-4764-87ea-63c0989bf059 Ifuid: 276 Name: NY-T0-LIF160-EDGE-20 Internal name: uplink-276 Mode: lif IP/Mask: 172.16.160.20/24 MAC: 00:50:56:97:51:65 LS port: 40102113-c8af-4d4e-a94d-ca44f9efe9a5 Urpf-mode: STRICT_MODE DAD-mode: LOOSE RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0) Admin: up Op_state: up MTU: 9000 | Interface: a3d7669a-e81c-43ea-81c0-dd60438284bc Ifuid: 289 Name: NY-T0-LIF160-EDGE-21 Internal name: uplink-289 Mode: lif IP/Mask: 172.16.160.21/24 MAC: 00:50:56:97:84:c3 LS port: 045cd486-d8c5-4df5-8784-2e49862771f4 Urpf-mode: STRICT_MODE DAD-mode: LOOSE RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0) Admin: up Op_state: up MTU: 9000 |
Interface: a1f0d5d0-3883-4e04-b985-e391ec1d9711 Ifuid: 281 Name: NY-T0-LIF161-EDGE-20 Internal name: uplink-281 Mode: lif IP/Mask: 172.16.161.20/24 MAC: 00:50:56:97:a7:33 LS port: d180ee9a-8e82-4c59-8195-ea65660ea71a Urpf-mode: STRICT_MODE DAD-mode: LOOSE RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0) Admin: up Op_state: up MTU: 9000 | Interface: 2de46a54-3dba-4ddc-abe7-5b713260e7d4 Ifuid: 296 Name: NY-T0-LIF161-EDGE-21 Internal name: uplink-296 Mode: lif IP/Mask: 172.16.161.21/24 MAC: 00:50:56:97:ec:1b LS port: c32e2109-32d0-4c0f-a916-bfba01fdd6ac Urpf-mode: STRICT_MODE DAD-mode: LOOSE RA-mode: SLAAC_DNS_TRHOUGH_RA(M=0, O=0) Admin: up Op_state: up MTU: 9000 |
The MAC address tables shows that ToR/Leaf switch NY-N3K-LEAF-10 learns the Tier-0 Layer 3 MAC addresses from VLAN 160 locally and from VLAN 161 via Portchannel 1 (Po1).
And the MAC address tables shows that ToR/Leaf switch NY-N3K-LEAF-11 learns the Tier-0 Layer 3 MAC addresses from VLAN 161 locally and from VLAN 160 via Portchannel 1 (Po1).
Table 5 - ToR/Leaf Switch MAC Address Table for Northbound Uplink VLAN 160 and 161
ToR/Leaf Switch NY-N3K-LEAF-10 | ToR/Leaf Switch NY-N3K-LEAF-11 |
---|---|
NY-N3K-LEAF-10# show mac address-table dynamic vlan 160 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 160 0050.5697.5165 dynamic 0 F F Eth1/2 * 160 0050.5697.84c3 dynamic 0 F F Eth1/4 | NY-N3K-LEAF-11# show mac address-table dynamic vlan 160 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 160 0050.5697.5165 dynamic 0 F F Po1 * 160 0050.5697.84c3 dynamic 0 F F Po1 * 160 780c.f049.0c81 dynamic 0 F F Po1 |
NY-N3K-LEAF-10# show mac address-table dynamic vlan 161 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 161 0050.5697.a733 dynamic 0 F F Po1 * 161 0050.5697.ec1b dynamic 0 F F Po1 * 161 502f.a8a8.717c dynamic 0 F F Po1 | NY-N3K-LEAF-11# show mac address-table dynamic vlan 161 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 161 0050.5697.a733 dynamic 0 F F Eth1/2 * 161 0050.5697.ec1b dynamic 0 F F Eth1/4 * 161 780c.f049.0c81 dynamic 0 F F Po1 |
As we have seen in the Edge Transport Node configuration each Edge Node has two TEP IP addresses statically configured. Both Fastpath interfaces load balance the Geneve encapsulated overlay traffic. Table 8 shows the TEP MAC address in order to verify the Edge Node TEP MAC addresses.
Table 7 - ToR/Leaf Switch MAC Address Table for Edge Node TEP VLAN 151
ToR/Leaf Switch NY-N3K-LEAF-10 | ToR/Leaf Switch NY-N3K-LEAF-11 |
---|---|
NY-N3K-LEAF-10# show mac address-table dynamic vlan 151 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 151 0050.5697.5165 dynamic 0 F F Eth1/2 * 151 0050.5697.84c3 dynamic 0 F F Eth1/4 * 151 0050.5697.a733 dynamic 0 F F Po1 * 151 0050.5697.ec1b dynamic 0 F F Po1 * 151 502f.a8a8.717c dynamic 0 F F Po1 | NY-N3K-LEAF-11# show mac address-table dynamic vlan 151 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 151 0000.0c9f.f097 dynamic 0 F F Po1 * 151 0050.5697.5165 dynamic 0 F F Po1 * 151 0050.5697.84c3 dynamic 0 F F Po1 * 151 0050.5697.a733 dynamic 0 F F Eth1/2 * 151 0050.5697.ec1b dynamic 0 F F Eth1/4 * 151 780c.f049.0c81 dynamic 0 F F Po1 |
Table 8 - NSX-T Edge Node TEP MAC Addresses
ny-edge-transport-node-20> | ny-edge-transport-node-21> |
---|---|
ny-edge-transport-node-20> get interface fp-eth0 | find MAC MAC address: 00:50:56:97:51:65
ny-edge-transport-node-20> get interface fp-eth1 | find MAC MAC address: 00:50:56:97:a7:33 | ny-edge-transport-node-21> get interface fp-eth0 | find MAC MAC address: 00:50:56:97:84:c3
ny-edge-transport-node-21> get interface fp-eth1 | find MAC MAC address: 00:50:56:97:ec:1b |
For the sake of completeness, the table below shows that only ToR/Leaf Switch NY-N3K-LEAF-10 learns the two Edge Node Management MAC address from VLAN 60 locally, ToR/Leaf Switch NY-N3K-LEAF-11 only via Portchannel 1 (Po1). This is expected, as we have configured on the vDS port group level the teaming policy in active/standby. The Edge Node N-VDS is for the Edge Node management interface not relevant.
Table 8 - ToR/Leaf Switch MAC Address Table for Edge Node Management VLAN 60
ToR/Leaf Switch NY-N3K-LEAF-10 | ToR/Leaf Switch NY-N3K-LEAF-11 |
---|---|
NY-N3K-LEAF-10# show mac address-table dynamic vlan 60 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 60 0050.5697.1e49 dynamic 0 F F Eth1/4 * 60 0050.5697.4555 dynamic 0 F F Eth1/2 * 60 502f.a8a8.717c dynamic 0 F F Po1 | NY-N3K-LEAF-11# show mac address-table dynamic vlan 60 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 60 0000.0c9f.f03c dynamic 0 F F Po1 * 60 0050.5697.1e49 dynamic 0 F F Po1 * 60 0050.5697.4555 dynamic 0 F F Po1 |
Please note, I always highly recommend to run a few failover tests to confirm that the NSX-T Edge Node deployment works as expected.
I hope you had a little bit fun reading this blog about a single N-VDS on the Edge Node with VLAN pinning.
Software Inventory:
vSphere version: VMware ESXi, 6.5.0, 15256549
vCenter version:6.5.0, 10964411
NSX-T version: 2.5.1.0.0.15314288 (GA)
Cisco Nexus 3048 NX-OS version: 7.0(3)I7(6)
Blog history
Version 1.0 - 13.04.2020 - first published version