Dear readers
Welcome to this new blog talking about static routing with the NSX-T Tier-0 Gateway. The majority of our customer are using BGP for the Tier-0 Gateway to Top of Rack (ToR) switches connectivity to exchange IP prefixes. For those customers which prefer static routing, this blog talks about the two design options.
- Design Option 1: Static Routing using SVI as Next Hop with NSX-T Edge Node in Active/Active Mode to support ECMP for North/South
- Design Option 2: Static Routing using SVI as Next Hop with NSX-T Edge Node in Active/Standby Mode using HA VIP
I have the impression, that the second design option with a Tier-0 Gateway with two NSX-T Edge Node in Active/Standby mode using HA VIP is widely known, but the first option with NSX-T Edge Node in Active/Active mode leveraging ECMP with static routing is pretty unknown. This first option is as example also a valid Enterprise PKS (new name is Tanzu Kubernetes Grid Integration - TKGI) design option (with shared Tier-1 Gateway) or can be used as well with vSphere 7 with Kubernetes (Project Pacific) where BGP is not allowed nor preferred. I am sure, the reader is aware, that Tier-0 Gateway in Active/Active mode cannot be enabled for stateful services (e.g. Edge firewall).
Before we start to configure these two different design options, we need to describe the overall lab topology, the physical and logical setup along with the NSX-T Edge Node setup including the NSX-T Edge Node main installation steps. For both option we will configure only a single N-VDS on the NSX-T Edge Node. This is not a requirement, but it is considered a pretty simple design option. The other popular design options consist of typically three embedded N-VDS on the NSX-T Edge Node for design option 1 and two mbedded N-VDS on the NSX-T Edge Node for design option 2.
Logical Lab Topology
The lab setup is pretty simple. For an easy comparison between these two options, I have configured both design options parallel. The most relevant part for this blog is between the two Tier-0 Gateways and the two ToR switches acting as Layer 3 Leaf switches. The configuration and design for the Tier-1 Gateway and the compute vSphere cluster hosting the eight workload Ubuntu VMs is for both design options identically. There is only a single Tier-1 Gateway per Tier-0 Gateway configured, each with two overlay segments. The eight workload Ubuntu VMs are installed on different Compute vSphere cluster called NY-CLUSTER-COMPUTE1 with only two ESXi hosts and are evenly distributed on the two ESXi hosts. These two compute ESXi hosts are prepared with NSX-T and have only a single overlay Transport Zone configured. The four NSX-T Edge Node VMs are running on another vSphere cluster, called NY-CLUSTER-EDGE1. This vSphere cluster has again only two ESXi hosts. A third vSphere cluster called NY-CLUSTER-MGMT is used for the management component, like vCenter and the NSX-T managers. Details about the compute and management vSphere clusters are not relevant for this blog and hence are deliberately omitted.
The diagram below shows the NSX-T logical topology, the most relevant vSphere objects and underneath the NSX-T overlay and VLAN segments (for the NSX-T Edge Node North/South connectivity.
Physical Setup
Lets have first a look to the physical setup used for our four NSX-T VM-based Edge Nodes. Understanding the physical is not less important than the logical setup. Two Nexus 3048 ToR switches configured as Layer 3 Leaf switches are used. They have a Layer 3 connection towards a single spine (not shown) and two Layer 2 trunks combined into a single portchannel with LACP between the two ToR switches. Two ESXi hosts (ny-esx50a and ny-esx51a) with totally 4 pNICs assigned to two different virtual Distributed Switches (vDS). Please note, the Nexus 3048 switches are not configured with Cisco vPC, even this would also a valid options.
The relevant physical links for the NSX-T Edge Nodes connectivity are the four green links only connected to vDS2.
These two ESXi hosts (ny-esx50a and ny-esx51a) are NOT prepared. The two ESXi hosts belong to a single vSphere Cluster exclusively used for NSX-T Edge Node VMs. There are a few good reasons NOT to prepare these ESXi hosts with NSX-T where you host only NSX-T Edge Node VMs:
- It is not required
- Better NSX-T upgrade-ability (you don't need to evacuate the NSX-T VM-based Edge Nodes during host NSX-T software upgrade with vMotion to enter maintenance mode; every vMotion of the NSX-T VM-based Edge Node will cause a short unnecessary data plane glitch)
- Shorter NSX-T upgrade cycles (for every NSX-T upgrade you only need to upgrade the ESXi hosts which are used for the payload VMs and only the NSX-T VM-based Edge Nodes, but not the ESXi hosts where you have your Edge Nodes deployed
- vSphere HA can be turned off (do we want to move a highly loaded packet forwarding node like an NSX-T Edge Node with vMotion in a host vSphere HA event? No I don't think so - as the routing HA model react in a failure event faster)
- Simplified DRS settings (do we want to move an NSX-T VM-based Edge Nodes with vMotion to balance the resources?)
- Typically a resource pool is not required
We should never underestimate how important smooth upgrade cycles are. Upgrade cycles are time consuming events and are typically required multiple times per year.
To have the ESXi host NOT prepared for NSX-T is considered best practice and should always be deployed in any NSX-T deployments which can afford a dedicated vSphere Cluster only for NSX-T VM-based Edge Nodes. Install NSX-T on ESXi hosts where you have deployed your NSX-T VM-based Edge Nodes (called collapsed design) is valid too and appropriate for customers who have a low number of ESXi hosts to keep the CAPEX costs low.
ESXi Host vSphere Networking
The first virtual Distributed Switches (vDS1) is used for the host vmkernel networking only. The typically vmkernel interfaces are attached to three different port groups. The second virtual Distributed Switches (vDS2) is used for the NSX-T VM-based Edge Node networking only. All virtual Distributed Switches port groups are all tagged with the appropriate VLAN id, with the exception of the three uplink trunk port groups (more details later). Both virtual Distributed Switches are configured for MTU 9000 bytes. And I am using a different Geneve Tunnel End Point (TEP) VLAN for the Compute ESXi hosts (VLAN 150 for ny-esx70a and ny-esx71a) and for the two NSX-T VM-based Edge Node (VLAN 151) running on the ESXi hosts (ny-esx50a and ny-esx51a). This is in such a setup not a requirement, but helps to distribute the BUM traffic replication effort lleveraging the hierarchical 2-Tier replication mode. The "dummy" port group is used to connect the "not-used" NSX-T Edge Node fast path interface (fp-ethX); the attachment to a dummy port group is done to avoid that NSX-T reports it as interface admin status down.
Table 1 - vDS Setup Overview
Name Diagram | vDS Name | Physical Interfaces | Port Groups |
---|---|---|---|
vDS1 | NY-vDS-ESX5x-EDGE1 | vmnic0 and vmnic1 | NY-vDS-PG-ESX5x-EDGE1-VMK0-Mgmt50 NY-vDS-PG-ESX5x-EDGE1-VMK1-vMotion51 NY-vDS-PG-ESX5x-EDGE1-VMK2-ipStorage52 |
vDS2 | NY-vDS-ESX5x-EDGE2 | vmnic2 and vmnic3 | NY-vDS-PG-ESX5x-EDGE2-EDGE-Mgmt60 (Uplink 1 active, Uplink 2 standby) NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkA (Uplink 1 active, Uplink 2 unused) NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkB (Uplink 1 unused, Uplink 2 active) Ny-vDS-PG-ESX5x-EDGE2-EDGE-TrunkC (Uplink 1 active, Uplink 2 active) NY-vDS-PG-ESX5x-EDGE2-Dummy999 (Uplink 1 and Uplink 2 are unused) |
The combined diagram below shows the most relevant NY-vDS-ESX5x-EDGE2 port group settings regarding VLAN trunking and Teaming and Failover.
Logical VLAN Setup
The ToR switches are configured with these relevant four VLANs (60, 151,160 and 161) for the NSX-T Edge Nodes and the associated Switched Virtual Interfaces (SVI). The VLANs 151, 160 and 161 (VLAN 161 is not used in design option 2) are carried over the three vDS trunk port groups (NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkA, NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkB and NY-vDS-PG-ESX5x-EDGE2-EDGE-TrunkC). The SVI on the Nexus 3048 for Edge Management (VLAN 60) and for the Edge Node TEP (VLAN 151) are configured with HSRPv2 with a VIP of .254. The two SVIs on the Nexus 3048 for the Uplink VLAN (160 and 161) are configured without HSRP. VLAN999 for the dummy VLAN does not exists on the ToR switches. The Tier-1 Gateway is not shown in the diagrams below.
Please note, the dotted line to SVI161 respective SVI160 indicates that the VLAN/SVI configuration on the ToR switch exists, but is not used for the static routing when using Active/Active ECMP with static routing (design option 1).
And the dotted line to SVI161 in design option 2 indicates that the VLAN/SVI configuration on the ToR switches exists, but is not used for the static routing when using Active/Standby with HA VIP with static routing. More details about the static routing is shown in a later step.
NSX-T Edge Node Deployment
The NSX-T Edge Node deployment option with the single Edge Node N-VDS is simple and has been discussed in one of my other blogs. In this lab exercise I have done an NSX-T Edge Node ova installation, followed by the "join" command followed by the final step of the NSX-T Edge Transport Node configuration. The NSX-T UI installation option is also valid, but my personal preference is the ova deployment option. The most relevant step for such a NSX-T Edge Node setup is the correct place of the dot1q tagging and the correct mapping of the NSX-T Edge Node interfaces to the virtual Distributed Switches (vDS2) trunk port groups (A & B for option 1 and C for option 2) as shown in the diagrams below.
The diagram below shows the NSX-T Edge Node overall setup and the network selection for the NSX-T Edge Node 20 & 21 during the ova deployment for the design option 1:
The diagram below shows the NSX-T Edge Node overall setup and the network selection for the NSX-T Edge Node 22 & 23 during the ova deployment for the design option 2:
After the successful ova deployment the "join" command must be used to connect the management plane of the NSX-T Edge Nodes to the NSX-T managers. The "join" command requires the NSX-T manager thumbprint. Jump with SSH to the first NSX-T manager and read the API thumbprint. Jump via SSH to every ova deployed NSX-T Edge Node and execute the "join" command. The two steps are shown the in the table below:
Table 2 - NSX-T Edge Node "join" to the NSX-T Managers
Step | Command Example | Device | Comments |
---|---|---|---|
Read API Thumbprint | ny-nsxt-manager-21> get certificate api thumbprint ea90e8cc7adb6d66994a9ecc0a930ad4bfd1d09f668a3857e252ee8f74ba1eb4 | first NSX-T manager | N/A |
Join the NSX-T Manager for each NSX-T Edge Node | ny-nsxt-edge-node-20> join management-plane ny-nsxt-manager-21.corp.local thumbprint ea90e8cc7adb6d66994a9ecc0a930ad4bfd1d09f668a3857e252ee8f74ba1eb4 username admin Password for API user: Node successfully registered as Fabric Node: 437e2972-bc40-11ea-b89c-005056970bf2
ny-nsxt-edge-node-20>
--- do the same for all other NSX-T Edge Nodes --- | on all previous deployed NSX-T Edge Node through ova | NSX-T will sync the configuration with the two other NSX-T managers Do not join using the NSX-T manager VIP FQDN/IP |
The resulting UI after the "join" command is shown below. The configuration state must be "Configure NSX".
NSX-T Edge Transport Node Configuration
Before we can start with the NSX-T Edge Transport Node configuration, we need to be sure, that the Uplink Profile are ready. The two design options requires two different Uplink Profiles. The two diagrams below shows the two different Uplink Profiles for the NSX-T Edge Transport Nodes:
The Uplink Profile "NY-EDGE-UPLINK-PROFILE-SRC-ID-TEP-VLAN151" is used for design option 1 and requires for Multi-TEP the teaming policy "LOADBALANCE_SRCID" with two Active Uplinks (EDGE-UPLINK01 and EDGE-UPLINK02). Two additional named teaming policies are configured for a proper ECMP dataplane forwarding; please see blog "Single NSX-T Edge Node N-VDS with correct VLAN pinning" for more details. I am using the same named teaming configuration for design option 1 as in the other blog where I have used BGP instead static routing. As already mentioned, the dot1q tagging (Transport VLAN = 151) for the two TEP interfaces is required as part of this Uplink Profile configuration.
The Uplink Profile "NY-EDGE-UPLINK-PROFILE-FAILOVER-TEP-VLAN151" is used for design option 2 and requires the teaming policy "FAILOVER_ORDER" with only a single Active Uplink (EDGE-UPLINK01). Named teaming policies are not required. Again the dot1q tagging for the single TEP interface (Transport VLAN = 151) is required as part of this Uplink Profile configuration.
The NSX-T Edge Transport Node configuration itself is straightforward and is shown in the two diagrams below for a single NSX-T Edge Transport Node per design option.
NSX-T Edge Transport Node 20 & 21 (design option 1) are using the previous configured Uplink Profile "NY-EDGE-UPLINK-PROFILE-SRC-ID-TEP-VLAN151". Two static TEP IP addresses are configured and the two Uplinks (EDGE-UPLINK01 & EDGE-UPLINK02) are mapped to the fast path interfaces (fp-eth0 & fp-eth1).
NSX-T Edge Transport Node 22 & 23 (design option 2) are using the previous configured Uplink Profile "NY-EDGE-UPLINK-PROFILE-FAILOVER-TEP-VLAN151". A single static TEP IP address is configured and the single Uplinks (EDGE-UPLINK01) is mapped to the fast path interface (fp-eth0).
Please note, the required configuration of the two NSX-T Transport Zones and the single N-VDS switch is not shown.
The NSX-T Edge Transport Node ny-nsxt-edge-node-20 and ny-nsxt-edge-node-21 are assigned to the NSX-T Edge cluster NY-NSXT-EDGE-CLUSTER01 and the NSX-T Edge Transport Node ny-nsxt-edge-node-22 and ny-nsxt-edge-node-22 are assigned to the NSX-T Edge cluster NY-NSXT-EDGE-CLUSTER02. This NSX-T Edge cluster configuration is also not shown.
NSX-T Tier-0 Gateway Configuration
The base NSX-T Tier-0 Gateway configuration is straightforward and is shown in the two diagrams below.
The Tier-0 Gateway NY-T0-GATEWAY-01 (design option 1) is configured in Active/Active mode along with the association with the NSX-T Edge Cluster NY-NSXT-EDGE-CLUSTER01.
The Tier-0 Gateway NY-T0-GATEWAY-02 (design option 2) is configured in Active/Standby mode along with the association with the NSX-T Edge Cluster NY-NSXT-EDGE-CLUSTER02. In this example preemptive is selected and the first NSX-T Edge Transport Node (ny-nsxt-edge-node-22) is the preferred Edge Transport Node (the active node when both nodes are up and running).
The next step of Tier-0 Gateway configuration is about the Layer 3 interfaces (LIF) for the northbound connectivity towards the ToR switches.
The next two diagrams shows the IP topologies including the ToR switches IP configuration along the resulting NSX-T Tier-0 Gateway Layer 3 interface configuration for the design option 1 (A/A ECMP).
The next diagrams shows the IP topology including the ToR switches IP configuration along the resulting NSX-T Tier-0 Gateway interface configuration for the design option 2 (A/S HA VIP).
The HA VIP configuration requires that both NSX-T Edge Transport Node interfaces belong to the same Layer 2 segment. Here I am using the previous configured Layer 3 interfaces (LIF); both belong to the same VLAN segment 160 (NY-T0-VLAN-SEGMENT-160).
All the previous steps are probably known by the majority of the readers. However, the next step is about the static routing configuration; these steps highlights the relevant configurations to archive ECMP with two NSX-T Edge Transport Node in Active/Active mode.
Design Option 1 Static Routing (A/A ECMP)
The first step in design option 1 is the Tier-0 static route configuration for northbound traffic. The most common way is to configure default routes northbound.
Two default routes with each with a different Next Hop (172.16.160.254 and 172.16.161.254) are configured on the NY-T0-GATEWAY-01. This is the first step to achieve ECMP for northbound traffic towards the ToR switches. The diagram below shows the corresponding NSX-T Tier-0 Gateway static routing configuration. Please keep in mind, that at the NSX-T Edge Transport Node level, each Edge Transport Node will have two default route entries. This is shown in the table underneath.
The difference between the logical construct configuration (Tier-0 Gateway) and the "physical" construct configuration (the Edge Transport Nodes) might be already known, as we have the same behavior already with BGP. This approach limits configuration errors. In BGP we configure typically only two BGP peers towards the two ToR switches, but each NSX-T Edge Transport Nodes get two BGP session realized.
The diagram below shows the setup with the two default routes (in black) northbound.
Please note, the configuration steps how to configure the Tier-1 Gateway (NY-T1-GATEWAY-GREEN) and how to connect it to the Tier-0 Gateway is not shown.
Table 3 - NSX-T Edge Transport Node Routing Table for Design Option 1 (A/A ECMP)
ny-nsxt-edge-node-20 (Service Router) | ny-nsxt-edge-node-21 (Service Router) |
---|---|
ny-nsxt-edge-node-20(tier0_sr)> get route 0.0.0.0/0
Flags: t0c - Tier0-Connected, t0s - Tier0-Static, b - BGP, t0n - Tier0-NAT, t1s - Tier1-Static, t1c - Tier1-Connected, t1n: Tier1-NAT, t1l: Tier1-LB VIP, t1ls: Tier1-LB SNAT, t1d: Tier1-DNS FORWARDER, t1ipsec: Tier1-IPSec, isr: Inter-SR, > - selected route, * - FIB route
Total number of routes: 1
t0s> * 0.0.0.0/0 [1/0] via 172.16.160.254, uplink-307, 03:29:43 t0s> * 0.0.0.0/0 [1/0] via 172.16.161.254, uplink-309, 03:29:43 ny-nsxt-edge-node-20(tier0_sr)> | ny-nsxt-edge-node-21(tier0_sr)> get route 0.0.0.0/0
Flags: t0c - Tier0-Connected, t0s - Tier0-Static, b - BGP, t0n - Tier0-NAT, t1s - Tier1-Static, t1c - Tier1-Connected, t1n: Tier1-NAT, t1l: Tier1-LB VIP, t1ls: Tier1-LB SNAT, t1d: Tier1-DNS FORWARDER, t1ipsec: Tier1-IPSec, isr: Inter-SR, > - selected route, * - FIB route
Total number of routes: 1
t0s> * 0.0.0.0/0 [1/0] via 172.16.160.254, uplink-292, 03:30:42 t0s> * 0.0.0.0/0 [1/0] via 172.16.161.254, uplink-306, 03:30:42 ny-nsxt-edge-node-21(tier0_sr)> |
The second step is to configure static routing southbound from the ToR switches towards NSX-T Edge Transport Node. This step is required to achieve ECMP for southbound traffic. Each ToR switch is configured with totally four static routes to forward traffic to the destination overlay networks within NSX-T. We could easily see that each NSX-T Edge Transport Node is used twice as Next Hop for the static route entries.
Table 4 - Nexus ToR Switches Static Routing Configuration and Resulting Routing Table for Design Option 1 (A/A ECMP)
NY-N3K-LEAF-10 | NY-N3K-LEAF-11 |
---|---|
ip route 172.16.240.0/24 Vlan160 172.16.160.20 ip route 172.16.240.0/24 Vlan160 172.16.160.21
ip route 172.16.241.0/24 Vlan160 172.16.160.20 ip route 172.16.241.0/24 Vlan160 172.16.160.21 | ip route 172.16.240.0/24 Vlan161 172.16.161.20 ip route 172.16.240.0/24 Vlan161 172.16.161.21
ip route 172.16.241.0/24 Vlan161 172.16.161.20 ip route 172.16.241.0/24 Vlan161 172.16.161.21 |
NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 03:26:44, static *via 172.16.160.21, Vlan160, [1/0], 03:26:58, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 03:26:44, static *via 172.16.160.21, Vlan160, [1/0], 03:26:58, static ---snip---
NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 03:27:39, static *via 172.16.161.21, Vlan161, [1/0], 03:27:51, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 03:27:39, static *via 172.16.161.21, Vlan161, [1/0], 03:27:51, static ---snip---
NY-N3K-LEAF-11# |
Again, these steps are straightforward and it shows how we can archive ECMP with static routing for North/South traffic. But what is happen, when as example one of the two NSX-T Edge Transport Node is down? Lets assume, ny-nsxt-edge-node-20 is down. Traffic from the Spine switches will be still forward to both ToR switches and once the ECMP hash is calculated, the traffic is forwarded to one of the four Next Hops (the four Edge Transport Node Layer 3 interfaces). Based on the hash calculation, it could be Next Hop 172.16.160.20 or 172.16.161.20, both interfaces belong to ny-nsxt-edge-node-20. This traffic will be blackholed and dropped! But why does the ToR switches still announce these overlay network 172.16.240.0/24 and 172.16.241.0/24 to the Spine switches? The reason is simple, because for both ToR switches are the static route entries still valid, as VLAN160/161 or/and the Next Hop are still UP. So from the ToR switch routing table perspective is all fine. These static route entries will potentially never go down, as the Next Hop IP addresses belongs to the VLAN 160 or VLAN 161 and these VLANs are always in the state UP as long a single physical port is UP and part of one of these VLANs (assuming the ToR switch is up and running). Even when all attached ESXi host are down, the InterSwitch link between the ToR switches is still UP and hence VLAN 160 and VLAN 161 are still UP. Please keep in mind, with BGP does this problem not exists, as we have BGP keepalives and once the NSX-T Edge Transport Node is down, the ToR switch tears down the BGP session and invalidate the local route entries.
But how we could solve the blackholing issue with static routing? The answer is Bi-Directional Forwarding (BFD) for static routing.
What is BFD?
BFD is nothing else then a purpose build keepalive protocol that typically routing protocols including first hop redundancy protocols (e.g. HSRP or VRRP) subscribe to. Various protocols can piggyback a single BFD session. BFD can detect link failures in milliseconds or sub-seconds (NSX-T Bare Metal Edge Nodes with 3 x 50ms) or near sub-seconds (NSX-T VM-based Edge Nodes 3 x 500ms) in the context of NSX-T. All protocols have some way of detecting failure, usually timer-related. Tuning these timers can theoretically get you sub-second failure detection too, but this produces unnecessary high overhead as theses protocols weren't designed with that in mind. BFD was specifically built for fast failure detection and maintain low CPU load. Please keep in mind, if you have as example BGP running between two physical routers, there's no need to have BFD sessions for link failure detection, as the routing protocol will detect the link-down event instantly. But for two routers (e.g. Tier-0 Gateways) connected through intermediate Layer 2/3 nodes (physical infra, vDS, etc.) where the routing protocol cannot detect a link-down event, the failure event must be detected through a dead timer. Welcome in the virtual world!! BFD was enhanced with the capability to support static routing too, even the driver using BFD for static routing was not the benefit to keep the CPU low and have fast failure detection, it was about the extend the functionality of static routes with keepalives with BFD.
So, how we can apply BFD for static routing in our lab? There are multiple configuration steps required.
Before we can associate BFD with the static routes on the NSX-T Tier-0 Gateway NY-T0-GATEWAY-01, the creation of a BFD profile for static routes is required. This is shown in the diagram below. I am using the same BFD parameter (Interval=500ms and Declare Dead Multiple=3) as NSX-T 3.0 has defined by default for BFD registered for BGP.
The next step is the configuration of BFD peers for static routing at Tier-0 Gateway level. I am using for the BFD peers the same Next Hop IP addresses (172.16.160.254 and 172.16.161.254) as I have used for the static routes northbound towards the ToR switches. Again, this BFD peer configuration is configured at Tier-0 Gateway level, but the realization of the BFD peers happens at Edge Transport Node level. On each of the two NSX-T Edge Transport Nodes (Service Router) are two BGP sessions realized. The appropriate BFD peer source interface on the Tier-0 Gateway is automatically selected (the Layer 3 LIF) by NSX-T, but as you see, NSX-T allow you to specify the BFD source interface too.
The table below shows the global BFD timer configuration and the BFD peers with source and peer (destination) IP.
Table 5 - NSX-T Edge Transport Node BFD Configuration
ny-nsxt-edge-node-20 (Service Router) | ny-nsxt-edge-node-21 (Service Router) |
---|---|
ny-nsxt-edge-node-20(tier0_sr)> get bfd-config Logical Router UUID : 1cfd7da2-f37c-4108-8f19-7725822f0552 vrf : 2 lr-id : 8193 name : SR-NY-T0-GATEWAY-01 type : PLR-SR
Global BFD configuration Enabled : True Min RX Interval: 500 Min TX Interval: 500 Min RX TTL : 255 Multiplier : 3
Port : 64a2e029-ad69-4ce1-a40e-def0956a9d2d
Session BFD configuration
Source : 172.16.160.20 Peer : 172.16.160.254 Enabled : True Min RX Interval: 500 Min TX Interval: 500 Min RX TTL : 255 Multiplier : 3
Port : 371a9b3f-d669-493a-a46b-161d3536b261
Session BFD configuration
Source : 172.16.161.20 Peer : 172.16.161.254 Enabled : True Min RX Interval: 500 Min TX Interval: 500 Min RX TTL : 255 Multiplier : 3
ny-nsxt-edge-node-20(tier0_sr)> | ny-nsxt-edge-node-21(tier0_sr)> get bfd-config Logical Router UUID : a2ea4cbc-c486-46a1-a663-c9c5815253af vrf : 1 lr-id : 8194 name : SR-NY-T0-GATEWAY-01 type : PLR-SR
Global BFD configuration Enabled : True Min RX Interval: 500 Min TX Interval: 500 Min RX TTL : 255 Multiplier : 3
Port : a5454564-ef1c-4e30-922f-9876b9df38df
Session BFD configuration
Source : 172.16.160.21 Peer : 172.16.160.254 Enabled : True Min RX Interval: 500 Min TX Interval: 500 Min RX TTL : 255 Multiplier : 3
Port : 8423e83b-0a69-44f4-90d1-07d8ece4f55e
Session BFD configuration
Source : 172.16.161.21 Peer : 172.16.161.254 Enabled : True Min RX Interval: 500 Min TX Interval: 500 Min RX TTL : 255 Multiplier : 3
ny-nsxt-edge-node-21(tier0_sr)> |
BFD in general but also for static routing requires that the peering site is configured with BFD too to ensure BFD keepalives are send out respective replied. Once BFD peers are configured on the Tier-0 Gateway, the ToR switches requires the appropriate BFD peer configuration too. This is shown in the table below. Each ToR switch gets two BFD peer configurations, one for each of the NSX-T Edge Transport Node.
Table 6 - Nexus ToR Switches BFD for Static Routing Configuration
NY-N3K-LEAF-10 | NY-N3K-LEAF-11 |
---|---|
feature bfd ! ip route static bfd Vlan160 172.16.160.20 ip route static bfd Vlan160 172.16.160.21 | feature bfd ! ip route static bfd Vlan161 172.16.161.20 ip route static bfd Vlan161 172.16.161.21 |
Once both ends of the BFD peers are configured correctly, the BFD sessions should come up and the static route should be installed into the routing table.
The table below shows the two BFD neighbors for the static routing (interface VLAN160 respective VLAN161). The BFD neighbor for interface Eth1/49 is used for the BFD peer towards the Spine switch and is registered for OSPF. The NX-OS operating system does not mention "static routing" for the registered protocol, it shows "netstack" - reason unknown.
Table 7 - Nexus ToR Switches BFD for Static Routing Configuration and Verification
NY-N3K-LEAF-10/11 |
---|
NY-N3K-LEAF-10# show bfd neighbors
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.160.254 172.16.160.20 1090519041/2635291218 Up 1099(3) Up Vlan160 default 172.16.160.254 172.16.160.21 1090519042/3842218904 Up 1413(3) Up Vlan160 default 172.16.3.18 172.16.3.17 1090519043/1090519041 Up 5629(3) Up Eth1/49 default NY-N3K-LEAF-10# |
NY-N3K-LEAF-11# show bfd neighbors
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.161.254 172.16.161.20 1090519041/591227029 Up 1384(3) Up Vlan161 default 172.16.161.254 172.16.161.21 1090519042/2646176019 Up 1385(3) Up Vlan161 default 172.16.3.22 172.16.3.21 1090519043/1090519042 Up 4696(3) Up Eth1/49 default NY-N3K-LEAF-11# |
NY-N3K-LEAF-10# show bfd neighbors details
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.160.254 172.16.160.20 1090519041/2635291218 Up 1151(3) Up Vlan160 default
Session state is Up and not using echo function Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None MinTxInt: 500000 us, MinRxInt: 500000 us, Multiplier: 3 Received MinRxInt: 500000 us, Received Multiplier: 3 Holdown (hits): 1500 ms (0), Hello (hits): 500 ms (22759) Rx Count: 20115, Rx Interval (ms) min/max/avg: 83/1921/437 last: 348 ms ago Tx Count: 22759, Tx Interval (ms) min/max/avg: 386/386/386 last: 24 ms ago Registered protocols: netstack Uptime: 0 days 2 hrs 26 mins 39 secs, Upcount: 1 Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 Multiplier: 3 - Length: 24 My Discr.: -1659676078 - Your Discr.: 1090519041 Min tx interval: 500000 - Min rx interval: 500000 Min Echo interval: 0 - Authentication bit: 0 Hosting LC: 1, Down reason: None, Reason not-hosted: None
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.160.254 172.16.160.21 1090519042/3842218904 Up 1260(3) Up Vlan160 default
Session state is Up and not using echo function Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None MinTxInt: 500000 us, MinRxInt: 500000 us, Multiplier: 3 Received MinRxInt: 500000 us, Received Multiplier: 3 Holdown (hits): 1500 ms (0), Hello (hits): 500 ms (22774) Rx Count: 20105, Rx Interval (ms) min/max/avg: 0/1813/438 last: 239 ms ago Tx Count: 22774, Tx Interval (ms) min/max/avg: 386/386/386 last: 24 ms ago Registered protocols: netstack Uptime: 0 days 2 hrs 26 mins 46 secs, Upcount: 1 Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 Multiplier: 3 - Length: 24 My Discr.: -452748392 - Your Discr.: 1090519042 Min tx interval: 500000 - Min rx interval: 500000 Min Echo interval: 0 - Authentication bit: 0 Hosting LC: 1, Down reason: None, Reason not-hosted: None
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.3.18 172.16.3.17 1090519043/1090519041 Up 5600(3) Up Eth1/49 default
Session state is Up and using echo function with 500 ms interval Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None MinTxInt: 500000 us, MinRxInt: 2000000 us, Multiplier: 3 Received MinRxInt: 2000000 us, Received Multiplier: 3 Holdown (hits): 6000 ms (0), Hello (hits): 2000 ms (5309) Rx Count: 5309, Rx Interval (ms) min/max/avg: 7/2101/1690 last: 399 ms ago Tx Count: 5309, Tx Interval (ms) min/max/avg: 1689/1689/1689 last: 249 ms ago Registered protocols: ospf Uptime: 0 days 2 hrs 29 mins 29 secs, Upcount: 1 Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 Multiplier: 3 - Length: 24 My Discr.: 1090519041 - Your Discr.: 1090519043 Min tx interval: 500000 - Min rx interval: 2000000 Min Echo interval: 500000 - Authentication bit: 0 Hosting LC: 1, Down reason: None, Reason not-hosted: None
NY-N3K-LEAF-10# |
NY-N3K-LEAF-11# show bfd neighbors details
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.161.254 172.16.161.20 1090519041/591227029 Up 1235(3) Up Vlan161 default
Session state is Up and not using echo function Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None MinTxInt: 500000 us, MinRxInt: 500000 us, Multiplier: 3 Received MinRxInt: 500000 us, Received Multiplier: 3 Holdown (hits): 1500 ms (0), Hello (hits): 500 ms (22634) Rx Count: 19972, Rx Interval (ms) min/max/avg: 93/1659/438 last: 264 ms ago Tx Count: 22634, Tx Interval (ms) min/max/avg: 386/386/386 last: 127 ms ago Registered protocols: netstack Uptime: 0 days 2 hrs 25 mins 47 secs, Upcount: 1 Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 Multiplier: 3 - Length: 24 My Discr.: 591227029 - Your Discr.: 1090519041 Min tx interval: 500000 - Min rx interval: 500000 Min Echo interval: 0 - Authentication bit: 0 Hosting LC: 1, Down reason: None, Reason not-hosted: None
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.161.254 172.16.161.21 1090519042/2646176019 Up 1162(3) Up Vlan161 default
Session state is Up and not using echo function Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None MinTxInt: 500000 us, MinRxInt: 500000 us, Multiplier: 3 Received MinRxInt: 500000 us, Received Multiplier: 3 Holdown (hits): 1500 ms (0), Hello (hits): 500 ms (22652) Rx Count: 20004, Rx Interval (ms) min/max/avg: 278/1799/438 last: 337 ms ago Tx Count: 22652, Tx Interval (ms) min/max/avg: 386/386/386 last: 127 ms ago Registered protocols: netstack Uptime: 0 days 2 hrs 25 mins 58 secs, Upcount: 1 Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 Multiplier: 3 - Length: 24 My Discr.: -1648791277 - Your Discr.: 1090519042 Min tx interval: 500000 - Min rx interval: 500000 Min Echo interval: 0 - Authentication bit: 0 Hosting LC: 1, Down reason: None, Reason not-hosted: None
OurAddr NeighAddr LD/RD RH/RS Holdown(mult) State Int Vrf 172.16.3.22 172.16.3.21 1090519043/1090519042 Up 4370(3) Up Eth1/49 default
Session state is Up and using echo function with 500 ms interval Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None MinTxInt: 500000 us, MinRxInt: 2000000 us, Multiplier: 3 Received MinRxInt: 2000000 us, Received Multiplier: 3 Holdown (hits): 6000 ms (0), Hello (hits): 2000 ms (5236) Rx Count: 5236, Rx Interval (ms) min/max/avg: 553/1698/1690 last: 1629 ms ago Tx Count: 5236, Tx Interval (ms) min/max/avg: 1689/1689/1689 last: 1020 ms ago Registered protocols: ospf Uptime: 0 days 2 hrs 27 mins 26 secs, Upcount: 1 Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 Multiplier: 3 - Length: 24 My Discr.: 1090519042 - Your Discr.: 1090519043 Min tx interval: 500000 - Min rx interval: 2000000 Min Echo interval: 500000 - Authentication bit: 0 Hosting LC: 1, Down reason: None, Reason not-hosted: None
NY-N3K-LEAF-11# |
The table below shows the BFD session on the Tier-0 Gateway on the Service Router (SR). The CLI shows the BFD peers and source IP addresses along the state. Please note, BFD does not require that both end of the BFD peer are configured with an identically interval and multiplier value, but for troubleshooting reason are identically parameter recommended.
Table 8 - NSX-T Edge Transport Node BFD Verification
ny-nsxt-edge-node-20 (Service Router) | ny-nsxt-edge-node-21 (Service Router) |
---|---|
ny-nsxt-edge-node-20(tier0_sr)> get bfd-sessions BFD Session Dest_port : 3784 Diag : No Diagnostic Encap : vlan Forwarding : last true (current true) Interface : 64a2e029-ad69-4ce1-a40e-def0956a9d2d Keep-down : false Last_cp_diag : No Diagnostic Last_cp_rmt_diag : No Diagnostic Last_cp_rmt_state : up Last_cp_state : up Last_fwd_state : UP Last_local_down_diag : No Diagnostic Last_remote_down_diag : No Diagnostic Last_up_time : 2020-07-07 15:42:23 Local_address : 172.16.160.20 Local_discr : 2635291218 Min_rx_ttl : 255 Multiplier : 3 Received_remote_diag : No Diagnostic Received_remote_state : up Remote_address : 172.16.160.254 Remote_admin_down : false Remote_diag : No Diagnostic Remote_discr : 1090519041 Remote_min_rx_interval : 500 Remote_min_tx_interval : 500 Remote_multiplier : 3 Remote_state : up Router : 1cfd7da2-f37c-4108-8f19-7725822f0552 Router_down : false Rx_cfg_min : 500 Rx_interval : 500 Service-link : false Session_type : LR_PORT State : up Tx_cfg_min : 500 Tx_interval : 500
BFD Session Dest_port : 3784 Diag : No Diagnostic Encap : vlan Forwarding : last true (current true) Interface : 371a9b3f-d669-493a-a46b-161d3536b261 Keep-down : false Last_cp_diag : No Diagnostic Last_cp_rmt_diag : No Diagnostic Last_cp_rmt_state : up Last_cp_state : up Last_fwd_state : UP Last_local_down_diag : No Diagnostic Last_remote_down_diag : No Diagnostic Last_up_time : 2020-07-07 15:42:24 Local_address : 172.16.161.20 Local_discr : 591227029 Min_rx_ttl : 255 Multiplier : 3 Received_remote_diag : No Diagnostic Received_remote_state : up Remote_address : 172.16.161.254 Remote_admin_down : false Remote_diag : No Diagnostic Remote_discr : 1090519041 Remote_min_rx_interval : 500 Remote_min_tx_interval : 500 Remote_multiplier : 3 Remote_state : up Router : 1cfd7da2-f37c-4108-8f19-7725822f0552 Router_down : false Rx_cfg_min : 500 Rx_interval : 500 Service-link : false Session_type : LR_PORT State : up Tx_cfg_min : 500 Tx_interval : 500
ny-nsxt-edge-node-20(tier0_sr)> | ny-nsxt-edge-node-21(tier0_sr)> get bfd-sessions BFD Session Dest_port : 3784 Diag : No Diagnostic Encap : vlan Forwarding : last true (current true) Interface : a5454564-ef1c-4e30-922f-9876b9df38df Keep-down : false Last_cp_diag : No Diagnostic Last_cp_rmt_diag : No Diagnostic Last_cp_rmt_state : up Last_cp_state : up Last_fwd_state : UP Last_local_down_diag : No Diagnostic Last_remote_down_diag : No Diagnostic Last_up_time : 2020-07-07 15:42:15 Local_address : 172.16.160.21 Local_discr : 3842218904 Min_rx_ttl : 255 Multiplier : 3 Received_remote_diag : No Diagnostic Received_remote_state : up Remote_address : 172.16.160.254 Remote_admin_down : false Remote_diag : No Diagnostic Remote_discr : 1090519042 Remote_min_rx_interval : 500 Remote_min_tx_interval : 500 Remote_multiplier : 3 Remote_state : up Router : a2ea4cbc-c486-46a1-a663-c9c5815253af Router_down : false Rx_cfg_min : 500 Rx_interval : 500 Service-link : false Session_type : LR_PORT State : up Tx_cfg_min : 500 Tx_interval : 500
BFD Session Dest_port : 3784 Diag : No Diagnostic Encap : vlan Forwarding : last true (current true) Interface : 8423e83b-0a69-44f4-90d1-07d8ece4f55e Keep-down : false Last_cp_diag : No Diagnostic Last_cp_rmt_diag : No Diagnostic Last_cp_rmt_state : up Last_cp_state : up Last_fwd_state : UP Last_local_down_diag : No Diagnostic Last_remote_down_diag : No Diagnostic Last_up_time : 2020-07-07 15:42:15 Local_address : 172.16.161.21 Local_discr : 2646176019 Min_rx_ttl : 255 Multiplier : 3 Received_remote_diag : No Diagnostic Received_remote_state : up Remote_address : 172.16.161.254 Remote_admin_down : false Remote_diag : No Diagnostic Remote_discr : 1090519042 Remote_min_rx_interval : 500 Remote_min_tx_interval : 500 Remote_multiplier : 3 Remote_state : up Router : a2ea4cbc-c486-46a1-a663-c9c5815253af Router_down : false Rx_cfg_min : 500 Rx_interval : 500 Service-link : false Session_type : LR_PORT State : up Tx_cfg_min : 500 Tx_interval : 500
ny-nsxt-edge-node-21(tier0_sr)> |
Design Option 2 - Static Routing (A/S HA VIP)
The first step in design option 2 is the Tier-0 static route configuration for northbound traffic. The most common way is to configure a default routes northbound. The diagram below shows the setup with the two default routes (in black) northbound. As already mentioned, HA VIP requires that both NSX-T Edge Transport Node interfaces belong to the same Layer 2 segment (NY-T0-VLAN-SEGMENT-160). A single default routes with with two different Next Hops (172.16.160.254 and 172.16.161.254) are configured on the NY-T0-GATEWAY-02. With this design we could also achieve ECMP for northbound traffic towards the ToR switches. The diagram below shows the corresponding NSX-T Tier-0 Gateway static routing configuration. Please keep in mind again, that at the NSX-T Edge Transport Node level, each Edge Transport Node will have two default route entries even we have configured at Tier-0 Gateway level only two default routes, not four. This is shown in the table underneath.
Please note, the configuration steps how to configure the Tier-1 Gateway (NY-T1-GATEWAY-BLUE) and how to connect it to the Tier-0 Gateway is not shown.
Table 9 - NSX-T Edge Transport Node Routing Table for Design Option 2 (A/S HA VIP)
ny-nsxt-edge-node-22 (Service Router) | ny-nsxt-edge-node-23 (Service Router) |
---|---|
ny-nsxt-edge-node-22(tier0_sr)> get route 0.0.0.0/0
Flags: t0c - Tier0-Connected, t0s - Tier0-Static, b - BGP, t0n - Tier0-NAT, t1s - Tier1-Static, t1c - Tier1-Connected, t1n: Tier1-NAT, t1l: Tier1-LB VIP, t1ls: Tier1-LB SNAT, t1d: Tier1-DNS FORWARDER, t1ipsec: Tier1-IPSec, isr: Inter-SR, > - selected route, * - FIB route
Total number of routes: 1
t0s> * 0.0.0.0/0 [1/0] via 172.16.160.253, uplink-278, 00:00:27 t0s> * 0.0.0.0/0 [1/0] via 172.16.160.254, uplink-278, 00:00:27 ny-nsxt-edge-node-22(tier0_sr)> | ny-nsxt-edge-node-23(tier0_sr)> get route 0.0.0.0/0
Flags: t0c - Tier0-Connected, t0s - Tier0-Static, b - BGP, t0n - Tier0-NAT, t1s - Tier1-Static, t1c - Tier1-Connected, t1n: Tier1-NAT, t1l: Tier1-LB VIP, t1ls: Tier1-LB SNAT, t1d: Tier1-DNS FORWARDER, t1ipsec: Tier1-IPSec, isr: Inter-SR, > - selected route, * - FIB route
Total number of routes: 1
t0s> * 0.0.0.0/0 [1/0] via 172.16.160.253, uplink-279, 00:00:57 t0s> * 0.0.0.0/0 [1/0] via 172.16.160.254, uplink-279, 00:00:57 ny-nsxt-edge-node-23(tier0_sr)> |
The second step is to configure static routing southbound from the ToR switches towards NSX-T Edge Transport Node. Each ToR switch is configured with two static routes to forward traffic to the destination overlay networks (overlay segments 172.16.242.0/24 and 172.16.243.0/24) within NSX-T. For each of the static routes is the Next Hop the NSX-T Tier-0 Gateway HA VIP.
The table below shows the static routing configuration on the ToR switch and the resulting routing table. The Next Hop is the Tier-0 Gateway HA VIP 172.16.160.24 for all static routes.
Table 10 - Nexus ToR Switches Static Routing Configuration and Resulting Routing Table for Design Option 2 (A/S HA VIP)
NY-N3K-LEAF-10 | NY-N3K-LEAF-11 |
---|---|
ip route 172.16.242.0/24 Vlan160 172.16.160.24 ip route 172.16.243.0/24 Vlan160 172.16.160.24 | ip route 172.16.242.0/24 Vlan160 172.16.160.24 ip route 172.16.243.0/24 Vlan160 172.16.160.24 |
NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 02:51:34, static *via 172.16.160.21, Vlan160, [1/0], 02:51:41, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 02:51:34, static *via 172.16.160.21, Vlan160, [1/0], 02:51:41, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 02:55:42, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 02:55:42, static
NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 02:53:04, static *via 172.16.161.21, Vlan161, [1/0], 02:53:12, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 02:53:04, static *via 172.16.161.21, Vlan161, [1/0], 02:53:12, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 02:55:03, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 02:55:03, static
NY-N3K-LEAF-11# |
Failover Sanity checks
The table below
Table 11 - Failover Sanity Check
Failover Case | NY-N3K-LEAF-10 (Routing Table) | NY-N3K-LEAF-11 (Routing Table) | Comments |
---|---|---|---|
All NSX-T Edge Transport Nodes are UP | NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 00:58:27, static *via 172.16.160.21, Vlan160, [1/0], 00:58:43, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 00:58:27, static *via 172.16.160.21, Vlan160, [1/0], 00:58:43, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:02:47, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:02:47, static NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 00:59:10, static *via 172.16.161.21, Vlan161, [1/0], 00:59:25, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 00:59:10, static *via 172.16.161.21, Vlan161, [1/0], 00:59:25, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:01:21, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:01:21, static NY-N3K-LEAF-11# | |
NSX-T Edge Transport Node ny-nsxt-edge-node-20 is DOWN All other NSX-T Edge Transport Node are UP | NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 1/0 *via 172.16.160.21, Vlan160, [1/0], 01:01:01, static 172.16.241.0/24, ubest/mbest: 1/0 *via 172.16.160.21, Vlan160, [1/0], 01:01:01, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:05:05, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:05:05, static NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 1/0 *via 172.16.161.21, Vlan161, [1/0], 01:01:21, static 172.16.241.0/24, ubest/mbest: 1/0 *via 172.16.161.21, Vlan161, [1/0], 01:01:21, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:03:17, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:03:17, static NY-N3K-LEAF-11# | Route entries with ny-nsxt-edge-node-20 (172.16.160.20 and 172.16.161.20) are removed by BFD |
NSX-T Edge Transport Node ny-nsxt-edge-node-21 is DOWN All other NSX-T Edge Transport Node are UP | NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 1/0 *via 172.16.160.20, Vlan160, [1/0], 00:02:40, static 172.16.241.0/24, ubest/mbest: 1/0 *via 172.16.160.20, Vlan160, [1/0], 00:02:40, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:12:13, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:12:13, static
NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 1/0 *via 172.16.161.20, Vlan161, [1/0], 00:03:04, static 172.16.241.0/24, ubest/mbest: 1/0 *via 172.16.161.20, Vlan161, [1/0], 00:03:04, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:10:28, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:10:28, static
NY-N3K-LEAF-11# | Route entries with ny-nsxt-edge-node-21 (172.16.160.21 and 172.16.161.21) are removed by BFD |
NSX-T Edge Transport Node ny-nsxt-edge-node-22 is DOWN All other NSX-T Edge Transport Node are UP | NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 00:06:55, static *via 172.16.160.21, Vlan160, [1/0], 00:00:09, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 00:06:55, static *via 172.16.160.21, Vlan160, [1/0], 00:00:09, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:16:28, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:16:28, static
NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 00:07:01, static *via 172.16.161.21, Vlan161, [1/0], 00:00:16, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 00:07:01, static *via 172.16.161.21, Vlan161, [1/0], 00:00:16, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:14:25, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:14:25, static
NY-N3K-LEAF-11# | A single NSX-T Edge Transport Node down used for HA VIP does not change the routing table |
NSX-T Edge Transport Node ny-nsxt-edge-node-23 is DOWN All other NSX-T Edge Transport Node are UP | NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 00:10:58, static *via 172.16.160.21, Vlan160, [1/0], 00:04:12, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.160.20, Vlan160, [1/0], 00:10:58, static *via 172.16.160.21, Vlan160, [1/0], 00:04:12, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:20:31, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:20:31, static
NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.240.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 00:11:30, static *via 172.16.161.21, Vlan161, [1/0], 00:04:45, static 172.16.241.0/24, ubest/mbest: 2/0 *via 172.16.161.20, Vlan161, [1/0], 00:11:30, static *via 172.16.161.21, Vlan161, [1/0], 00:04:45, static 172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:18:54, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:18:54, static
NY-N3K-LEAF-11# | A single NSX-T Edge Transport Node down used for HA VIP does not change the routing table |
NSX-T Edge Transport Node ny-nsxt-edge-node-20 and ny-nsxt-edge-node-21 are DOWN All other NSX-T Edge Transport Node are UP | NY-N3K-LEAF-10# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:24:06, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:24:06, static
NY-N3K-LEAF-10# | NY-N3K-LEAF-11# show ip route static IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string>
172.16.242.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:22:54, static 172.16.243.0/24, ubest/mbest: 1/0 *via 172.16.160.24, Vlan160, [1/0], 01:22:54, static
NY-N3K-LEAF-11# | All route entries related to design option 1 are removed by BFD |
I hope you had a little bit of fun reading this blog post about a static routing with NSX-T. Now with the knowledge how to archive ECMP with static routing, you might have a new and interessting design option for your customers NSX-T deployment.
Software Inventory:
vSphere version: VMware ESXi, 6.5.0, 15256549
vCenter version: 6.5.0, 10964411
NSX-T version: 3.0.0.0.0.15946738 (GA)
Cisco Nexus 3048 NX-OS version: 7.0(3)I7(6)
Blog history
Version 1.0 - 08.07.2020 - first published version