Building a NSX-T nested lab with Eve-NG, virtualised Arista switches, BGP, ECMP and the kitchen sink attached.

Since my last tweet and blogpost on a bug in NSX-T when deploying on a nested ESXi host, i've had a few requests from people to describe the actual lab setup used, the procedure and a quick-and-dirty guide to get started with NSX-T on vSphere. This blogpost will focus on the actual physical lab setup, the virtual lab setup, the physical (and virtual) networking, and the actual NSX-T implementation and setup.

Note: This blog presumes the reader has decent skills in networking and virtualisation, as we'll be going full-on inception on our virtualised network. If you're having issues wrapping your head around nested labs, you're in for a ride.

Due to the size of this blogpost a table of content has been provided.

Introduction to NSX-T

For the people not familiar with NSX-T (also known as "Transformers", hence the spiffy product logo, and not to be confused with the Honda NSX-T), NSX-T is a new product from VMware, based on concepts from both NSX-v as well als NSX-MH (short for Multi-Hypervisor). As most people will mainly be aware of NSX for vSphere (which is what most people refer to when they talk about VMware NSX), NSX-MH was the Nicira implementation for non-vSphere platforms. While it was comparable to NSX for vSphere in some aspects, the intended user base, feature set, configuration and deployment was completely different.

A network that can transform into a sports car
I too have always dreamed about a network that transforms into a car. Or a car with Layer 3 routing capabilities.

Now, with NSX-T we see a convergence of vSphere-based and non-vSphere based SDN into a single platform. While NSX-T is an entirely new product and technical comparisons cannot be made directly between NSX-T and NSX-V or NSX-MH, a lot of concepts from both NSX-v and NSX-MH have been kept and function similarly from a conceptual point of view. While feature parity is not there yet, NSX-T offers some significant benefits over NSX-v such as control plane based routing configuration, improvements to redundancy (BGP-MED and BFD come to mind), decoupling of the vCenter management layer, a significantly better API, and much more. While you'll mostly see NSX-v in your typiscal vSphere based datacenter environment, don't be surprised if you'll start hearing the name NSX-T much more in the coming year.

Physical lab setup

My physical lab consists of the following components:

  • a Supermicro server with the following specs:
    • 6 core Xeon-D with hyperthreading
    • 128GB Memory
    • 1TB Samsung 960 Pro M.2 NVMe
    • 4x 4TB WD Red 5400RPM
    • 2x 1GbE and 2x10Gbe (note that these are only running at 1GbE speeds because of the switch).
  • A second supermicro server with a 128GB SSD, a 4 core CPU and 32GB of memory, which is mostly used if i need some physical hardware to mess around with things.
  • a Ubiqiuiti edgeswitch lite 24x1GbE switch.
  • a Ubiqiuiti edgerouter lite providing WAN connectivity and services such as firewalling and dynamic routing.

On this box we run ESXi 6.5 with a vCenter server and all other related infrastructure components such as AD, NSX-v, FreeNAS, etc.

L3 connectivity is provided by the Edgeswitch Lite, with the exception of some VLANs such as DMZ or transit VLANs which are L2 up to the Edgerouter Lite. A quick overview of my network infrastructure at home can be seen below.

At the distributed vswitch level we've created the following relevant port groups:

  • 2 Trunk port groups which trunk all VLANs in the ranges 23xx respectively 24xx, plus the 10xx range. These are used to trunk traffic to our nested ESXi hosts.
  • 1 Trunk port group which trunks all VLANs in the range 10xx. This is used to connect our virtual switches running in eve-NG to the physical network.
  • Access port groups for transit networks, which will be used to connect the NSX Edges to the physical network.

For the purposes of this lab, we are running the following components on this machine:

  • 2 Nested ESXi hosts with the following specs:
    • 4 virtual Nics connected to their "site-local" trunk ports as described above (either 23xx or 24xx). 2 Nics have been assigned to the standard interfaces (management, vMotion, storage, etc.), while two have been left unassigned. This is important for later.
    • 2 vCPUs
    • 12GB Memory
    • 8 GB local storage
    • Connected to a Freenas provided iSCSI datastore
  • 1 Virtual machine running Eve-NG
    • 2 virtual nics, 1 connected to my management VLAN, 1 connected to the 10xx trunk port.

Eve-NG allows you to run virtualised network components from pretty much every major vendor inside a virtual machine, providing you with nearly all of the features that do not require hardware support.

These ESXi machines were then connected to the vCenter and placed in their own cluster each. Note that i would normally use William Lam's vGhetto nested lab deployment, but for this lab i wanted to build the machines myself.

So the structure we're getting in vCenter is as follows:

  • vCenter
    • Cluster "Nested - Site 1"
      • vesxi101.int.vxsan.com (Host)
        • Nested VM for testing purposes
    • Cluster "Nested - Site 2"
    • vesxii201.int.vxsan.com (Host)
      • Nested VM for testing purposes
    • Cluster "Physical"
      • pesxi101
        • vesxi101.int.vxsan.com (VM)
        • vesxi201.int.vxsan.com (VM)

If you're getting confused right now, consider the fact that you can leave the physical server and vCenter components out of the equation for now. They'll be purely used to deploy workloads on, but this will be no different than if you were to deploy all remaining components on the nested ESXi hosts.

Virtual network setup

So earlier in this blog post i told you we'd be going full-on inception? This is where the fun really starts. Remember the Eve-NG machine we've deployed earlier? We're running two arista vEOS QEMU images on there which are connected to the VLAN 10xx trunk port connected to the Eve-NG VM. This means that all VLAN tagging passed through the Eve-NG VM arrives at the Arista vEOS images untouched, which allows us to do VLAN trunking all the way at the virtual switch level.

Logically, this looks something like this:

Yo dawg

Conceptually, it looks something like this:

While the whole setup procedure for Eve-NG is out of scope for this blog post, there are great guides out there on how to set it up and how the product actually works. I've just got two tips when deploying this:

  • remember to enable MAC Address changes and promiscuous mode on your trunk port groups and prevent yourself from looking like an idiot.
  • make sure your MTUs match. This cost me a significant amount of time, but the Arista vEOS will use MTU 9214 by default on its port groups. If you have a different MTU configured on your physical switch, things will break and you'll be tearing out your hairs in frustration why it's not working. Make sure your MTU matches end-to-end.
  • Nested virtualised networking will perform like a drunk potato. You're running QEMU on top of vSphere. Your switch has no hardware offload whatsoever. You're most likely going to see 50+ms latency even between two switches running on the same Eve-NG machine. Don't expect this to perform well in any way whatsoever. This is purely for testing purposes.
  • Do not configure BFD with this setup, shit will break in unexpected and hilarious ways involving explosions and women and children manning the lifeboats.

NSX-T deployment

Now that we have our physical and our nested network set up, i'd like to apologise for the lengthy introduction. However, I do believe that it is a required part to get all the following steps , since so much of the NSX-T configuration depends on it, and the amount of nesting involved can cause one to very easily make mistakes. However, now that we're done with this we can finally get to the interesting bit: deploying NSX-T!

The topology we're going to be building is as follows:

Preparation

Start off by dowloading all the required bits from your software source, either the nicira portal or the VMware partner portal. We really need all the bits, as there's no automatic deployment of any component from the NSX Manager, as you'd expect if you are familiar with NSX-v. Once you're done you should have the following files:

  • nsx-manager-1.1.0.0.0.4788147.ova
  • nsx-controller-1.1.0.0.0.4788146.ova
  • nsx-edge-1.1.0.0.0.4788148.ova
  • nsx-edge-1.1.0.0.0.4788148.iso
  • nsx-lcp-1.1.0.0.0.4788198-esx65.zip

All components can be directly deployed on the physical ESXi host outside of your nested lab, there is no requirement to run any components on hosts being managed by NSX itself, as it was for NSX-v. For simplicity's sake, i recommend to not deploy any kind of components on the nested hosts to prevent increasing the yodawg-level even more.

In addition, make sure you've reserved static IP addresses for all components, and as a precaution, i've also created A and PTR records for all components outside of the VTEP interfaces.

Deploying the NSX-T Manager and controller.

In this step, we'll prepare our deployment by doing the following:

  • Deploy the NSX manager.
  • Deploy the NSX controllers.
  • Join the controllers to our management plane.

First, we deploy the NSX Manager. This can be done in the regular way by deploying an OVF. All the extended properties are relatively straightforward. I would recommend enabling SSH as it makes configuration significantly simpler.

Next, deploy at least one controller. Again, this is an OVF deployment without any advanced configuration properties, and enabling SSH significantly increases your quality of life.

Once the controller is deployed, log in to both the manager and the controller through either the console or through ssh and run the following commands.

Note that if you are using putty, for some reason both the NSX manager and Controller will disconnect the session unexpectedly. I have no idea what's causing this, but i've used bash on windows for this lab.

On the manager:

NSX-Manager> get certificate api thumbprint  
973eacda1ec9ff72139dnotgettingthisonesorrya4ef7b3beb4ab6a47bc19287  

Note: store this thumbprint for later, you'll need it quite a few times.

On the controller:

join management-plane ip.of.nsx.mgr username admin thumbprint nsx.mgr.thumbprint  

Enter the password for your nsx manager, wait for a minute, and ensure that your manager is connected by running the following.

On the controller:

nsxtc> get managers  
- 172.22.100.22    Connected
nsxtc> set control-cluster security-model shared-secret  
Secret: enteryoursecrethere  
nsxtc> initialize control-cluster  
nsxtc> get control-cluster status  
is master: true  
in majority: true  
uuid                                 address              status  
dc59f305-354f-4cdd-bf4b-f4e7f424a2b2 172.22.100.23        active  

Ensure that is master and in majority are true, and your controller is listed.

Next, on the manager:

nsxtm> get management-cluster status  
Number of nodes in management cluster: 1  
- 172.22.100.22    (UUID 4221BA38-FC25-C278-C4D2-B003603F760D) Online

Management cluster status: STABLE

Number of nodes in control cluster: 1  
- 172.22.100.23    (UUID dc59f305-354f-4cdd-bf4b-f4e7f424a2b2)

Control cluster status: STABLE

nsxtm> get management-cluster status  
Number of nodes in management cluster: 1  
- 172.22.100.22    (UUID 4221BA38-FC25-C278-C4D2-B003603F760D) Online

Management cluster status: STABLE

Number of nodes in control cluster: 1  
- 172.22.100.23    (UUID dc59f305-354f-4cdd-bf4b-f4e7f424a2b2)

Control cluster status: STABLE  

Ensure that all status show as stable and your controller is listed. If you wish to join more controllers, deploy them as the first one and run the following

on each new controller:

nsxtc> set control-cluster security-model shared-secret  
Secret: enteryouroriginalsecrethere  
nsxtc> get control-cluster certificate thumbprint  
8caf5b9721446f98dc5dnotgettingthisoneeitherc9702d2fc4d6a469b42499  

Then, on the master controller run the following for each of the new controllers:

nsxtc> join control-cluster ip.of.new.controller thumbprint new.controller.thumbprint  

Then, on each new controller run the following:

nsxtc> activate control-cluster  

Prepare and configure hosts

Once this is all done we can log in to the nsx manager UI. Open a browser and point it towards whatever you configured as the NSX manager's hostname and you'll be presented with the most glorious UI a piece of networking kit has had in a long while. It's clean, it's responsive, it slices, it dices, it makes julienne fries!.

The best reason to install NSX-T? No more flash UI.

Next, open the fabric tab. Your hosts should be empty, so let's change that. Click the "add" button, enter a name for your host, its management IP address, your operating system (ESXi) and the username and password. Leave the thumbprint empty as this will automatically be resolved. After adding the host, wait a few minutes until everything is green and the software version shows up correctly. Note that if everything is not green and you're running ESXi 6.5 nested on ESXi6.5, ensure that you've read https://vxsan.com/nsx-t-esxi-host-preparation-fails-errno-1-operation-not-permitted-it-is-not-safe-to-continue/ and have disabled secure boot for now.

Prepare transport zones and transport nodes.

Next up, we'll prepare the transport zones. For those coming from NSX-v, the concept of transport zones is slightly different. A transport zone is no longer purely an overlay concept, but dictates which networks can be connected to from a hypervisor or an edge cluster. As such, transport zones come in two forms: Overlay or VLAN transport zones.

An Overlay transport zone is what ties the transport nodes together and transports the actual virtual wires. a VLAN transport zone is used for your Tier-0 edges to connect the virtualised network to the physical network.

First off, start with creating an overlay transport zone. Open the Fabric -> Transport zones menu and click add.

Enter a name for your transport zone, an optional description and a host switch name. The host switch name will be used to create a mapping between physical nics on your ESXi hosts and either logical router nics or VTEP interfaces.

Next, create at least one IP pool for your VTEP interfaces. Open Inventory -> Groups -> IP Pools and create at least one IP pool. As i've created two different "sites" in my lab, i've created two IP pools using distinct IP subnets. This part is very similar to NSX-v's IP pool creation: enter a name, add an IP range, gateway, subnet and optional DNS servers.

When the IP pools are created, open Fabric -> Profiles and create a new Uplink Profile. The uplink profile determines the VTEP interface configuration. While you can use the default, it is recommended to create your own. I've provided an over overview of the uplink profiles i've created to serve as an example, but right now you only need one or two (depending on whether you're using separate L3 domains for your ESXi hosts:

Enter a name and an optional description. If you want LAGs, configure a lag (which has its use cases for physical edges, but that's something we're not going into right now). Set a Teaming policy, and depending on your teaming policy set active and standby uplinks as uplink-1 and uplink-2.

Next, set a transport VLAN which will be your VTEP vlan, and set the MTU to a minimum of 1600.

Once the Uplink profiles are created, add your ESXi nodes as a transport node. A transport node defines a node as participating in your network fabric. Open Fabric -> Nodes -> Transport Nodes and click "Add".

Enter a name (recommended to use the name of your host), select your ESXi host from the dropdown, and select the Overlay transport zone you've just created. Then, in the new node switch enter your host switch name as the host switch name you've configured in the transport zone, and in the Uplink profile enter a name for your VTEP. For the Uplink profile, select your Uplink profile you've created, and for IP assignment select either "Use IP Pool", "Use DHCP" or "Use Static IP List". The first two should speak for itself if you have NSX-v experience, the third one is new and allows you to configure a static IP directly on a host.

If you are using DHCP ensure you have your DHCP server set up, if you are using IP pools click the "Create and use new IP pool" and enter a name and a subnet in the same way you'd do this in NSX-v.

Next, in the physical NICs section, select a physical device and assign it an uplink from your uplink profile. Remember when i said to leave two nics unused in your nested ESXi host? This is the reason why, as NSX-T needs dedicated physical NICs for its VTEP interfaces.

In our case we add vmnic3 and vmnic4 to be our uplink-1 and uplink-2 respectively. Note that there is a weird UI bug when adding and selecting a second vmnic from the dropdown, but you can enter the name manually.

After waiting a short while the transport node should show up and its status should be "Success". If you log in to vSphere and open the physical adapter configuration of your ESXi host it should show the physical nic is assigned to your overlay switch, even though the switch itself does not appear anywhere within vSphere. I believe this is a quality of life issue and it would be nice to show the switch itself as an opaque switch, but for now we can use this to validate the host is actually connected to the NSX-T opaque host switch.

Now that we have prepared our cluster, let's deploy some edge hosts!

Edge host deployment and configuration

Before we start with the actual edge host configuration, there are some things to consider if you're coming from NSX-v. Whereas an Edge is a virtual appliance performing actual routing, natting, and advanced services inside the VM, the concept of an edge is slightly different in NSX-T.

An edge is either a virtual or physical form factor server, which serves as the control plane for routing and advanced services in NSX-T. Whereas in NSX-v one would deploy an edge per router, NSX-T virtualises these resurces as containers running inside an edge node, also named VRFs. So instead of having multiple edges, a NSX-T environment will only have a limited amount of edges which then create resources such as Tier-0 and Tier-1 routers, DHCP servers and NAT services on-demand. Not only does this significantly reduce VM sprawl, it also simplifies configuration, allows for rapid provisioning and reduces complexity of the environment.

Edges can be deployed in multiple ways (OVA deployment, PXEBoot, ISO installation. In this deployment, we'll focus on the OVA deployment, but from a technical point of view they are all equal.

Start off with deploying the edge through the OVA deployment. During the deployment one can select the size, which determines the throughput and performance of advanced services. For this deployment we'll select small as it is only a Poc and our resources are limited.

When it comes to the selection of network ports, you have to follow a specific setup. While this can be changed afterwards, it is recommended to do it correct at deployment time:

  • Network 0 should be connected to your management VLAN port group as this will be used to configure the edge and log in to it over SSH.
  • Network 1 should be connected to either your trunk or a dedicated VTEP port group.
  • Network 2 and 3 should be connected to either individiual /31 transit VLAN port groups or a trunk containing your uplinks. This design choice is entirely dependent on your Tier-0 configuration, for this example we'll use dedicated port groups for transit purposes.

Just as with NSX-v it is recommended to set these transit port groups to active/unused and unused/active and have the actual routing protocol handle redundancy. Fortunately we've already preprovisioned these transit VLANs for these kind of use cases, if not you'll have to build them yourself.

Once the edges are deployed (you have deployed two, didn't you?), log in to the edge over SSH if you've enabled that during the deployment, and run the following

nsxte-0> join management-plane ip.of.nsx.mgr username admin thumbprint nsxt.mgr.thumbprint  

if you don't have the thumbprint anymore, you can retrieve it from the manager with the command get certificate api thumbprint.

You should get a message stating that the edge succesfully joined the NSX fabric as a fabric node with its UUID. repeat this step for each edge node.

When you log back in to the NSX manager UI, you should see your edge nodes under Fabric -> Nodes -> Edges. Note that LCP connectivity is not available, this is entirely normal until the host is joined as a transport node. Next up, we'll prepare the edge as an actual transport node and join it to the transport zones.

First off, we'll need to create transport zones for your VLAN uplinks. Remember i told you how transport zones are different from NSX-v? This is where that comes in. When we created a transport zone for logical switching earlier, we selected the type as Overlay. Now, we'll go to Fabric -> Transport zone and create 4 transport zones, one for each of our transit VLANs.

After the transport zones have been created, add your Edge node as a transport node to join it to the fabric. Open Fabric -> Nodes -> Transport Nodes and add a new transport node.

This procedure is the same as for an ESXi host with the difference that this edge node will be connected to multiple overlays. If you consider your typical NSX-v design, you'd have port groups for transit VLANs and a VTEP vlan on your vSphere edge cluster, and this is very similar.

Enter a name and select your node, then in the transport zones field select your overlay and two VLAN transport zones. Note that you can add as much transport zones to an edge transport node as you wish, but for our use case we're using two VLAN transport zones and a single overlay zone.

Next, we add the Node switches. This is very similar to the ESXi transport node configuration, however we need to create multiple node switches, one for each transport zone. When you've configured the first node switch, click "Add new node switch" at the bottom to create new ones for each transport zone. Again, the edge switch names must match what you've configured in the transport zone. When creating a VLAN backed node switch, you cannot configure an IP profile, because a VLAN transport zone does not require VTEP interfaces.

Also note that for virtual edge transport nodes, only a single virtual nic can be bound to a tz overlay. For virtual edges, this is something that would be resolved through your distributed switch to which the virtual appliance is connected. For physical edges you can use active/standby uplinks as normal. This is also the reason we've created separate uplink profiles for our edges.

When you're done, the edge transport node should look something like this:

When the status of the edges is all OK (check for the LCP status in the edge panel, note that this may take a few minutes), start creating an edge cluster. Open the Fabric -> Nodes -> Edge clusters panel and add a new edge cluster. Note that this is mandatory, even if you only have a single edge transport node.

Enter the name, an optional description and select the standard edge cluster profile. Next, click edit next to the transport nodes, select "Virtual machine" and select both edges. If you've deployed edges as a physical machine (which can still be virtual but needs to be deployed through the ISO or PXEBoot), they'll appear as physical machines.

Once the edge cluster is created, we can actually start creating logical routers. Open the Routing panel, then add a new router and select "New Tier-0 router". Tier-0 routers are what connect your physical network to your virtualised network fabric. Unlike the NSX Edge service gateway however, the Tier-0 router is fully distributed and purely functions as a physical to virtual perimeter router. As such, it does not support the features that a Tier-1 router does.

Enter a name and an optional description, select your edge cluster and select your high availability mode. This choice is up to you, active-standby is comparable to NSX-v's HA deployment (but without the long failover times), whereas active-active will configure ECMP to your physical routers. As we like to explore ECMP, we'll select active-active.

When deploying the logical router, you'll notice the speed at which it deploys. As there are no appliances or virtual machines to deploy or boot, new routing instances can literally be configured at the speed at which a container can be spun up, which will make automated deployments significantly easier.

Next, create a Tier-1 router. Tier-1 routers are what connects your logical switches to your tier-0 routers and provide advanced services such as DHCP and NAT, while still being fully distributed. They can be compared as a hybrid between NSX-v's distributed logical router and edge service gateway, with all routing being performed in-kernel while stateful services have their control plane on the edge service router which exists on the edge transport node.

When creating a Tier-1 router enter a name and optional description and select your upstream Tier-0 Router. This automatically creates internal logical switches between the Tier-0 and Tier-1 router. Next, select your edge cluster, edge cluster members and preferred member. Due to the way Tier-1 routers work, they cannot be configured as active-active and we'll need to select a primary member on which services run.

When both Tier-0 and Tier-1 routers have been created, your router should look something like this.

Next to create connectivity, we'll create logical switches. This is very comparable to NSX-v, with one major difference: VLAN backed port groups are also logical switches, as opposed to port groups in vSphere.

Open the switching panel and start adding logical switches. For this demo we've created the following logical switches:

  • LS1
  • LS2
  • vlan-1012
  • vlan-1014
  • vlan-1016
  • vlan-1018

The LS logical switches are created in the tz-overlay transport zone, and the vlan logical switches are created in their respective vlan transport zone. The VLAN logical switches have been created with VLAN tag 0, as these are connected to a port group configured as an access port. If you were to connect these to a trunk port a VLAN tag msut be configured.

Note that for the logical switches you must select a Replication mode, either Hierarchical Two-Tier or Head. Hierarchical Two-Tier is very comparable to NSX-v's Hybrid replication mode where a UTEP is elected in each L2 broadcast domain which replicates trafic to other broadcast domains. Head is very similar to unicast replication mode. Note that there is no pure multicast option anymore. If you want to, switching profiles can be configured which allow for advanced security, monitoring and traffic shaping options such as DSCP/CoS, Spoofguard, ARP and DHCP snooping, etc.

When configured, your logical switches should look comparable to this.

Next, go back to your router overview. Click the Tier-1 Router name and select configuration. Add a Logical Router port, enter a name, select your logical switch to connect the port to and configure an IP address and subnet mask.

If you open the switching -> Ports panel, you should see your logical routers being connected to your logical switch. When you click the attachment you can see the details about the logical router's port. Note that in this example i've also created DHCP services on the logical switches, but this is not required and left as an exercise for the reader.

Now that your Tier-1 router has been configured, we can test basic connectivity. Create virtual machines on your nested ESXi hosts, and when connecting their virtual NICs to a network, you should see two new port groups with a unique interface. This is an opaque switch and is unique to NSX-T. It cannot be managed or configured from vSphere, is unique to each host and is completely independent of vCenter.

Confirm that - once an IP address has been configured on the virtual machines - that you can ping the Tier-1 logical router IP address, and the VM on the other logical switch.

Now that we've configured connectivity between logical switches, it's time to configure actual physical connectivity. Open the Routing panel, and select the Tier-0 router we've configured earlier. Add a new router port, enter a name and optional description and select the uplink type. Next, select the first transport node and the first transit VLAN logical switch. The configuration should look something like this.

Repeat these steps for each transit logical switch, and ensure the correct logical switches are connect to the correct transport mode.

When finished, the result should be one downlink port to your Tier-1 router and four uplink ports to your VLAN logical switches:

Configure upstream routing

Next, enter the Routing panel in the Tier-0 router and select BGP. First, edit the global BGP configuration. In our case we'll set BGP to enabled, enable ECMP, leave graceful restart disabled and set our local AS to 65538. Next, add a new Neighbour.

Enter the following information:

Neighbour address: the IP address of your BGP neighbor.
Local Address: select only the transit VLAN your BGP neighbour is connected to.
Remote AS: Enter the remote AS of your neighbor.
Keep Alive Time: This is dependent on the equipment you are peering with.
Hold Down time: This is dependent on the equipment you are peering with.

When all of this has been configured, we can log in to either of the Tier-0 router appliances and run the following commands

get logical-router  

look for the SERVICE_ROUTER_TIER_0 and note down its VRF. This is the routing engine responsible for BGP peering. Next, run vrf vrfid - where vrfid is the VRF ID of the Tier-0 service router - to enter the VRF.

Next, run get route bgp to confirm that we are receiving routes from our upstream physical devices, and we are in fact getting ECMP enabled routes

nsxte-0(tier0_sr)> get route bgp

Flags: c - connected, s - static, b - BGP, ns - nsx_static  
nc - nsx_connected, rl - router_link, t0n: Tier0-NAT, t1n: Tier1-NAT


b    0.0.0.0/0            [20/0]        via 172.20.255.12  
b    0.0.0.0/0            [20/0]        via 172.20.255.14  
b    172.16.0.0/12        [20/0]        via 172.20.255.12  
b    172.16.0.0/12        [20/0]        via 172.20.255.14  
b    192.168.0.0/16       [200/0]       via 0.0.0.0  
b    192.168.1.0/24       [20/0]        via 172.20.255.12  
b    192.168.1.0/24       [20/0]        via 172.20.255.14  
b    217.63.252.0/23      [20/0]        via 172.20.255.12  
b    217.63.252.0/23      [20/0]        via 172.20.255.14  

Note that we are also getting 192.168/24 routes, these are in fact OSPF routes for another lab redistributed by NSX-v into BGP by my physical router.

The next step is to configure route redistribution from our Tier-1 to our Tier-0. One important thing to note is that these routes are distributed through the NSX-T control plane and not through a routing protocol.

Open the Tier-1 router, click routing and select "Route advertisement". First, configure the global options. We'll enable route advertisement and advertise all NSX connected routes. We won't advertise NAT routes or Static routes since we're not using those in our lab.

When we run get route on our Tier-0 service router VRF, we can see that two additional routes have been injected into the routing table.

nsxte-0(tier0_sr)> get route

Flags: c - connected, s - static, b - BGP, ns - nsx_static  
nc - nsx_connected, rl - router_link, t0n: Tier0-NAT, t1n: Tier1-NAT

Total number of routes: 11

b    0.0.0.0/0            [20/0]        via 172.20.255.12  
b    0.0.0.0/0            [20/0]        via 172.20.255.14  
ns   10.1.0.0/24          [3/0]         via 169.254.0.1  
ns   10.2.0.0/24          [3/0]         via 169.254.0.1  
rl   100.64.0.0/31        [0/0]         via 169.254.0.1  
c    169.254.0.0/28       [0/0]         via 169.254.0.2  
b    172.16.0.0/12        [20/0]        via 172.20.255.12  
b    172.16.0.0/12        [20/0]        via 172.20.255.14  
c    172.20.255.12/31     [0/0]         via 172.20.255.13  
c    172.20.255.14/31     [0/0]         via 172.20.255.15  
b    192.168.0.0/16       [200/0]       via 0.0.0.0  
b    192.168.1.0/24       [20/0]        via 172.20.255.12  
b    192.168.1.0/24       [20/0]        via 172.20.255.14  
b    217.63.252.0/23      [20/0]        via 172.20.255.12  
b    217.63.252.0/23      [20/0]        via 172.20.255.14  

The ns routes, which is a special type for NSX routing have been propagated by the Tier-1 logical router into the Tier-0 logical router routing table.

The last step we need to perform is to redistribute our routes into the physical world. Open the Tier-0 Router, selecting Routing and open "Route redistribution". First, ensure that route redistribution is enabled. Next, add a new route redistribution. Enter a name, an optional description and select your sources. In our case we're only selecting "NSX Static" as we have no need to distribute natted or manual static routes. Note that NSX static routes are auto-configured routes redistributed into the fabric, whereas static routes are manually configured static routes on the Tier-1 router. This can cause some confusion and initially cost me a good 30 minute to discover why route redistribution was not working..

As we can see on the arista switches we're receiving our 10.1.0.0/24 and 10.2.0.0/24 routes:

localhost#show ip route bgp

VRF name: default  
Codes: C - connected, S - static, K - kernel,  
       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type2, B I - iBGP, B E - eBGP,
       R - RIP, I L1 - ISIS level 1, I L2 - ISIS level 2,
       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
       NG - Nexthop Group Static Route, V - VXLAN Control Service

Gateway of last resort:  
 B E    0.0.0.0/0 [200/0] via 172.20.255.8, Vlan1008

 B E    10.1.0.0/24 [200/0] via 172.20.255.17, Vlan1016
 B E    10.2.0.0/24 [200/0] via 172.20.255.17, Vlan1016
 B E    172.20.255.0/31 [200/0] via 172.20.255.8, Vlan1008
 B E    172.20.255.2/31 [200/0] via 172.20.255.8, Vlan1008
 B E    172.20.255.4/31 [200/0] via 172.20.255.8, Vlan1008
 B E    172.20.255.6/31 [200/0] via 172.20.255.8, Vlan1008
 B E    172.20.255.10/31 [200/0] via 172.20.255.8, Vlan1008
 B E    172.20.0.0/16 [200/0] via 172.20.255.8, Vlan1008
 B E    172.21.100.0/24 [200/0] via 172.20.255.8, Vlan1008
 B E    172.21.101.0/24 [200/0] via 172.20.255.8, Vlan1008
 B E    172.21.102.0/24 [200/0] via 172.20.255.8, Vlan1008
 B E    172.22.0.0/16 [200/0] via 172.20.255.8, Vlan1008
 B E    172.23.0.0/16 [200/0] via 172.20.255.8, Vlan1008
 B E    172.24.0.0/16 [200/0] via 172.20.255.8, Vlan1008
 B E    192.168.1.0/24 [200/0] via 172.20.255.8, Vlan1008
 B E    192.168.0.0/16 [200/0] via 172.20.255.19, Vlan1018
 B E    217.63.252.0/23 [200/0] via 172.20.255.8, Vlan1008

And that's it! To prove that we've got connectivity i've performed a HTTP GET to my own blog, which shows that we can connect to an external network!

Hopefully this has helped you in understanding how NSX-T works, the differences between NSX-v and NSX-T and how you too can build your own multi-site NSX-T lab with relatively limited resources by using nested networking equipment and nested hypervisors.

Configurations

For completeness, the arista vEos configurations have been provided below:

vEOS 1

! Command: show running-config
! device: localhost (vEOS, EOS-4.18.1F)
!
! boot system flash:/vEOS-lab.swi
!
transceiver qsfp default-mode 4x10G  
!
spanning-tree mode mstp  
!
no aaa root  
!
username admin role network-admin secret sha512 redacted  
!
vlan 1008,1016,1018,2200,2203  
!
interface Ethernet1  
   no switchport
   ip address 172.31.255.1/31
!
interface Ethernet2  
   mtu 9214
   switchport access vlan 2200
   switchport mode trunk
!
interface Ethernet3  
!
interface Management1  
!
interface Vlan1008  
   ip address 172.20.255.9/31
!
interface Vlan1016  
   ip address 172.20.255.16/31
!
interface Vlan1018  
   ip address 172.20.255.18/31
!
interface Vlan2000  
!
interface Vlan2200  
   mtu 9214
   ip address 172.22.100.135/24
!
interface Vlan2203  
   ip address 172.22.103.145/24
!
ip routing  
!
router bgp 65536  
   neighbor pg-nsxt-1 peer-group
   neighbor pg-nsxt-1 remote-as 65538
   neighbor pg-nsxt-1 maximum-routes 12000
   neighbor 172.20.255.8 remote-as 65535
   neighbor 172.20.255.8 maximum-routes 12000
   neighbor 172.20.255.17 peer-group pg-nsxt-1
   neighbor 172.20.255.19 peer-group pg-nsxt-1
   aggregate-address 172.16.0.0/12 summary-only
!
end  

vEOS2

localhost(config-router-bgp)#show running-config  
! Command: show running-config
! device: localhost (vEOS, EOS-4.18.1F)
!
! boot system flash:/vEOS-lab.swi
!
transceiver qsfp default-mode 4x10G  
!
spanning-tree mode mstp  
!
no aaa root  
!
username admin role network-admin secret sha512 redacted  
!
vlan 1010,1012,1014,2200,2203  
!
interface Ethernet1  
   no switchport
   ip address 172.31.255.0/31
!
interface Ethernet2  
   switchport access vlan 2200
   switchport mode trunk
!
interface Ethernet3  
!
interface Management1  
!
interface Vlan1010  
   ip address 172.20.255.11/31
!
interface Vlan1012  
   ip address 172.20.255.12/31
!
interface Vlan1014  
   ip address 172.20.255.14/31
!
interface Vlan2200  
   mtu 9214
   ip address 172.22.100.136/24
!
interface Vlan2203  
   ip address 172.22.103.146/24
!
ip routing  
!
router bgp 65537  
   neighbor pg-nsxt-1 peer-group
   neighbor pg-nsxt-1 remote-as 65538
   neighbor pg-nsxt-1 maximum-routes 12000
   neighbor 172.20.255.10 remote-as 65535
   neighbor 172.20.255.10 maximum-routes 12000
   neighbor 172.20.255.13 peer-group pg-nsxt-1
   neighbor 172.20.255.15 peer-group pg-nsxt-1
   aggregate-address 172.16.0.0/12 summary-only
!
end