Why we really need an updated NSX design guide….

With the release of NSX-T 3.0 and vSphere 7 a new option came in the form of distributed vSwitch. Finally ESXi hosts requires no NVDS at all they can freely consume vDS port groups.

I am not stating all options below are valid and must be used. I am here to learn which are not recommended. I don’t discuss active-active T0 designs here, only Active-Standby.

However the options for Edge VM networking is extrapolated. So while there are many options out there, I’d like to discuss four of them. Hereby I have no exact answer which is good or bad, which is better or even worse.

Below in the examples I am peering in one VLAN with Switch1/2 which are in one IRF – there will be one example where two VLANs are to be used and naturally two peers.

There is one very important difference based on how many uplink VLANs are required:

  • one
  • two

Also need to decide how many TEP interfaces you need:

  • one
  • two

And just put this together with the load balancing options of a vDS.

As long as a given port group has one active uplink it does not matter which of the above options you use. However there are some designs where you can have multiple active links and things getting a little difficult to understand. In this article I simply take the default load balancing policy as my used one and will not even think through what the others would do with the traffic of an Edge VM. While the CPU requirement for Mac hash and IP hash are not significant, I prefer traffic flow in and out as quickly as possible (call me an old fashioned guy)

(To give you the background of my logic here – which might be faulty -, just sketch the TEP interface of a given Edge VM. It is talking to other host TEP and to the Edge TEP only. From and to matrix is limited and will not change based on any workload VM’s traffic – like it might do on VLAN based old fashioned way. No need to calculate based every packet, because it makes no sense.)

You can use LAG as well – however I do like to implement simple designs where the lowest amount of black magic is used, so I prefer not to use LACP in the age of 25/50/100Gbit link world.

Before we kick it off let me remind you to this fact:

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/installation/GUID-50FDFDFB-F660-4269-9503-39AE2BBA95B4.html

Yes. This means if you don’t use named teaming policy – for VLANs only failover based is supported with one active and NO standby – the uplink VLAN will stick to the first uplink – fp-eth0 – and if that uplink dies, the uplink VLAN will not fail over to the fp-eth1 interface.

Also before we start lets check this:

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/installation/GUID-370D06E1-1BB6-4144-A654-7AF2542C3136.html

What this says is that if you run at least vDS 6.6 you must configure Mac Learning or you will have issues when in a multitep design you loose one uplink and the failed TEP IP and it’s MAC is about to move to the other NIC of the Edge VM. However when you create a vDS like that after you are done with creating the necessary PGs, you must enable Mac Learning still – cannot do that from the GUI!

Multiple TEP – One uplink VLAN – vDS PGs with one uplink and no standby, named uplink policy

This requires two named policies for the same single uplink VLAN. On the T0, one Edge VM will have its uplink configured with uplink-1 pinned, the other Edge VM will have that the other way around.

The downside of this design is that TEP traffic will flow through both physical links, but uplink will use only one and once that uplink is gone the T0 will fail over no matter what.

Multiple TEP – One uplink VLAN – vDS PGs with two uplinks, one active and one standby – named uplink policy

Pretty much the same as option one, but vDS port groups here can consume the other uplink if the active one is dead.

You might be asking why. I ask the same 😀 Here the advantage is that if Edge VM1 is active than overlay will use both physical links and the first one for uplink traffic. If uplink dies, than no failover will happen at T0. However it the Edge VM1 is off for some reason, Edge VM2 will use the second uplink for uplink traffic in normal operation and only use the first physical uplink in case something is happening with the second switch or the connectivity to that.

Multiple TEP – One uplink VLAN – vDS PGs with two uplinks, one active and one standby – no named uplink policy

As mentioned above, here the uplink traffic flows from fp-eth0 always, regardless which Edge VM has the active part of a given T0.

Multiple TEP – Two uplink VLANs – vDS PGs with one uplink and no standby – named uplink policy

This is the really suggested design, no question. In this design, vDS port groups have one uplink configured – this is how VMware Cloud Foundation does it also (link) – and no standby. If that uplink dies, than peering in that VLAN will go down and the other will have to handle all the uplink and overlay traffic.

Multiple TEP – Two uplink VLANs – vDS PGs with one active uplink and one standby – named uplink policy

Sure there are some occurrences out there like this. Pretty similar to the one above, but with standby adapters for uplink port groups. This is also available in the NSX design guide at page 165.

Worth to mention that here you must peer in two VLANs.

https://nsx-labs.livefire.solutions/m/94286/l/1228931-3-2-create-trunk-port-groups-to-interconnect-edge-nodes

Single TEP – One uplink VLAN – vDS PG with one active and one standby

Error handling is the task of the vDS. Edge VM uplink design is simple. Only one physical NIC will be used at all times for all traffic.

Questions

This is not a post about how it needs to be done. This is an article start the discussion as a new NSX Design Guide is in need now really quickly as there are millions of options out there and while they are all working, no one can judge which is the recommended one, however I’d like select two that I can prefer.

I am posting this on my linkedin profile to open a discussion as I am not fan of blog based forums. https://www.linkedin.com/posts/activity-6730766933223911424-1Xlm

Before anyone moves a deployed system into production it is mandatory to do resiliency tests. Nothing fancy here everyone does this, however since the whole Edge VM networking is abstract – two cubed actually. vDS is abstract with it’s uplinks, port group is also abstract so as the uplink profile. When you disconnect a single network cable, the proper side must go down – if using no standby of course – and while in vDS you have a physical interface called Uplink1, that does not mean that in the Uplink profile of the Edge will be in line with that. Testing and testing!