VCAP – Design: VMware vSphere Distributed Switch best practice notes

Home / Networking / VCAP – Design: VMware vSphere Distributed Switch best practice notes

In this post I will share my notes about VMware vSphere Distributed Switch best practice document in preparation for the VCAP Design exam. (Document link; https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vsphere-distributed-switch-best-practices-white-paper.pdf)

Notes:

  • The document is not updated to vSphere v6; some of the information related to v5.X only.
  • The document includes a lot of useful information.  This post is only study notes; if you are not familiar with the technologies discussed in the document please read the document first.
  • I have chosen to cover the rack server configuration only, but the document also cover blade server configuration

Design Consideration:

Goals:

  • No single point of failure
  • Isolate traffic type
  • Use of traffic management and optimization

Component configuration:

  • Rack server with 8 1GbE NIC’s
  • Rack server with 2 10GbE NIC’s
  • Physical Switch – capable of switch clustering

Traffic and bandwidth usage:

  • Management – Low
  • vMotion – High
  • FT – medium to High
  • iSCSI/NFS – High
  • VM traffic – Depend on the application

 

Deployment Configuration:

Hosts and cluster: in the examples we will use 2 clusters, and each cluster will include 2 ESXi hosts

MGMT: vCenter will be used for centralized management

Network Infrastructure:

Physical switches: provide L2 connectivity, secure and reliable

Traffic type diagram:  All the traffic mentioned above will use separate port group (PG) and assigned VMkernel

Source: VMware vSphere Distributed Switch best practice document

VMware vSphere Distributed Switch best practices configuration:

  • All Hosts will be deployed with the same number of physical NIC’s
  • Virtual ports connected to a PG will share the same properties
  • PG consideration:
    • Number of virtual ports
    • VLAN’s / VLAN trunking / Private VLANS
    • Port binding
    • Bidirectional traffic shaping
    • Port security
  • Use of NIOC
  • Physical Network switch consideration
    • VLAN’s – enable VST on vSwitch and trunk all VLAN’s to the physical switch ports
    • STP: No support on vDS make sure no loops in the config.
    • STP: Use PortFast to ESXi host facing physical switch ports
    • STP: Use BPDU guard to enforce STP boundary
  • Use link aggregation to increase throughput
  • Enable link-State Tracking
  • Use MTU

Now let’s look at the examples in the document

Rack server with eight 1GbE NIC’s:

Source: VMware vSphere Distributed Switch best practice document

 

Design option 1 – Static configuration NO NIOC and NO load balancing based on Physical NIC’s load configuration:

  • Switches and NIC’s:  4 NIC’s are connected to the first access layer switch and the other 4 to the second access switch to avoid the no single point of failure requirement.
  • dvUplink: 8 dvUplink is configured. VMware recommends to change the name of the dvUplink for something more meaningful.
  • dvPortGroup: each traffic type mentioned above is configured as its own PG.
  • Teaming option: all traffic excluding VM traffic is set for “explicit failover”, VM traffic is set to “route based on NIC load”.
  • Recommendation: use VLAN’s to isolate traffic type
  • Physical switch:  trunking enabled for VLAN’s, STP enabled on the trunk ports with PortFast and BPDU

Design option 2 – Dynamic configuration With NIOC and load balancing based on Physical NIC’s load

The logical design diagram is the same as option1 but configuration is different based on monitoring traffic pattern and bandwidth over time. For the configuration we will look at following traffic bandwidth needs:

  • MGMT (<1GB)
  • vMotion(1GB)
  • FT(1GB)
  • iSCSI(1GB)
  • VM’s (2GB)

Configuration:

  • Switches and NIC’s: same as option 1
  • dvUplink: all uplinks are active no standby links
  • dvPortGroup: Teaming options will set to “to route based on NIC load”
  • Recommendation: use VLAN’s to isolate traffic type
  • Use NIOC

NIOC custom share calculation: In the document the example uses 5 shares for MGMT (less than a 1GB), 10 shares for vMotion,FT and ISCSI(1GB) and 20 shares for VM’s (2GB)

Calculation:

  • Calculate the total number of shares
  • Calculate the % of bandwidth for all traffic types: Share value/Total shares
  • Percentage of bandwidth X Total bandwidth of the uplink(1Gb multiplied by 1000)

Results:

Total shares = MGMT(5) + vMotion(10) + FT(10) + iSCSI(10)+VM’s(20) = 55

  • MGMT: (Percentage of bandwidth = share value(5) / total shares(55)) * uplink bandwidth (1Gb) = 90.91 Mbps
  • vMotion: (Percentage of bandwidth = share value(10) / total shares(55)) * uplink bandwidth (1Gb) = 181.81 Mbps
  • FT: (Percentage of bandwidth = share value(10) / total shares(55)) * uplink bandwidth (1Gb) = 181.81 Mbps.
  • iSCSI: (Percentage of bandwidth = share value(10) / total shares(55)) * uplink bandwidth (1Gb) = 181.81 Mbps
  • VM’s: (Percentage of bandwidth = share value(20) / total shares(55)) * uplink bandwidth (1Gb) = 363.63 Mbps.

 

Rack server with two 10 GbE NIC’s:

Source: VMware vSphere Distributed Switch best practice document

 

Design option 1 – Static configuration NO NIOC and NO load balancing based on Physical NIC’s load Configuration:

  • Switches and NIC’s:  1 of the NIC’s is connected to the first access layer switch and the other to the second access switch to avoid the no single point of failure requirement.
  • dvUplink: 2 dvuplink is configured. VMware recommends to change the name of the dvUplink for something more meaningful.
  • dvPortGroup: each of the traffic type mention above is configured as its own PG.
  • Teaming option: all traffic excluding VM traffic is set for “explicit failover”, VM traffic is set to “route based on NIC load”.
  • Recommendation: use VLAN’s to isolate traffic type
  • Physical switch:  trunking enabled for VLAN’s, STP enabled on the trunk ports with PortFast and BPDU

Design option 2 – Dynamic configuration With NIOC and load balancing based on Physical NIC’s load

The logical design diagram is the same as option1 but configuration is different based on monitoring traffic pattern and bandwidth over time.

For the configuration we will look at following traffic bandwidth needs:

  • MGMT (<1GB)
  • vMotion(2GB)
  • FT(1GB)
  • iSCSI(2GB)
  • VM’s (2GB)

Configuration:

  • Switches and NIC’s: same as option 1
  • dvUplink: all uplinks are active no standby links
  • dvPortGroup: Teaming options will set to “to route based on NIC load”
  • Recommendation: use VLAN’s to isolate traffic type
  • Use NIOC

NIOC custom share calculation:

  • Total shares = 5+ 20 +10 + 20+20 = 75
  • MGMT: (5/75) * 10,000 = 667 Mbps
  • vMotion: (20/75) * 10,000 = 2.67 Gbps
  • FT: (10/75) * 10,000 = 1.33Gbps
  • iSCSI: (20/75) * 10,000 = 2.67 Gbps
  • VM’s: (20/75) * 10,000 = 2.67 Gbps

 

Summary:

In the rack server 8x1Gb NIC’s and 2x10Gbps static design options we achieve resiliency via active/standby uplink configuration, security is handle by isolation for traffic type, the drawback with this approach the I/O resources are underutilized

In the rack server 8x1Gb NIC’s and 2x10Gbps dynamic design options we do achieve resiliency and this approach take advantage of vSphere distributed switch advanced capabilities, I/O resourced are utilized efficiently. This design option is recommended.

Thanks for reading

Mordi.

 

 

 

 

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *