Routing & Switching Design
The Aruba ESP Campus Routing & Switching Design section describes the technologies and design principals used in the design of a layer 2 and 3 Aruba ESP topology and control plane.
Table of contents
Spanning Tree Protocol
High availability is the primary goal for any enterprise conducting business on an ongoing basis. Layer 2 loops cause catastrophic network disruptions, making prevention and timely removal of loops a critical part of network administration.
The Spanning Tree Protocol (STP) dynamically discovers and removes layer 2 loops in a network. This section covers STP topology, the type of STP to use, and supplemental features to enable.
One method of increasing availability between network infrastructure is to establish layer 2 redundancy using multiple links, so an individual link failure does not result in a network outage. Multiple strategies can be applied to prevent the redundant connections from forwarding layer 2 frames in an infinite loop. The Aruba ESP architecture uses Virtual Switching Extension (VSX), discussed in the following Network Resiliency section, to prevent loops between centrally administered network switches. STP in combination with Loop Protect is configured primarily to resolve accidental loops created by users in the access layer.
With many different types of STP and varying network devices using different defaults, it is important to standardize on a common version for predictable STP topology. The recommended STP version for Aruba Gateways and Switches is Rapid per VLAN Spanning Tree (Rapid PVST+).
STP and Root Bridge Selection
Enable STP on all devices providing layer 2 services to prevent accidental loops.
STP creates a loop-free topology by selecting a root bridge and subsequently permitting only one port on each non-root switch to forward frames in the direction of the root. The root bridge is dynamically selected, using the lowest bridge ID as its primary selector. The bridge ID begins with a bridge priority, which can be set administratively to influence root bridge selection. Aggregation switches and collapsed core switches should have low bridge priorities to ensure that a switch at that layer becomes the root bridge of the network. The root bridge should be a device that is central to the aggregation of VLANs for downstream devices. In the campus topologies discussed in this guide, the root bridge candidates are the collapsed core, access aggregation, and services aggregation devices.
In the three-tier wired design, the root bridges are the access aggregation switches and service aggregation switches. Virtual Switching Extension (VSX) and multichassis link aggregation groups (MC-LAGs) are used to allow dual connections between the access and aggregation layers without the need for STP on individual links. Each set of aggregation switches is a separate layer 2 domain with its own root bridge, but they do not interfere with each other because STP does not extend over the layer 3 boundary between the devices. In this example, the aggregation switches are set with a low bridge priority to ensure that one switch in each VSX pair becomes the root bridge. The core devices are layer 3 switches and do not run STP.
Spanning Tree Protocol root bridge placement
STP Supplemental Features
STP has several supplemental features that help keep the network stable. Here is a brief overview of supplemental features, with justifications for enabling them.
Root Guard stops devices from sending a superior bridge protocol data unit (BPDU) on interfaces that should not send a superior or lower-priority BPDU on an interface. Enable Root Guard on the aggregation or collapsed core downlinks to prevent access switches from becoming the root of the network. Do not enable it on the links connecting the aggregation switches to the core switch.
Admin Edge allows a port to be enabled automatically without going through the listening and learning phases of STP on the switch. This should be used only on single device ports or with a PC daisy-chained to a phone. Use Admin Edge with caution because STP does not run on these ports, so if there is a loop on the network, STP cannot detect it. This feature should be used only for client-facing ports on access switches.
BPDU Guard automatically blocks any port on which it detects a BPDU. This feature typically should be enabled on admin-defined client-facing ports on access switches. This prevents any BPDUs from being received on ports configured as client-facing. BPDU Guard ensures that BPDUs are not received on access ports, preventing loops and spoofed BPDU packets.
BPDU Filter ignores BPDUs sent to an interface and does not send its own BPDUs to interfaces. The main use for this feature is in a multitenancy environment when the servicing network does not want to participate in the customer’s STP topology. A BPDU Filter–enabled interface still allows other switches to participate in their own STP topologies. Using BPDU Filter is not recommended unless the network infrastructure does not need to participate in the STP topology.
Loop Protect is a supplemental protocol to STP that can detect loops when a device creating a loop does not support STP and also drops BPDUs. Loop Protect automatically disables ports to block a detected loop and re-enables ports when the loop disappears. This feature should be turned on for all access port interfaces to prevent accidental loops from port to port. Loop Protect should not be enabled on the uplink interfaces of access switches or in the core, aggregation, or collapsed core layers of the network.
Fault Monitor can be used to automatically detect excessive traffic and link errors. Fault Monitor can be used to log events, send SNMP traps, or temporarily disable a port. Enable Fault Monitor in notification mode for all recognized faults, and enable it on all interfaces for data continuity, but do not enable the disable feature with Fault Monitor because Loop Protect is used to stop loops.
Network Resiliency
Switching Resiliency Technologies
For campus switches, Aruba recommends either a two-tier LAN with collapsed core or a three-tier LAN with a routed core. In both designs, common features can be enabled to ensure that the network is highly resilient. The two-tier campus is shown on the left below; the three-tier campus is on the right.
Two-Tier and Three-Tier Wired
Virtual Switching Extension (VSX)
VSX enables two AOS-CX switches to appear as a single switch to downstream-connected devices. Use VSX in a collapsed core or aggregation point to add redundancy. In a standard link aggregation group (LAG), multiple physical interfaces between two devices are combined into a single logical link. Virtual Switching Extension (VSX) extends this capability by combining ports across two AOS-CX switches on one side of the LAG, referred to as a multi-chassis LAG (MC-LAG). The VSX pair appears as a single layer 2 switch to downstream connected devices, which can be another switch, a gateway, or an individual network host. The active gateway features enable configuration of a redundant Layer 3 network gateway using a shared IP and MAC address. Dual-homing connected devices in this manner adds network resiliency by eliminating a single point of failure for upstream connectivity.
From a management/control plane perspective, each switch is independent. VSX pairs synchronize MAC, ARP, STP, and other state tables over an inter-switch link (ISL). The ISL is a standard LAG between the VSX pair designated to run the ISL protocol.
VSX is supported on Aruba CX 6400, CX8100, CX 8320, CX 8325, CX 8360, CX 8400, and CX9300 models. A VSX pair cannot be mixed between different models, meaning a CX8320 cannot be mixed with a CX 8325.
VSX Pair Placement
The access switch uses a standard LAG connection. From the access switch perspective, the VSX pair is a single upstream switch. This minimizes the fault domains for links by separating the connections between the VSX-paired switches. This also minimizes the service impact with the Live Upgrade feature because each device has its own control plane and link to the downstream access devices.
Traditional STP vs VSX
MC-LAG enables all uplinks between adjacent switches to be active and passing traffic for higher capacity and availability, as shown in the right side of the figure below.
Note: When using LAG or MC-LAG, STP is not required but should be enabled as an additional enhanced loop protection security mechanism.
VSX Terminology
LACP—The Link Aggregation Control Protocol (LACP) combines two or more physical ports into a single trunk interface for redundancy and increased capacity.
LACP Fallback—LAGs with LACP Fallback enabled allow an active LACP interface to connect with its peer before it receives LACP protocol data units (PDUs). This feature is useful for access switches using Zero Touch Provisioning (ZTP) connecting to LACP-configured aggregation switches.
Inter-switch link—Best practice for configuring the ISL LAG is to permit all VLANs. Specifying a restrictive list of VLANs is valid if the network administrator requires more control.
MC-LAG—These LAGs should be configured with the specific VLANs and use LACP active mode. MC-LAGs should NOT be configured with all-VLAN permission.
VSX keepalive—The VSX keepalive is a User Datagram Protocol (UDP) probe that sends hellos between the two VSX nodes to detect a split-brain situation. The keepalives should be sent using the out-of-band management (OOBM) port connected to a dedicated management network, or enabled on a direct IP connection using a dedicated physical link between the VSX pair members.
Active gateway—This provides a redundant IP gateway for endpoints using a more modern approach than VRRP. It must be configured on VSX primary and secondary switches. Both devices also must assign the same active gateway MAC address. A limited number of unique virtual MACs are configurable per switch, so it is best practice to re-use the same active gateway MAC value for all active gateway IP assignments across all VLANs. Active gateway MACs should be assigned from the four ranges reserved for private use:
- x2-xx-xx-xx-xx-xx
- x6-xx-xx-xx-xx-xx
- xA-xx-xx-xx-xx-xx
- xE-xx-xx-xx-xx-xx
Note: x is any hexadecimal value.
PIM Dual—If the network is multicast-enabled, the default PIM DR is the VSX primary switch. The VSX secondary also can establish PIM peering to avoid a long convergence time in case of VSX primary failure. Aruba recommends configuring PIM as active-active for each VSX pair.
VSX and Rapid PVST—Aruba recommends configuring VSX and STP per the design guidelines outlined in the best-practices document below.
Note: For additional detail and coverage of VSX use cases outside the scope of this guide, refer to the latest version of the Virtual Switching Extension Guide as found on the HPE Networking Support Portal.
Virtual Switching Framework
Stacking allows multiple access switches to be connected to each other and behave like a single switch. Stacking combines multiple physical devices into one virtual switch, increasing port density and allowing management and configuration from one IP address. This reduces the total number of managed devices while better utilizing the port capacity in an access wiring closet. Stack members share the uplink ports, which provides additional bandwidth and redundancy.
AOS-CX access switches provide front plane stacking using the Virtual Switching Framework (VSF) feature, using two of the four front panel SFP ports operating at 10G, 25G, 50G, or 100G speeds. VSF combines the control and management planes of both switches in a VSF stack, which allows for simpler management and redundancy in the access closet. VSF is supported on Aruba CX 6200 and 6300 switches.
VSF supports up to 10 members on a 6300 and up to eight members on a 6200. Aruba recommends a ring topology for the stacked switches. A ring topology can be used for 2–10 switches and allows for link-failure fault tolerance, because the devices can still reach the commander or standby switch using the secondary path.
The commander and standby switches should have separate connections to the pair of upstream aggregation switches. If the commander fails, the standby can still forward traffic, limiting the failure to the commander switch. The recommended interface for switch stacking links is a 50G direct-attach cable (DAC), which allows enough bandwidth for traffic across members.
There are three stacking-device roles:
- The commander conducts overall management of the stack and synchronizes forwarding databases with the standby.
- The standby provides redundancy for the stack and takes over stack management operations if the commander becomes unavailable or if an administrator forces a commander failover.
- Members are not part of the overall stack management, but they must manage their local subsystems and ports to operate correctly as part of the stack. The commander and standby also are responsible for their own local subsystems and ports.
VSF Connections
To mitigate the effects of a VSF split stack, a split-detection mechanism must be enabled for the commander and standby, known as multi-active detection (MAD). MAD is enabled using a connection between the OOBM ports on the primary and secondary members to detect when a split occurs. Aruba recommends connecting OOBM ports directly with an Ethernet cable.
VSF OOBM MAD and links
Quality of Service
Quality of service (QoS) refers to a network’s ability to provide higher levels of service using traffic prioritization and control mechanisms. Applying the proper QoS policy is important for real-time traffic (such as on Skype or Zoom) and business-critical applications (such as Oracle or Salesforce).
To accurately configure QoS on a network, consider several aspects, including bit rate, throughput, path availability, delay, jitter, and loss. The last three—delay, jitter, and loss—are easily improved with a queuing algorithm that enables the administrator to prioritize applications with higher requirements over those with lower requirements. The areas of the network that require queuing are those with constrained bandwidth, such as the wireless or WAN sections. Wired LAN uplinks are designed to carry the appropriate amount of traffic for the expected bandwidth requirements. Since QoS queuing does not take effect until active congestion occurs, it is not typically needed on LAN switches.
The easiest strategy to deploy QoS is to identify the critical applications running in the network and give them a higher level of service using the QoS scheduling techniques described in this guide. The remaining applications stay in the best-effort queue to minimize upfront configuration and lower the day-to-day operational effort of troubleshooting a complex QoS policy. If additional applications become critical, they are added to the list. This can be repeated as needed without requiring a comprehensive policy for all applications. This strategy is often used by organizations that do not have a corporatewide QoS policy or that are troubleshooting application performance problems in specific network areas of the network.
One example prioritizes real-time applications along with a few other critical applications that require fast response times because users are waiting on remote servers. Identifying business-critical applications and giving them special treatment on the network helps employees remain productive. Real-time applications are placed into a strict priority queue, and business-critical applications are given a higher level of service during congestion.
The administrator must limit the amount of traffic placed into strict priority queues to prevent oversaturation of interface buffers. Giving all traffic priority defeats the purpose of creating a QoS strategy. After critical traffic is identified and placed in the appropriate queues, the rest of the traffic is placed in a default queue with a lower level of service. If the higher-priority applications do not use the bandwidth assigned, the default queue uses all available bandwidth.
Traffic Classification and Marking
Aruba recommends using the access switch as a QoS policy enforcement point for traffic over the LAN. This means selected applications are identified by IP address and port number at the access switch and marked for special treatment. Optionally, traffic can be queued on the uplinks, but this is not required in an adequately designed campus LAN environment. Any applications that are not identified are re-marked with a value of zero, giving them a best-effort level of service. The diagram below shows where traffic is classified, marked, and queued as it passes through the switch.
Classification, Marking, and Queuing
In a typical enterprise network, applications with similar characteristics are categorized based on how they operate over the network. Then, the applications go in different queues according to category. For example, if broadcast video or multimedia streaming applications are not used for business purposes, there is no need to account for them in a QoS policy. However, if Skype and Zoom are used for making business-related calls, the traffic must be identified and given a higher priority. Specific traffic that is not important to the business (for example: YouTube, gaming, and general web browsing) should be identified and placed into the “scavenger” class, where it is dropped first and with the highest frequency during times of congestion.
A comprehensive QoS policy requires categorization of business-relevant and scavenger-class applications on the network. Layer 3 and layer 4 classifications group applications into categories with similar characteristics. After sorting the critical applications from the others, they are combined into groups for queuing.
Scheduling algorithms rely on classification markings to identify applications passing through a network device. Applications marked at layer 3 rather than layer 2 will carry the markings throughout the life of a packet.
The goal of the QoS policy is to allow critical applications to share the available bandwidth with minimal system impact and engineering effort.
DiffServ Code Point (DSCP)
Marking in layer 3 uses the IP type of service (ToS) byte with either the IP precedence three most significant bit values (from 0 to 7) or the DSCP six most significant bit values (from 0 to 63). The DSCP values are more common because they provide a higher level of QoS granularity. They are also backward compatible with IP precedence because of their leftmost placement in the ToS byte.
Layer 3 markings are added in the standards-based IP header, so they remain with the packet as it travels across the network. When an additional IP header is added to a packet, such as traffic in an IPsec tunnel, the inner header DSCP marking must be copied to the outer header to allow the network equipment along the path to use the values.
DSCP marking
Several RFCs are associated with the DSCP values as they pertain to the per-hop behaviors (PHBs) of traffic passing through the various network devices along its path. The diagram below shows the relationship between PHB and DSCP and their associated RFCs.
DSCP relationship with PHBs
Voice traffic is marked with the highest priority using an Expedited Forwarding (EF) class. Multimedia applications, broadcast, and video conferencing are placed into an assured forwarding (AF31) class to give them a percentage of the available bandwidth as they cross the network. Signaling, network management, transactional, and bulk applications are given an assured forwarding (AF21) class. Finally, default and scavenger applications are placed into the default (DF) class to provide them a reduced amount of bandwidth but not completely starve them during times of interface congestion. The figure below shows an example of six application types mapped to a four-class LAN queuing model.
Six application types in a 4-class LAN queuing model
The best-effort entry at the end of the QoS policy marks all application flows that are not recognized by the layer 3 / layer 4 classification into the best-effort queue. This prevents end-users who mark their own packets from getting higher-priority access across the network.
Traffic Prioritization and Queuing
The Aruba switch supports up to eight queues, but this example uses only four queues. The real-time interactive applications are combined into one strict priority queue. In contrast, multimedia and transactional applications are placed into deficit round-robin (DRR) queues, and the last queue is used for scavenger and default traffic. DRR is a packet-based scheduling algorithm that groups applications into classes and shares available capacity among them according to a percentage of bandwidth defined by the network administrator. Each DRR queue gets its fair share of bandwidth during congestion, but all of them can use as much bandwidth as needed when there is no congestion.
The outbound interface requires the DSCP values shown in the second column to queue the applications. The weighted values used in the DRR LAN scheduler column are added, and each DRR queue is given a share of the total. The values must be adjusted according to the volume of applications in each category. This is often a trial-and-error process because the QoS policy affects the applications in the environment. As shown in the four-class example below, the queues are sequentially assigned in top-down order.
QoS Summary for Aruba Switch
Maximum Transmission Unit (MTU) size
Packet size must be considered to achieve optimal traffic flow and network performance. The Layer 3 Maximum Transmission Unit (MTU) defines the maximum IP or IPv6 packet size that can be transmitted through the network without fragmentation. Layer 2 MTU defines the largest Ethernet frame size supported on a network link without dropping the frame. MTU should be configured at both Layer 2 and Layer 3 across the entire network. On some devices, MTU is also referred to as “jumbo frames” and should be enabled.
Network tunneling methods add headers to packets, which can increase the packet size beyond default MTU values. A network administrator must ensure MTU values can accommodate the increased packet size of tunneled traffic in the full end-to-end network path between tunnel endpoints. Using jumbo frames or increasing the MTU size can prevent fragmentation and improve network performance. HPE Aruba Networking does not support fragmentation of GRE or VXLAN tunnels.
Setting the Layer 2 and Layer 3 MTU to 9198 bytes on CX switches is recommend for the following advanced feature sets that require MTUs greater than the default value of 1500.
- User-based tunneling: GRE is used to tunnel ingress traffic from a switch interface to a mobility gateway.
- Distributed overlay fabric: Traffic is encapsulated with a VXLAN header before being routed through the underlay.
- Wireless WLAN in tunnel mode: Client traffic is tunneled from access points to mobility gateways.
Note: The AOS-CX Layer 2 MTU configuration value specifies the maximum Ethernet payload, which represents the same bytes as the Layer 3 MTU. The maximum Ethernet frame size is auto-calculated by adding an additional 22 bytes (14 bytes for the Ethernet frame header, 4 bytes to accommodate an 802.1Q header, and 4 bytes for the Ethernet frame checksum (FCS).
OSPF Routing
Aruba ESP best practice uses OSPF for its simplicity and ease of configuration. OSPF is a dynamic, link-state, standards-based routing protocol commonly deployed in campus networks. OSPF provides fast convergence and excellent scalability, making it a good choice for large networks because it can grow with the network without requiring redesign.
OSPF defines areas to limit routing advertisements and to allow for route summarization. The Aruba ESP campus design uses a single area for the campus LAN. Multi-area, backbone designs are considered when connecting multiple campus or WAN topologies. OSPF is often used to exchange routes between the campus LAN and a WAN gateway or a DMZ firewall.
The ESP underlay best practice configuration uses OSPF point-to-point links between aggregation and core devices. Interfaces on aggregation switches providing layer 3 services to downstream hosts are configured as members of the OSPF domain. Configure the OSPF router for passive-interface default
to prevent unintended adjacencies from forming with devices plugged into a layer 2 access port. When an OSPF neighbor is expected on a port, disable passive operation.
When configuring access switches, best practice is to configure an IP address in the management VLAN and to enable OSPF on that VLAN IP interface. Adding /32 loopback interfaces to OSPF also lays the foundation for a high-reliability management network.
The diagram below illustrates the OSPF fundamentals of a three-tier campus LAN.
OSPF in Three-Tier Wired
IP Multicast
IP multicast efficiently sends a single traffic stream to multiple interested receivers using a destination IP address in the 224.0.0.0/4 block. Each IP address in the block is referred to as a multicast group. Traffic sent to an individual group from a source is delivered to all hosts that express interest in receiving traffic for the group. Multicast forwarding optimizations prune traffic from Layer 2 and Layer 3 data links without interested listeners. Multicast applications include audio/video streaming, service discovery, and host imaging.
HPE Aruba Networking uses Protocol Independent Multicast – Sparse Mode (PIM-SM) to route multicast traffic between subnets. PIM-SM routers form adjacencies between connected Layer 3 routed interfaces. A PIM-SM domain is the contiguous set of PIM-SM adjacent routers, typically consisting of the full set of campus and data center Layer 3 switches. The following added features support discovery and delivery of multicast traffic:
- Rendezvous point (RP).
- Multicast Source Discovery Protocol (MSDP).
- Bootstrap router (BSR) mechanism.
- Internet Group Management Protocol (IGMP).
- IGMP snooping.
Typically, the multicast group’s source IP address is unknown to an interested listener and its directly attached router. One or more Rendezvous Points (RPs) maintain a centralized mapping of multicast groups to their source IP addresses. PIM-SM routers with sources directly attached register the source/group data with the RP. Any PIM-SM router with an interested listener can use the RP to establish the initial flow of multicast traffic without knowledge of the source IP address.
Anycast RP is the most common redundancy strategy for the RP function. The same loopback IP address is assigned to two or more PIM-SM routers. PIM Register messages are sent by source-attached routers to the anycast RP address and received by the nearest RP. Multicast Source Discovery Protocol (MSDP) shares multicast source/group information among the complete set of anycast RPs, so each RP maintains a complete set of source/group information. Any RP can facilitate the initial flow of multicast traffic. If one RP fails, the others continue the RP function without requiring configuration changes or updates on other members of the PIM domain.
RP interfaces and MSDP are configured on campus core switches.
The BSR mechanism is built into the PIM protocol for the purpose of selecting an RP dynamically and distributing the RP IP address throughout the PIM-SM domain. Core switches are configured as BSR candidates using a unique loopback IP address and as RP candidates using the shared anycast IP loopback. After a single switch is elected as the BSR, a candidate RP address is selected. When using anycast redundancy, all RP candidates advertise the same anycast IP address. BSR then advertises the selected RP address to all participating routers, freeing the administrator from having to configure an RP address on each network router.
IGMP is enabled on default gateway SVI interfaces which serve as the IGMP queriers for their respective networks. An IGMP querier maintains a list of interested listeners for multicast groups on its local network segment. When a host is interested in receiving multicast traffic, it makes an IGMP join request. When the IGMP querier receives the IGMP join, the switch initiates the PIM join process toward the RP to start a multicast flow. When a listener is no longer interested in receiving traffic, it sends an IGMP leave message. Additionally, the IGMP querier periodically verifies interest in multicast groups by sending an IGMP query to all attached hosts. If no interested listeners remain for an established multicast group on a PIM router, the switch initiates a PIM prune process to stop the flow of multicast traffic.
The HPE Aruba Networking AOS-CX Multicast Guide in the HPE Networking Support Portal and IETF RFC 7761 provide more detailed information on PIM-SM and IGMP operations.
IGMP snooping monitors IGMP joins, leaves, and queries to optimize Layer 2 multicast forwarding state. Configure IGMP snooping on access and aggregations switches in a Three-Tier topology, or access and core switches in a Two-Tier topology.
Dynamic Multicast Optimization (DMO) further optimizes multicast traffic in wireless networks.
Aruba ESP Campus LAN Design Summary
The ESP campus wired LAN provides network access for employees, APs, and Internet of Things (IoT) devices. The campus LAN also becomes the logical choice for interconnecting the WAN, data center, and Internet access, making it a critical part of the network.
The simplified access, aggregation, and core design delivers the following benefits:
- An intelligent access layer protects from attacks while maintaining user transparency within the layer 2 VLAN boundaries.
- The aggregation and core layers provide IP routing using OSPF and IP multicast using PIM sparse mode with redundant BSRs and RPs.
- The services aggregation connects critical networking devices such as corporate servers, WAN routers, and Internet edge firewalls.
- The core is a high-speed dual-switch interconnection that provides path redundancy and sub-second failover for nonstop forwarding of packets.
- Combining the core and services aggregation into a single layer enables the network to scale when a standalone core is not required.