Nexus 9300

data sheet beauty shots of switches

Beauty shots of two switch family members

Introduced September 2014 : End of life announced August 2017

n.b. The 92xx switches are newer than the 93xx. Covered separately.

The 9300 and 9500 switches were introduced in November 2013. These switches can run in either NX OS mode or ACI mode. In NX mode, they are stand-alone switches. In ACI mode they run from a controller.

The 9300 uses Broadcom's Trident II as its main switching engine. An important thing to remember about the Trident II is that it has 128 SerDes that can each run at 10 Gb/s. It can be organized as 128 ports of 10 G/s, or 32 ports of 40 Gb/s. Each 40 Gb/s port uses four SerDes in parallel. There is no data path in or out of the Trident II except through the SerDes. It is not possible to augment the 12 MBytes of packet buffers managed directly by the Trident II.

Ivan Pepelnjak has a relevent blog on how Nexus and Trident II fit together. It contains a link to a Cisco Live! slide deck on the Nexus 9500/9300 switches.

Cisco calls the main switching engine the NFE (Network Forwarding Engine). They have dedicated some of the SerDes to interonnect to a Cisco proprietary ASIC that can be used to augment the fixed 12 MByte packet memory of the Trident II. Cisco calls this ASIC their ALE. That stands for Application Leaf Engine. This ASIC has other functions besides augmenting switch buffers. It has 10 Gb/s SerDes' for its interconnect to the Trident II and additional SerDes' that provide ports that are intended to connect the switch to the network core.

Diagram shows ALE attached to Trident II

If you dig deep enough through the 9300 docs, you'll find explicit mention of Trident II. This is more than speculation. There is a lot more to the 9300 family than just Broadcom. Note that referencing the Trident II ports as Front-Panel Ports is confusing. The ALE ports at the top of the diagram are also on the front panel. The Nexus 9500 uses the same (or similar) ASICs. The top ports are used to connect to the chassis backplane instead of being exposed on the front panel.

A 23-page paper details how the ALE and Trident II work together. It is complicated. At least, I think it is complex. For one thing, there are actually two different ALE ASICs: the ALE and ALE-2. They have 40 MB and 25 MB of buffer memory, respectively. It is not clear from the data sheets which 93xx model has which ALE variant. The difference between ALE and ALE-2 is more than just then size of the interconnects and amount of packet memory. They allocate the memory, each in its own special way. The following figure, also from the cited paper, shows how ALE either does or doesn't augment packet memory on a per port basis:

Diagram shows buffer boost path through ALE

If a port has buffer boost turned on, all its packets will take a trip in and out of the ALE. Cisco calls this hair pinning. If buffer boost is not on, packets between front ports on the NFE will not take extra trip through the ALE. All the buffers in the hairpin section of the ALE are global, available to any port configured to use them. Climbing down into the weeds, the 40 MB ALE divides memory into three separate pools that are reserved for northbound, southbound and hairpin traffic. Relying on this description rather than reading the paper (or asking Cisco) is not recommended. The point, I think, is that the amount of packet memory may depend on the direction through the maze.

The 9300 has dynamic queue limits to keep one port from hogging all the buffer resources. Limit factors are selected from eleven settings that cap a single queue or port to between 1 percent [option 0] and 89 percent [option 10] of available buffer space. The default setting is option 8, so the cap is 67 percent. The 9300 also has burst profiles that help decide: when there is congestion -- should packets be preferentially dropped from the long flow or the short flow?

Because I need to select a number to put in the master table, I think the most extra ALE memory you can count on to absorb a microburst is 9 Mbytes. You can get more under some circumstances in on direction. That is in addition to the 10 or 12 MBytes in the Trident II. I am going to say 21. So, there. Note that there are lots of piles of memory littering the landscape. It has become truely difficult to answer the question "how much memory is there?" My guess is that this is not the only example of this kind of complexity.

Miercom Report

Miercom is a test for hire house. They were commissioned by Cisco to do Buffer Performance Testing to show the bonofides of the Nexus relative to the Arista 7150S that uses an Intel Fulcrum SoC.