Rockport Networks Launches 300 Gbps Switchless Fabric, Reveals 396-Node Deployment at TACC

This article by Tiffany Trader was originally published October 27 on www.hpcwire.com.

Rockport Networks emerged from stealth this week with the launch of its 300 Gbps switchless networking architecture focused on the needs of the high-performance computing and the advanced-scale AI market. Early customers include the Texas Advanced Computing Center (TACC), which has installed the networking technology across part of its Frontera system, as well as DiRAC/Durham University, which is also using the networking gear. The high-performance networking group at Ohio State is also engaged with Rockport, lending its expertise with standards support.

Rockport’s distributed switching capability is implemented by its patented rNOS software, the network operating system that runs across the network cards. The software doesn’t take any server resources and is invisible to the server apart from it seeing a high performance Ethernet NIC. Network functions are distributed down to each node which are direct connected to each other over passive cabling. There is a distributed control plane and a distributed routing plane, but nodes are self-discovering, self-configuring and self-healing, according to Rockport. The software determines the best path through the network to minimize congestion and latency, while breaking down packets into smaller pieces (Rockport calls these FLITs) to ensure high-priority messages are not blocked by bulk data.

 

In addition to rNOS, Rockport Networks solution consists of three parts:

  • The Rockport NC1225: This half height, half length PCI Express card is the only active component of the Rockport solution. It replaces standard NICs and plugs directly into servers or storage enclosures. The card has three elements: a standard Ethernet host interface; an FPGA, which enables upgradeable firmware; and embedded optics. The NC1225 implements the host interface as well as the distributed switching by running Rockport’s rNOS software in the embedded hardware. The NC1225 replaces standard network interface cards (NIC) and “adaptively aggregates the bandwidth of multiple parallel network paths, drawing from 300 Gbps of available network capacity,” according to the company.
  • The Rockport SHFL: The SHFL is a new passive cabling invention consisting of a box that is pre-wired to form a topology and distribute links correctly, either within the rack or across the racks. The cables themselves are commodity cables, specifically MTP/MPO-24 to the SHFL and MTP/MPO-32 between SHFLs. This is standard passive fiber optic cabling and all the optics are on the endpoint cards. To be clear, SHFLs have no have no power and no electronics. The model shown (below) has 24 ports, but Rockport offers different form factors for different network topologies at different scales. The maximum distance for a link is about 100 meters, but due to the mesh connectivity, the network will still operate even if some links exceed that distance.
  • The Rockport Autonomous Network Manager (ANM): The ANM provides high-level and granular insights into the active network. It continuously monitors all aspects of the network and stores historical data for 60 days with the most recent seven days collected and presented at high fidelity.

The product that is currently shipping is based on an advanced version of 6D torus, with high path diversity, according to Rockport Chief Technology Officer Matt Williams. It supports up to 1,500 nodes at present, but the architecture is designed to scale to 100,000 nodes-plus, leveraging topologies like Dragonfly, the CTO said.

Matthew Williams of Rockport Networks

To test and validate its solution, Rockport Networks has been working with the Texas Advanced Computing Center (TACC) in Austin for about a year. Under the auspices of its new Rockport Center of Excellence, TACC recently installed Rockport networking across 396 nodes of its Frontera supercomputer. (The ~8,000-node Dell system, ranked number ten on the Top500 list, uses Nvidia-Mellanox HDR InfiniBand as its primary interconnect.) The Rockport-connected nodes are being leveraged for production science in support of quantum computing research, pandemic-related research and urgent response computing, addressing disruptive weather events and other large-scale disasters.

“TACC is very pleased to be a Rockport Center of Excellence. We run diverse advanced computing workloads which rely on high-bandwidth, low-latency communication to sustain performance at scale,” stated Dan Stanzione, director of TACC and associate vice president for research at UT-Austin. “We’re excited to work with innovative new technology like Rockport’s switchless network design.

“Our team is seeing promising initial results in terms of congestion and latency control. We’ve been impressed by the simplicity of installation and management. We look forward to continuing to test on new and larger workloads and expanding the Rockport Switchless Network further into our datacenter,” he added.

Williams reported that the Rockport installation at TACC took only a week-and-a-half to complete. “It’s literally a two step process,” he said. “Plug in the card, and plug in the cable.”

Williams told HPCwire that customers are seeing an average improvement of 28 percent over InfiniBand and a 3X decrease in end-to-end latency at scale, running their applications under load. “Under load, we have the better overall performance and deliver a consistently better workload completion time. Every workload is different, you’re not always going to see 28 percent. Sometimes we will be higher or lower, depending on how sensitive that workload is to network conditions. But on average, we’re seeing about 28 percent.”

He clarified that these four tests (above) compared the Rockport solution against 100 Gbps InfiniBand networking, but said they are seeing “very similar results” in internal testing against 200 Gbps InfiniBand. The top-listed HPC workload employs a moving mesh hydrodynamics code.

Pressed on the methodology and comparisons, Williams said, “the important thing about how we define performance is it’s in production, it’s under load. A lot of traditional network vendors like to focus on unloaded raw baseline performance or infrastructure. But when you deploy them in production, and you have multiple workloads running through this mix of bandwidth and latency sensitive workloads, you start to see tremendous degradation in performance from what you saw in the baseline tests. So we always talk about how we function, how we perform in a loaded environment, like you’ll see in a multi workload production environment.”

The Rockport network technology has been in trials with customers and is now production-ready at scale, according to Williams. HPC, AI and machine learning are beachhead markets with the company targeting high-performance applications that are very sensitive to network performance, primarily latency, but that also have a need for consistent bandwidth performance.

“It’s a lossless solution but we still leverage standard host interfaces, so in order to test or deploy our solution, our customers just remove the existing IB card, or an Ethernet NIC in some cases, and replace it with our card,” said Williams. “None of the software changes; none of the drivers even change. We appear to be a standard Ethernet NIC interface with all the advanced offloads that provides.”

The solution that is shipping to customers is the same as the one that is installed at TACC. Unlike in a traditional HPC network infrastructure, which prioritizes node connectivity within a rack, with the Rockport setup, nodes in different racks are directly connected together. The takeaway is that it’s less sensitive to physical location. Williams noted that the TACC deployment spans 11 racks of equipment across the datacenter, providing direct connections over that distance.

The announcement garnered support from HPC analyst firm Hyperion Research.

“There’s been significant evidence that would suggest that switchless architectures have the capacity to significantly up level application performance that traditionally has come at great cost,” stated Earl C. Joseph, CEO, Hyperion Research as part of the news launch. “Making these advances more economically accessible should greatly benefit the global research community and hopefully improve expectations relative to what we can expect from the network when it comes to return-on-research and time-to-results.”

Statements of support were also issued by DiRAC at Durham University and Ohio State University’s Network-based Computing Lab.

“The team at Durham continues to push the bounds when it comes to uncovering next-generation HPC network technologies,” said Alastair Basden, DiRAC/Durham University, technical manager of COSMA HPC Cluster. “Based on a 6D torus, we found the Rockport Switchless Network to be remarkably easy to setup and install. We looked at codes that rely on point-to-point communications between all nodes with varying packet sizes where – typically – congestion can reduce performance on traditional networks. We were able to achieve consistent low latency under load and look forward to seeing the impact this will have on even larger-scale cosmology simulations.”

“Our mission is to provide the advanced computing community with standard libraries such as MVAPICH2 that support the very best possible performance available in the market. We make it a top priority to keep our libraries fresh with innovative approaches, like Rockport Networks’ new switchless architecture,” said DK Panda, professor and distinguished scholar of computer science at the Ohio State University, and lead for the Network-Based Computing Research Group. “We look forward to our ongoing partnership with Rockport to define new standards for our upcoming releases.”

Here’s something similar we think you’ll like.