Myths of AI networking — debunked

BrandPost By Ram Velaga, SVP & GM, Core Switching Group, Broadcom

Jul 10, 20255 mins

As AI infrastructure scales at an unprecedented rate, a number of outdated assumptions keep resurfacing – especially when it comes to the role of networking in large-scale training and inference systems. Many of these myths are rooted in technologies that worked well for small clusters. But today’s systems are scaling to hundreds of thousands – and soon, millions – of GPUs. Those older models no longer apply. Let’s walk through some of the most common myths – and why Ethernet has clearly emerged as the foundation for modern AI networking.

Myth 1: You cannot use Ethernet for high performance AI networks

This myth has already been busted. Ethernet is now the de facto networking technology for AI at scale. Most, if not all, of the largest GPU clusters deployed in the past year have used Ethernet for scale-out networking.

Ethernet delivers performance that matches or exceeds what alternatives like InfiniBand offer – while providing a stronger ecosystem, broader vendor support, and faster innovation cycles. InfiniBand, for example, wasn’t designed for today’s scale. It’s a legacy fabric being pushed beyond its original purpose.

Meanwhile, Ethernet is thriving: multiple vendors are shipping 51.2T switches, and Broadcom recently introduced Tomahawk 6, the industry’s first 102.4T switch. Ecosystems for optical and electrical interconnect are also mature, and clusters of 100K GPUs and beyond are now routinely built on Ethernet.

Myth 2: You need separate networks for scale-up and scale-out

This was acceptable when GPU nodes were small. Legacy scale-up links originated in an era when connecting two or four GPUs was enough. Today, scale-up domains are expanding rapidly. You’re no longer connecting four GPUs – you’re designing systems with 64, 128, or more in a single scale-up cluster. And that’s where Ethernet, with its proven scalability, becomes the obvious choice.

Using separate technologies for local and cluster-wide interconnect only adds cost, complexity, and risk. What you want is the opposite: a single, unified network that supports both. That’s exactly what Ethernet delivers – along with interface fungibility, simplified operations, and an open ecosystem.

To accelerate this interface convergence, we’ve contributed the Scale-Up Ethernet (SUE) framework to the Open Compute Project, helping the industry standardize around a single AI networking fabric.

Myth 3: You need proprietary interconnects and exotic optics

This is another holdover from a different era. Proprietary interconnects and tightly coupled optics may have worked for small, fixed systems – but today’s AI networks demand flexibility and openness.

Ethernet gives you options: third-generation co-packaged optics (CPO), module-based retimed optics, linear drive optics, and the longest-reach passive copper. You’re not locked into one solution. You can tailor your interconnect to your power, performance, and economic goals – with full ecosystem support.

Myth 4: You need proprietary NIC features for AI workloads

Some AI networks rely on programmable, high-power NICs to support features like congestion control or traffic spraying. But in many cases, that’s just masking limitations in the switching fabric.

Modern Ethernet switches – like Tomahawk 5 & 6 – integrate load balancing, rich telemetry, and failure resiliency directly into the switch. That reduces cost, lowers power, and frees up power for what matters most: your GPUs/ XPUs.

Looking ahead, the trend is clear: NIC functions will increasingly be embedded into XPUs. The smarter strategy is to simplify, not over-engineer.

Myth 5: You have to match your network to your GPU vendor

There’s no good reason for this. The most advanced GPU clusters in the world – deployed at the largest hyperscalers – run on Ethernet.

Why? Because it enables flatter, more efficient network topologies. It’s vendor-neutral. And it supports innovation – from AI-optimized collective libraries to workload-specific tuning at both the scale-up and scale-out levels.

Ethernet is a standards-based, well understood technology with a very vibrant ecosystem of partners. This allows AI clusters to scale more easily, and completely decoupled from the choice of GPU/XPU, delivering an open, scalable and power efficient system

The bottom line

Networking used to be an afterthought. Now it’s a strategic enabler of AI performance, efficiency, and scalability.

If your architecture is still built around assumptions from five years ago, it’s time to rethink them. The future of AI is being built on Ethernet – and that future is already here.

Click to explore more about Ethernet technology and to learn more about Merchant Silicon.

About Ram Velaga

Broadcom

Ram Velaga is Senior Vice President and General Manager of the Core Switching Group at Broadcom, responsible for the company’s extensive Ethernet switch portfolio serving broad markets including the service provider, data center and enterprise segments. Prior to joining Broadcom in 2012, he served in a variety of product management roles at Cisco Systems, including Vice President of Product Management for the Data Center Technology Group. Mr. Velaga earned an M.S. in Industrial Engineering from Penn State University and an M.B.A. from Cornell University. Mr. Velaga holds patents in communications and virtual infrastructure.