Nvidia To Deploy Light Based GPU Interconnects By 2026

Nvidia is planning to implement light-based communication between its artificial intelligence GPUs by 2026, utilizing silicon photonics interconnects with co-packaged optics (CPO) in its next-generation rack-scale AI platforms to achieve higher transfer rates at reduced power consumption.

At the Hot Chips conference, Nvidia provided further details regarding its upcoming Quantum-X and Spectrum-X photonics interconnection solutions, outlining their expected arrival in 2026. These solutions represent a significant move towards optical interconnects to manage the increasing demands of data transfer within large AI GPU clusters.

Nvidia’s developmental timeline is expected to closely mirror TSMC’s COUPE (Compact Universal Photonic Engine) roadmap, which is structured into three distinct phases. The initial phase involves an optical engine designed for OSFP connectors, facilitating data transfers of 1.6 Tb/s while simultaneously lowering power consumption. The second phase transitions to CoWoS packaging incorporating co-packaged optics, thereby achieving 6.4 Tb/s data transfer rates at the motherboard level. The third phase focuses on achieving 12.8 Tb/s within processor packages, with the objective of further decreasing both power usage and latency.

The necessity for CPO stems from the challenges associated with interconnecting thousands of GPUs in large-scale AI clusters, requiring them to operate as a unified system. This architecture necessitates modifications to traditional networking configurations. Specifically, instead of each rack having its own Tier-1 (Top-of-Rack) switch connected by short copper cables, the switches are relocated to the end of the row. This configuration establishes a consistent, low-latency fabric spanning multiple racks. This relocation increases the distance between servers and their primary switch, rendering copper cables impractical for high speeds such as 800 Gb/s. Consequently, optical connections become essential for nearly all server-to-switch and switch-to-switch links.

Nvidia designs slower B30A chip to meet US restrictions

The use of pluggable optical modules in such environments presents inherent limitations. In these designs, data signals exit the Application-Specific Integrated Circuit (ASIC), traverse the board and connectors, and are subsequently converted to light. This process introduces significant electrical loss, reaching approximately 22 decibels on 200 Gb/s channels. Compensation for this loss requires complex processing, which increases per-port power consumption to 30W. This, in turn, necessitates additional cooling and introduces potential points of failure. Nvidia asserts that these issues become increasingly problematic as the scale of AI deployments expands.

CPO mitigates the drawbacks associated with traditional pluggable optical modules by integrating the optical conversion engine directly alongside the switch ASIC. This proximity allows the signal to be coupled to fiber almost immediately, bypassing the need to travel over extended electrical traces. As a result, electrical loss is reduced to 4 decibels, and per-port power consumption decreases to 9W. This arrangement also eliminates numerous components that could potentially fail, simplifying the implementation of optical interconnects.

Nvidia asserts that transitioning from conventional pluggable transceivers and integrating optical engines directly into switch silicon, facilitated by TSMC’s COUPE platform, yields substantial improvements in efficiency, reliability, and scalability. Nvidia reports that CPO offers significant advantages over pluggable modules, including a 3.5-times increase in power efficiency, a 64 times improvement in signal integrity, a 10 times increase in resilience due to the reduction in active devices, and approximately 30% faster deployment times due to simpler service and assembly procedures.

Nvidia plans to introduce CPO-based optical interconnection platforms for both Ethernet and InfiniBand technologies. The company anticipates launching Quantum-X InfiniBand switches in early 2026. Each switch is designed to provide 115 Tb/s of throughput, accommodating 144 ports operating at 800 Gb/s each. The system also incorporates an ASIC featuring 14.4 TFLOPS of in-network processing and supports Nvidia’s 4th Generation Scalable Hierarchical Aggregation Reduction Protocol (SHARP), aimed at reducing latency for collective operations. These switches will utilize liquid cooling.

Concurrently, Nvidia is preparing to integrate CPO into Ethernet through its Spectrum-X Photonics platform, scheduled for release in the second half of 2026. This platform will be based on the Spectrum-6 ASIC, which will power two distinct devices: the SN6810, offering 102.4 Tb/s of bandwidth across 128 ports at 800 Gb/s, and the SN6800, which scales to 409.6 Tb/s and 512 ports operating at the same rate. Both devices will also employ liquid cooling.

Nvidia envisions that its CPO-based switches will drive new AI clusters designed for generative AI applications, which are becoming increasingly large and complex. By utilizing CPO, these clusters will eliminate thousands of discrete components, resulting in faster installation times, easier servicing, and reduced power consumption per connection. Consequently, clusters utilizing Quantum-X InfiniBand and Spectrum-X Photonics are expected to demonstrate improvements in metrics such as time-to-turn-on, time-to-first-token, and overall long-term reliability.

Nvidia emphasizes that co-packaged optics are not simply an optional enhancement but a fundamental requirement for future AI data centers. This suggests that the company intends to position its optical interconnects as a key differentiator and advantage over rack-scale AI solutions offered by competitors, such as AMD. AMD’s acquisition of Enosemi is relevant in this context.

A critical aspect of Nvidia’s silicon photonics initiative is its close alignment with the evolution of TSMC’s COUPE (Compact Universal Photonic Engine) platform. As TSMC’s platform advances in the coming years, Nvidia’s CPO platforms are expected to correspondingly improve. The first generation of TSMC’s COUPE is constructed by stacking a 65nm electronic integrated circuit (EIC) with a photonic integrated circuit (PIC) using the company’s SoIC-X packaging technology.

The TSMC COUPE roadmap is divided into three stages of development. The initial generation involves an optical engine designed for OSFP connectors, providing 1.6 Tb/s data transfer while simultaneously reducing power consumption. The second generation incorporates CoWoS packaging with co-packaged optics, resulting in a data transfer rate of 6.4 Tb/s at the motherboard level. The third generation is designed to achieve 12.8 Tb/s within processor packages and aims to further reduce power consumption and latency.

Featured image credit

Nvidia To Deploy Light Based GPU Interconnects By 2026

Stay Ahead of the Curve!

Related Posts

Why Science Must Embrace Co-Creation with Generative AI to Break Current Research Barriers

Learn Python (+ AI) and Become a Certified Data Analyst for FREE This Week

Leave a Reply Cancel reply