Ethernet for the Future: Cisco’s Blueprint for AI Networks & Sustainability in Data Centers

Cisco has embarked on a mission to solidify Ethernet as the principal infrastructure for artificial intelligence (AI) networks, both now and in the future. With their comprehensive blueprint and cutting-edge technologies, Cisco aims to revolutionize AI network capabilities and ensure optimal performance. A core component of this endeavor is the deployment of Nexus 9000 data center switches, which offer unparalleled bandwidth and the necessary features to cater to the requirements of AI and machine learning (ML) applications.

The Role of Nexus 9000 Data Center Switches

At the forefront of Cisco’s AI blueprint, the Nexus 9000 data center switches exhibit remarkable capabilities. Equipped with Application-Specific Integrated Circuits (ASICs), these switches support up to 25.6Tbps of bandwidth, delivering tremendous speed and efficiency. With a combination of superior hardware and software advancements, Nexus 9000 switches possess the essential latency, congestion management mechanisms, and telemetry to fulfill the demands of AI workloads.

Leveraging Existing Data Center Ethernet Networks

Cisco has developed a blueprint that organizations can employ to leverage their existing data center Ethernet networks to support AI workloads. By optimizing and augmenting these networks, enterprises can minimize costs and infrastructure complexity while maximizing performance and efficiency. This approach allows businesses to seamlessly integrate AI capabilities into their current network infrastructure.

Enabling Nexus AI-Based Networking

Two key technologies support AI-based networking on Nexus switches. The first is the support for Remote Direct Memory Access Over Converged Ethernet, version 2 (ROCEv2), in the NX-OS operating system. This feature enables direct memory access over Ethernet, reducing latency and enhancing overall network performance. The second technology, Explicit Congestion Notification (ECN), offers an efficient congestion control mechanism, ensuring smooth data flow within the network. Combined, these advancements empower Ethernet networks to prioritize critical workloads like AI, guaranteeing uninterrupted performance even during periods of congestion.

Prioritizing AI Workloads

The integration of ROCEv2 and ECN technologies allows Ethernet networks to allocate priority to specific workloads, such as AI applications that cannot tolerate any dropped packets. Even when faced with congestion, AI workloads receive dedicated network priority. This prioritization ensures uninterrupted performance and reduces the risk of data loss, enabling organizations to harness the true potential of AI technologies.

Simplifying Configurations through Automation

To streamline the implementation of the aforementioned features, Cisco has published scripts that enable customers to automate specific settings across their networks. These scripts not only facilitate the setup of the required fabric but also simplify configuration processes. By automating essential tasks, businesses can save valuable time and resources while ensuring a seamless transition to an AI-centric network infrastructure.

Optimizing RoCEv2 Transport with Telemetry Capabilities

Nexus 9000 switches are equipped with built-in telemetry capabilities, which assist in correlating network issues and optimizing RoCEv2 transport. Telemetry enables real-time monitoring and analysis, offering insights into network performance and aiding in fine-tuning operations. By leveraging this powerful feature, organizations can detect and resolve network bottlenecks, ensuring optimal performance for AI workloads.

Introducing Cisco’s High-End Silicon One Processors

In addition to Nexus switches, Cisco has introduced its new high-end programmable Silicon One processors, targeting large-scale AI/ML infrastructures for enterprises and hyperscalers. These processors, such as the 5nm 51.2Tbps Silicon One G200 and 25.6Tbps G202, expand the existing Silicon One family to 13 members. Engineered for massive AI workloads, these processors deliver unparalleled processing power and enable organizations to scale their AI capabilities seamlessly.

Establishing a Scheduled Fabric for Enhanced Bandwidth

By combining enhanced Ethernet technologies like RoCEv2, ECN, and Silicon One processors, businesses can create what Cisco terms a “Scheduled Fabric.” This fabric enables physical components to communicate with one another, optimizing scheduling behavior and significantly enhancing bandwidth throughput. The Scheduled Fabric architecture provides a higher bandwidth output, especially for data-intensive AI and ML flows, enabling organizations to fully harness the potential of their AI infrastructures.

Cisco’s unwavering mission to establish Ethernet as the chief underpinning for AI networks showcases their commitment to revolutionizing the field. With the Nexus 9000 data center switches, advanced Ethernet technologies, automation capabilities, and high-end Silicon One processors, Cisco is empowering organizations to unlock the unlimited potential of AI. By prioritizing AI workloads, simplifying configurations, and optimizing network efficiency, businesses can achieve unprecedented performance, scalability, and innovation in the realm of artificial intelligence.

Explore more