With a deep background in artificial intelligence, machine learning, and the underlying infrastructure that powers them, Dominic Jainy has spent his career at the intersection of breakthrough technology and real-world application. As the data center industry grapples with an explosion in AI demand, we sat down with him to dissect Nvidia’s latest bombshell, the Rubin platform. Our conversation explores the seismic shift in cost efficiency promised by this new architecture, the strategic brilliance behind its “fewer but smarter” GPU design, and what Nvidia’s accelerated one-year release cycle means for competitors and data center operators alike.
Nvidia claims Rubin’s run costs will be one-tenth of Blackwell’s. Can you break down how the new Vera CPU and other chips in the platform achieve this? Please provide specific metrics or a step-by-step example of how data centers will realize these savings.
That one-tenth figure is the headline, and it’s an absolutely staggering claim that really grabs your attention. It’s not achieved by a single magic bullet, but through a complete platform-level redesign. Nvidia is orchestrating a symphony of new silicon here. You have the Rubin GPU doing the heavy lifting, but it’s supported by the new Vera CPU, the NVLink 6 switch for incredibly fast interconnects, the ConnectX-9 SuperNIC, and even the Spectrum-6 Ethernet switch. The philosophy is to spread the workload intelligently across specialized chips, so no single component becomes a bottleneck. For a data center, the savings materialize step-by-step: training times are slashed, meaning less power consumed over time, and the cost per inference token plummets. It’s a shift from thinking about the cost of a single GPU to the total cost of ownership for generating an AI-driven result.
The Rubin platform uses just 72 GPUs per system—a quarter of the previous generation—while targeting complex agentic AI. Walk us through the architectural philosophy behind this “fewer GPUs” approach and how it specifically enhances performance for demanding reasoning models.
This is one of the most fascinating aspects of the Rubin announcement. For years, the mantra was always “more GPUs.” Now, Nvidia is showing us a more elegant path forward. By packing a system with only 72 GPUs, a quarter of what we saw with Blackwell, they are directly tackling the biggest headaches for data center operators: power density and cooling. It feels like a gutsy, confident move. This isn’t a reduction in power; it’s a massive leap in efficiency. For demanding workloads like agentic AI and complex reasoning models, raw teraflops aren’t the only thing that matters. You need an architecture, like their new Vera Rubin NVL72 rack-scale solution, that can handle intricate data pathways and constant communication between nodes without faltering. It’s about building a more balanced and responsive system, which is precisely what these sophisticated new AI models require to function effectively.
With Rubin, Nvidia has accelerated its release cadence to a yearly cycle. From a strategic perspective, how does this pace strengthen the CUDA software “moat,” and what are the step-by-step implications for data center operators trying to plan their hardware refresh cycles?
The shift to a yearly cycle is a masterstroke of competitive strategy. As analyst Stephen Sopko noted, the real barrier for competitors isn’t just the chip; it’s the sprawling CUDA software ecosystem. By releasing groundbreaking hardware every year, Nvidia ensures that the entire developer community is constantly optimizing for their latest and greatest platform. This creates an incredible gravitational pull, making it nearly impossible for anyone else to break that developer loyalty. For data center operators, this is a profound change. First, they must accept that their state-of-the-art hardware will be “out of date” in just 12 months, as Gartner’s Tony Harvey put it. This forces a complete re-evaluation of procurement strategy. Budgets will need to be more flexible, and IT teams will have to get comfortable managing a mix of hardware generations. It accelerates the entire industry, but it also locks customers even more tightly into Nvidia’s world.
Analysts suggest the AI “pie is expanding” for all chipmakers, yet competition is growing. Can you describe how Rubin’s specific features, like its new NVLink 6 switch, directly counter offerings from rivals and give an anecdote on why this matters for large-scale AI deployments?
I completely agree with the “expanding pie” view; the demand is so immense that both Nvidia and AMD are selling everything they can produce. But Nvidia isn’t taking its market position for granted. While they still hold up to 90% of the market, they are using platforms like Rubin to redefine the terms of competition. It’s no longer just about the GPU. A feature like the NVLink 6 switch is the perfect example. It acts as the central nervous system for the entire rack, allowing those 72 GPUs to communicate as a single, cohesive unit. A competitor might offer a powerful GPU, but if it can’t be scaled effectively in a large deployment, it’s a non-starter. I’ve seen projects where incredible amounts of GPU power were left stranded, sitting idle because the interconnects created a traffic jam. Nvidia is solving that problem at a fundamental level, telling the market that if you want to build truly massive, efficient AI systems, you need their entire, tightly integrated stack.
What is your forecast for the AI data center hardware market over the next three to five years, especially concerning the balance between raw performance gains and rising sustainability pressures?
Over the next three to five years, I foresee the market being defined by an intense push-and-pull between raw performance and sustainability. The demand for AI compute is, as Jensen Huang said, “going through the roof,” and that won’t stop. We’ll continue to see this incredible race to push the limits of chip design and manufacturing. However, the industry is hitting a very real wall when it comes to power and cooling. You simply can’t keep doubling power density indefinitely. This is why platforms like Rubin, which market efficiency as a core feature, are the blueprint for the future. The key metric will shift from pure teraflops to performance-per-watt. The winners in this next era won’t just be the ones with the fastest chip; they will be the companies that provide a holistic solution that helps customers manage their soaring operational costs and environmental impact. Sustainability is rapidly moving from a corporate social responsibility checkbox to a mission-critical design constraint for the entire data center ecosystem.
