Dominic Jainy has spent years at the intersection of AI, machine learning, and blockchain, helping teams turn ambitious ideas into production systems. He’s guided data center strategies through tight supply cycles, shifting workloads, and the rise of inference and agentic AI. In this conversation, he unpacks what’s behind the latest surge in AI-driven demand, why CPUs are back in the spotlight, how packaging and foundry bets will shape the next wave—and where practitioners should place their next dollar.
Intel’s shares spiked on stronger AI data center demand; what specific customer behaviors or deal types are moving the needle, and can you share anecdotes or metrics that illustrate how orders have changed quarter to quarter?
The spike aligns with customers converting multi-quarter pilots into committed production ramps, often bundling server CPUs with accelerators in the same purchase order. I’m seeing more take‑or‑pay agreements tied to capacity reservations, especially where supply is tight and timing matters for launch windows. Quarter to quarter, orders have shifted from fragmented node buys to clustered allocations where networking, storage, and CPUs land together to avoid stranded capacity. That concentration mirrors the 29% premarket reaction: investors are reading bundled, time‑bound deals as a leading signal of sustained demand.
Data center and AI revenue reached about $5.1 billion with 22% year-over-year growth; what underlying workloads and buyer segments drove that outperformance, and how sustainable is it as projects move from pilots to production?
Inference-heavy services, retrieval-augmented experiences, and microservice back ends are dominating, and they’re sticky once rolled into user-facing products. Hyperscalers led with large rollouts, but enterprises are now expanding beyond proofs of concept into customer support, analytics, and content ops. The 22% year-over-year lift to roughly $5.1 billion suggests these aren’t one-off experiments; they’re revenue-facing services getting funded quarter after quarter. Sustainability hinges on cost per query, and as teams optimize CPU-accelerator mix, the 7% overall revenue growth being outpaced here signals headroom left in inference and distributed AI.
Overall company revenue grew roughly 7% while data center outpaced it; what execution choices or product gaps explain that divergence, and what step-by-step actions would you prioritize to narrow it?
Data center rode the AI wave faster than the broader portfolio, so the mix skewed toward high-demand CPUs and packaging. Elsewhere, slower cycles and product gaps diluted the blended growth rate. To narrow it, I’d prioritize: align roadmap milestones to the AI inference shift, expand advanced packaging slots where demand is tight, accelerate platform certification with top clouds, and bundle software optimizations that translate directly into throughput gains. Then track weekly conversion of pipeline to bookings to keep the 7% line rising toward data center momentum.
CPU demand is rising alongside accelerators as inference scales; where do you see the best CPU-to-GPU ratios for cost and performance, and can you walk through a concrete sizing example that balances latency, TCO, and power?
The sweet spot depends on model size, token budgets, and concurrency, but the trend is toward more CPU per accelerator as orchestration, retrieval, and pre/post-processing mature. For a practical build, start with the target latency budget, then partition tokenization, retrieval, and light transforms to CPUs while reserving accelerators for dense math. Size CPUs to absorb traffic spikes and cold starts so GPUs stay hot and efficient, and let autoscaling widen on the CPU tier first to protect TCO. Finally, right-size power by mapping steady-state GPU utilization and giving CPUs headroom for bursty agentic flows.
As AI shifts from foundational models to inference and agentic systems, how should architects redesign clusters, networking, and storage tiers, and what practical migration steps avoid stranded capacity?
Move from monolithic training pods to service fabrics where inference, retrieval, and tools run as composable tiers. Upgrade east‑west networking with predictable latency and segment storage into hot vector stores, fast feature caches, and colder corpora. Practically, migrate in rings: carve a subset of racks, validate service latency and autoscaling, then roll across availability zones while draining the old pools. Keep telemetry on token paths so you retire or repurpose nodes before they become silent cost centers.
Tight supply for AI silicon persists; what procurement tactics, vendor mix strategies, or buffer inventories are working in practice, and can you share metrics on lead times and yield risks you’re seeing?
Capacity reservations tied to phased deliveries are winning, especially when paired with multi-vendor commitments for flexibility. Teams are building modest buffer shelves for CPUs and networking while time‑phasing accelerators to match software readiness. Where supply is tight, buyers accept staggered SKUs if the platform is pin‑compatible and software-stable, minimizing revalidation costs. Lead times and yields vary widely right now, and the most reliable indicator I watch is whether suppliers can keep second‑quarter guidance above consensus while acknowledging tight AI silicon conditions.
Hyperscalers remain the strongest demand drivers, with enterprises ramping; what patterns distinguish their buying criteria, and can you give examples of enterprise deployments that hit ROI thresholds with specific timelines?
Hyperscalers bias for fleet homogeneity, orchestration maturity, and packaging capacity guarantees, while enterprises lean on faster time to value and managed services. Enterprises that moved customer support inference and content tagging into production often cleared ROI in a single budget cycle once call deflection and cycle‑time cuts showed up. They started with narrow intent, instrumented every step, and only then widened the model’s scope. That pragmatism mirrors the broader pattern: hyperscalers set the pace, but enterprise wins accumulate quietly and then scale.
Xeon processors remain key contributors; where are customers extracting the most value—microservices, vector databases, retrieval-augmented generation, or fine-tuned inference—and what tuning steps have delivered measurable throughput gains?
I’m seeing standout gains in microservices and vector databases feeding retrieval-augmented generation, with CPUs handling tokenization, routing, and feature plumbing. Pinning latency‑sensitive threads, enabling instruction‑level accelerations, and aligning NUMA with shard placement deliver immediate wins. Caching hot embeddings in CPU memory narrows the tail on P95 latency, which compounds throughput at scale. Those adjustments, coupled with right‑sized networking, are why Xeon demand remains a visible contributor to the segment’s lift.
A competing vendor reported $2.3 billion in data center revenue with faster percentage growth; how do you interpret the mix difference, and what milestones or benchmarks would convince you one roadmap is out-executing the other?
Mix matters: a portfolio tilted toward accelerators can show sharp percentage ramps, while CPU‑anchored stacks compound steadily as inference broadens. I look for milestone clarity—consistent quarter-to-quarter revenue like the $5.1 billion print, attach rates for CPUs next to accelerators, and sustained year‑over‑year growth above the 7% company baseline. I also watch packaging availability and delivery predictability, because missed slots ripple into software slip. The roadmap that secures capacity, ships on time, and holds guidance above consensus in tight markets is out‑executing, regardless of headlines.
Advanced packaging and wafer capacity are highlighted as advantages; which packaging approaches (e.g., 2.5D, CoWoS alternatives, Foveros-like stacking) matter most for AI inference at scale, and how should buyers evaluate thermal and memory bandwidth trade-offs?
For inference, interposer‑based 2.5D and stacking approaches shine when they unlock memory bandwidth without blowing the thermal budget. Buyers should map bandwidth per watt against their token throughput target and check that heat density stays within cooling envelopes. Packaging that eases memory proximity while enabling predictable thermals turns into higher rack density and simpler operations. In tight supply, favor options with proven capacity so delivery doesn’t become the bottleneck.
The foundry business shows revenue growth but ongoing operating losses; what concrete levers—utilization, long-term agreements, yield learning—most quickly improve margins, and what leading indicators should investors track monthly?
Utilization is the fastest lever—fill the lines with multi‑year agreements that smooth demand and justify learning cycles. Yield learning compounds every week; even small defect reductions stack quickly across volumes. Packaging slots aligned to high‑demand AI parts improve mix and margin even before full node ramps. As for indicators, I track wafer starts, cycle time stability, and whether margins keep improving while the unit is still scaling customer relationships.
Building a TSMC alternative is framed as a long game; which process nodes or specialty technologies (RF, advanced packaging, backside power) could create near-term differentiation, and where do ecosystem gaps still slow customer onboarding?
Advanced packaging is the near‑term wedge because it links today’s compute to memory and IO in practical ways for inference. Specialty tech like RF front‑ends and power delivery tweaks can differentiate platforms tied to edge and connectivity. Ecosystem gaps remain in toolchains, PDK maturity, and partner IP catalogs, which slow new customer tape‑outs. The long game works if ecosystem friction drops while packaging capacity stays ahead of AI demand curves.
Comments about a “Terafab” facility selecting Intel manufacturing raised interest; what realistic scenarios could emerge for cross-company chip fabrication, and how would that influence supply allocation, timing, and risk sharing?
A scenario where select program lines use Intel processes across multiple companies could anchor predictable volumes and derisk early ramps. That would spread allocation decisions across shared milestones, aligning delivery windows with product launches. Risk sharing could show up as staged commitments that expand as validation gates are cleared. It’s early days, but even a directional signal like this can catalyze planning while details on timing and volumes develop.
For enterprises planning AI inference at the edge and in distributed sites, what step-by-step rollout playbook do you recommend—from model selection to observability—and what failure modes have you seen that teams should avoid?
Start with a constrained model matched to your latency and privacy needs, then define retrieval boundaries and data governance up front. Stand up a CPU‑forward edge tier for tokenization, filtering, and routing, and keep accelerators centralized or regionally pooled. Automate rollout with canaries, wire in observability for token paths and costs, and set rollback policies that trigger on tail latency. Avoid silent drift, unmanaged caches, and stranded nodes that no one owns across sites.
With margins improving in some segments, what operational metrics—wafer starts, cycle time, test escape rates—most tightly correlate with profitability in AI-centric product lines, and can you share benchmark targets you consider healthy?
Consistent wafer starts and short, predictable cycle times correlate tightly with cash efficiency, especially when packaging is the pacing item. Test escape rates drive downstream RMA pain, so I favor early investment in screening where AI parts are complex. Inference demand also rewards packaging yield and slot adherence; miss a window and utilization drops across the rack. Healthy targets are contextual, but I watch whether margin trends improve alongside guidance staying above consensus in a supply‑constrained market.
Do you have any advice for our readers?
Anchor every AI decision to measurable service goals—latency, cost per query, and reliability—and let those drive architecture and procurement. Favor platforms with proven delivery in tight markets and keep your options open with portable software stacks. Instrument everything from tokenization to response so you can tune CPUs and accelerators without guesswork. Most of all, scale deliberately: small, fast iterations beat big bets that arrive after the market has already moved.
