Engineers, infrastructure managers, and tech leaders interested in optimizing large-scale computing systems and AI deployments.
Amin Vahdat, a key figure in Google's infrastructure, is introduced. He leads internal infrastructure and is crucial for TPU scaling.
Amin was in charge of Google's internal infrastructure, including the TPUs that enable Gemini at scale.
Google has one of the largest computing infrastructures globally, aiming for tens of gigawatts in the next four years.
Building 1 gigawatt costs about $40 billion. Google's infrastructure organization is highly efficient, achieving high utilization rates.
The key metric is not gigawatts, but the capability and value delivered to users. Reliability is crucial for effective utilization.
The focus should be on value delivered per dollar spent, not just infrastructure capacity. Efficiency means delivering more value with less.
Ultimately, business outcomes like daily active users per gigawatt are the real measures of success, not just raw capacity.
Efficient use of TPUs requires synchronized compute, storage, and networking. Poor orchestration leads to idle, expensive resources.