youtube-transcript.ai Extract yours →

NVIDIA CEO Jensen Huang GTC 2026 Full Keynote

https://www.youtube.com/watch?v=jIviHI7fqyc
Translation: zh-CN

[00:08] This is how intelligence is made.
这就是智能的制造方式。

[00:12] This is how intelligence is made.
这就是智能的制造方式。

[00:15] A new kind of factory, generator of tokens, the building blocks of AI.
一种新型工厂,代币的生成器,人工智能的基石。

[00:24] Tokens have opened a new frontier, turning data into knowledge and drawing on all we have learned.
代币开辟了一个新的前沿领域,将数据转化为知识,并借鉴了我们所学到的一切。

[00:35] Tokens are harnessing a new wave of clean energy and unlocking the secrets of the stars.
代币正在利用一股新的清洁能源浪潮,并揭开星辰的奥秘。

[00:51] In virtual worlds, they help robots learn and in the physical world perfect, forging new paths.
在虚拟世界中,它们帮助机器人学习,在物理世界中完善,开辟新的道路。

[01:11] and clearing the way for a bountiful harvest.
为丰收铺平道路。

[01:17] In the moments that matter, tokens are already there.
在重要的时刻,代币已经存在。

[01:24] And in the miles between, they never stop.
在中间的里程中,它们从不停止。

[01:31] They work where human hands cannot.
它们在人类无法企及的地方工作。

[01:38] So we may all breathe easier.
这样我们都能更轻松地呼吸。

[01:45] And the smallest hearts beat stronger.
最小的心脏跳动得更有力。

[02:04] Tokens are helping us break new ground on a scale never attempted
代币正在帮助我们在前所未有的规模上开辟新天地

[02:13] On a scale never attempted to empower the world.
在一个前所未有的规模上,致力于赋能世界。

[02:25] So we can reach star cloud one.
这样我们就能到达星云一号。

[02:28] So we can reach star cloud one.
这样我们就能到达星云一号。

[02:28] Separation confirmed.
分离已确认。

[02:28] Well beyond it.
远远超出它。

[02:40] Together we take the next great leap.
我们一起迈出伟大的下一步。

[02:43] Together we take the next great leap into a bright new future.
我们一起迈出伟大的下一步,迈向光明的新未来。

[02:49] Built for all mankind.
为全人类建造。

[02:58] And here.
而这里。

[03:01] And here is where it all begins.
而这里就是一切的开端。

[03:13] Welcome to the stage, Nvidia founder and
欢迎来到舞台,英伟达创始人兼

[03:16] Welcome to the stage, Nvidia founder and CEO, Jensen Wong.
欢迎来到舞台,英伟达创始人兼首席执行官黄仁勋。

[03:28] Welcome to GTC.
欢迎来到GTC。

[03:36] I just want to remind you this is a tech
我只想提醒你,这是一个科技

[03:38] I just want to remind you this is a tech conference.
我只想提醒你,这是一个科技会议。

[03:42] All these people lining up so early in
这么多人一大早就排队

[03:44] All these people lining up so early in the morning. All of you in here, it's
这么多人一大早就排队。你们都在这里,很高兴见到你们。

[03:47] the morning. All of you in here, it's great to see you.
早上。你们都在这里,很高兴见到你们。

[03:49] great to see you. GTC
很高兴见到你们。GTC

[03:51] GTC GTC. We're going to talk about
GTC GTC。我们将讨论

[03:53] GTC. We're going to talk about technology. We're going to talk about
GTC。我们将讨论技术。我们将讨论

[03:56] technology. We're going to talk about platforms. Nvidia has three platforms.
技术。我们将讨论平台。英伟达有三个平台。

[03:59] platforms. Nvidia has three platforms. You think that we mostly talk about one
平台。英伟达有三个平台。你认为我们主要谈论其中一个

[04:01] of them. It's related to CUDA X. Our
其中一个。它与CUDA X有关。我们的

[04:05] systems is another platform and now we
系统是另一个平台,现在我们

[04:08] systems is another platform and now we have a new platform called AI factories.
系统是另一个平台,现在我们有了一个名为AI工厂的新平台。

[04:10] have a new platform called AI factories. We're going to talk about all of them
有了一个名为AI工厂的新平台。我们将讨论所有这些

[04:11] We're going to talk about all of them and most importantly we're going to talk
我们将讨论所有这些,最重要的是我们将讨论

[04:13] and most importantly we're going to talk about ecosystems. But before I start,
最重要的是我们将讨论生态系统。但在我开始之前,

[04:17] about ecosystems.
关于生态系统。

[04:17] But before I start, let me thank our pregame show hosts.
但在开始之前,请允许我感谢我们的赛前节目主持人。

[04:21] I thought they did a great job.
我认为他们做得很好。

[04:24] Sarah Go of Conviction,
来自Conviction的Sarah Go,

[04:27] Alfred Lyn, Sequa Capital, Nvidia's first venture capitalist, Gavin Baker,
Sequa Capital的Alfred Lyn,Nvidia的第一位风险投资家Gavin Baker,

[04:35] Nvidia's first major institutional investor.
Nvidia的第一大机构投资者。

[04:37] These three people are deep in technology, deep in what's going on and
这三个人都深谙技术,深谙时事,

[04:44] of course they have just a really broad reach of technology ecosystem.
当然,他们拥有非常广泛的技术生态系统。

[04:46] And then of course all of the VIPs that I hand selected to join us today, allstar team.
然后当然还有我今天精心挑选加入的所有贵宾,全明星阵容。

[04:54] I want to thank all of you for that.
我想感谢你们所有人。

[05:02] I also want to thank all the companies that are here.
我也想感谢在座的所有公司。

[05:09] Nvidia as you know is a platform company.
如你所知,英伟达是一家平台公司。

[05:11] We have technology, we have our platforms, we have re rich ecosystem and
我们拥有技术,拥有平台,拥有丰富的生态系统,

[05:15] today there are probably 100% of the
今天可能有100%的

[05:20] Today there are probably 100% of the hundred trillion dollars of industry.
今天,可能有百分之百的千亿美元产业。

[05:22] Hundred trillion dollars of industry here.
千亿美元的产业在这里。

[05:25] 450 companies sponsored this event.
450家公司赞助了此次活动。

[05:30] I want to thank you.
我想感谢你。

[05:30] A thousand technical sessions, 2,000 speakers.
一千场技术会议,2000名演讲者。

[05:34] This is this conference is going to cover every single layer of the five layer cake of artificial intelligence.
本次会议将涵盖人工智能五层蛋糕的每一个层面。

[05:39] From land power and shell the infrastructure to chips to the platforms the models and of course the most important and ultimately what's going to take get this industry taken off is all of the applications.
从陆地力量和外壳基础设施到芯片、平台、模型,当然,最重要也是最终将推动这个行业腾飞的是所有的应用。

[05:56] Applications.
应用。

[05:56] What it all began, it all began here.
这一切的开端,一切都始于此。

[06:00] This is the 20th anniversary of CUDA.
这是CUDA的20周年纪念。

[06:04] We've been working on CUDA for 20 years.
我们已经在CUDA上工作了20年。

[06:12] For 20 years, we've been dedicated to this architecture.
20年来,我们一直致力于这一架构。

[06:16] This revolutionary invention, SIMT, single instruction, multi-threaded, writing scalar code.
这项革命性的发明,SIMT,单指令、多线程,编写标量代码。

[06:23] multi-threaded, writing scalar code could spawn off into multi-threaded.
多线程,编写标量代码可以分叉成多线程。

[06:26] could spawn off into multi-threaded application.
可以分叉成多线程应用程序。

[06:29] much much easier to program application.
比应用程序容易得多。

[06:29] much much easier to program than CINDI.
比CINDI容易得多。

[06:33] We recently added tiles so that we could help people program tensor.
我们最近添加了图块,以便我们能够帮助人们编程张量。

[06:36] that we could help people program tensor cores and the structures of mathematics.
我们可以帮助人们编程张量核心和数学结构。

[06:39] cores and the structures of mathematics that are so foundational to artificial.
核心和数学结构,它们是人工智能如此基础的。

[06:41] that are so foundational to artificial intelligence today.
对当今人工智能如此基础。

[06:43] intelligence today.
当今的智能。

[06:43] Thousands of tools and compilers and.
成千上万的工具、编译器和。

[06:47] Thousands of tools and compilers and frameworks and libraries.
成千上万的工具、编译器、框架和库。

[06:49] frameworks and libraries in open source.
开源的框架和库。

[06:51] in open source.
在开源中。

[06:51] There's a couple of hundred thousand public projects.
有几十万个公共项目。

[06:55] CUDA literally is integrated into every.
CUDA几乎集成到每一个中。

[06:57] literally is integrated into every single ecosystem.
几乎集成到每一个生态系统中。

[06:59] single ecosystem.
单一生态系统。

[06:59] This chart.
这张图表。

[07:02] This chart basically describes 100% of Nvidia's.
这张图表基本上描述了英伟达100%的。

[07:05] basically describes 100% of Nvidia's strategies.
基本上描述了英伟达100%的战略。

[07:08] strategies.
战略。

[07:08] You've been watching me talk about this slide from the very.
你一直在看我从一开始就讲这个幻灯片。

[07:09] about this slide from the very beginning.
从一开始就讲这个幻灯片。

[07:12] beginning.
开始。

[07:12] And ultimately, the single hardest thing to achieve is the thing on.
最终,最难实现的事情是。

[07:15] hardest thing to achieve is the thing on the bottom, installed base.
最难实现的事情是底部的,已安装的基础。

[07:19] the bottom, installed base.
底部,已安装的基础。

[07:19] It has taken us 20 years to now have built up.
我们花了20年才建立起。

[07:22] us 20 years to now have built up hundreds of millions of GPUs and
我们花了20年才建立起数亿个GPU和

[07:24] hundreds of millions of GPUs and computing systems around the world that run CUDA.
全球有数亿个 GPU 和计算系统运行 CUDA。

[07:29] We are in every cloud. We're in every computer company.
我们无处不在,在每个云端,在每家计算机公司。

[07:33] We serve just about every single industry.
我们服务于几乎所有行业。

[07:38] The installed base of CUDA is the reason why the flywheel is accelerating.
CUDA 的装机量是飞轮加速的原因。

[07:43] The install base is what attracts developers who then creates new algorithms that achieves a breakthrough.
装机量吸引了开发者,他们进而创造了实现突破的新算法。

[07:47] For example, deep learning.
例如,深度学习。

[07:50] There are so many others.
还有很多其他的。

[07:54] Those breakthroughs leads to entirely new markets which builds new ecosystems around them with other companies that join which creates a larger installed base.
这些突破带来了全新的市场,围绕这些市场建立了新的生态系统,其他公司也加入进来,从而创造了更大的装机量。

[08:04] This flywheel, this flywheel is now accelerating.
这个飞轮,这个飞轮现在正在加速。

[08:06] The number of downloads of Nvidia libraries is incredibly accelerating.
Nvidia 库的下载量正在惊人地加速。

[08:11] is at a very large scale and growing faster than ever.
规模非常大,并且增长速度比以往任何时候都快。

[08:17] This flywheel is what makes this computing platform able to sustain so much applications, so many new breakthroughs.
这个飞轮使得这个计算平台能够支持如此多的应用程序,如此多的新突破。

[08:23] But most importantly,
但最重要的是,

[08:27] breakthroughs.
突破。

[08:29] But most importantly, it also enables it also enables these infrastructures to have extraordinarily useful life.
但最重要的是,它还使这些基础设施具有极其有用的寿命。

[08:33] And the extraordinarily useful life.
极其有用的寿命。

[08:35] And the reason for that is very obvious.
原因非常明显。

[08:37] There's so many applications that you can run on Nvidia CUDA.
您可以在Nvidia CUDA上运行如此多的应用程序。

[08:40] We support the entire every single phase of the AI life cycle.
我们支持AI生命周期的每一个阶段。

[08:43] We address every single data processing platform.
我们处理每一个数据处理平台。

[08:45] We accelerate scientific principled solvers of all different kinds.
我们加速各种科学原则求解器。

[08:50] And so the application reach is so great that once you install Nvidia GPUs, the useful life of it is incredibly high.
因此,应用程序的覆盖范围如此之大,一旦您安装了Nvidia GPU,它的使用寿命将非常长。

[08:59] It is also one of the reasons why Ampear that we shipped them some six years ago the pricing of Ampear in the cloud is going up.
这也是为什么我们六年前发货的Ampear,Ampear在云中的定价正在上涨的原因之一。

[09:10] And so all of that is made possible fundamentally because the install base is high, the flywheel is high, the developer reach is great.
因此,这一切之所以成为可能,根本原因在于安装基础庞大,飞轮效应强,开发者覆盖面广。

[09:17] And when all of that happens and we continuously update our software, the computing cost declines.
当所有这些发生并且我们不断更新我们的软件时,计算成本就会下降。

[09:23] The combination of accelerated
加速的组合

[09:28] declines.
下降。

[09:31] The combination of accelerated computing speeding up applications computing speeding up applications tremendously.
加速计算的结合极大地加快了应用程序的运行速度。

[09:34] Meanwhile, as we continue to nurture and continue to update software over its life, not only do you get the first time pop, you get the continuous cost reduction of accelerated computing over time.
与此同时,随着我们继续在其生命周期内培育和更新软件,您不仅能获得首次推出的优势,还能获得加速计算随时间推移带来的持续成本降低。

[09:38] software over its life, not only do you get the first time pop, you get the continuous cost reduction of accelerated computing over time.
软件在其生命周期内,您不仅能获得首次推出的优势,还能获得加速计算随时间推移带来的持续成本降低。

[09:40] And we're willing to nurture, willing to support every single one of these GPUs in the world because they're all architecturally compatible.
我们愿意培育,愿意支持世界上所有这些 GPU,因为它们在架构上是兼容的。

[09:42] computing over time. And we're willing to nurture, willing to support every single one of these GPUs in the world because they're all architecturally compatible.
随时间推移的计算。我们愿意培育,愿意支持世界上所有这些 GPU,因为它们在架构上是兼容的。

[09:45] We're willing to do so because the install base is so large.
我们愿意这样做,因为安装基数如此之大。

[09:48] If we release a new optimization, it benefits millions.
如果我们发布新的优化,它将惠及数百万人。

[09:50] This applies to everybody in the world.
这适用于世界上所有人。

[09:51] This combination of dynamics is what makes the NVIDIA architecture expand its reach, accelerating its growth, at the same time driving down computing cost, which ultimately encourages new growth.
这种动态的结合使得 NVIDIA 架构能够扩大其影响力,加速其增长,同时降低计算成本,最终鼓励新的增长。

[09:53] compatible. We're willing to do so because the install base is so large. If we release a new optimization, it benefits millions. This applies to everybody in the world. This combination of dynamics is what makes the NVIDIA architecture expand its reach, accelerating its growth, at the same time driving down computing cost, which ultimately encourages new growth.
兼容。我们愿意这样做,因为安装基数如此之大。如果我们发布新的优化,它将惠及数百万人。这适用于世界上所有人。这种动态的结合使得 NVIDIA 架构能够扩大其影响力,加速其增长,同时降低计算成本,最终鼓励新的增长。

[09:55] If we release a new optimization, it benefits millions.
如果我们发布新的优化,它将惠及数百万人。

[09:57] This applies to everybody in the world.
这适用于世界上所有人。

[10:00] This combination of dynamics is what makes the NVIDIA architecture expand its reach, accelerating its growth, at the same time driving down computing cost, which ultimately encourages new growth.
这种动态的结合使得 NVIDIA 架构能够扩大其影响力,加速其增长,同时降低计算成本,最终鼓励新的增长。

[10:03] So, CUDA is at the center of it.
因此,CUDA 是其核心。

[10:06] But our journey that could actually started 25 years ago.
但我们的旅程实际上始于 25 年前。

[10:10] GeForce.
GeForce。

[10:13] reach, accelerating its growth, at the same time driving down computing cost, which ultimately encourages new growth.
影响力,加速其增长,同时降低计算成本,最终鼓励新的增长。

[10:15] same time driving down computing cost, which ultimately encourages new growth.
同时降低计算成本,最终鼓励新的增长。

[10:17] So, CUDA is at the center of it.
因此,CUDA 是其核心。

[10:21] But our journey that could actually started 25 years ago.
但我们的旅程实际上始于 25 年前。

[10:23] GeForce.
GeForce。

[10:26] could actually started 25 years ago.
实际上始于 25 年前。

[10:34] I know how many of you grew up with GeForce.
我知道你们有多少人是伴随着 GeForce 成长的。

[10:38] GeForce is Nvidia's greatest marketing campaign.
GeForce 是英伟达最成功的营销活动。

[10:43] We attract future customers starting long before you could afford to pay for it yourself.
我们在你们能够负担得起自己付费之前很久就开始吸引未来的客户了。

[10:52] Your parents paid for you to be Nvidia customers.
你们的父母为你们成为英伟达的客户付了钱。

[10:56] And every single year they paid up year after year after year until someday you became an amazing computer scientist and became a proper customer, a proper developer.
而且他们年复一年地支付,直到有一天你们成为了一名出色的计算机科学家,成为了一名合格的客户,一名合格的开发者。

[11:13] But this is this is the house that GeForce made 25 years ago.
但这栋房子是 GeForce 在 25 年前建造的。

[11:17] We started our journey which led to CUDA.
我们开始了我们的旅程,最终 dẫn đến CUDA。

[11:19] 25 years ago, we invented the programmable shader.
25 年前,我们发明了可编程着色器。

[11:26] A perfectly unobvious invention to make an accelerator programmable.
这是一项非常出人意料的发明,让加速器变得可编程。

[11:31] The world's first programmable accelerator, the pixel shader 25 years ago.
世界上第一个可编程加速器,25 年前的像素着色器。

[11:33] That led us
这 dẫn đến 我们

[11:37] Pixel shader 25 years ago.
25年前的像素着色器。

[11:40] That led us to explore further and further 20 years later, 5 years later, the invention of CUDA.
这促使我们进一步探索,20年后,5年后,CUDA的发明。

[11:45] One of the biggest investments that we made and we couldn't afford it at the time.
我们做出的最大投资之一,当时我们负担不起。

[11:50] And it consumed the vast majority of our company's profits was to take CUDA on the backs of GeForce to every single computer.
它消耗了我们公司的大部分利润,就是将CUDA置于GeForce之上,推广到每一台计算机。

[11:59] We dedicated ourselves to creating this platform because we felt so much we felt so strongly about its potential.
我们致力于创建这个平台,因为我们对其潜力深信不疑。

[12:05] But ultimately the company's dedication to it despite the hardships in the beginning believing it every single day for third for 13 generations or 20 years we now have CUDA installed everywhere.
但最终,尽管初期困难重重,公司仍每天坚持信念,历经13代或20年,我们现在到处都安装了CUDA。

[12:22] The pixel shader led to of course the revolution of GeForce.
像素着色器当然导致了GeForce的革命。

[12:26] And then 10 years ago, we introduced about 10 years ago, what is it, eight years ago, we introduced RTX, a complete redesign of our architecture for the modern era of computer graphics.
然后10年前,我们推出了,大约10年前,是什么,8年前,我们推出了RTX,这是我们为现代计算机图形时代对架构进行的全面重新设计。

[12:39] for the modern era of computer graphics.
为计算机图形学的现代时代。

[12:39] GeForce brought CUDA to the world.
GeForce 将 CUDA 带给了世界。

[12:42] GeForce brought CUDA to the world.
GeForce 将 CUDA 带给了世界。

[12:43] GeForce therefore enabled Alex Kruefky and Ilas
因此,GeForce 使 Alex Kruefky 和 Ilas

[12:47] therefore enabled Alex Kruefky and Ilas Susver and Jeff Hinton, Andrew Ang and
因此,使 Alex Kruefky 和 Ilas Susver、Jeff Hinton、Andrew Ang 以及

[12:50] Susver and Jeff Hinton, Andrew Ang and so many others to discover that the GPU
Ilas Susver、Jeff Hinton、Andrew Ang 以及许多其他人能够发现 GPU

[12:54] so many others to discover that the GPU could be their friend in accelerating
许多其他人能够发现 GPU 可以成为他们加速深度学习的朋友。

[12:56] could be their friend in accelerating deep learning.
可以成为他们加速深度学习的朋友。

[12:56] It started the big bang
它开启了人工智能的“大爆炸”。

[12:58] deep learning. It started the big bang of AI.
深度学习。它开启了人工智能的“大爆炸”。

[13:02] of AI. 10 years ago, we decided that we
人工智能。十年前,我们决定

[13:02] 10 years ago, we decided that we would fuse
十年前,我们决定融合

[13:05] would fuse programmable shading and introduce two
融合可编程着色,并引入两个

[13:08] programmable shading and introduce two new ideas.
可编程着色,并引入两个新概念。

[13:08] ray tracing, hardware ray
光线追踪,硬件光线

[13:11] new ideas. ray tracing, hardware ray tracing, which is incredibly hard to do.
新概念。光线追踪,硬件光线追踪,这在当时是极其困难的。

[13:13] tracing, which is incredibly hard to do. And a new idea at the time, imagine
追踪,这在当时是极其困难的。还有一个新概念,想象一下

[13:13] And a new idea at the time, imagine about 10 years ago, we thought that AI
还有一个新概念,想象一下,大约十年前,我们认为人工智能

[13:16] about 10 years ago, we thought that AI would revolutionize computer graphics.
大约十年前,我们认为人工智能将彻底改变计算机图形学。

[13:18] would revolutionize computer graphics.
将彻底改变计算机图形学。

[13:18] Just as GeForce brought AI to the world,
正如 GeForce 将人工智能带给了世界一样,

[13:21] Just as GeForce brought AI to the world, AI is now going to go back and
正如 GeForce 将人工智能带给了世界一样,人工智能现在将回过头来

[13:24] AI is now going to go back and revolutionize how computer graphics is
人工智能现在将回过头来彻底改变计算机图形学的运作方式。

[13:26] revolutionize how computer graphics is done all together. Well, today I'm going
彻底改变计算机图形学的运作方式。那么,今天我将

[13:29] done all together. Well, today I'm going to show you something of the future.
运作方式。那么,今天我将向您展示一些未来的东西。

[13:31] to show you something of the future. This is our next generation of graphics
向您展示一些未来的东西。这是我们下一代图形

[13:33] This is our next generation of graphics technology. We call it neuro rendering.
这是我们下一代图形技术。我们称之为神经渲染。

[13:36] technology. We call it neuro rendering. The fusion,
技术。我们称之为神经渲染。融合,

[13:39] The fusion,
融合,

[13:41] The fusion, the fusion of 3D graphics and artificial intelligence.
3D图形和人工智能的融合。

[13:45] This is DLSS 5.
这是DLSS 5。

[13:47] Take a look at it.
看看它。

[14:05] Heat.
热。

[14:05] Heat.
热。

[14:57] Is that incredible?
这令人难以置信吗?

[15:03] Computer graphics comes to life.
计算机图形栩栩如生。

[15:05] Computer graphics comes to life.
计算机图形栩栩如生。

[15:05] Now what did we do?
现在我们做了什么?

[15:08] What did we do?
我们做了什么?

[15:08] We fused controllable 3D graphics.
我们融合了可控的 3D 图形。

[15:11] Controllable 3D graphics.
可控的 3D 图形。

[15:11] The ground truth of virtual worlds, the structured data, remember this word, the structured data of virtual worlds, of generated worlds.
虚拟世界的真实情况,结构化数据,记住这个词,虚拟世界的结构化数据,生成世界的结构化数据。

[15:14] The ground truth of virtual worlds, the structured data, remember this word, the structured data of virtual worlds, of generated worlds.
虚拟世界的真实情况,结构化数据,记住这个词,虚拟世界的结构化数据,生成世界的结构化数据。

[15:16] Data, remember this word, the structured data of virtual worlds, of generated worlds.
数据,记住这个词,虚拟世界的结构化数据,生成世界的结构化数据。

[15:19] Data of virtual worlds, of generated worlds.
虚拟世界的数据,生成世界的数据。

[15:22] We combine 3D graphics, structured data with generative AI, probabilistic computing.
我们结合了 3D 图形、结构化数据与生成式人工智能、概率计算。

[15:24] Graphics, structured data with generative AI, probabilistic computing.
图形、结构化数据与生成式人工智能、概率计算。

[15:27] Generative AI, probabilistic computing.
生成式人工智能、概率计算。

[15:29] Probabilistic computing. One of them is completely predictive, the other one probabilistic yet highly realistic.
概率计算。其中一个是完全预测性的,另一个是概率性的但高度逼真的。

[15:31] Completely predictive, the other one probabilistic yet highly realistic.
完全预测性的,另一个是概率性的但高度逼真的。

[15:35] Probabilistic yet highly realistic. We combine these two ideas.
概率性的但高度逼真的。我们结合了这两个想法。

[15:37] Combine these two ideas. Combine these two ideas controlled through structured data controlled perfectly and yet generating at the same time.
结合这两个想法。结合这两个想法,通过结构化数据完美控制,同时生成。

[15:40] Two ideas controlled through structured data controlled perfectly and yet generating at the same time.
两个想法通过结构化数据完美控制,同时生成。

[15:44] Data controlled perfectly and yet generating at the same time.
数据完美控制,同时生成。

[15:46] And as a result, the content is beautiful, amazing, as well as controllable.
结果是,内容既美观、惊艳,又可控。

[15:48] Result, the content is beautiful, amazing, as well as controllable.
结果是,内容既美观、惊艳,又可控。

[15:51] The content is beautiful, amazing, as well as controllable.
内容既美观、惊艳,又可控。

[15:54] This concept of fusing structured information and
这种融合结构化信息的概念

[15:57] Fusing structured information and generative AI will repeat itself in one generative AI will repeat itself in one industry after another industry after industry after another industry after another industry.
将结构化信息与生成式人工智能相结合,将在一个又一个行业中不断重复,一个行业又一个行业,一个行业又一个行业。

[16:05] Structured data is the foundation of trustworthy AI.
结构化数据是值得信赖的人工智能的基础。

[16:12] Well, this is going to scare you a little bit.
好吧,这会让你有点害怕。

[16:13] I'm going to flip the slide.
我要翻页了。

[16:15] and don't gasp.
不要倒吸一口气。

[16:18] So, we're going to go through the schematic for the rest of the time.
所以,我们将在剩余的时间里浏览这个示意图。

[16:25] This is my best slide.
这是我最好的幻灯片。

[16:28] Every time I I asked my I asked the team, "What's my best slide?"
每次我问我的团队,“我最好的幻灯片是什么?”

[16:32] Repeatedly, this was it.
一再如此,就是这个。

[16:34] They say, "Don't do it, Jensen. Don't do it."
他们说,“别这么做,詹森。别这么做。”

[16:38] I said, 'N no, this these seats are free for some of you.
我说,“不,不,这些座位对你们中的一些人是免费的。”

[16:44] So this is your price of admission.
所以这是你的入场费。

[16:46] So this is this is structured data.
所以这是,这是结构化数据。

[16:49] You've heard of it.
你听说过它。

[16:54] SQL, Spark, Pandas, Velox, some of these really really important very large platforms.
SQL、Spark、Pandas、Velox,这些非常非常重要的大型平台中的一些。

[16:56] Snow, snow, uh,
Snow、Snow,呃,

[17:00] very large platforms.
非常大的平台。

[17:00] Snow, snow, uh, snowflake, data bricks, EMR, Amazon EMR.
雪花,雪花,嗯,雪花,数据砖,EMR,Amazon EMR。

[17:05] snowflake, data bricks, EMR, Amazon EMR, um, Azure, Fabric.
雪花,数据砖,EMR,Amazon EMR,嗯,Azure,Fabric。

[17:08] um, Azure, Fabric, Google Cloud, BigQuery.
嗯,Azure,Fabric,Google Cloud,BigQuery。

[17:12] All of these platforms are processing data frames.
所有这些平台都在处理数据帧。

[17:15] These data frames are giant spreadsheets
这些数据帧是巨大的电子表格

[17:17] and they hold all of life's information.
它们包含了生命的所有信息。

[17:20] This is the structured data, the ground truth of business.
这是结构化数据,是业务的真相。

[17:24] This is the ground truth of enterprise computing.
这是企业计算的真相。

[17:26] Well, now we're going to have AI use structured data
好吧,现在我们将让人工智能使用结构化数据

[17:30] and we better accelerate the living daylights out of it.
我们最好能极大地加速它。

[17:32] It used to be okay
以前是可以的

[17:35] and we would, you know, of course we would accelerate uh structured data
我们当然会加速嗯结构化数据

[17:37] so that we could do more.
以便我们能做更多的事情。

[17:40] We could do it more cheaply.
我们可以更便宜地做到这一点。

[17:42] We could do it more frequently per day
我们可以每天更频繁地做到这一点

[17:45] and keep the company running at a much more synchronized way.
并使公司以一种更加同步的方式运行。

[17:46] However, in the future, what's going to happen
然而,在未来,将会发生什么

[17:49] is these data structures are going to be used by AI
是这些数据结构将被人工智能使用

[17:51] and AI is going to be much much faster than us.
而人工智能将比我们快得多。

[17:53] Future agents are going to use structured
未来的代理将使用结构化

[18:02] agents are going to use structured databases as well.
代理商也将使用结构化数据库。

[18:04] And then of course databases as well.
然后当然还有数据库。

[18:07] And then of course the unstructured database, the unstructured database, the generative database.
然后当然还有非结构化数据库,非结构化数据库,生成式数据库。

[18:10] This database is generative database.
这个数据库是生成式数据库。

[18:12] This database is represents the vast majority of the world.
这个数据库代表了世界上的绝大多数。

[18:15] Vector databases, unstructured data, PDFs, videos, speeches, all of the world's information.
向量数据库,非结构化数据,PDF,视频,语音,世界上所有的信息。

[18:19] About 90% of what's generated every single year is unstructured data.
每年生成的内容中约有90%是非结构化数据。

[18:21] Until now, this data has been completely useless to the world.
到目前为止,这些数据对世界来说是完全无用的。

[18:26] We read it, we put it into our file system, and that's it.
我们阅读它,将其放入文件系统,仅此而已。

[18:28] Unfortunately, we can't query it.
不幸的是,我们无法查询它。

[18:30] We can't search for it.
我们无法搜索它。

[18:32] It's hard to do that.
这很难做到。

[18:34] And the reason for that is because there's no easy indexing of unstructured data.
原因在于非结构化数据没有简单的索引方式。

[18:35] You have to understand its meaning, its purpose.
你必须理解它的含义,它的目的。

[18:38] And so now we have AI do that just as AI was able to solve multi-modality perception you can and understanding you can use that same technology
所以现在我们让人工智能来做这件事,就像人工智能能够解决多模态感知和理解一样,你可以使用相同的技术

[18:41] multimodality perception and understanding to go read a PDF to understand its meaning and from that meaning embedded into a larger structure
多模态感知和理解来阅读PDF,理解其含义,并从该含义嵌入到更大的结构中

[19:05] meaning embedded into a larger structure that we can search into we can query that.
含义嵌入到一个更大的结构中,我们可以搜索,我们可以查询它。

[19:08] we can search into we can query into.
我们可以搜索,我们可以查询。

[19:12] NVIDIA created two foundational libraries.
英伟达创建了两个基础库。

[19:14] Just like we created RTX for 3D graphics, we created QDF for data frames, structured data.
就像我们为 3D 图形创建了 RTX 一样,我们为数据帧、结构化数据创建了 QDF。

[19:17] We created QVS for vector stores, semantic data, unstructured data, AI data.
我们为向量存储、语义数据、非结构化数据、人工智能数据创建了 QVS。

[19:23] These two platforms are going to be two of the most important platforms in the future.
这两个平台将是未来最重要的两个平台。

[19:30] super excited to see its adoption throughout the network, this complicated network of the world's data processing systems.
非常期待看到它在整个网络中被采用,这个复杂的全球数据处理系统网络。

[19:38] And the reason for that is because data processing has been around a long time and therefore so many different companies and platforms and services.
原因在于数据处理已经存在了很长时间,因此有如此多的不同公司、平台和服务。

[19:47] It has taken us a long time to integrate deeply into this ecosystem.
我们花费了很长时间才深入集成到这个生态系统中。

[19:53] I'm super proud of the work that we're doing here.
我为我们在这里所做的工作感到非常自豪。

[19:57] And then today we're announcing several of them.
然后今天我们宣布了其中的几个。

[20:01] IBM the inventor of SQL one of the most important domain specific languages of all kind of all
IBM,SQL 的发明者,最重要领域特定语言之一,所有种类的所有

[20:07] Specific languages of all kind of all time is accelerating Watson X data with KUDF.
所有种类、所有时代的所有特定语言都在加速使用 KUDF 的 Watson X 数据。

[20:17] 60 years ago IBM introduced the System 360.
60 年前,IBM 推出了 System 360。

[20:22] The first modern platform for general-purpose computing, launching the computing era.
第一个通用的现代计算平台,开启了计算时代。

[20:26] Then SQL, a declarative language to query data without requiring the computer to be instructed step by step.
然后是 SQL,一种声明式语言,无需指示计算机一步一步地操作即可查询数据。

[20:36] And the data warehouse.
以及数据仓库。

[20:38] Each the foundations of modern enterprise computing.
每一个都是现代企业计算的基础。

[20:40] Today, IBM and NVIDIA are reinventing data processing for the era of AI by accelerating IBM Watson X. Data SQL engines with NVIDIA GPU computing libraries.
如今,IBM 和 NVIDIA 通过加速 IBM Watson X. Data SQL 引擎和 NVIDIA GPU 计算库,正在为人工智能时代重塑数据处理。

[20:53] Data is the ground truth that gives AI context and meaning.
数据是赋予人工智能上下文和意义的地面实况。

[20:57] AI needs rapid access to massive data sets.
人工智能需要快速访问海量数据集。

[21:03] Today's CPU data processing systems can't keep up.
当今的 CPU 数据处理系统已跟不上步伐。

[21:05] Nestle makes thousands of
雀巢生产数千种

[21:09] can't keep up. Nestle makes thousands of supply chain decisions every day. Their

[21:12] supply chain decisions every day. Their order to cache data mart aggregates

[21:14] order to cache data mart aggregates every supply order and delivery event

[21:17] every supply order and delivery event across global operations in 185

[21:20] across global operations in 185 countries.

[21:22] countries. On CPUs, Nestle refreshed the data mart

[21:24] On CPUs, Nestle refreshed the data mart a few times a day. With accelerated

[21:27] a few times a day. With accelerated Watson X data running on Nvidia GPUs,

[21:31] Watson X data running on Nvidia GPUs, Nestle can run the same workload five

[21:33] Nestle can run the same workload five times faster at 83% lower cost.

[21:38] times faster at 83% lower cost. The next computing platform has arrived.

[21:40] The next computing platform has arrived. Accelerated computing for the era of AI.

[21:52] NVIDIA accelerates data processing in

[21:54] NVIDIA accelerates data processing in the cloud. We also accelerate data

[21:56] the cloud. We also accelerate data processing on prem. As you know, Dell is

[21:59] processing on prem. As you know, Dell is the worldleading computer systems maker

[22:02] the worldleading computer systems maker and they also are one of the world's

[22:05] and they also are one of the world's leading storage providers and they

[22:06] leading storage providers and they worked with us to create the Dell AI

[22:09] worked with us to create the Dell AI data platform that integrates QDF and

[22:12] data platform that integrates QDF and QVS to create an accelerated data

[22:15] QVS to create an accelerated data platform. well for the era of AI and uh

[22:19] platform. well for the era of AI and uh this is an example of what they did with

[22:21] this is an example of what they did with NT data huge speed up this is cloud

[22:25] NT data huge speed up this is cloud Google cloud and Google cloud as you

[22:27] Google cloud and Google cloud as you know we've been working with Google

[22:28] know we've been working with Google cloud for a very long time we accelerate

[22:32] cloud for a very long time we accelerate Google's vertex AI we now accelerate

[22:35] Google's vertex AI we now accelerate bigquery really important uh framework

[22:38] bigquery really important uh framework and really important platform and this

[22:40] and really important platform and this is an example of our work together with

[22:42] is an example of our work together with Snapchat where we reduce their cost of

[22:45] Snapchat where we reduce their cost of computing by nearly 80%.

[22:49] computing by nearly 80%. When you accelerate data processing,

[22:51] When you accelerate data processing, when you accelerate computing, you get

[22:54] when you accelerate computing, you get the benefit of speed, you get the

[22:56] the benefit of speed, you get the benefit of scale, but most importantly,

[22:59] benefit of scale, but most importantly, you also get the benefit of cost. And so

[23:02] you also get the benefit of cost. And so all of those come together as one. It

[23:05] all of those come together as one. It was originally called Moore's law.

[23:07] was originally called Moore's law. Moore's law was about getting

[23:08] Moore's law was about getting performance doubling every couple of

[23:10] performance doubling every couple of years. It's another way of saying so

[23:13] years. It's another way of saying so long as the price remains about the same

[23:15] long as the price remains about the same and most computers remained about the

[23:17] and most computers remained about the same, you're also getting twice the

[23:19] same, you're also getting twice the performance every year or you're

[23:21] performance every year or you're reducing the cost of computing every

[23:23] reducing the cost of computing every single year. Well, Moore's law has run

[23:26] single year. Well, Moore's law has run out of steam. We need a new approach.

[23:29] out of steam. We need a new approach. Accelerated computing allows us to take

[23:31] Accelerated computing allows us to take these giant leaps forward and as you

[23:33] these giant leaps forward and as you will see later because we continue to

[23:36] will see later because we continue to optimize the algorithms

[23:38] optimize the algorithms and Nvidia is an algorithm company. As

[23:41] and Nvidia is an algorithm company. As we continue to optimize the algorithms

[23:43] we continue to optimize the algorithms and because our our reach is so large

[23:45] and because our our reach is so large and our install base is so large we can

[23:48] and our install base is so large we can reduce the computing cost increasing the

[23:50] reduce the computing cost increasing the scale increasing the speed for everybody

[23:53] scale increasing the speed for everybody continuously. This is Google cloud. You

[23:56] continuously. This is Google cloud. You could see this pattern I just mentioned.

[23:58] could see this pattern I just mentioned. I just wanted to show you three versions

[24:00] I just wanted to show you three versions of it. Nvidia built the accelerated

[24:03] of it. Nvidia built the accelerated computing platform has a bunch of

[24:06] computing platform has a bunch of libraries on top. I gave you three

[24:08] libraries on top. I gave you three examples. RTX is one of them. QDF is

[24:10] examples. RTX is one of them. QDF is another. KVS and we'll show you a few

[24:13] another. KVS and we'll show you a few more. These libraries sit on top of our

[24:16] more. These libraries sit on top of our platform. But ultimately

[24:19] platform. But ultimately we integrate into the world's cloud

[24:22] we integrate into the world's cloud services into the world's OEMs and

[24:25] services into the world's OEMs and together and other platforms that I'll

[24:28] together and other platforms that I'll show you together were able to reach the

[24:30] show you together were able to reach the world. This pattern Nvidia, Google

[24:34] world. This pattern Nvidia, Google Cloud, Snapchat will repeat over and

[24:37] Cloud, Snapchat will repeat over and over again. And kind of looks like this.

[24:39] over again. And kind of looks like this. And so this is one example. Nvidia with

[24:41] And so this is one example. Nvidia with Google Cloud. We accelerate Vertex AI.

[24:44] Google Cloud. We accelerate Vertex AI. We accelerate Bitquery. We accelerate.

[24:47] We accelerate Bitquery. We accelerate. We're I'm super proud of the work that

[24:49] We're I'm super proud of the work that we've done with Jackson XLA. We are

[24:51] we've done with Jackson XLA. We are incredible on PyTorch. We're the only

[24:54] incredible on PyTorch. We're the only accelerator in the world that's

[24:55] accelerator in the world that's incredible on PyTorch and incredible on

[24:58] incredible on PyTorch and incredible on Jackson XLA. And the customers that we

[25:00] Jackson XLA. And the customers that we support, the base 10s, the Crowd

[25:02] support, the base 10s, the Crowd Strikes, Puma, Salesforce, they're not

[25:06] Strikes, Puma, Salesforce, they're not our customers, but they're customers,

[25:09] our customers, but they're customers, developers of ours that we've integrated

[25:10] developers of ours that we've integrated the NVIDIA technologies into that we can

[25:13] the NVIDIA technologies into that we can then land on the clouds.

[25:16] then land on the clouds. Our relationship with cloud service

[25:18] Our relationship with cloud service providers are essentially us bringing

[25:21] providers are essentially us bringing customers to them. We integrate our

[25:24] customers to them. We integrate our libraries, we accelerate workloads, and

[25:27] libraries, we accelerate workloads, and we land those customers in the clouds.

[25:30] we land those customers in the clouds. And so, as you could see, most of our

[25:33] And so, as you could see, most of our cloud service providers love working

[25:35] cloud service providers love working with us. And um they're always asking us

[25:38] with us. And um they're always asking us to land the next customer on their

[25:40] to land the next customer on their cloud. And I just want to let you know

[25:43] cloud. And I just want to let you know there are a lot of customers.

[25:45] there are a lot of customers. We're going to accelerate everybody. And

[25:48] We're going to accelerate everybody. And so, there will be lots and lots of

[25:49] so, there will be lots and lots of customers will be able to land in your

[25:50] customers will be able to land in your cloud. Just be patient with us. And so

[25:53] cloud. Just be patient with us. And so this is Google Cloud. This is AWS. We've

[25:55] this is Google Cloud. This is AWS. We've been working with AWS a long time. And

[25:58] been working with AWS a long time. And one of the areas, one of the one of the

[26:00] one of the areas, one of the one of the things I'm super excited about this year

[26:02] things I'm super excited about this year is we're going to bring open AI to AWS.

[26:05] is we're going to bring open AI to AWS. And so it's going to drive enormous

[26:07] And so it's going to drive enormous consumption of cloud computing at AWS.

[26:09] consumption of cloud computing at AWS. It's going to expand the reach, expand

[26:11] It's going to expand the reach, expand the compute of open AI. And as you know,

[26:14] the compute of open AI. And as you know, they are completely compute constrained.

[26:17] they are completely compute constrained. And so AWS, we accelerate EMR, we

[26:20] And so AWS, we accelerate EMR, we accelerate SageMaker, we accelerate

[26:22] accelerate SageMaker, we accelerate Bedrock. NVIDIA's integrated really

[26:25] Bedrock. NVIDIA's integrated really deeply into AWS. They were our first

[26:28] deeply into AWS. They were our first cloud partner,

[26:30] cloud partner, Microsoft Azure.

[26:32] Microsoft Azure. NVIDIA's A100 supercomputer

[26:36] NVIDIA's A100 supercomputer um was the the first one we built was

[26:39] um was the the first one we built was for Nvidia. The first one we installed

[26:42] for Nvidia. The first one we installed was at Azure. And that led to the inter

[26:46] was at Azure. And that led to the inter the uh the big successful partnership

[26:48] the uh the big successful partnership with open AI but we've been working with

[26:50] with open AI but we've been working with Azure for quite a long time. We

[26:52] Azure for quite a long time. We accelerate Azure cloud now it's uh their

[26:56] accelerate Azure cloud now it's uh their AI foundry we partner deeply with we

[26:58] AI foundry we partner deeply with we accelerate Bing search we work with them

[27:02] accelerate Bing search we work with them on Azure regions. This is one of the

[27:05] on Azure regions. This is one of the areas that is incredibly important as we

[27:09] areas that is incredibly important as we continue to expand AI throughout the

[27:11] continue to expand AI throughout the world. One of the capabilities that we

[27:13] world. One of the capabilities that we offer is confidential computing.

[27:18] offer is confidential computing. That in confidential computing, you want

[27:20] That in confidential computing, you want to make sure that even the operator

[27:23] to make sure that even the operator cannot see your data. Even the operator

[27:27] cannot see your data. Even the operator cannot touch or see your models.

[27:30] cannot touch or see your models. confidential computing. Nvidia's GPUs is

[27:32] confidential computing. Nvidia's GPUs is the first ones in the world to do that.

[27:35] the first ones in the world to do that. It's now able to support confidential

[27:37] It's now able to support confidential computing and protected deployment of

[27:39] computing and protected deployment of these very valuable open AI models and

[27:42] these very valuable open AI models and and anthropic models throughout clouds

[27:44] and anthropic models throughout clouds and different regions and all because of

[27:47] and different regions and all because of our conf confidential computing.

[27:49] our conf confidential computing. Confidential computing is super

[27:50] Confidential computing is super important. And here's an example where

[27:53] important. And here's an example where we have different customers that we work

[27:55] we have different customers that we work with. Synopsis, a great partner of ours.

[27:57] with. Synopsis, a great partner of ours. were accelerating all of their EDA and

[27:59] were accelerating all of their EDA and CA workflows. And then we landed at

[28:01] CA workflows. And then we landed at Microsoft Azure.

[28:04] Microsoft Azure. We were Oracle's first AI customer.

[28:08] We were Oracle's first AI customer. Most people would have thought we were

[28:10] Most people would have thought we were their first supplier. We were their

[28:11] their first supplier. We were their first supplier also, but we were their

[28:14] first supplier also, but we were their first AI customer. I'm quite proud of

[28:16] first AI customer. I'm quite proud of the fact that I explained AI clouds to

[28:20] the fact that I explained AI clouds to Oracle for the first time and we were

[28:22] Oracle for the first time and we were their first customer. Since then,

[28:24] their first customer. Since then, they've really taken off. We've landed a

[28:27] they've really taken off. We've landed a whole bunch of our partners there. Core

[28:29] whole bunch of our partners there. Core Coher and Fireworks and of course very

[28:31] Coher and Fireworks and of course very famously open AAI

[28:35] a great partnership with Core

[28:38] a great partnership with Core Core. They're the world's first AI

[28:42] Core. They're the world's first AI native cloud. A company that was built

[28:45] native cloud. A company that was built with only one singular purpose to

[28:48] with only one singular purpose to provision to host GPUs as the era of

[28:51] provision to host GPUs as the era of accelerated computing showed up and to

[28:53] accelerated computing showed up and to host for AI clouds. They've got some

[28:56] host for AI clouds. They've got some fantastic customers and they're growing

[28:58] fantastic customers and they're growing incredibly. One of the platforms that

[29:00] incredibly. One of the platforms that I'm quite excited about is Palunteer and

[29:02] I'm quite excited about is Palunteer and Dell. The three of our companies have

[29:05] Dell. The three of our companies have made it possible to stand up a brand new

[29:08] made it possible to stand up a brand new type of AI platform, the Palunteer

[29:10] type of AI platform, the Palunteer ontology platform and AI platform. And

[29:13] ontology platform and AI platform. And we could stand up these platforms in any

[29:15] we could stand up these platforms in any country in any airgapped region

[29:18] country in any airgapped region completely on prem, completely on site,

[29:21] completely on prem, completely on site, completely in the field. AI could be

[29:23] completely in the field. AI could be deployed literally everywhere without

[29:25] deployed literally everywhere without our confidential computing capability

[29:27] our confidential computing capability without our ability to build the

[29:29] without our ability to build the endtoend system as well as offer the

[29:32] endtoend system as well as offer the entire

[29:34] entire accelerated computing and AI stack from

[29:36] accelerated computing and AI stack from data processing whether it's vectors or

[29:39] data processing whether it's vectors or structures all the way to AI it wouldn't

[29:41] structures all the way to AI it wouldn't have been possible I wanted to show you

[29:44] have been possible I wanted to show you these examples

[29:46] these examples this is our special working relationship

[29:50] this is our special working relationship with the world's cloud service providers

[29:52] with the world's cloud service providers and many well all of them are here and I

[29:55] and many well all of them are here and I get the benefit of seeing them during

[29:56] get the benefit of seeing them during boot tour and it's just so incredibly

[29:58] boot tour and it's just so incredibly exciting. I just want to thank all of

[30:00] exciting. I just want to thank all of you for the hard work. What NVIDIA has

[30:02] you for the hard work. What NVIDIA has done is this and you're going to see

[30:04] done is this and you're going to see this theme over and over again.

[30:07] this theme over and over again. Nvidia is vertically integrated the

[30:10] Nvidia is vertically integrated the world's first vertically integrated

[30:14] world's first vertically integrated but horizontally open company

[30:17] but horizontally open company and the reason that's necessary is very

[30:20] and the reason that's necessary is very simple. Accelerated

[30:23] simple. Accelerated computing is not a chip problem.

[30:26] computing is not a chip problem. Accelerated computing is not a systems

[30:28] Accelerated computing is not a systems problem. Accelerated computing has a

[30:31] problem. Accelerated computing has a missing word. We just never say it

[30:34] missing word. We just never say it anymore. Application acceleration.

[30:38] anymore. Application acceleration. You if I could make a computer run

[30:40] You if I could make a computer run everything faster, that's called a CPU.

[30:43] everything faster, that's called a CPU. But that's run out of steam. The only

[30:46] But that's run out of steam. The only way for us to accelerate applications

[30:48] way for us to accelerate applications going forward and continue to bring

[30:51] going forward and continue to bring tremendous speed up, tremendous cost

[30:53] tremendous speed up, tremendous cost reduction is through application or

[30:56] reduction is through application or domain specific acceleration. I dropped

[30:59] domain specific acceleration. I dropped that phrase in the in the front and

[31:01] that phrase in the in the front and therefore it just became applica

[31:02] therefore it just became applica accelerated computing and that is the

[31:05] accelerated computing and that is the reason why Nvidia has to be library

[31:07] reason why Nvidia has to be library after library, domain after domain,

[31:09] after library, domain after domain, vertical after vertical.

[31:11] vertical after vertical. We are a vertically integrated computing

[31:14] We are a vertically integrated computing company. There is no other way. We have

[31:17] company. There is no other way. We have to understand the applications. We have

[31:19] to understand the applications. We have to understand the domain. We have to

[31:20] to understand the domain. We have to understand fundamentally the algorithms.

[31:23] understand fundamentally the algorithms. And we have to figure out how to deploy

[31:26] And we have to figure out how to deploy the algorithm

[31:27] the algorithm in whatever scenario it wants to be

[31:29] in whatever scenario it wants to be deployed. Whether it's a data center,

[31:32] deployed. Whether it's a data center, cloud, onrem, at the edge, or in a

[31:34] cloud, onrem, at the edge, or in a robotic system. All of those computing

[31:36] robotic system. All of those computing systems are different. And finally, the

[31:39] systems are different. And finally, the systems and chips. We are vertically

[31:42] systems and chips. We are vertically integrated. What makes it incredibly

[31:44] integrated. What makes it incredibly powerful and the reason why you saw all

[31:46] powerful and the reason why you saw all the slides is because Nvidia is

[31:48] the slides is because Nvidia is horizontally open. We work and integrate

[31:51] horizontally open. We work and integrate Nvidia's technology into whatever

[31:53] Nvidia's technology into whatever platform you would like us to integrate

[31:54] platform you would like us to integrate into. We offer you the software. We

[31:57] into. We offer you the software. We offer you libraries. We integrate with

[31:59] offer you libraries. We integrate with your technology so that we can bring

[32:02] your technology so that we can bring accelerated computing to everybody in

[32:04] accelerated computing to everybody in the world.

[32:06] the world. Well,

[32:08] Well, this GTC is really a great demonstration

[32:11] this GTC is really a great demonstration of that. You know, most of the time,

[32:13] of that. You know, most of the time, most of the time you'll see me talk

[32:15] most of the time you'll see me talk about these verticals and I'll use some

[32:16] about these verticals and I'll use some examples, but in every single case,

[32:20] examples, but in every single case, whether it's automotive f by the way,

[32:22] whether it's automotive f by the way, financial services, the largest

[32:24] financial services, the largest percentage of attendees at this GTC is

[32:27] percentage of attendees at this GTC is from the financial services industry.

[32:30] from the financial services industry. I know. I I'm hoping it's developers,

[32:33] I know. I I'm hoping it's developers, not traders.

[32:35] not traders. Guys,

[32:42] here's here's

[32:44] here's here's here's one thing I wanted to say. And so

[32:49] here's one thing I wanted to say. And so in the audience represents Nvidia's

[32:51] in the audience represents Nvidia's ecosystem upstream of our supply chain

[32:54] ecosystem upstream of our supply chain and downstream of our supply chain. And

[32:57] and downstream of our supply chain. And we work we think about our supply chain

[32:59] we work we think about our supply chain upstream and downstream. And it's just

[33:02] upstream and downstream. And it's just so exciting that

[33:05] so exciting that our entire upstream supply chain this

[33:07] our entire upstream supply chain this last year

[33:10] last year irrespective of whether you're a 50 year

[33:11] irrespective of whether you're a 50 year old company, we have 70 year old

[33:13] old company, we have 70 year old companies. We have a 150 year old

[33:16] companies. We have a 150 year old company who are now part of Nvidia

[33:19] company who are now part of Nvidia supply chain and partnering with us

[33:20] supply chain and partnering with us either upstream or downstream. And last

[33:23] either upstream or downstream. And last year

[33:24] year you had your record year, did you not?

[33:28] you had your record year, did you not? Congratulations.

[33:35] We're on to something here. This is the

[33:37] We're on to something here. This is the beginning of something very, very big.

[33:39] beginning of something very, very big. And so if you look at accelerated

[33:41] And so if you look at accelerated computing, we've now set the computing

[33:42] computing, we've now set the computing platform. But in order for us to

[33:47] platform. But in order for us to activate those computing platforms, we

[33:49] activate those computing platforms, we need to have domain specific libraries

[33:51] need to have domain specific libraries that solve very important problems in

[33:54] that solve very important problems in each one of the verticals that we

[33:56] each one of the verticals that we address. You see us addressing every

[33:58] address. You see us addressing every single one of this. Autonomous vehicles,

[34:01] single one of this. Autonomous vehicles, our reach, our breadth, our impact.

[34:04] our reach, our breadth, our impact. Incredible. We have a track on that.

[34:07] Incredible. We have a track on that. financial services. I just mentioned

[34:08] financial services. I just mentioned algorithmic trading is going from

[34:11] algorithmic trading is going from classical machine learning with human

[34:14] classical machine learning with human feature engineering called quant the

[34:16] feature engineering called quant the quants did that to now supercomputers

[34:20] quants did that to now supercomputers studying massive amounts of data

[34:22] studying massive amounts of data discovering insight and discovering

[34:24] discovering insight and discovering patterns by itself and so this is going

[34:27] patterns by itself and so this is going through its deep learning and its

[34:29] through its deep learning and its transformer moment healthcare is going

[34:31] transformer moment healthcare is going is going through their chap GPT moment

[34:34] is going through their chap GPT moment some really exciting work that we're

[34:35] some really exciting work that we're there we We have a great keynote track

[34:38] there we We have a great keynote track here. We have a great keynote track.

[34:39] here. We have a great keynote track. Kimberly Pal is doing a great keynote

[34:41] Kimberly Pal is doing a great keynote track um for healthcare. We're talking

[34:43] track um for healthcare. We're talking about AI physics or AI biology for drug

[34:48] about AI physics or AI biology for drug discovery, AI agents for customer

[34:51] discovery, AI agents for customer service and support of diagnos diagnosis

[34:56] service and support of diagnos diagnosis and of course physical AI, robotic

[34:58] and of course physical AI, robotic systems. All these different vectors of

[35:01] systems. All these different vectors of AI have different platforms that NVIDIA

[35:03] AI have different platforms that NVIDIA provides. industrial we are completely

[35:07] provides. industrial we are completely resetting and starting the largest

[35:09] resetting and starting the largest buildout of human history and most of

[35:13] buildout of human history and most of the world's industries building AI

[35:16] the world's industries building AI factories building chip plants building

[35:18] factories building chip plants building computer plants are represented here

[35:20] computer plants are represented here today media and entertainment gaming of

[35:23] today media and entertainment gaming of course real time AI platform so that we

[35:27] course real time AI platform so that we could translation and broadcast support

[35:30] could translation and broadcast support and live live games and live video

[35:34] and live live games and live video enormous amount of it will be augmented

[35:36] enormous amount of it will be augmented with AI. We have a we have a platform

[35:39] with AI. We have a we have a platform called hollow scan quantum there are 35

[35:41] called hollow scan quantum there are 35 different companies here building with

[35:44] different companies here building with us the next generation of quantum GPU

[35:47] us the next generation of quantum GPU hybrid systems uh retail and CPG using

[35:50] hybrid systems uh retail and CPG using Nvidia for supply chain using creating a

[35:54] Nvidia for supply chain using creating a gentic shopping systems

[35:56] gentic shopping systems AI agents for customer support a lot of

[36:00] AI agents for customer support a lot of work being done here $35 trillion

[36:03] work being done here $35 trillion industry robotics $50 trillion industry

[36:06] industry robotics $50 trillion industry in manufacturing Nvidia has been working

[36:08] in manufacturing Nvidia has been working in this area for a decade now building

[36:10] in this area for a decade now building three computers, the fundamental

[36:12] three computers, the fundamental computers necessary to build robotic

[36:14] computers necessary to build robotic systems. We are integrated with working

[36:17] systems. We are integrated with working with literally every single company that

[36:20] with literally every single company that we know of building robots. We have 110

[36:22] we know of building robots. We have 110 robots here at the show. And then

[36:25] robots here at the show. And then telecommunications

[36:27] telecommunications about as large as the world's IT

[36:29] about as large as the world's IT industry about$2 trillion dollars. We

[36:31] industry about$2 trillion dollars. We see of course base stations everywhere.

[36:34] see of course base stations everywhere. It's one of the world's infrastructures.

[36:37] It's one of the world's infrastructures. It was the infrastructure of the last

[36:39] It was the infrastructure of the last generation of computing. That

[36:41] generation of computing. That infrastructure is going to get

[36:42] infrastructure is going to get completely reinvented. And the reason

[36:44] completely reinvented. And the reason for that is very simple. That base

[36:47] for that is very simple. That base station which is

[36:50] station which is it does one thing which is base station

[36:53] it does one thing which is base station is going to be an AI infrastructure

[36:56] is going to be an AI infrastructure platform in the future. AI will run at

[36:58] platform in the future. AI will run at the edge. And so lots of lots of great

[37:01] the edge. And so lots of lots of great um uh great uh discussion there. And our

[37:04] um uh great uh discussion there. And our platform there is called Aerial or AI

[37:06] platform there is called Aerial or AI RAM. Big partnership with Nokia, big

[37:08] RAM. Big partnership with Nokia, big partnership with T-Mobile and many

[37:10] partnership with T-Mobile and many others.

[37:12] others. At the core of our business,

[37:15] At the core of our business, everything that I just mentioned,

[37:17] everything that I just mentioned, computing platforms, but very

[37:20] computing platforms, but very importantly, our CUDA X libraries, our

[37:23] importantly, our CUDA X libraries, our CUDA X libraries is the algorithm, the

[37:26] CUDA X libraries is the algorithm, the algorithms that Nvidia invents. We are

[37:28] algorithms that Nvidia invents. We are an algorithm company. That's what makes

[37:30] an algorithm company. That's what makes us special. That what that's what makes

[37:32] us special. That what that's what makes it possible for me to be able to go into

[37:34] it possible for me to be able to go into every single one of these industries,

[37:36] every single one of these industries, imagine the future and have the world's

[37:38] imagine the future and have the world's best computer scientists describe and

[37:40] best computer scientists describe and solve problems, refactor it, reexpress

[37:43] solve problems, refactor it, reexpress it,

[37:44] it, and turn it into a library. We have so

[37:47] and turn it into a library. We have so many I think we have at this show, we're

[37:50] many I think we have at this show, we're announcing a hundred 100 libraries,

[37:54] announcing a hundred 100 libraries, something 70 libraries, maybe 40 models

[37:57] something 70 libraries, maybe 40 models and that's just at the show. We're

[37:59] and that's just at the show. We're updating these all the time. We're

[38:01] updating these all the time. We're updating them all the time. The

[38:03] updating them all the time. The libraries is the crown jewels of our

[38:06] libraries is the crown jewels of our company. It is what makes it possible

[38:08] company. It is what makes it possible for that platform, the computing

[38:10] for that platform, the computing platform to be activated in service of

[38:13] platform to be activated in service of solving a problem, making impact. One of

[38:16] solving a problem, making impact. One of the biggest, one of the most important

[38:17] the biggest, one of the most important libraries that we ever created, coupn

[38:22] libraries that we ever created, coupn CUDA deep neural networks. It completely

[38:25] CUDA deep neural networks. It completely revolutionized artificial intelligence,

[38:27] revolutionized artificial intelligence, caused a big bang of modern AI. Let me

[38:30] caused a big bang of modern AI. Let me show you a short video about CUDA X.

[38:36] 20 years ago, we built CUDA, a single

[38:39] 20 years ago, we built CUDA, a single architecture for accelerated computing.

[38:42] architecture for accelerated computing. Today, we've reinvented computing. A

[38:45] Today, we've reinvented computing. A thousand CUDA X libraries help

[38:47] thousand CUDA X libraries help developers make breakthroughs in every

[38:49] developers make breakthroughs in every field of science and engineering.

[38:52] field of science and engineering. CU opt for decision optimization.

[38:56] CU opt for decision optimization. CU litho for computational lithography.

[39:01] CU litho for computational lithography. CDSS for direct sparse solvers.

[39:05] CDSS for direct sparse solvers. Coup equivariance for geometryaware

[39:08] Coup equivariance for geometryaware neural networks.

[39:11] neural networks. Aerial for AI ran.

[39:15] Aerial for AI ran. Warp for differentiable physics.

[39:19] pair of bricks for genomics.

[39:22] pair of bricks for genomics. At their foundation are algorithms and

[39:25] At their foundation are algorithms and they are beautiful.

[40:08] Wow.

[40:10] Wow. Heat. Heat.

[42:00] Everything you saw was a simulation.

[42:02] Everything you saw was a simulation. Some of it was principled solvers,

[42:05] Some of it was principled solvers, fundamental physics solvers. Some of it

[42:07] fundamental physics solvers. Some of it was AI surrogates, AI physical models

[42:11] was AI surrogates, AI physical models and some of it was physical AI robotics

[42:14] and some of it was physical AI robotics models. Everything was simulated.

[42:17] models. Everything was simulated. Nothing was animated. Nothing was

[42:20] Nothing was animated. Nothing was articulated. Everything was completely

[42:23] articulated. Everything was completely simulated. That is what fundamentally

[42:26] simulated. That is what fundamentally Nvidia does. It is through the

[42:30] Nvidia does. It is through the connection of understanding of the

[42:32] connection of understanding of the algorithms with our computing platforms

[42:35] algorithms with our computing platforms that we're able to open up to unlock

[42:37] that we're able to open up to unlock these opportunities. Nvidia is a

[42:40] these opportunities. Nvidia is a vertically integrated computing company

[42:43] vertically integrated computing company with open

[42:45] with open horizontal integration with the world.

[42:48] horizontal integration with the world. So that's CUDA X. Well, just now you saw

[42:52] So that's CUDA X. Well, just now you saw a whole bunch of companies. You saw

[42:55] a whole bunch of companies. You saw Walmart and you know there's L'Oreal and

[42:58] Walmart and you know there's L'Oreal and incredible companies established

[43:00] incredible companies established companies JP Morgan and Ro and these are

[43:03] companies JP Morgan and Ro and these are companies in companies that have defined

[43:06] companies in companies that have defined society to today. Toyota is here. These

[43:10] society to today. Toyota is here. These are some of the largest companies in the

[43:12] are some of the largest companies in the world.

[43:13] world. It is also true

[43:16] It is also true that there's a whole bunch of companies

[43:17] that there's a whole bunch of companies you've never heard of. These are

[43:19] you've never heard of. These are companies we call them AI natives. a

[43:22] companies we call them AI natives. a whole bunch of small companies. This the

[43:25] whole bunch of small companies. This the list is gigantic. I can't I couldn't

[43:27] list is gigantic. I can't I couldn't this is just a little tiny tiny bit of

[43:29] this is just a little tiny tiny bit of it. And um I I couldn't decide whether

[43:32] it. And um I I couldn't decide whether to show you more or show you less. And

[43:34] to show you more or show you less. And so I I made it so that you couldn't see

[43:36] so I I made it so that you couldn't see any

[43:39] and and nobody's feelings are hurt.

[43:41] and and nobody's feelings are hurt. However, inside this list are a bunch of

[43:44] However, inside this list are a bunch of brand new companies. There are companies

[43:46] brand new companies. There are companies like for example you might have heard a

[43:48] like for example you might have heard a couple of them open AAI anthropic but

[43:51] couple of them open AAI anthropic but there's a whole bunch of others there's

[43:52] there's a whole bunch of others there's a whole bunch of others and they serve

[43:54] a whole bunch of others and they serve different verticals

[43:56] different verticals something happened in the last two years

[43:59] something happened in the last two years particularly this last year we've been

[44:01] particularly this last year we've been working with the AI natives for a long

[44:02] working with the AI natives for a long time and this last year it just

[44:05] time and this last year it just skyrocketed and I'll explain to you why

[44:07] skyrocketed and I'll explain to you why it happened these this industry has

[44:09] it happened these this industry has skyrocketed $150 billion dollars of

[44:12] skyrocketed $150 billion dollars of investment into venture investment into

[44:15] investment into venture investment into startups, the largest in human history.

[44:18] startups, the largest in human history. This is also the first time that the

[44:22] This is also the first time that the scale of the investments went from

[44:24] scale of the investments went from millions of dollars, tens of millions of

[44:26] millions of dollars, tens of millions of dollars to hundreds of millions of

[44:28] dollars to hundreds of millions of dollars and billions of dollars. And the

[44:30] dollars and billions of dollars. And the reason for that is this is the first

[44:33] reason for that is this is the first time in history that every single one of

[44:36] time in history that every single one of these companies

[44:38] these companies needs compute and lots and lots of it.

[44:41] needs compute and lots and lots of it. They need tokens, lots and lots of it.

[44:43] They need tokens, lots and lots of it. they're either need they're either going

[44:45] they're either need they're either going to create and build and create tokens

[44:48] to create and build and create tokens and generate tokens or they're going to

[44:51] and generate tokens or they're going to integrate

[44:52] integrate add value tokens

[44:55] add value tokens that are available created by anthropic

[44:57] that are available created by anthropic and open AAI and others and so this

[45:00] and open AAI and others and so this industry is different in so many

[45:02] industry is different in so many different ways but the one thing that is

[45:04] different ways but the one thing that is very clear the impact that they're

[45:06] very clear the impact that they're making this the incredible value that

[45:08] making this the incredible value that they're delivering already is quite

[45:10] they're delivering already is quite tangible AI natives

[45:14] tangible AI natives All because we reinvented computing.

[45:17] All because we reinvented computing. Just like during the PC revolution, a

[45:19] Just like during the PC revolution, a whole bunch of new companies were

[45:21] whole bunch of new companies were created. Just as just as uh during the

[45:24] created. Just as just as uh during the internet revolution, a whole bunch of

[45:25] internet revolution, a whole bunch of companies were created and mobile cloud

[45:27] companies were created and mobile cloud a whole bunch of companies were created.

[45:28] a whole bunch of companies were created. Each one of them had their own standards

[45:30] Each one of them had their own standards and all we're talk about one of the

[45:32] and all we're talk about one of the major standards is that just happened.

[45:34] major standards is that just happened. Incredibly important. And this

[45:36] Incredibly important. And this generation, we also have our own large

[45:40] generation, we also have our own large number of very, very special companies.

[45:42] number of very, very special companies. We reinvented computing. It stands to

[45:45] We reinvented computing. It stands to reason there's going to be a whole new

[45:47] reason there's going to be a whole new crop of really important companies,

[45:50] crop of really important companies, consequential companies for the future

[45:52] consequential companies for the future of the world. The the Googles, the

[45:55] of the world. The the Googles, the Amazons, the Metas, consequential

[45:57] Amazons, the Metas, consequential companies that have come as a result of

[45:59] companies that have come as a result of the last computing platform shift. We

[46:02] the last computing platform shift. We are now at the beginning of a new

[46:03] are now at the beginning of a new platform shift. But what happened in the

[46:05] platform shift. But what happened in the last couple years? Well, we've been

[46:07] last couple years? Well, we've been watching, as you know, we've been

[46:09] watching, as you know, we've been working on deep learning and working on

[46:10] working on deep learning and working on AI, the big bang of modern AI. We were

[46:13] AI, the big bang of modern AI. We were right there at the spot and we've been

[46:15] right there at the spot and we've been advancing this field for quite some

[46:16] advancing this field for quite some time. But why the last two years? What

[46:20] time. But why the last two years? What happened in the last two years? Well,

[46:21] happened in the last two years? Well, three things. Chat GPT of course started

[46:25] three things. Chat GPT of course started the generative AI era. It's able to not

[46:28] the generative AI era. It's able to not just understand, perceive, and

[46:32] just understand, perceive, and understand. It's able to also translate

[46:34] understand. It's able to also translate and generate generation of unique

[46:37] and generate generation of unique content. I showed you the fusion of

[46:39] content. I showed you the fusion of generative AI with computer graphics and

[46:42] generative AI with computer graphics and it brought computer graphics to life.

[46:44] it brought computer graphics to life. You guys just everybody in the world

[46:46] You guys just everybody in the world should be using chat GPT. I know I use

[46:48] should be using chat GPT. I know I use it every single morning. Used it plenty

[46:50] it every single morning. Used it plenty this morning. And so chat GPT was the

[46:52] this morning. And so chat GPT was the generative AI the era. The second by the

[46:56] generative AI the era. The second by the way generative generative computing

[46:59] way generative generative computing versus the way we used to do computing.

[47:01] versus the way we used to do computing. It's not it's generative AI is a

[47:04] It's not it's generative AI is a capability of software but it has

[47:06] capability of software but it has profoundly changed how computing is

[47:08] profoundly changed how computing is done.

[47:10] done. Computing used to be retrieval based now

[47:13] Computing used to be retrieval based now it's generative. Keep that thought in

[47:15] it's generative. Keep that thought in mind when I talk about certain things

[47:17] mind when I talk about certain things and you'll realize why it is that

[47:20] and you'll realize why it is that everything that we do is going to change

[47:22] everything that we do is going to change how computers are architected, how

[47:24] how computers are architected, how computers are provided, how computers

[47:26] computers are provided, how computers are going to be built out and what is

[47:28] are going to be built out and what is the meaning of computing altogether

[47:31] the meaning of computing altogether generative AI 2023 end of 22 2023 the

[47:36] generative AI 2023 end of 22 2023 the next reasoning AI 01

[47:40] next reasoning AI 01 which and then took off with 03

[47:43] which and then took off with 03 reasoning allowed it to reflect, allows

[47:45] reasoning allowed it to reflect, allows it to think to itself, allowed it to

[47:48] it to think to itself, allowed it to plan, break down, break down problems

[47:51] plan, break down, break down problems and decompose a problem it couldn't

[47:54] and decompose a problem it couldn't understand into steps or parts that it

[47:56] understand into steps or parts that it could understand. It could ground itself

[47:59] could understand. It could ground itself on research. 01 made generative AI

[48:03] on research. 01 made generative AI trustworthy and grounded on truth. That

[48:07] trustworthy and grounded on truth. That caused Chad GPT to simply took off. And

[48:11] caused Chad GPT to simply took off. And that was a very, very big moment. the

[48:14] that was a very, very big moment. the amount of input tokens that was

[48:16] amount of input tokens that was necessary in order to produce and the

[48:18] necessary in order to produce and the amount of output tokens it need it

[48:20] amount of output tokens it need it generated in order to reason the model

[48:23] generated in order to reason the model was a little bit larger it you know of

[48:26] was a little bit larger it you know of course you could have much larger models

[48:27] course you could have much larger models the model 01 was a little bit larger not

[48:29] the model 01 was a little bit larger not much larger but its input token usage

[48:33] much larger but its input token usage for context

[48:34] for context and its output token for thinking

[48:38] and its output token for thinking increased the amount of computation

[48:39] increased the amount of computation tremendously then came quad code the

[48:43] tremendously then came quad code the first agentic model. It was able to read

[48:46] first agentic model. It was able to read files, code, compile it, test it,

[48:50] files, code, compile it, test it, evaluate it, go back and iterate on it.

[48:53] evaluate it, go back and iterate on it. Cloud code has revolutionized software

[48:55] Cloud code has revolutionized software engineering. As all of you know, 100% of

[48:58] engineering. As all of you know, 100% of NVIDIA is using a combination of CL or

[49:02] NVIDIA is using a combination of CL or oftentimes all three of them. Cloud

[49:04] oftentimes all three of them. Cloud code, codeex, and cursor all over

[49:07] code, codeex, and cursor all over Nvidia. There's not one software

[49:09] Nvidia. There's not one software engineer today who is not assisted by

[49:12] engineer today who is not assisted by one or many AI agents helping them code.

[49:15] one or many AI agents helping them code. Cloud code completely revolutionizes the

[49:18] Cloud code completely revolutionizes the the new inflection and the for the first

[49:21] the new inflection and the for the first time.

[49:23] time. You don't ask a AI what,

[49:27] You don't ask a AI what, where, when, how.

[49:31] where, when, how. You ask it

[49:33] You ask it create, do, build.

[49:37] create, do, build. You ask it to use tools,

[49:40] You ask it to use tools, take your context, read files. It's able

[49:44] take your context, read files. It's able to agentically break down a problem,

[49:46] to agentically break down a problem, reason about it, reflect on it. It's

[49:49] reason about it, reflect on it. It's able to solve problems, and actually

[49:51] able to solve problems, and actually perform tasks. An AI that was able to

[49:55] perform tasks. An AI that was able to perceive became an AI that could

[49:58] perceive became an AI that could generate. An AI that could generate

[50:00] generate. An AI that could generate became an AI that could reason. An AI

[50:01] became an AI that could reason. An AI that could reason now became an AI that

[50:04] that could reason now became an AI that can actually do work. Very productive

[50:07] can actually do work. Very productive work. The amount of computation in the

[50:09] work. The amount of computation in the last two years, we know that everybody

[50:12] last two years, we know that everybody in this room knows the computing demand

[50:15] in this room knows the computing demand for NVIDIA GPU is off the charts. Spot

[50:19] for NVIDIA GPU is off the charts. Spot pricing is skyrocketing. You couldn't

[50:21] pricing is skyrocketing. You couldn't find a GPU if you tried. And yet, in the

[50:24] find a GPU if you tried. And yet, in the meantime, we're shipping GPUs out,

[50:27] meantime, we're shipping GPUs out, incredible amounts of it, and demand

[50:29] incredible amounts of it, and demand just keeps on going up. There's a reason

[50:31] just keeps on going up. There's a reason for that. This fundamental inflection.

[50:36] for that. This fundamental inflection. Finally, AI is able to do productive

[50:39] Finally, AI is able to do productive work and therefore the inflection point

[50:43] work and therefore the inflection point of inference has arrived.

[50:47] of inference has arrived. AI now has to think. In order to think,

[50:50] AI now has to think. In order to think, it has to inference. AI now has to do.

[50:53] it has to inference. AI now has to do. In order to do, it has to inference. AI

[50:55] In order to do, it has to inference. AI has to read. In order to do so, it has

[50:57] has to read. In order to do so, it has to inference. It has to reason. It has

[50:59] to inference. It has to reason. It has to inference. every part of AI

[51:04] to inference. every part of AI every time it has to think it has to

[51:05] every time it has to think it has to reason it has to do it has to generate

[51:08] reason it has to do it has to generate tokens it has to inference it's way past

[51:12] tokens it has to inference it's way past training now it's in the in the field of

[51:15] training now it's in the in the field of inference so the in the inference

[51:17] inference so the in the inference inflection has arrived

[51:20] inflection has arrived at the time when the amount of tokens

[51:22] at the time when the amount of tokens the amount of compute necessary

[51:24] the amount of compute necessary increased by roughly 10,000 times now

[51:28] increased by roughly 10,000 times now when I combine these to the fact that

[51:31] when I combine these to the fact that since in the last two years the

[51:32] since in the last two years the computing demand computing demand of the

[51:35] computing demand computing demand of the work has gone up by 10,000 times and the

[51:38] work has gone up by 10,000 times and the amount of usage

[51:41] amount of usage the amount of usage has probably gone up

[51:43] the amount of usage has probably gone up by a hundred times.

[51:48] by a hundred times. People have heard me say I believe that

[51:50] People have heard me say I believe that computing demand has increased by 1

[51:54] computing demand has increased by 1 million times in the last two years. It

[51:57] million times in the last two years. It is the feeling that we all have. It is

[52:00] is the feeling that we all have. It is the feeling every startup has. It's the

[52:02] the feeling every startup has. It's the feeling that OpenAI has. It's the

[52:04] feeling that OpenAI has. It's the feeling that Anthropic has. If they

[52:06] feeling that Anthropic has. If they could just get more capacity, they could

[52:08] could just get more capacity, they could generate more tokens. Their revenues

[52:10] generate more tokens. Their revenues would go up. More people could use it.

[52:12] would go up. More people could use it. The more advanced, the smarter the AI

[52:14] The more advanced, the smarter the AI could become. We are now at that

[52:18] could become. We are now at that positive flywheel system. We have we

[52:20] positive flywheel system. We have we have reached that moment. The

[52:22] have reached that moment. The inflection, the inference inflection has

[52:25] inflection, the inference inflection has arrived. Last year at this time, I said

[52:32] arrived. Last year at this time, I said that

[52:33] that where I stood at that moment in time, we

[52:37] where I stood at that moment in time, we saw about

[52:40] saw about $500

[52:42] $500 billion dollars.

[52:44] billion dollars. We saw$500 billion dollars

[52:48] We saw$500 billion dollars of very high confidence demand and

[52:51] of very high confidence demand and purchase orders

[52:53] purchase orders for Blackwell and Reuben through 2026.

[52:59] for Blackwell and Reuben through 2026. I said that last year.

[53:03] I said that last year. Now, I don't know if you guys feel the

[53:05] Now, I don't know if you guys feel the same way, but $500 billion is an

[53:08] same way, but $500 billion is an enormous amount of revenue.

[53:12] enormous amount of revenue. Not one impressed.

[53:22] I know why you're not impressed. Because

[53:23] I know why you're not impressed. Because all of you had record years.

[53:29] Well, I'm here to tell you

[53:33] Well, I'm here to tell you that right now where I stand, a few

[53:36] that right now where I stand, a few short months after GTCDC,

[53:40] short months after GTCDC, one year after last GTC, right here

[53:43] one year after last GTC, right here where I stand,

[53:45] where I stand, I see through 2027

[53:51] at least $1 trillion

[54:03] Now, does it make any sense?

[54:06] Now, does it make any sense? And that's what I'm going to spend the

[54:08] And that's what I'm going to spend the rest of the time talking about. In fact,

[54:11] rest of the time talking about. In fact, we are going to be short. I am certain

[54:15] we are going to be short. I am certain computing demand will be much higher

[54:17] computing demand will be much higher than that. And there's a reason for

[54:19] than that. And there's a reason for that. So, the first thing is

[54:22] that. So, the first thing is um we did a lot of work in the last

[54:24] um we did a lot of work in the last year. Of course, as you know, 2025 was

[54:27] year. Of course, as you know, 2025 was NVIDIA's year of inference. We wanted to

[54:30] NVIDIA's year of inference. We wanted to make sure that not only were we good at

[54:31] make sure that not only were we good at training and post- training, that we

[54:33] training and post- training, that we were incredibly good at every single

[54:35] were incredibly good at every single phase of AI so that the investments that

[54:38] phase of AI so that the investments that were made, investments made in our

[54:40] were made, investments made in our infrastructure could scale out for as

[54:43] infrastructure could scale out for as long as they would like to use it. And

[54:44] long as they would like to use it. And the useful life of Nvidia's

[54:46] the useful life of Nvidia's infrastructure would be long and

[54:48] infrastructure would be long and therefore the cost would be incredibly

[54:50] therefore the cost would be incredibly low. The longer you could use it, the

[54:52] low. The longer you could use it, the lower the cost. There's no question in

[54:54] lower the cost. There's no question in my mind Nvidia systems are the lowest

[54:57] my mind Nvidia systems are the lowest cost infrastructure you could get for AI

[55:00] cost infrastructure you could get for AI infrastructure in the world. And so the

[55:02] infrastructure in the world. And so the first part was last year was all about

[55:04] first part was last year was all about AI for inference and it drove this

[55:07] AI for inference and it drove this inflection point. Simultaneously

[55:11] inflection point. Simultaneously we were very pleased last year that

[55:13] we were very pleased last year that Anthropic has come to Nvidia that MSL

[55:16] Anthropic has come to Nvidia that MSL Meta SL has chosen Nvidia and meanwhile

[55:20] Meta SL has chosen Nvidia and meanwhile meanwhile and as a collection as a group

[55:24] meanwhile and as a collection as a group this represents

[55:26] this represents onethird of the world's AI compute open-

[55:30] onethird of the world's AI compute open- source models open-source models have

[55:33] source models open-source models have reached near the frontier and it is

[55:35] reached near the frontier and it is literally everywhere and Nvidia as you

[55:38] literally everywhere and Nvidia as you know today we're the only platform in

[55:40] know today we're the only platform in the world today that runs every single

[55:43] the world today that runs every single domain of AI

[55:45] domain of AI across every single one of these AI

[55:48] across every single one of these AI models

[55:50] models in language and biology and computer

[55:52] in language and biology and computer graphics computer vision and speech

[55:56] graphics computer vision and speech proteins and chemicals robotics and

[55:59] proteins and chemicals robotics and otherwise edge or cloud any language

[56:03] otherwise edge or cloud any language NVIDIA's architecture is funible for all

[56:06] NVIDIA's architecture is funible for all of that and we're incredible for all of

[56:08] of that and we're incredible for all of that. That allows us to be the lowest

[56:11] that. That allows us to be the lowest cost, the highest confidence platform

[56:14] cost, the highest confidence platform because when you're building these

[56:15] because when you're building these systems, as I mentioned, a trillion

[56:18] systems, as I mentioned, a trillion dollars is an enormous amount of

[56:19] dollars is an enormous amount of infrastructure. You have to have

[56:21] infrastructure. You have to have complete confidence that the trillion

[56:23] complete confidence that the trillion dollars you're putting down will be you

[56:26] dollars you're putting down will be you utilized, would be performant, would be

[56:29] utilized, would be performant, would be incredibly cost-effective, and have

[56:31] incredibly cost-effective, and have useful life for as long as you could see

[56:34] useful life for as long as you could see that infrastructure investment you could

[56:37] that infrastructure investment you could make on Nvidia. You could make with

[56:39] make on Nvidia. You could make with complete confidence.

[56:41] complete confidence. We have now proven that it is the only

[56:44] We have now proven that it is the only infrastructure in the world that you

[56:45] infrastructure in the world that you could go anywhere in the world and build

[56:48] could go anywhere in the world and build with complete confidence. You want to

[56:50] with complete confidence. You want to put it in any of the clouds, we're

[56:52] put it in any of the clouds, we're delighted by that. You want to put it on

[56:53] delighted by that. You want to put it on prem, we're happy about that. You want

[56:56] prem, we're happy about that. You want to put it in any country, anywhere,

[56:57] to put it in any country, anywhere, we're delighted to support you. We are

[57:00] we're delighted to support you. We are now

[57:02] now a computing platform that runs all of

[57:04] a computing platform that runs all of AI. Now, our business

[57:09] AI. Now, our business already starting to show that 60% of our

[57:12] already starting to show that 60% of our business is hyperscalers. The top five

[57:16] business is hyperscalers. The top five hyperscalers.

[57:17] hyperscalers. However, even within that top five

[57:19] However, even within that top five hyperscalers, some of it is internal AI

[57:22] hyperscalers, some of it is internal AI consumption. The internal AI consumption

[57:25] consumption. The internal AI consumption really important work like Rexus is

[57:27] really important work like Rexus is moving from recommener systems of tables

[57:29] moving from recommener systems of tables and collaborative filtering and content

[57:31] and collaborative filtering and content filtering. It's moving towards deep

[57:33] filtering. It's moving towards deep learning and large language models.

[57:35] learning and large language models. Search moving to deep learning large

[57:38] Search moving to deep learning large language models. Almost all of these

[57:41] language models. Almost all of these different hypers scale workloads are now

[57:43] different hypers scale workloads are now moving shifting towards a workload that

[57:45] moving shifting towards a workload that Nvidia GPUs are incredibly good at. But

[57:48] Nvidia GPUs are incredibly good at. But on top of that, because we work with

[57:50] on top of that, because we work with every AI lab, because we work with every

[57:52] every AI lab, because we work with every we accelerate a every AI model and

[57:55] we accelerate a every AI model and because we have a large ecosystem of AI

[57:57] because we have a large ecosystem of AI natives that we work with that we can

[57:59] natives that we work with that we can bring to the clouds that investment no

[58:03] bring to the clouds that investment no matter how large, no matter how quick

[58:06] matter how large, no matter how quick that compute will be consumed and that

[58:09] that compute will be consumed and that represents 60% of our business. The

[58:11] represents 60% of our business. The other 40% is just everywhere. Regional

[58:14] other 40% is just everywhere. Regional clouds, sovereign clouds, enterprise,

[58:17] clouds, sovereign clouds, enterprise, industrial, robotics, edge, big systems,

[58:21] industrial, robotics, edge, big systems, supercomputing systems, small servers,

[58:24] supercomputing systems, small servers, enterprise servers.

[58:26] enterprise servers. The number of systems, incredible.

[58:30] The number of systems, incredible. The diversity of AI

[58:33] The diversity of AI is also its resilience.

[58:36] is also its resilience. The span of reach of AI is its

[58:39] The span of reach of AI is its resilience. There is no question this is

[58:42] resilience. There is no question this is not a one app technology. This is now

[58:45] not a one app technology. This is now fundamental. This is absolutely a new

[58:49] fundamental. This is absolutely a new computing platform shift. Well, our job

[58:53] computing platform shift. Well, our job is to continue to advance the technology

[58:55] is to continue to advance the technology and one of the most important things

[58:56] and one of the most important things that I mentioned last year was last year

[58:59] that I mentioned last year was last year was our year of inference. We dedicated

[59:02] was our year of inference. We dedicated everything. We took a giant chance and

[59:06] everything. We took a giant chance and reinvented while Hopper was at its prime

[59:10] reinvented while Hopper was at its prime and it was just cooking. We decided that

[59:13] and it was just cooking. We decided that the Hopper architecture the MVL link by

[59:16] the Hopper architecture the MVL link by 8 had to be taken to the next level. We

[59:19] 8 had to be taken to the next level. We completely rearchitected the system,

[59:21] completely rearchitected the system, disagregated the computing system alto

[59:24] disagregated the computing system alto together and created MVLink 72. The way

[59:27] together and created MVLink 72. The way that it's built, the way it's

[59:29] that it's built, the way it's manufactured, the way it's programmed

[59:31] manufactured, the way it's programmed completely changed. Grace Blackwell

[59:33] completely changed. Grace Blackwell MVLink72 was a giant bet and it wasn't

[59:37] MVLink72 was a giant bet and it wasn't easy for anybody and many of my partners

[59:39] easy for anybody and many of my partners here in the room. I want to thank all of

[59:41] here in the room. I want to thank all of you for the hard work that you guys did.

[59:42] you for the hard work that you guys did. Thank you.

[59:48] MVLink72

[59:50] MVLink72 MV FP4 not just FP4 precision FP4 is a

[59:55] MV FP4 not just FP4 precision FP4 is a whole different type of tensor core and

[59:57] whole different type of tensor core and computational unit. We've demonstrated

[59:59] computational unit. We've demonstrated now that we can inference NVFP4

[01:00:03] now that we can inference NVFP4 without loss of precision but gigantic

[01:00:06] without loss of precision but gigantic boost in performance and energy

[01:00:08] boost in performance and energy efficiency. We've also been able to use

[01:00:10] efficiency. We've also been able to use MVFP4 for training. So MVLink72, MVFP4,

[01:00:15] MVFP4 for training. So MVLink72, MVFP4, the invention of Dynamo, Tensor RTLM, a

[01:00:19] the invention of Dynamo, Tensor RTLM, a whole bunch of new algorithms. We even

[01:00:21] whole bunch of new algorithms. We even built a supercomputer to help us

[01:00:23] built a supercomputer to help us optimize kernels and help us optimize

[01:00:25] optimize kernels and help us optimize our complete stack. We call it DGX

[01:00:27] our complete stack. We call it DGX cloud. We invested billions of dollars

[01:00:30] cloud. We invested billions of dollars of supercomputing capability help us

[01:00:33] of supercomputing capability help us create the kernels, the software that

[01:00:36] create the kernels, the software that made inference possible. Well,

[01:00:40] made inference possible. Well, the results all came together and people

[01:00:44] the results all came together and people told people used to tell me but Jensen

[01:00:46] told people used to tell me but Jensen inference is so easy. Inference is the

[01:00:49] inference is so easy. Inference is the ultimate hard. Inference is ultimate

[01:00:51] ultimate hard. Inference is ultimate hard. It is also ultimate important

[01:00:53] hard. It is also ultimate important because it drives your revenues. And so

[01:00:55] because it drives your revenues. And so this is the outcome. This is from semi

[01:00:58] this is the outcome. This is from semi analysis. This is the largest most

[01:01:00] analysis. This is the largest most comprehensive sweep of AI that has AI

[01:01:04] comprehensive sweep of AI that has AI inference that has ever been done. And

[01:01:06] inference that has ever been done. And what you see here on the left on on this

[01:01:09] what you see here on the left on on this side on this side is tokens per watt.

[01:01:13] side on this side is tokens per watt. Tokens per watt is important because

[01:01:16] Tokens per watt is important because every data center every single factory

[01:01:18] every data center every single factory by definition is power constrained. A

[01:01:21] by definition is power constrained. A one gawatt factory will never become

[01:01:22] one gawatt factory will never become two. It's physically constrained the

[01:01:25] two. It's physically constrained the laws of atoms, the laws of physicality.

[01:01:28] laws of atoms, the laws of physicality. And so that one gigawatt of data center

[01:01:32] And so that one gigawatt of data center you want to drive the maximum number of

[01:01:34] you want to drive the maximum number of tokens which is the production the

[01:01:38] tokens which is the production the product of that factory. So you want

[01:01:40] product of that factory. So you want that you want to be on top of that curve

[01:01:42] that you want to be on top of that curve as high as you want. This the x- axis is

[01:01:46] as high as you want. This the x- axis is the interactivity the speed of inference

[01:01:49] the interactivity the speed of inference the speed of each inference. The faster

[01:01:52] the speed of each inference. The faster you can inference,

[01:01:54] you can inference, the faster you could of course respond.

[01:01:57] the faster you could of course respond. But very importantly, the faster you can

[01:01:59] But very importantly, the faster you can inference, the larger the models, the

[01:02:02] inference, the larger the models, the more context you could process, the more

[01:02:04] more context you could process, the more tokens you can think through. This axis

[01:02:07] tokens you can think through. This axis is the same as smartness of the AI. And

[01:02:11] is the same as smartness of the AI. And so this is the throughput of the AI.

[01:02:13] so this is the throughput of the AI. This is the smartness of the AI. Notice

[01:02:17] This is the smartness of the AI. Notice the smarter the AI, the lower your

[01:02:20] the smarter the AI, the lower your throughput. Makes sense? you're thinking

[01:02:22] throughput. Makes sense? you're thinking longer. Okay? And so this axis is the

[01:02:26] longer. Okay? And so this axis is the speed. And I'm going to come back to

[01:02:27] speed. And I'm going to come back to this. This is important. This is where I

[01:02:29] this. This is important. This is where I torture all of you. But it's too

[01:02:32] torture all of you. But it's too important. Every CEO in the world you

[01:02:34] important. Every CEO in the world you watch, every CEO in the world will study

[01:02:37] watch, every CEO in the world will study their business from now on in the way

[01:02:39] their business from now on in the way I'm about to describe

[01:02:41] I'm about to describe because this is your token factory. This

[01:02:45] because this is your token factory. This is your AI factory. This is your

[01:02:47] is your AI factory. This is your revenues. There's no question about that

[01:02:49] revenues. There's no question about that going forward. And so this is the

[01:02:51] going forward. And so this is the throughput. This is the intelligence.

[01:02:53] throughput. This is the intelligence. Better perf per watt for a given power

[01:02:57] Better perf per watt for a given power of data center. The more throughput, the

[01:02:59] of data center. The more throughput, the more tokens you could produce. On this

[01:03:01] more tokens you could produce. On this side is cost. Notice Nvidia is the

[01:03:05] side is cost. Notice Nvidia is the highest performance in the world. Nobody

[01:03:07] highest performance in the world. Nobody would be surprised by that. They would

[01:03:10] would be surprised by that. They would be surprised by the fact that in one

[01:03:13] be surprised by the fact that in one generation whereas Moore's law would

[01:03:15] generation whereas Moore's law would have given us through transistors 50%

[01:03:19] have given us through transistors 50% two times

[01:03:21] two times Moore's law would probably give us one

[01:03:24] Moore's law would probably give us one and a half times more performance. You

[01:03:27] and a half times more performance. You would have expected from Hopper H200 one

[01:03:30] would have expected from Hopper H200 one and a half times higher. Nobody would

[01:03:32] and a half times higher. Nobody would have expected 35 times higher. I said

[01:03:36] have expected 35 times higher. I said last year at this time that Nvidia's

[01:03:40] last year at this time that Nvidia's Grace Blackwell NVLink 72 was 35 times

[01:03:44] Grace Blackwell NVLink 72 was 35 times perf per watt. Nobody believed me. And

[01:03:48] perf per watt. Nobody believed me. And then semi-analysis came out and Dylan

[01:03:51] then semi-analysis came out and Dylan Patel had a quote.

[01:03:54] Patel had a quote. He accused me of sandbagging.

[01:03:58] He accused me of sandbagging. He says,

[01:04:00] He accused me of sandbagging. He says, "Jensen sandbagged. It's actually 50

[01:04:03] "Jensen sandbagged. It's actually 50 times." And he's not wrong. He's not

[01:04:06] times." And he's not wrong. He's not wrong. And so our cost per token, yeah,

[01:04:15] our cost per token is the lowest in the

[01:04:18] our cost per token is the lowest in the world. You can't beat it.

[01:04:21] world. You can't beat it. I've said before, if you have the wrong

[01:04:23] I've said before, if you have the wrong architecture, even if it's free, it's

[01:04:26] architecture, even if it's free, it's not cheap enough. And the reason for

[01:04:28] not cheap enough. And the reason for that is because no matter what happens,

[01:04:30] that is because no matter what happens, you still have to build a gigawatt data

[01:04:33] you still have to build a gigawatt data center. You still have to build build a

[01:04:35] center. You still have to build build a gigawatt factory. And that gigawatt

[01:04:37] gigawatt factory. And that gigawatt factory for 15 years advertised across

[01:04:40] factory for 15 years advertised across that gigawatt factory is about $40

[01:04:42] that gigawatt factory is about $40 billion. Even when you put nothing on

[01:04:45] billion. Even when you put nothing on it, it's $40 billion in. You better make

[01:04:47] it, it's $40 billion in. You better make for darn sure you put the best computer

[01:04:49] for darn sure you put the best computer system on that thing so that you could

[01:04:51] system on that thing so that you could have the best token cost. Nvidia's token

[01:04:55] have the best token cost. Nvidia's token cost is world class.

[01:04:57] cost is world class. basically untouchable at the moment. And

[01:04:59] basically untouchable at the moment. And the reason that true is because of

[01:05:02] the reason that true is because of extreme code design. And so I'm very

[01:05:04] extreme code design. And so I'm very happy that he named us.

[01:05:25] There was a monkey king,

[01:05:27] There was a monkey king, token king.

[01:05:33] Well, we take we take all of our

[01:05:35] Well, we take we take all of our software as I as I told you, we

[01:05:36] software as I as I told you, we vertically integrate, but we

[01:05:37] vertically integrate, but we horizontally open. We're vertical

[01:05:40] horizontally open. We're vertical integration, horizontal open. We

[01:05:42] integration, horizontal open. We integrate all of our software and all of

[01:05:43] integrate all of our software and all of our technology, however we could package

[01:05:45] our technology, however we could package it up and integrate it into the world's

[01:05:47] it up and integrate it into the world's inference service providers. And these

[01:05:51] inference service providers. And these these companies are growing so fast.

[01:05:54] these companies are growing so fast. They're growing so fast. Fireworks. Lynn

[01:05:57] They're growing so fast. Fireworks. Lynn is here together. They're just growing

[01:06:00] is here together. They're just growing so incredibly fast. A hundred times in

[01:06:02] so incredibly fast. A hundred times in the last year. They are token factories.

[01:06:07] the last year. They are token factories. And the effectiveness, the performance

[01:06:09] And the effectiveness, the performance and the token cost production capability

[01:06:12] and the token cost production capability for their factories is everything to

[01:06:14] for their factories is everything to them. And this is what happened.

[01:06:17] them. And this is what happened. This is we updated their software, same

[01:06:21] This is we updated their software, same system.

[01:06:23] system. And notice

[01:06:24] And notice their token speeds.

[01:06:27] their token speeds. Incredible. The difference before before

[01:06:30] Incredible. The difference before before Nvidia updated everything and all of our

[01:06:33] Nvidia updated everything and all of our algorithms and software and all the

[01:06:34] algorithms and software and all the technology that we bring to bear

[01:06:37] technology that we bring to bear about 700 tokens per second average went

[01:06:41] about 700 tokens per second average went to nearly 5,000 7 times higher. And so

[01:06:45] to nearly 5,000 7 times higher. And so this is the incredible power of extreme

[01:06:49] this is the incredible power of extreme code design. I mentioned earlier the

[01:06:51] code design. I mentioned earlier the importance of factories. This is the

[01:06:54] importance of factories. This is the importance of factory. Your data center,

[01:06:55] importance of factory. Your data center, it used to be a data center for files.

[01:06:57] it used to be a data center for files. It's now a factory to generate tokens.

[01:07:01] It's now a factory to generate tokens. Your factory is limited no matter what.

[01:07:03] Your factory is limited no matter what. Everybody's looking for land, power, and

[01:07:05] Everybody's looking for land, power, and shell. Once you build it, you are power

[01:07:08] shell. Once you build it, you are power limited. within that power limited

[01:07:10] limited. within that power limited infrastructure, you better make for darn

[01:07:13] infrastructure, you better make for darn sure that your inference because you

[01:07:15] sure that your inference because you know inference is your workload and

[01:07:17] know inference is your workload and tokens is your new commodity that

[01:07:19] tokens is your new commodity that compute is your revenues that you want

[01:07:22] compute is your revenues that you want to make sure that the architecture is as

[01:07:25] to make sure that the architecture is as optimized as you can in the future.

[01:07:27] optimized as you can in the future. every single CSP,

[01:07:30] every single CSP, every single computer company, every

[01:07:32] every single computer company, every single cloud company, every single AI

[01:07:34] single cloud company, every single AI company, every single

[01:07:37] company, every single company period are going to be thinking

[01:07:40] company period are going to be thinking about their token factory effectiveness.

[01:07:42] about their token factory effectiveness. This is your factory in the future. And

[01:07:45] This is your factory in the future. And the reason why I know that is because

[01:07:47] the reason why I know that is because everybody in this room is powered by

[01:07:49] everybody in this room is powered by intelligence. And in the future, that

[01:07:51] intelligence. And in the future, that intelligence will be augmented by

[01:07:53] intelligence will be augmented by tokens. So, let me show you how we got

[01:07:56] tokens. So, let me show you how we got here.

[01:07:59] On April 6th, 2016, a decade ago, we

[01:08:04] On April 6th, 2016, a decade ago, we introduced DGX1,

[01:08:06] introduced DGX1, the world's first computer designed for

[01:08:08] the world's first computer designed for deep learning.

[01:08:10] deep learning. Eight Pascal GPUs connected with the

[01:08:13] Eight Pascal GPUs connected with the first generation NVLink.

[01:08:16] first generation NVLink. 170 teraflops in one computer. The

[01:08:18] 170 teraflops in one computer. The world's first computer designed for AI

[01:08:21] world's first computer designed for AI researchers.

[01:08:24] With Volulta, we introduced NVLink

[01:08:27] With Volulta, we introduced NVLink switch. 16 GPUs connected with full

[01:08:30] switch. 16 GPUs connected with full alltoall bandwidth operating as one

[01:08:32] alltoall bandwidth operating as one giant GPU. A giant step forward, but

[01:08:36] giant GPU. A giant step forward, but model sizes continued to grow. The data

[01:08:39] model sizes continued to grow. The data center needed to become a single unit of

[01:08:42] center needed to become a single unit of computing. So, Melanox joined Nvidia.

[01:08:49] In 2020, DGXA100 Super Pod became the

[01:08:53] In 2020, DGXA100 Super Pod became the first GPU supercomput combining scale up

[01:08:56] first GPU supercomput combining scale up and scale out architecture.

[01:08:59] and scale out architecture. NVL link 3 for scale up, connect X6 and

[01:09:02] NVL link 3 for scale up, connect X6 and Quantum Infiniban for scale out.

[01:09:08] Then Hopper, the first GPU with the FP8

[01:09:11] Then Hopper, the first GPU with the FP8 Transformer engine that launched the

[01:09:13] Transformer engine that launched the generative AI era. MVLink 4, Connect X7,

[01:09:18] generative AI era. MVLink 4, Connect X7, Bluefield 3 DPUs, second generation

[01:09:21] Bluefield 3 DPUs, second generation quantum infiniband. It revolutionized

[01:09:24] quantum infiniband. It revolutionized computing.

[01:09:29] Blackwell redefined AI supercomputing

[01:09:31] Blackwell redefined AI supercomputing system architecture with NVLink 72. 72

[01:09:35] system architecture with NVLink 72. 72 GPUs connected by NVLink spine 130

[01:09:39] GPUs connected by NVLink spine 130 terabytes per second of all to all

[01:09:41] terabytes per second of all to all bandwidth.

[01:09:43] bandwidth. Compute trays integrate Blackwell GPUs,

[01:09:45] Compute trays integrate Blackwell GPUs, Grace CPUs, Connect X8, and Bluefield 3.

[01:09:51] Grace CPUs, Connect X8, and Bluefield 3. Scale Out runs over Spectrum 4 Ethernet.

[01:09:54] Scale Out runs over Spectrum 4 Ethernet. With three scaling laws in full steam,

[01:09:57] With three scaling laws in full steam, pre-training, post-training, and

[01:09:59] pre-training, post-training, and inference, and now Agentic systems,

[01:10:02] inference, and now Agentic systems, compute demand continues to grow

[01:10:04] compute demand continues to grow exponentially.

[01:10:07] And now Vera Rubin

[01:10:10] And now Vera Rubin architected for every phase of Agentic

[01:10:12] architected for every phase of Agentic AI advancing every pillar of computing

[01:10:16] AI advancing every pillar of computing including CPU storage networking and

[01:10:20] including CPU storage networking and security.

[01:10:21] security. Vera Rubin Nvlink 72 3.6 exoflops of

[01:10:26] Vera Rubin Nvlink 72 3.6 exoflops of compute 260 tab per second of alltoall

[01:10:30] compute 260 tab per second of alltoall NVLink bandwidth the engine

[01:10:33] NVLink bandwidth the engine supercharging the era of Agentic AI. The

[01:10:36] supercharging the era of Agentic AI. The Vera CPU rack designed for orchestration

[01:10:40] Vera CPU rack designed for orchestration and agentic workflows. The STX rack AI

[01:10:44] and agentic workflows. The STX rack AI native storage built with Bluefield 4.

[01:10:47] native storage built with Bluefield 4. Scale out with Spectrum X co-ackaged

[01:10:49] Scale out with Spectrum X co-ackaged optics increasing energy efficiency and

[01:10:52] optics increasing energy efficiency and resiliency. And an incredible new

[01:10:54] resiliency. And an incredible new addition, the Gro 3 LPX rack. Tightly

[01:10:58] addition, the Gro 3 LPX rack. Tightly connected to Vera Rubin, Gro's LPU's

[01:11:01] connected to Vera Rubin, Gro's LPU's massive onchip SRAMM, a token

[01:11:04] massive onchip SRAMM, a token accelerator to the already incredibly

[01:11:06] accelerator to the already incredibly fast Vera Rubin. Together, 35 times more

[01:11:10] fast Vera Rubin. Together, 35 times more throughput per megawatt. The new Vera

[01:11:13] throughput per megawatt. The new Vera Rubin platform. Seven chips, five rack

[01:11:16] Rubin platform. Seven chips, five rack scale computers, one revolutionary AI

[01:11:19] scale computers, one revolutionary AI supercomput for agentic AI.

[01:11:23] supercomput for agentic AI. 40 million times more compute in just 10

[01:11:26] 40 million times more compute in just 10 years.

[01:11:39] Now, in the in the good old days when I

[01:11:42] Now, in the in the good old days when I would say hopper, I would hold up a

[01:11:45] would say hopper, I would hold up a chip.

[01:11:48] chip. That's just adorable.

[01:11:51] This is Vera Rubin. When we think ver

[01:11:59] when we when we think Vera Rubin, we

[01:12:02] when we when we think Vera Rubin, we think the entire system vertically

[01:12:05] think the entire system vertically integrated

[01:12:06] integrated completely with software

[01:12:08] completely with software extended end to end optimized as one

[01:12:12] extended end to end optimized as one giant system. The reason why it's

[01:12:14] giant system. The reason why it's designed for agentic systems is very

[01:12:16] designed for agentic systems is very clear because agents of course the most

[01:12:18] clear because agents of course the most important workload is it's thinking the

[01:12:21] important workload is it's thinking the large language model. The large language

[01:12:23] large language model. The large language models are going to larger and larger

[01:12:25] models are going to larger and larger and larger. It's going to generate more

[01:12:27] and larger. It's going to generate more and more tokens more quickly so it could

[01:12:28] and more tokens more quickly so it could think more quickly. But it also has to

[01:12:31] think more quickly. But it also has to access memory. It's going to pound on

[01:12:34] access memory. It's going to pound on memory really hard. KV cache structured

[01:12:38] memory really hard. KV cache structured data QDF unstructured data QVS. It's

[01:12:42] data QDF unstructured data QVS. It's going to be pounding on the me on the

[01:12:44] going to be pounding on the me on the storage system really really hard which

[01:12:46] storage system really really hard which is the reason why we reinvented the

[01:12:49] is the reason why we reinvented the storage system. It is also going to use

[01:12:52] storage system. It is also going to use tools and unlike humans that are more

[01:12:56] tools and unlike humans that are more tolerant to slower computers.

[01:12:59] tolerant to slower computers. AI wants the tools to be as fast as

[01:13:02] AI wants the tools to be as fast as possible. These tools web browsers in

[01:13:06] possible. These tools web browsers in the future they could also be virtual

[01:13:08] the future they could also be virtual PCs in the cloud. Those PCs have to be

[01:13:11] PCs in the cloud. Those PCs have to be and those computers have to be as fast

[01:13:13] and those computers have to be as fast as possible. We created a brand new CPU.

[01:13:17] as possible. We created a brand new CPU. A brand new CPU that's designed for

[01:13:20] A brand new CPU that's designed for extremely high singlethreaded

[01:13:22] extremely high singlethreaded performance,

[01:13:24] performance, incredibly

[01:13:26] incredibly high data output, incredibly good at

[01:13:29] high data output, incredibly good at data processing, and extreme energy

[01:13:33] data processing, and extreme energy efficiency. It is the only data center

[01:13:36] efficiency. It is the only data center CPU in the world that uses LPDDR5,

[01:13:40] CPU in the world that uses LPDDR5, LPDDR5 and incredible single thread

[01:13:44] LPDDR5 and incredible single thread performance and performance per watt

[01:13:46] performance and performance per watt that is unrivaled.

[01:13:48] that is unrivaled. And so that's we built that so that it

[01:13:51] And so that's we built that so that it could go along with the rest of these

[01:13:53] could go along with the rest of these racks for agentic processing. And so

[01:13:57] racks for agentic processing. And so here it is. This is the Grace Blackwell.

[01:14:01] here it is. This is the Grace Blackwell. Oh no, Vera Rubin. Where is it? Here it

[01:14:03] Oh no, Vera Rubin. Where is it? Here it is. Okay, so this is the Vera Rubin

[01:14:05] is. Okay, so this is the Vera Rubin system. Notice since the last time 100%

[01:14:08] system. Notice since the last time 100% liquid cooled. All of the cables gone.

[01:14:12] liquid cooled. All of the cables gone. What used to take what used to take

[01:14:16] What used to take what used to take 2 days to install now takes two hours.

[01:14:20] 2 days to install now takes two hours. Incredible. And so the manufacturing

[01:14:22] Incredible. And so the manufacturing cycle time going to dramatically reduce.

[01:14:25] cycle time going to dramatically reduce. This is also a supercomput that is

[01:14:27] This is also a supercomput that is cooled by it's cooled by hot water 45°

[01:14:32] cooled by it's cooled by hot water 45° which takes the pressure off of the data

[01:14:34] which takes the pressure off of the data center takes all of that cost and all of

[01:14:37] center takes all of that cost and all of that energy that's used to cool the data

[01:14:39] that energy that's used to cool the data center and makes it available for the

[01:14:42] center and makes it available for the system. This is the secret sauce. It is

[01:14:46] system. This is the secret sauce. It is the only we're the only company in the

[01:14:48] the only we're the only company in the world that has today built the sixth

[01:14:51] world that has today built the sixth sixth generation scaleup switching

[01:14:54] sixth generation scaleup switching system. This is not Ethernet. This is

[01:14:57] system. This is not Ethernet. This is not Infiniban. This is MVLink. This is

[01:15:00] not Infiniban. This is MVLink. This is the sixth generation MVLink. This is

[01:15:02] the sixth generation MVLink. This is insanely hard to do. Well, it is

[01:15:05] insanely hard to do. Well, it is insanely hard to do. Period. And I'm

[01:15:07] insanely hard to do. Period. And I'm just super proud of the team. MVLink

[01:15:09] just super proud of the team. MVLink completely cooled. This is the brand new

[01:15:12] completely cooled. This is the brand new Gro system. And I'll show you a little

[01:15:14] Gro system. And I'll show you a little bit more about it. this system.

[01:15:17] bit more about it. this system. Eight GU chips. This is the LP30. The

[01:15:21] Eight GU chips. This is the LP30. The world's never seen it. Anything that the

[01:15:22] world's never seen it. Anything that the world's ever seen is V1. This is third

[01:15:25] world's ever seen is V1. This is third generation.

[01:15:27] generation. And we're in volume production now. And

[01:15:30] And we're in volume production now. And I'll show you more about that in just a

[01:15:32] I'll show you more about that in just a second. The world's first

[01:15:35] second. The world's first CPO

[01:15:37] CPO Spectrum X switch. This is also in full

[01:15:41] Spectrum X switch. This is also in full production. Co-packaged optics. Optics

[01:15:45] production. Co-packaged optics. Optics comes directly onto this chip,

[01:15:47] comes directly onto this chip, interfaces directly to silicon.

[01:15:50] interfaces directly to silicon. Electrons gets translated to photons and

[01:15:53] Electrons gets translated to photons and it gets directly directly connected to

[01:15:56] it gets directly directly connected to this chip. We invented the process

[01:15:58] this chip. We invented the process technology with TSMC. We're the only one

[01:16:00] technology with TSMC. We're the only one in production with it today. It's called

[01:16:02] in production with it today. It's called coupe. It's completely revolutionary.

[01:16:04] coupe. It's completely revolutionary. Nvidia is in full production with

[01:16:07] Nvidia is in full production with Spectrum X.

[01:16:12] This is the Vera system. Twice the

[01:16:14] This is the Vera system. Twice the performance per watt of any any CPUs in

[01:16:17] performance per watt of any any CPUs in the world today. It is also in

[01:16:20] the world today. It is also in production. Well, you know, we never we

[01:16:23] production. Well, you know, we never we never thought we would be selling CPUs

[01:16:26] never thought we would be selling CPUs standalone. Um, we are selling a lot of

[01:16:29] standalone. Um, we are selling a lot of CPU standalone. This is already for sure

[01:16:32] CPU standalone. This is already for sure going to be a multi-billion dollar

[01:16:33] going to be a multi-billion dollar business for us. So, I'm very very

[01:16:34] business for us. So, I'm very very pleased with our CPU architects. We

[01:16:37] pleased with our CPU architects. We designed a revolutionary CPU and this is

[01:16:40] designed a revolutionary CPU and this is the CX9

[01:16:42] the CX9 powered with Vera CPU, the Bluefield 4

[01:16:46] powered with Vera CPU, the Bluefield 4 STX, our new storage platform. Okay, so

[01:16:49] STX, our new storage platform. Okay, so these are the four these are the the the

[01:16:52] these are the four these are the the the racks and it's connected

[01:16:56] racks and it's connected each one of these racks, the MVL link

[01:16:58] each one of these racks, the MVL link rack.

[01:17:00] rack. This is I've shown you guys this before.

[01:17:03] This is I've shown you guys this before. It's a super heavy and seems to get

[01:17:05] It's a super heavy and seems to get heavier every year.

[01:17:08] heavier every year. because I think there's just more cables

[01:17:09] because I think there's just more cables in there every year. And so, so this is

[01:17:11] in there every year. And so, so this is the MVLink rack. We've also taken this

[01:17:14] the MVLink rack. We've also taken this technology because it it is so

[01:17:19] technology because it it is so efficient to create a data center with

[01:17:22] efficient to create a data center with these cabling systems, structured

[01:17:24] these cabling systems, structured cables. So, we decided to do that for

[01:17:26] cables. So, we decided to do that for Ethernet. So, this is Ethernet, 256

[01:17:31] Ethernet. So, this is Ethernet, 256 liquid cooled nodes in one rack. And it

[01:17:35] liquid cooled nodes in one rack. And it is also connected with these incredible

[01:17:39] is also connected with these incredible connectors.

[01:17:43] You guys want to see um

[01:17:46] You guys want to see um Reuben Ultra.

[01:18:01] So this is the Reuben Ultra compute

[01:18:03] So this is the Reuben Ultra compute node. Unlike Reuben that slides in

[01:18:08] node. Unlike Reuben that slides in horizontally, Ruben Ultra goes into a

[01:18:12] horizontally, Ruben Ultra goes into a whole new rack. It's called Kyber that

[01:18:15] whole new rack. It's called Kyber that enables us to connect 144 GPUs in one

[01:18:20] enables us to connect 144 GPUs in one MVLink domain. And so the Kyber rack,

[01:18:24] MVLink domain. And so the Kyber rack, this I I could lift it, I'm sure, but I

[01:18:28] this I I could lift it, I'm sure, but I won't.

[01:18:30] won't. It's quite heavy. This This is one

[01:18:33] It's quite heavy. This This is one compute node, and it slides into the

[01:18:35] compute node, and it slides into the Kyber rack vertically.

[01:18:38] Kyber rack vertically. This is where it connects into. This is

[01:18:41] This is where it connects into. This is the midplane. The Kyber racks, those

[01:18:44] the midplane. The Kyber racks, those four top MVLink connectors slide in and

[01:18:48] four top MVLink connectors slide in and connect into this. And this becomes one

[01:18:52] connect into this. And this becomes one of the nodes.

[01:18:54] of the nodes. And each one of these racks is a

[01:18:55] And each one of these racks is a different compute node. And this is the

[01:18:57] different compute node. And this is the amazing part. This is the midplane.

[01:19:01] amazing part. This is the midplane. And the back of the midplane, instead of

[01:19:04] And the back of the midplane, instead of the cabling system,

[01:19:06] the cabling system, which has its limits in terms of how far

[01:19:08] which has its limits in terms of how far we could drive cables, copper cables, we

[01:19:11] we could drive cables, copper cables, we now have this system to connect 144

[01:19:15] now have this system to connect 144 GPUs. This is the new MVLink. This sits

[01:19:20] GPUs. This is the new MVLink. This sits also vertically and it con connects into

[01:19:24] also vertically and it con connects into the midplanes on the back. Compute in

[01:19:27] the midplanes on the back. Compute in the front, MVLink switches in the back.

[01:19:31] the front, MVLink switches in the back. One giant computer. Okay. So that is

[01:19:37] One giant computer. Okay. So that is Reuben Ultra

[01:19:46] as I mentioned. as I mentioned.

[01:19:51] How about we t take this back down?

[01:19:54] How about we t take this back down? I need the rest of my slides.

[01:19:56] I need the rest of my slides. >> Oh, it's coming down. Okay. Thank you,

[01:19:59] >> Oh, it's coming down. Okay. Thank you, Janine.

[01:20:01] Janine. This is what happens when you This is

[01:20:04] This is what happens when you This is what happens when you don't practice.

[01:20:11] Okay. All right. So, um you saw you

[01:20:17] Okay. All right. So, um you saw you Take your time. Just don't get hurt.

[01:20:20] Take your time. Just don't get hurt. You saw you saw this slide. You know,

[01:20:23] You saw you saw this slide. You know, only at Nvidia's keynote will you see

[01:20:25] only at Nvidia's keynote will you see last year's slide presented again. And

[01:20:28] last year's slide presented again. And the reason for that is I just want to

[01:20:30] the reason for that is I just want to let you know that last year I told you

[01:20:31] let you know that last year I told you something very, very important. And it's

[01:20:33] something very, very important. And it's so important. It's worthwhile to tell

[01:20:35] so important. It's worthwhile to tell you again.

[01:20:37] you again. This is probably the single most

[01:20:39] This is probably the single most important chart for the future of AI

[01:20:41] important chart for the future of AI factories. And every CEO, every CEO in

[01:20:44] factories. And every CEO, every CEO in the world will be tracking it. We'll be

[01:20:46] the world will be tracking it. We'll be studying it very deeply. It's much much

[01:20:49] studying it very deeply. It's much much more complicated than this. It's

[01:20:50] more complicated than this. It's multi-dimensional.

[01:20:51] multi-dimensional. But you will be studying the throughput

[01:20:55] But you will be studying the throughput and this token speed of your AI

[01:20:58] and this token speed of your AI factories. The throughput, token speed

[01:21:00] factories. The throughput, token speed at ISO power because that's all the

[01:21:03] at ISO power because that's all the power you have. Throughput and token

[01:21:06] power you have. Throughput and token speed for your factories forever. And

[01:21:09] speed for your factories forever. And that that analysis is going to lead

[01:21:12] that that analysis is going to lead directly to your revenues. What you do

[01:21:15] directly to your revenues. What you do this year will show up precisely next

[01:21:18] this year will show up precisely next year as your revenues. And this chart is

[01:21:21] year as your revenues. And this chart is what it's all about. And I said on the

[01:21:23] what it's all about. And I said on the vertical axis, on the vertical axis,

[01:21:25] vertical axis, on the vertical axis, thank you guys. On the vertical axis is

[01:21:28] thank you guys. On the vertical axis is throughput. On the horizontal axis is

[01:21:31] throughput. On the horizontal axis is token rate. Today I'm going to show you

[01:21:33] token rate. Today I'm going to show you this

[01:21:35] this because we're able because we're now

[01:21:37] because we're able because we're now able to increase the token speed and

[01:21:40] able to increase the token speed and because model sizes are increasing

[01:21:43] because model sizes are increasing because the token length the context

[01:21:46] because the token length the context length depending on the different grades

[01:21:48] length depending on the different grades of a different application use case

[01:21:51] of a different application use case continues to grow from maybe a 100,000

[01:21:54] continues to grow from maybe a 100,000 tokens input length to maybe millions.

[01:21:58] tokens input length to maybe millions. the token input length is growing and

[01:22:01] the token input length is growing and also the output token length is growing.

[01:22:03] also the output token length is growing. And so all of these play into ultimately

[01:22:09] And so all of these play into ultimately the marketing and the pricing of future

[01:22:12] the marketing and the pricing of future tokens. Tokens are the new commodity and

[01:22:15] tokens. Tokens are the new commodity and like all commodities once it reaches an

[01:22:18] like all commodities once it reaches an inflection once it becomes mature or

[01:22:20] inflection once it becomes mature or becomes maturing it will segment into

[01:22:22] becomes maturing it will segment into different parts. The high throughput

[01:22:27] different parts. The high throughput low speed could be used for the free

[01:22:29] low speed could be used for the free tier. The next tier could be the medium

[01:22:32] tier. The next tier could be the medium tier. Larger model, maybe higher speed

[01:22:35] tier. Larger model, maybe higher speed for sure, larger input context length.

[01:22:39] for sure, larger input context length. That translates to a different price

[01:22:42] That translates to a different price point. You could see from all the

[01:22:44] point. You could see from all the different services, this one is free.

[01:22:46] different services, this one is free. It's a free tier. The first tier could

[01:22:47] It's a free tier. The first tier could be $3 per million tokens. The next tier

[01:22:50] be $3 per million tokens. The next tier could be $6 per million tokens. You

[01:22:53] could be $6 per million tokens. You would like to be able to keep pushing

[01:22:54] would like to be able to keep pushing this boundary because the larger the

[01:22:57] this boundary because the larger the model smarter, the more input token

[01:23:01] model smarter, the more input token context length, more relevant, the

[01:23:04] context length, more relevant, the higher the speed, the long the more you

[01:23:07] higher the speed, the long the more you can think and iterate smarter AI models.

[01:23:10] can think and iterate smarter AI models. So this is about smarter AI models. And

[01:23:12] So this is about smarter AI models. And when you have smarter AI models, each

[01:23:15] when you have smarter AI models, each one of these clicks allows you to

[01:23:17] one of these clicks allows you to increase the price. So this is $45. And

[01:23:20] increase the price. So this is $45. And maybe one day there'll be a premium

[01:23:22] maybe one day there'll be a premium model that allows you a premium service

[01:23:24] model that allows you a premium service that allows you to generate token speeds

[01:23:28] that allows you to generate token speeds that are incredibly high because you're

[01:23:30] that are incredibly high because you're in a critical path or maybe you're doing

[01:23:32] in a critical path or maybe you're doing really long research and $150 per

[01:23:35] really long research and $150 per million tokens is just not a thing. So

[01:23:38] million tokens is just not a thing. So let's translate that. Suppose you were

[01:23:40] let's translate that. Suppose you were to use 50 million tokens per day as a

[01:23:43] to use 50 million tokens per day as a researcher at $150 per million tokens.

[01:23:46] researcher at $150 per million tokens. As it turns out, as a research team,

[01:23:48] As it turns out, as a research team, that's not even a thing. So, we believe

[01:23:51] that's not even a thing. So, we believe that this is the future. This is where

[01:23:53] that this is the future. This is where AI wants to go. This is where it is

[01:23:55] AI wants to go. This is where it is today.

[01:23:57] today. It had to start here to establish the

[01:24:00] It had to start here to establish the value and establish it usefulness and

[01:24:02] value and establish it usefulness and get better and better and better. In the

[01:24:04] get better and better and better. In the future, you're going to see most

[01:24:05] future, you're going to see most services encompass encompass all of

[01:24:08] services encompass encompass all of that. This is Hopper.

[01:24:10] that. This is Hopper. Hopper started and I moved it moved the

[01:24:12] Hopper started and I moved it moved the chart. This is 50. This is 100. Hopper

[01:24:16] chart. This is 50. This is 100. Hopper looks like this. And you would have

[01:24:17] looks like this. And you would have expected Hopper the next generation to

[01:24:20] expected Hopper the next generation to be higher, but nobody would have

[01:24:22] be higher, but nobody would have expected it to be that much higher. This

[01:24:24] expected it to be that much higher. This is Grace Blackwell. What Grace Blackwell

[01:24:26] is Grace Blackwell. What Grace Blackwell did is at your free tier increase your

[01:24:29] did is at your free tier increase your throughput tremendously.

[01:24:31] throughput tremendously. However,

[01:24:33] However, where you mostly monetize your service,

[01:24:36] where you mostly monetize your service, it increased your throughput by 35

[01:24:39] it increased your throughput by 35 times. This is no different than any

[01:24:41] times. This is no different than any product that every company makes. The

[01:24:44] product that every company makes. The higher the tier, the higher the quality,

[01:24:46] higher the tier, the higher the quality, the higher the performance, the lower

[01:24:49] the higher the performance, the lower the volume, the lower the capacity. And

[01:24:51] the volume, the lower the capacity. And so it is no different than any other

[01:24:53] so it is no different than any other business in the world. And so now we're

[01:24:56] business in the world. And so now we're able to increase this tier by 35x.

[01:25:00] able to increase this tier by 35x. And we introduced a whole new tier.

[01:25:04] And we introduced a whole new tier. This this is the benefit of Grace

[01:25:06] This this is the benefit of Grace Blackwell. A huge jump over Hopper.

[01:25:10] Blackwell. A huge jump over Hopper. Well, this is what we're doing with

[01:25:15] Okay. So, this is Grace Blackwell. Okay.

[01:25:19] Okay. So, this is Grace Blackwell. Okay. Let me just reset reset this.

[01:25:22] Let me just reset reset this. And this is Vera Rubin.

[01:25:25] And this is Vera Rubin. Okay.

[01:25:31] Now, just think just think what just

[01:25:33] Now, just think just think what just happened at every single tier. At every

[01:25:36] happened at every single tier. At every single tier, at every single tier, we

[01:25:38] single tier, at every single tier, we increase the throughput. And at the tier

[01:25:41] increase the throughput. And at the tier that where your highest ASP and your

[01:25:43] that where your highest ASP and your most valuable segment, we increased it

[01:25:46] most valuable segment, we increased it by 10x.

[01:25:48] by 10x. That is the hard work. This is

[01:25:51] That is the hard work. This is incredibly hard to do out here. This is

[01:25:54] incredibly hard to do out here. This is the benefit of EVL 72. This is the

[01:25:56] the benefit of EVL 72. This is the benefit of extremely low latency. This

[01:25:58] benefit of extremely low latency. This is the benefit of extreme code design

[01:26:00] is the benefit of extreme code design that we could shift the entire area up.

[01:26:03] that we could shift the entire area up. Now, what does it mean from a customer

[01:26:04] Now, what does it mean from a customer perspective in the end? Suppose I were

[01:26:06] perspective in the end? Suppose I were to take all of that and I just, you

[01:26:09] to take all of that and I just, you know, multiply it against suppose I took

[01:26:11] know, multiply it against suppose I took 25% of my power, used it in free tier,

[01:26:14] 25% of my power, used it in free tier, 25% of my power in the medium tier, 25%

[01:26:16] 25% of my power in the medium tier, 25% of my power in the high tier and 25% of

[01:26:18] of my power in the high tier and 25% of my power in the premium tier. My data

[01:26:21] my power in the premium tier. My data center only has a gigawatt.

[01:26:23] center only has a gigawatt. And so I get to decide how I want to

[01:26:26] And so I get to decide how I want to distribute. The free tier allows me to

[01:26:28] distribute. The free tier allows me to attract more customers.

[01:26:30] attract more customers. This allows me to serve my most valuable

[01:26:32] This allows me to serve my most valuable customers.

[01:26:34] customers. And the combination, the product of all

[01:26:36] And the combination, the product of all that allows you basically your revenues,

[01:26:39] that allows you basically your revenues, the revenues you can generate, assuming

[01:26:42] the revenues you can generate, assuming this simplistic example, allows

[01:26:45] this simplistic example, allows Blackwell to generate five times more

[01:26:48] Blackwell to generate five times more revenues.

[01:26:51] revenues. Vera Rubin to generate five times. Yeah.

[01:27:00] So if you're a Reuben, you should get

[01:27:02] So if you're a Reuben, you should get there as soon as you can. And the reason

[01:27:05] there as soon as you can. And the reason for that is because your your cost of

[01:27:07] for that is because your your cost of tokens goes down and your throughput

[01:27:09] tokens goes down and your throughput goes up now. But we want even more. We

[01:27:12] goes up now. But we want even more. We want even more. And so let me just show

[01:27:14] want even more. And so let me just show you back to this. This is as you as I as

[01:27:17] you back to this. This is as you as I as I told you this throughput requires a

[01:27:20] I told you this throughput requires a ton of flops. This latency, this

[01:27:24] ton of flops. This latency, this interactivity requires enormous amount

[01:27:26] interactivity requires enormous amount of bandwidth. Computers don't like

[01:27:29] of bandwidth. Computers don't like extreme amount of flops, extreme amount

[01:27:31] extreme amount of flops, extreme amount of bandwidth because there's only so

[01:27:32] of bandwidth because there's only so much surface area for chips that any

[01:27:35] much surface area for chips that any systems has. And so optimizing for high

[01:27:38] systems has. And so optimizing for high throughput and optimizing for low

[01:27:40] throughput and optimizing for low latency are in fact enemies of each

[01:27:43] latency are in fact enemies of each other. And so this is what happened when

[01:27:46] other. And so this is what happened when we combined with rock. Okay. And so we

[01:27:49] we combined with rock. Okay. And so we we acquired the team that worked on the

[01:27:51] we acquired the team that worked on the Gro chips and licensed the technology

[01:27:53] Gro chips and licensed the technology and we've been working together now to

[01:27:55] and we've been working together now to integrate the system. This is what that

[01:27:57] integrate the system. This is what that looks like. So at the most valuable tier

[01:28:01] looks like. So at the most valuable tier at the most valuable tier we're now

[01:28:03] at the most valuable tier we're now going to increase performance by 35x.

[01:28:06] going to increase performance by 35x. Now this very simple chart revealed to

[01:28:10] Now this very simple chart revealed to you exactly the reason why Nvidia is so

[01:28:14] you exactly the reason why Nvidia is so strong in the vast majority of the

[01:28:17] strong in the vast majority of the workloads so far. And the reason for

[01:28:19] workloads so far. And the reason for that is because up in this area

[01:28:22] that is because up in this area throughput matters so much. MVLink 72 is

[01:28:26] throughput matters so much. MVLink 72 is so gamechanging. It is exactly the right

[01:28:28] so gamechanging. It is exactly the right architecture and it's even hard to beat

[01:28:31] architecture and it's even hard to beat even as you add Grock to it. However,

[01:28:35] even as you add Grock to it. However, if you extended this chart way out here

[01:28:38] if you extended this chart way out here and you said you wanted to have services

[01:28:40] and you said you wanted to have services that delivers not 400 tokens per second

[01:28:43] that delivers not 400 tokens per second but a thousand tokens per second, all of

[01:28:46] but a thousand tokens per second, all of a sudden MVLink72 runs out of steam and

[01:28:48] a sudden MVLink72 runs out of steam and it simply can't get there. We just don't

[01:28:51] it simply can't get there. We just don't have enough bandwidth. And so this is

[01:28:53] have enough bandwidth. And so this is where Grock comes in and this is what

[01:28:55] where Grock comes in and this is what happens when we push that out. So it

[01:28:59] happens when we push that out. So it goes out beyond Thank you.

[01:29:05] goes out beyond even the limits of what

[01:29:08] goes out beyond even the limits of what MVLink72 can do. And if you were to do

[01:29:10] MVLink72 can do. And if you were to do that, translate that into revenues

[01:29:14] that, translate that into revenues relative to Blackwell Vera Rubin is 5x.

[01:29:19] relative to Blackwell Vera Rubin is 5x. If most of your workload is high

[01:29:21] If most of your workload is high throughput, I would stick with just 100%

[01:29:24] throughput, I would stick with just 100% Vera Rubin. If a lot of your workload

[01:29:27] Vera Rubin. If a lot of your workload wants to be coding and very high valued

[01:29:31] wants to be coding and very high valued engineering token generation, I would

[01:29:34] engineering token generation, I would add Grock to it. I would add Grock to

[01:29:36] add Grock to it. I would add Grock to maybe 25% of my total data center. The

[01:29:39] maybe 25% of my total data center. The rest of my data center is all 100% Vera

[01:29:41] rest of my data center is all 100% Vera Rubin. And so that gives you a sense of

[01:29:45] Rubin. And so that gives you a sense of how you would add Grock to Vera Rubin

[01:29:48] how you would add Grock to Vera Rubin and extend its performance and extend

[01:29:51] and extend its performance and extend its value even more. This is what

[01:29:53] its value even more. This is what happens.

[01:29:55] happens. Ver this is a contrast. The reason why

[01:29:58] Ver this is a contrast. The reason why the reason why Grock was so attractive

[01:29:59] the reason why Grock was so attractive to me is because their computing system

[01:30:02] to me is because their computing system a deterministic data flow processor it

[01:30:05] a deterministic data flow processor it is statically compiled. It is compiler

[01:30:09] is statically compiled. It is compiler scheduled meaning the compiler figures

[01:30:12] scheduled meaning the compiler figures out when the data when to do the compute

[01:30:15] out when the data when to do the compute the the compute and the data arrives at

[01:30:17] the the compute and the data arrives at the same time. All of that is done

[01:30:19] the same time. All of that is done statically in advance

[01:30:21] statically in advance and scheduled completely in software.

[01:30:25] and scheduled completely in software. There's no dynamic scheduling.

[01:30:28] There's no dynamic scheduling. The architecture is designed with

[01:30:30] The architecture is designed with massive amounts of SRAMM. It is designed

[01:30:33] massive amounts of SRAMM. It is designed just for inference. This one workload.

[01:30:36] just for inference. This one workload. Now, this one workload, as it turns out,

[01:30:38] Now, this one workload, as it turns out, is the workload of AI factories. And as

[01:30:41] is the workload of AI factories. And as the world continues to increase the

[01:30:43] the world continues to increase the amount of high-speed tokens it wants to

[01:30:45] amount of high-speed tokens it wants to generate with super smart tokens it

[01:30:48] generate with super smart tokens it wants to generate, the value of this

[01:30:50] wants to generate, the value of this integration is going to get even higher.

[01:30:52] integration is going to get even higher. And so these are two extreme processors.

[01:30:54] And so these are two extreme processors. You could see one chip 500 megabytes,

[01:30:58] You could see one chip 500 megabytes, one Vera Ruben chip, one Ruben chip 288

[01:31:02] one Vera Ruben chip, one Ruben chip 288 gigabytes.

[01:31:05] gigabytes. It would take a lot of rock chips to be

[01:31:08] It would take a lot of rock chips to be able to hold the parameter size of

[01:31:11] able to hold the parameter size of Reuben as well as all of the context

[01:31:14] Reuben as well as all of the context that has to go the KV cache that has to

[01:31:16] that has to go the KV cache that has to go along with it. So that limited

[01:31:18] go along with it. So that limited Grock's ability to really reach the

[01:31:21] Grock's ability to really reach the mainstream to really take off until we

[01:31:23] mainstream to really take off until we had a great idea. What if we

[01:31:25] had a great idea. What if we disagregated inference altogether with a

[01:31:28] disagregated inference altogether with a piece of software called Dynamo? What if

[01:31:30] piece of software called Dynamo? What if we rearchitected the way that inference

[01:31:32] we rearchitected the way that inference is done in the pipeline? so that we

[01:31:35] is done in the pipeline? so that we could put the work that makes perfect

[01:31:37] could put the work that makes perfect sense on Vera Rubin and then offload the

[01:31:41] sense on Vera Rubin and then offload the decode generation the low latency the

[01:31:45] decode generation the low latency the bandwidth limited challenged part of the

[01:31:48] bandwidth limited challenged part of the workload for Grock and so we united

[01:31:51] workload for Grock and so we united unified

[01:31:52] unified two processors of extreme differences

[01:31:55] two processors of extreme differences one for high throughput one for low

[01:31:57] one for high throughput one for low latency it still doesn't change the fact

[01:32:00] latency it still doesn't change the fact that we need a lot of memory and so

[01:32:02] that we need a lot of memory and so Grock we're just going

[01:32:04] Grock we're just going add a whole bunch of Grock chips which

[01:32:06] add a whole bunch of Grock chips which expands the amount of memory it has and

[01:32:09] expands the amount of memory it has and so if you could just imagine

[01:32:12] so if you could just imagine out of a trillion parameter model we

[01:32:15] out of a trillion parameter model we have to store all of that in gro chips

[01:32:18] have to store all of that in gro chips however it sits next to Nvidia Vera

[01:32:21] however it sits next to Nvidia Vera Rubin where we could we could hold the

[01:32:23] Rubin where we could we could hold the massive amounts of KV cache that's

[01:32:26] massive amounts of KV cache that's necessary in processing all of these

[01:32:28] necessary in processing all of these agentic AI systems it's based upon this

[01:32:31] agentic AI systems it's based upon this idea of this aggregated inference we do

[01:32:35] idea of this aggregated inference we do the prefill that's the easy part but we

[01:32:38] the prefill that's the easy part but we also tightly integrate the decode so the

[01:32:42] also tightly integrate the decode so the attention part of decode is done on

[01:32:44] attention part of decode is done on Nvidia's Vera Rubin which needs a lot of

[01:32:46] Nvidia's Vera Rubin which needs a lot of math and the feed forward network part

[01:32:50] math and the feed forward network part of it the decode part is done uh the

[01:32:53] of it the decode part is done uh the token generation part is done on Vera

[01:32:55] token generation part is done on Vera Rubin on the uh on the groip the two of

[01:32:58] Rubin on the uh on the groip the two of them working tightly coupled together

[01:32:59] them working tightly coupled together over today Ethernet with a special mode

[01:33:03] over today Ethernet with a special mode to reduce its latency by about half. And

[01:33:06] to reduce its latency by about half. And so that capability allows us to

[01:33:08] so that capability allows us to integrate these two systems. We run

[01:33:10] integrate these two systems. We run Dynamo, this incredible operating system

[01:33:12] Dynamo, this incredible operating system for AI factories on top of it. And you

[01:33:15] for AI factories on top of it. And you get 35 times increase. 35 times

[01:33:20] get 35 times increase. 35 times increase. Not to mention additional new

[01:33:23] increase. Not to mention additional new tiers of inference performance for token

[01:33:26] tiers of inference performance for token generation the world's never seen. So

[01:33:28] generation the world's never seen. So this is it. This is Grock.

[01:33:38] the Vera Rubin systems including Grock.

[01:33:41] the Vera Rubin systems including Grock. I want to thank Samsung uh who

[01:33:43] I want to thank Samsung uh who manufactures the Gro LP30 chip for us

[01:33:47] manufactures the Gro LP30 chip for us and they're cranking as hard as they

[01:33:48] and they're cranking as hard as they can. I really appreciate appreciate you

[01:33:51] can. I really appreciate appreciate you guys. We're in production with the Gro

[01:33:53] guys. We're in production with the Gro chip and uh you know we'll ship it in

[01:33:56] chip and uh you know we'll ship it in the second half probably about Q3 time

[01:33:58] the second half probably about Q3 time frame. Okay.

[01:34:01] frame. Okay. Grock LPX

[01:34:09] Vera Rubin you know it's kind of hard

[01:34:12] Vera Rubin you know it's kind of hard it's kind of hard to imagine any more

[01:34:15] it's kind of hard to imagine any more customers

[01:34:17] customers you know and and uh the the really great

[01:34:20] you know and and uh the the really great thing is is um Grace Blackwell early

[01:34:23] thing is is um Grace Blackwell early sampling of it was really complicated

[01:34:25] sampling of it was really complicated because of coming together of Envy Link

[01:34:27] because of coming together of Envy Link 72 but the sampling of Vera Rubin is

[01:34:29] 72 but the sampling of Vera Rubin is just going incredibly well and in fact

[01:34:32] just going incredibly well and in fact Satia I think texted out already that

[01:34:34] Satia I think texted out already that the first Vera Rubin rack is already up

[01:34:36] the first Vera Rubin rack is already up and running at Microsoft Azure and so

[01:34:39] and running at Microsoft Azure and so I'm super excited for them. We're just

[01:34:40] I'm super excited for them. We're just going to keep cranking these things out.

[01:34:42] going to keep cranking these things out. We have now set up a supply chain that

[01:34:45] We have now set up a supply chain that could manufacture thousands a week of

[01:34:48] could manufacture thousands a week of these systems essentially multi-

[01:34:51] these systems essentially multi- gigawatts of AI factories per month

[01:34:55] gigawatts of AI factories per month inside our supply chain. And so we're

[01:34:58] inside our supply chain. And so we're going to crank out these these Vera

[01:34:59] going to crank out these these Vera Rubin racks while we're cranking out the

[01:35:01] Rubin racks while we're cranking out the GB300 racks. We are in full production.

[01:35:04] GB300 racks. We are in full production. The Vera CPUs

[01:35:06] The Vera CPUs incredibly successful. And the reason

[01:35:08] incredibly successful. And the reason for that is because AI needs CPUs for

[01:35:11] for that is because AI needs CPUs for tool use and Vera CPU was designed just

[01:35:15] tool use and Vera CPU was designed just perfectly for that sweet spot.

[01:35:17] perfectly for that sweet spot. Incredible for the next generation of

[01:35:19] Incredible for the next generation of data processing. Vera CPU is ideal. the

[01:35:22] data processing. Vera CPU is ideal. the Vera CPU plus blue plus CX9 connected

[01:35:26] Vera CPU plus blue plus CX9 connected into the Bluefield fourstack

[01:35:28] into the Bluefield fourstack 100%

[01:35:31] 100% 100% of the world's storage industry is

[01:35:35] 100% of the world's storage industry is joining us on this system and the reason

[01:35:38] joining us on this system and the reason for that is because they see exactly the

[01:35:40] for that is because they see exactly the same thing. The storage system is going

[01:35:43] same thing. The storage system is going to get pounded. It's going to get

[01:35:45] to get pounded. It's going to get pounded because we used to have humans

[01:35:47] pounded because we used to have humans using the storage systems. We used to

[01:35:49] using the storage systems. We used to have humans using SQL. Now we're going

[01:35:51] have humans using SQL. Now we're going to have AIS using these storage systems

[01:35:54] to have AIS using these storage systems and it's going to store QDF accelerated

[01:35:57] and it's going to store QDF accelerated storage, QVS accelerated storage as well

[01:36:00] storage, QVS accelerated storage as well as very importantly KV caching. Okay, so

[01:36:03] as very importantly KV caching. Okay, so this is the Vera Rubin system. Now

[01:36:06] this is the Vera Rubin system. Now what's amazing is this. in just two

[01:36:08] what's amazing is this. in just two years time in a one gigawatt factory in

[01:36:13] years time in a one gigawatt factory in just two years time in one gigawatt

[01:36:15] just two years time in one gigawatt factory using the mathematics that I

[01:36:18] factory using the mathematics that I showed you earlier whereas Moore's law

[01:36:21] showed you earlier whereas Moore's law would have given us a couple of steps we

[01:36:23] would have given us a couple of steps we would have you know x factored the

[01:36:26] would have you know x factored the number of transistors we would have x

[01:36:28] number of transistors we would have x factored the number of flops we would

[01:36:30] factored the number of flops we would have x factored the number of amount of

[01:36:33] have x factored the number of amount of bandwidth but with this architecture

[01:36:36] bandwidth but with this architecture we're going to take our token generation

[01:36:38] we're going to take our token generation ation speed token generation rate from 2

[01:36:42] ation speed token generation rate from 2 million to 700 million 350 times

[01:36:48] million to 700 million 350 times increase.

[01:36:53] This is this is the power of extreme

[01:36:56] This is this is the power of extreme code design. This is what I mean when we

[01:36:59] code design. This is what I mean when we integrate and optimize vertically but

[01:37:02] integrate and optimize vertically but then we open it horizontally for

[01:37:04] then we open it horizontally for everybody to enjoy. This is our road

[01:37:06] everybody to enjoy. This is our road map. Very quickly,

[01:37:09] map. Very quickly, Blackwell is here, the Oberon system. In

[01:37:13] Blackwell is here, the Oberon system. In the case of Reuben, we have the Oberon

[01:37:15] the case of Reuben, we have the Oberon system. We're always backwards

[01:37:17] system. We're always backwards compatible. So that if you wanted to not

[01:37:19] compatible. So that if you wanted to not change anything and just keep on moving

[01:37:21] change anything and just keep on moving through with the new architecture, you

[01:37:22] through with the new architecture, you could do so.

[01:37:24] could do so. The old the standard um rack system

[01:37:27] The old the standard um rack system Oberon still available. Oberon is copper

[01:37:31] Oberon still available. Oberon is copper scale up. And with Oberon, we could also

[01:37:34] scale up. And with Oberon, we could also use optical scale out or excuse me,

[01:37:37] use optical scale out or excuse me, optical scale up to expand to MVLink

[01:37:42] optical scale up to expand to MVLink 576.

[01:37:44] 576. Okay. And so there's a lot of

[01:37:45] Okay. And so there's a lot of conversation about is Nvidia going to

[01:37:47] conversation about is Nvidia going to copper scale up or optical scale up.

[01:37:50] copper scale up or optical scale up. We're going to do both.

[01:37:52] We're going to do both. So, we're going to have MVLink 144 with

[01:37:55] So, we're going to have MVLink 144 with Kyber and then with Operon uh Opteron

[01:38:00] Kyber and then with Operon uh Opteron Oberon, we're going to MVLink72

[01:38:03] Oberon, we're going to MVLink72 plus Optical to get to MVLink 576.

[01:38:08] plus Optical to get to MVLink 576. The next generation of Reuben with

[01:38:11] The next generation of Reuben with Reuben Ultra. We have the Reuben Ultra

[01:38:14] Reuben Ultra. We have the Reuben Ultra chip which is coming which is imp taping

[01:38:16] chip which is coming which is imp taping out and we have a brand new chip LP35.

[01:38:20] out and we have a brand new chip LP35. LP35 will for the first time incorporate

[01:38:23] LP35 will for the first time incorporate Nvidia's MVFP4 computing structure give

[01:38:28] Nvidia's MVFP4 computing structure give you another few X X factor speed up.

[01:38:30] you another few X X factor speed up. Okay. And so this is Oberon MVLink 72

[01:38:34] Okay. And so this is Oberon MVLink 72 optical scale up and it uses Spectrum 6

[01:38:39] optical scale up and it uses Spectrum 6 the world's first co-ackaged optical and

[01:38:43] the world's first co-ackaged optical and um all of this is in production. The

[01:38:46] um all of this is in production. The next generation from here

[01:38:48] next generation from here is Fman. Fineman has a new GPU of

[01:38:52] is Fman. Fineman has a new GPU of course. It also has a new LPU

[01:38:56] course. It also has a new LPU LP40.

[01:38:58] LP40. Big step up. Incredible. Incredible new

[01:39:02] Big step up. Incredible. Incredible new technology. Now

[01:39:04] technology. Now uniting the scale of Nvidia and the Gro

[01:39:08] uniting the scale of Nvidia and the Gro team building together LP40. It's going

[01:39:11] team building together LP40. It's going to be incredible. a brand new CPU called

[01:39:14] to be incredible. a brand new CPU called Rosa,

[01:39:16] Rosa, short for Roslin. Bluefield 5, which

[01:39:19] short for Roslin. Bluefield 5, which connects the next CPU with the next

[01:39:22] connects the next CPU with the next Superneck

[01:39:24] Superneck CX10.

[01:39:26] CX10. We will have Kyber

[01:39:29] We will have Kyber which is copper scale up. We will also

[01:39:32] which is copper scale up. We will also have Kyber

[01:39:36] have Kyber CPO scale up. So for the first time we

[01:39:40] CPO scale up. So for the first time we will scale up with both copper and

[01:39:45] will scale up with both copper and co-ackage optics. Okay. And so a lot of

[01:39:49] co-ackage optics. Okay. And so a lot of people have been asking you know Jensen

[01:39:51] people have been asking you know Jensen are is copper going to still be

[01:39:53] are is copper going to still be important? The answer is yes.

[01:39:56] important? The answer is yes. Jensen are you going to scale up

[01:39:58] Jensen are you going to scale up optical? Yes.

[01:40:02] optical? Yes. Are you going to scale out optical?

[01:40:05] Are you going to scale out optical? Yes.

[01:40:07] Yes. And so for everybody who is in our

[01:40:09] And so for everybody who is in our ecosystem, we need a lot more capacity

[01:40:13] ecosystem, we need a lot more capacity and that's really the key. We need a lot

[01:40:15] and that's really the key. We need a lot more capacity for cop for copper. We

[01:40:17] more capacity for cop for copper. We need a lot more capacity for optics. We

[01:40:19] need a lot more capacity for optics. We need a lot more capacity for CPO and

[01:40:22] need a lot more capacity for CPO and that's the reason why we've been working

[01:40:24] that's the reason why we've been working with all of you to lay the foundation

[01:40:26] with all of you to lay the foundation for this level of growth. And so Fman

[01:40:29] for this level of growth. And so Fman will have all of that. Let me see if I

[01:40:31] will have all of that. Let me see if I uh missed everything. That's it. every

[01:40:34] uh missed everything. That's it. every single year. Brand new architecture.

[01:40:36] single year. Brand new architecture. Very quick.

[01:40:43] Very quickly, Nvidia went from a chip

[01:40:46] Very quickly, Nvidia went from a chip company to a AI factory company or AI

[01:40:50] company to a AI factory company or AI infrastructure company, AI computing

[01:40:52] infrastructure company, AI computing company. These systems

[01:40:54] company. These systems and now we're building entire AI

[01:40:56] and now we're building entire AI factories. There's so much power

[01:41:01] factories. There's so much power that is squandered in these AI

[01:41:03] that is squandered in these AI factories. We want to make sure that

[01:41:05] factories. We want to make sure that these AI factories come together

[01:41:07] these AI factories come together designed in the best possible way. Most

[01:41:10] designed in the best possible way. Most of these components never meet each

[01:41:12] of these components never meet each other. Most of most of us technology

[01:41:14] other. Most of most of us technology vendors now we all know each other but

[01:41:17] vendors now we all know each other but in the past we never met each other

[01:41:19] in the past we never met each other until the data center. That can't

[01:41:21] until the data center. That can't happen. We're building super complex

[01:41:23] happen. We're building super complex systems and so we have to meet each

[01:41:25] systems and so we have to meet each other virtually somewhere else and so we

[01:41:28] other virtually somewhere else and so we created Omniverse and the Omniverse DSX

[01:41:32] created Omniverse and the Omniverse DSX world a platform where all of us can

[01:41:34] world a platform where all of us can meet and design these gigafactories the

[01:41:37] meet and design these gigafactories the giga you know gigawatt AI factories

[01:41:40] giga you know gigawatt AI factories virtually in system. We have simulation

[01:41:44] virtually in system. We have simulation systems for the racks for mechanical,

[01:41:46] systems for the racks for mechanical, thermal, electrical, networking. Those

[01:41:49] thermal, electrical, networking. Those simulation systems integrated into all

[01:41:52] simulation systems integrated into all of our ecosystem partners of incredible

[01:41:54] of our ecosystem partners of incredible tools companies. We also operated,

[01:41:58] tools companies. We also operated, connected to the grid so that we could

[01:42:00] connected to the grid so that we could interact with each other, send each

[01:42:02] interact with each other, send each other information so that we could

[01:42:04] other information so that we could adjust

[01:42:06] adjust grid power and data center power

[01:42:08] grid power and data center power accordingly, saving energy. And then

[01:42:11] accordingly, saving energy. And then inside the data center using Max Q so

[01:42:14] inside the data center using Max Q so that we could adjust the system

[01:42:16] that we could adjust the system dynamically across power and cooling and

[01:42:19] dynamically across power and cooling and all of the different technologies we all

[01:42:20] all of the different technologies we all work on together so that we leave no

[01:42:23] work on together so that we leave no power squandered

[01:42:25] power squandered so that we run at the most optimal rate

[01:42:28] so that we run at the most optimal rate to deliver enormous amount of token

[01:42:30] to deliver enormous amount of token throughput. There's no question in my

[01:42:33] throughput. There's no question in my mind there's a factor of two in here and

[01:42:35] mind there's a factor of two in here and a factor of two at the scale we're

[01:42:37] a factor of two at the scale we're talking about is gigantic. We call this

[01:42:40] talking about is gigantic. We call this the NVIDIA DXX platform. And just as all

[01:42:43] the NVIDIA DXX platform. And just as all of our platforms, there's the hardware

[01:42:45] of our platforms, there's the hardware layer, there's the library layer, and

[01:42:48] layer, there's the library layer, and there's the ecosystem layer. It's

[01:42:49] there's the ecosystem layer. It's exactly the same way. Let's show it to

[01:42:52] exactly the same way. Let's show it to you.

[01:42:56] The greatest infrastructure buildout in

[01:42:58] The greatest infrastructure buildout in history is underway.

[01:43:00] history is underway. The world is racing to build chip system

[01:43:03] The world is racing to build chip system and AI factories. And every month of

[01:43:05] and AI factories. And every month of delay costs billions in lost revenues.

[01:43:09] delay costs billions in lost revenues. AI factory revenues are equal to tokens

[01:43:11] AI factory revenues are equal to tokens per watt. So with power constraints,

[01:43:15] per watt. So with power constraints, every unused watt is revenue lost.

[01:43:19] every unused watt is revenue lost. NVIDIA DSX is an Omniverse digital twin

[01:43:22] NVIDIA DSX is an Omniverse digital twin blueprint for designing and operating AI

[01:43:25] blueprint for designing and operating AI factories for maximum token throughput,

[01:43:28] factories for maximum token throughput, resilience, and energy efficiency.

[01:43:31] resilience, and energy efficiency. Developers connect through several APIs.

[01:43:34] Developers connect through several APIs. DSX SIM for physical, electrical,

[01:43:37] DSX SIM for physical, electrical, thermal and network simulation, DSX

[01:43:40] thermal and network simulation, DSX exchange for AI factory operational

[01:43:43] exchange for AI factory operational data, DSX Flex for secure dynamic power

[01:43:47] data, DSX Flex for secure dynamic power management between the grid and DSX Max

[01:43:51] management between the grid and DSX Max Q to dynamically maximize token

[01:43:53] Q to dynamically maximize token throughput.

[01:43:55] throughput. It starts with SIM ready assets from

[01:43:57] It starts with SIM ready assets from NVIDIA and equipment manufacturers

[01:44:01] NVIDIA and equipment manufacturers managed by PTC windshield PLM.

[01:44:04] managed by PTC windshield PLM. Then modelbased systems engineering is

[01:44:07] Then modelbased systems engineering is done in DASO systems 3D experience.

[01:44:12] Jacobs brings the data into their custom

[01:44:14] Jacobs brings the data into their custom Omniverse app to finalize design.

[01:44:18] Omniverse app to finalize design. It's tested with leading simulation

[01:44:20] It's tested with leading simulation tools using Seaman's Star CCCM Plus for

[01:44:24] tools using Seaman's Star CCCM Plus for external thermals,

[01:44:26] external thermals, Cadence Reality for internal,

[01:44:29] Cadence Reality for internal, EAP for electrical, and NVIDIA's network

[01:44:32] EAP for electrical, and NVIDIA's network simulator DSX Air

[01:44:36] simulator DSX Air and virtually commission through Procore

[01:44:39] and virtually commission through Procore to ensure accelerated construction time.

[01:44:42] to ensure accelerated construction time. When the site goes live, the digital

[01:44:45] When the site goes live, the digital twin becomes the operator. AI agents

[01:44:49] twin becomes the operator. AI agents work with DSX Max Q to dynamically

[01:44:52] work with DSX Max Q to dynamically orchestrate infrastructure.

[01:44:54] orchestrate infrastructure. Fedra's agent overseas cooling and

[01:44:57] Fedra's agent overseas cooling and electrical systems, sending signals to

[01:44:59] electrical systems, sending signals to Max Q, which continuously optimizes

[01:45:02] Max Q, which continuously optimizes compute throughput and energy

[01:45:03] compute throughput and energy efficiency.

[01:45:05] efficiency. Emerald AI agents interpret live grid

[01:45:08] Emerald AI agents interpret live grid demand and stress signals and adjust

[01:45:11] demand and stress signals and adjust power dynamically.

[01:45:15] power dynamically. With DSX, Nvidia and our ecosystem of

[01:45:18] With DSX, Nvidia and our ecosystem of partners are racing to build AI

[01:45:20] partners are racing to build AI infrastructure around the world,

[01:45:22] infrastructure around the world, ensuring extreme resiliency, efficiency,

[01:45:26] ensuring extreme resiliency, efficiency, and throughput.

[01:45:31] It's incredible, right? Well,

[01:45:37] om Omniverse Omniverse was designed to

[01:45:40] om Omniverse Omniverse was designed to hold the world's digital twin starting

[01:45:42] hold the world's digital twin starting from the earth and it's going to hold

[01:45:45] from the earth and it's going to hold digital twins of all sizes. And so we

[01:45:47] digital twins of all sizes. And so we have just such a great ecosystem of

[01:45:49] have just such a great ecosystem of partners. I want to thank all of you.

[01:45:50] partners. I want to thank all of you. All of these companies are brand new to

[01:45:53] All of these companies are brand new to our world. We didn't know many of you

[01:45:55] our world. We didn't know many of you just a couple years ago. And now we're

[01:45:57] just a couple years ago. And now we're working so close together to work on and

[01:46:00] working so close together to work on and build together the largest computer the

[01:46:03] build together the largest computer the world's ever seen and also to do it at

[01:46:05] world's ever seen and also to do it at planetary scale. So NVIDIA DSX is our

[01:46:08] planetary scale. So NVIDIA DSX is our new AI factory platform.

[01:46:17] I'll spend very little time on this at

[01:46:19] I'll spend very little time on this at this time. However, we're going to

[01:46:21] this time. However, we're going to space. We've already been out in space.

[01:46:23] space. We've already been out in space. uh Thor is radiation approved and uh

[01:46:26] uh Thor is radiation approved and uh we're in satellites. You do imaging from

[01:46:28] we're in satellites. You do imaging from the from satellites. In the future,

[01:46:30] the from satellites. In the future, we'll also build data centers in the in

[01:46:32] we'll also build data centers in the in the in the in space. Uh obviously very

[01:46:35] the in the in space. Uh obviously very complicated to do. So we have we're

[01:46:37] complicated to do. So we have we're working with our partners on a new

[01:46:39] working with our partners on a new computer called Vera Rubin Space 1 and

[01:46:42] computer called Vera Rubin Space 1 and it's going to go out to space and start

[01:46:44] it's going to go out to space and start data centers out out in space. Now, of

[01:46:46] data centers out out in space. Now, of course, in space, there's no conduction,

[01:46:50] course, in space, there's no conduction, there's no convection, there's just

[01:46:52] there's no convection, there's just radiation. And so, we have to figure out

[01:46:55] radiation. And so, we have to figure out how to um uh cool these systems uh out

[01:46:57] how to um uh cool these systems uh out in space. But, we've got lots of great

[01:46:59] in space. But, we've got lots of great engineers working on it. Let me talk to

[01:47:01] engineers working on it. Let me talk to you about something new.

[01:47:10] So, so um

[01:47:13] So, so um uh Peter Steinberger is here and um uh

[01:47:16] uh Peter Steinberger is here and um uh he he wrote a piece of software. It's

[01:47:19] he he wrote a piece of software. It's called Open Claw and and um I don't know

[01:47:23] called Open Claw and and um I don't know if he realized uh how successful it was

[01:47:26] if he realized uh how successful it was going to be. Um but the importance is

[01:47:28] going to be. Um but the importance is profound. Open Claw is the number one.

[01:47:32] profound. Open Claw is the number one. It's the most popular opensource project

[01:47:35] It's the most popular opensource project in the history of humanity and it did so

[01:47:37] in the history of humanity and it did so in just a few weeks.

[01:47:44] It exceeded it exceeded what Linux did

[01:47:46] It exceeded it exceeded what Linux did in 30 years. And it's that important. It

[01:47:50] in 30 years. And it's that important. It is that important. It will do well.

[01:47:55] Uh this is all you do. Okay? We're

[01:47:57] Uh this is all you do. Okay? We're announcing our support of it. Uh let me

[01:48:00] announcing our support of it. Uh let me let me just quickly go through this.

[01:48:01] let me just quickly go through this. this. I want to show you a couple

[01:48:02] this. I want to show you a couple things. You simply type this, you type

[01:48:05] things. You simply type this, you type it this into a into a console and um it

[01:48:09] it this into a into a console and um it goes out, it finds open claw, it

[01:48:11] goes out, it finds open claw, it downloads it, it builds you an AI agent,

[01:48:15] downloads it, it builds you an AI agent, and then you could tell it whatever else

[01:48:17] and then you could tell it whatever else you need to do. Okay, so let's take a

[01:48:19] you need to do. Okay, so let's take a look.

[01:49:53] An open source project just dropped.

[01:49:54] An open source project just dropped. >> Andre Carpathy has just launched

[01:49:56] >> Andre Carpathy has just launched something called research is a huge

[01:49:58] something called research is a huge deal.

[01:49:58] deal. >> You give an AI agent a task, go to

[01:50:00] >> You give an AI agent a task, go to sleep, it runs 100 experiments

[01:50:02] sleep, it runs 100 experiments overnight, keeping what works and

[01:50:04] overnight, keeping what works and killing what doesn't.

[01:50:16] I really love what my stuff enables that

[01:50:19] I really love what my stuff enables that person to do. And he had like one guy,

[01:50:21] person to do. And he had like one guy, he told me like he installed it as a

[01:50:23] he told me like he installed it as a 60-year-old dad and like they made beer,

[01:50:25] 60-year-old dad and like they made beer, connected the machine via Bluetooth to

[01:50:27] connected the machine via Bluetooth to open claw. And then we automated

[01:50:28] open claw. And then we automated everything including the whole website

[01:50:30] everything including the whole website for people to order lobster.

[01:50:38] Hundreds of people are queuing up for

[01:50:40] Hundreds of people are queuing up for lobsters in s openclaw.

[01:50:43] lobsters in s openclaw. >> Open claw.

[01:50:45] >> Open claw. >> You want to build open claw with open

[01:50:47] >> You want to build open claw with open claw.

[01:50:47] claw. >> Everyone is talking about open claw. But

[01:50:50] >> Everyone is talking about open claw. But what is open claw?

[01:50:52] what is open claw? >> Believe it or not, there's already a

[01:50:53] >> Believe it or not, there's already a claw con.

[01:51:07] Incredible. Incredible. Now, um I

[01:51:10] Incredible. Incredible. Now, um I illustrated effectively what open claw

[01:51:12] illustrated effectively what open claw is in this way and so all of you can

[01:51:15] is in this way and so all of you can understand it. But let's just think what

[01:51:16] understand it. But let's just think what happened. What is open claw? It connects

[01:51:19] happened. What is open claw? It connects it's an a it's a system. It calls and

[01:51:23] it's an a it's a system. It calls and connects to large language models. So

[01:51:25] connects to large language models. So the first thing it has it has resources

[01:51:27] the first thing it has it has resources that it manages. It manage it could

[01:51:29] that it manages. It manage it could access tools. It could access file

[01:51:31] access tools. It could access file systems. It could access large language

[01:51:33] systems. It could access large language models. It It's able to do scheduling.

[01:51:36] models. It It's able to do scheduling. It's able to do cron jobs. It's able to

[01:51:40] It's able to do cron jobs. It's able to um uh decompose a problem that a prompt

[01:51:43] um uh decompose a problem that a prompt that you gave it into step by step by

[01:51:45] that you gave it into step by step by step. It could spawn off and call upon

[01:51:48] step. It could spawn off and call upon other sub aents.

[01:51:50] other sub aents. It has IO. You could talk to it in any

[01:51:53] It has IO. You could talk to it in any modality you want. You could wave at it

[01:51:55] modality you want. You could wave at it and it understands you. You could talk

[01:51:58] and it understands you. You could talk to any modality you want. It sends you

[01:52:01] to any modality you want. It sends you messages, it texts you, sends you email.

[01:52:04] messages, it texts you, sends you email. So, it's got IO.

[01:52:07] So, it's got IO. Um, what else does it have? Well, based

[01:52:11] Um, what else does it have? Well, based on that, you could you could say in fact

[01:52:15] on that, you could you could say in fact it's an operating system. I've just used

[01:52:18] it's an operating system. I've just used the same syntax that I would describe an

[01:52:20] the same syntax that I would describe an operating system. Art

[01:52:23] operating system. Art openclaw has open sourced essentially

[01:52:27] openclaw has open sourced essentially the operating system of agent computers.

[01:52:30] the operating system of agent computers. It is no different than how Windows made

[01:52:33] It is no different than how Windows made it possible for us to create personal

[01:52:36] it possible for us to create personal computers. Now open claw has made it

[01:52:38] computers. Now open claw has made it possible for us to create personal

[01:52:41] possible for us to create personal agents.

[01:52:42] agents. The implication is incredible. The

[01:52:45] The implication is incredible. The implication is incredible. First of all,

[01:52:47] implication is incredible. First of all, the adoption says something you know all

[01:52:50] the adoption says something you know all in itself. However, the most important

[01:52:53] in itself. However, the most important thing is this. Every single company now

[01:52:54] thing is this. Every single company now realize every single company, every

[01:52:56] realize every single company, every single software company, every single

[01:52:58] single software company, every single technology company for the CEOs, the

[01:53:01] technology company for the CEOs, the question is what's your open claw

[01:53:02] question is what's your open claw strategy?

[01:53:04] strategy? Just as we need to all have a Linux

[01:53:06] Just as we need to all have a Linux strategy, we all needed to have a HTTP

[01:53:09] strategy, we all needed to have a HTTP HTML strategy which started the

[01:53:12] HTML strategy which started the internet. We all needed to have a

[01:53:13] internet. We all needed to have a Kubernetes strategy which made it

[01:53:15] Kubernetes strategy which made it possible for mobile cloud to happen.

[01:53:17] possible for mobile cloud to happen. Every company in the world today needs

[01:53:20] Every company in the world today needs to have an open claw strategy and a

[01:53:22] to have an open claw strategy and a gentic system strategy. This is the new

[01:53:25] gentic system strategy. This is the new computer. Now this is just the exciting

[01:53:28] computer. Now this is just the exciting part. This is enterprise IT before

[01:53:31] part. This is enterprise IT before openclaw you know and and I mentioned

[01:53:34] openclaw you know and and I mentioned earlier the way enterprise IT works and

[01:53:37] earlier the way enterprise IT works and the the reason these reason why it's

[01:53:38] the the reason these reason why it's called data centers is because these

[01:53:40] called data centers is because these large rooms these large buildings held

[01:53:42] large rooms these large buildings held data held the files of people the

[01:53:46] data held the files of people the structured data of business. It would

[01:53:48] structured data of business. It would pass through software that has tools and

[01:53:51] pass through software that has tools and you know systems of records and all

[01:53:53] you know systems of records and all kinds of workflow that's codified into

[01:53:56] kinds of workflow that's codified into it and that turns into tools that humans

[01:53:58] it and that turns into tools that humans would use

[01:54:00] would use digital workers would use. That is the

[01:54:02] digital workers would use. That is the old IT industry software companies

[01:54:05] old IT industry software companies creating tools saving files and of

[01:54:09] creating tools saving files and of course gsis consultants that help

[01:54:11] course gsis consultants that help companies figure out how to use these

[01:54:12] companies figure out how to use these tools and integrate these tools. These

[01:54:14] tools and integrate these tools. These in these tools are incredibly valuable

[01:54:16] in these tools are incredibly valuable for governance and security and privacy

[01:54:19] for governance and security and privacy and compliance and all of that's

[01:54:21] and compliance and all of that's continues to be true.

[01:54:23] continues to be true. It's just that post open clock post

[01:54:26] It's just that post open clock post agentic this is what it's going to look

[01:54:28] agentic this is what it's going to look like. This is the extraordinary part.

[01:54:31] like. This is the extraordinary part. Every single IT company, every single

[01:54:34] Every single IT company, every single company, every SAS company,

[01:54:37] company, every SAS company, every SAS company will become a

[01:54:42] every SAS company will become a a gas company.

[01:54:45] a gas company. No question about it. Every single SAS

[01:54:48] No question about it. Every single SAS company will become a gas company, an

[01:54:49] company will become a gas company, an agentic as a service company. And what's

[01:54:52] agentic as a service company. And what's amazing is this. You now open claw gave

[01:54:55] amazing is this. You now open claw gave us gave the industry exactly what it

[01:54:58] us gave the industry exactly what it needed at exactly the time.

[01:55:01] needed at exactly the time. Just as Linux gave the industry exactly

[01:55:05] Just as Linux gave the industry exactly what it needed at exactly the time just

[01:55:07] what it needed at exactly the time just as Kubernetes showed up at exactly the

[01:55:09] as Kubernetes showed up at exactly the right time just as HTML showed up it

[01:55:12] right time just as HTML showed up it made it possible for the entire industry

[01:55:14] made it possible for the entire industry to grab onto this open-source stack and

[01:55:18] to grab onto this open-source stack and go do something with it. There's just

[01:55:19] go do something with it. There's just one catch.

[01:55:21] one catch. Agentic systems

[01:55:24] Agentic systems in the corporate network can have access

[01:55:26] in the corporate network can have access to sensitive information. It can execute

[01:55:30] to sensitive information. It can execute code and it can communicate externally.

[01:55:33] code and it can communicate externally. Just say that out loud. Okay, think

[01:55:35] Just say that out loud. Okay, think about it. Access sensitive information,

[01:55:38] about it. Access sensitive information, execute code, communicate externally.

[01:55:41] execute code, communicate externally. You could of course access employee

[01:55:43] You could of course access employee information,

[01:55:45] information, access supply chain, access finance

[01:55:47] access supply chain, access finance information, sensitive information and

[01:55:48] information, sensitive information and send it out, communicate externally.

[01:55:51] send it out, communicate externally. Obviously,

[01:55:52] Obviously, this can't possibly be allowed. And so,

[01:55:55] this can't possibly be allowed. And so, what we did was we worked with Peter. We

[01:55:57] what we did was we worked with Peter. We took some of the world's best security

[01:55:59] took some of the world's best security and computing experts and we worked with

[01:56:02] and computing experts and we worked with Peter to make open claw

[01:56:06] Peter to make open claw open claw enterprise

[01:56:09] open claw enterprise secure and enterprise private capable.

[01:56:13] secure and enterprise private capable. And we call that

[01:56:16] And we call that this is our Nvidia open claw reference

[01:56:19] this is our Nvidia open claw reference for open nemo claw which is a reference

[01:56:21] for open nemo claw which is a reference for openclaw and it has all these

[01:56:24] for openclaw and it has all these agentic AI toolkits and the first part

[01:56:27] agentic AI toolkits and the first part of it is technology we call open shell

[01:56:30] of it is technology we call open shell that has now been integrated into open

[01:56:33] that has now been integrated into open claw now it's enterprise ready this

[01:56:38] claw now it's enterprise ready this stack this stack with a reference design

[01:56:41] stack this stack with a reference design we call Nemo cloud neoclaw

[01:56:43] we call Nemo cloud neoclaw Okay, with a reference stack we call

[01:56:45] Okay, with a reference stack we call Nemo clock. You could download it, play

[01:56:47] Nemo clock. You could download it, play with it, and you could connect to it the

[01:56:52] with it, and you could connect to it the policy engine of all of the SAS

[01:56:55] policy engine of all of the SAS companies in the world. And your policy

[01:56:57] companies in the world. And your policy engines are super important, super

[01:56:59] engines are super important, super valuable. So the policy engines could be

[01:57:01] valuable. So the policy engines could be connected Nemo Claw or Open Claw with

[01:57:05] connected Nemo Claw or Open Claw with Open Shell would be able to execute that

[01:57:07] Open Shell would be able to execute that policy engine. It has a polic

[01:57:11] policy engine. It has a polic guard rail. It has a privacy router and

[01:57:14] guard rail. It has a privacy router and as a result we could protect and keep

[01:57:18] as a result we could protect and keep the the clause from executing inside our

[01:57:22] the the clause from executing inside our company and do it safely. We also added

[01:57:25] company and do it safely. We also added several things to the agent system and

[01:57:28] several things to the agent system and one of the most important things you

[01:57:29] one of the most important things you want to do with your own

[01:57:32] want to do with your own claw custom claws is so that you can

[01:57:35] claw custom claws is so that you can have your custom models and this is

[01:57:37] have your custom models and this is Nvidia's open model initiative. We are

[01:57:40] Nvidia's open model initiative. We are now at the frontier of every single

[01:57:44] now at the frontier of every single domain of AI models. Whether it's

[01:57:47] domain of AI models. Whether it's Neimotron, Cosmos, World Foundation

[01:57:50] Neimotron, Cosmos, World Foundation model, Groot, artificial general

[01:57:53] model, Groot, artificial general robotics, human or robotics models,

[01:57:55] robotics, human or robotics models, Alpamo for autonomous vehicle, Bioneo

[01:57:59] Alpamo for autonomous vehicle, Bioneo for digital biology,

[01:58:01] for digital biology, Earth 2 for AI physics. We are at the

[01:58:04] Earth 2 for AI physics. We are at the frontier on every single one. Take a

[01:58:06] frontier on every single one. Take a look.

[01:58:09] The world is diverse. No single model

[01:58:12] The world is diverse. No single model can serve every industry.

[01:58:15] can serve every industry. Open Models is one of the largest and

[01:58:17] Open Models is one of the largest and most diverse AI ecosystems in the world.

[01:58:20] most diverse AI ecosystems in the world. Nearly 3 million open models across

[01:58:22] Nearly 3 million open models across language, vision, biology, physics, and

[01:58:25] language, vision, biology, physics, and autonomous systems enable AI builds for

[01:58:28] autonomous systems enable AI builds for specialized domains. NVIDIA is one of

[01:58:31] specialized domains. NVIDIA is one of the largest contributors to open-source

[01:58:33] the largest contributors to open-source AI. We build and release six families of

[01:58:36] AI. We build and release six families of open frontier models, plus the training

[01:58:39] open frontier models, plus the training data, recipes, and frameworks to help

[01:58:42] data, recipes, and frameworks to help developers customize and adopt new

[01:58:44] developers customize and adopt new leaderboard topping models are launching

[01:58:46] leaderboard topping models are launching for every family. At the core, Neotron

[01:58:50] for every family. At the core, Neotron reasoning models for language, visual

[01:58:53] reasoning models for language, visual understanding, rag,

[01:58:57] safety,

[01:59:00] safety, and speech.

[01:59:01] and speech. >> Can you hear me now? Hello. Yes, I can

[01:59:03] >> Can you hear me now? Hello. Yes, I can hear you now.

[01:59:05] hear you now. >> Cosmos Frontier models for physical AI

[01:59:08] >> Cosmos Frontier models for physical AI world generation and understanding.

[01:59:13] Alpayo, the world's first thinking and

[01:59:16] Alpayo, the world's first thinking and reasoning autonomous vehicle AI

[01:59:19] reasoning autonomous vehicle AI group foundation models for general

[01:59:22] group foundation models for general purpose robots. Bioneo open models for

[01:59:26] purpose robots. Bioneo open models for biology, chemistry, and molecular

[01:59:28] biology, chemistry, and molecular design.

[01:59:30] design. Earth 2 models for weather and climate

[01:59:33] Earth 2 models for weather and climate forecasting rooted in AI physics.

[01:59:37] forecasting rooted in AI physics. NVIDIA open models give researchers and

[01:59:39] NVIDIA open models give researchers and developers the foundation to build and

[01:59:42] developers the foundation to build and deploy AI for their own specialized

[01:59:44] deploy AI for their own specialized domains.

[01:59:46] domains. Our models our mo thank you

[01:59:52] our models are valuable to all of you

[01:59:54] our models are valuable to all of you because number one it's on the top of

[01:59:56] because number one it's on the top of the leaderboard. It's world class. But

[02:00:01] the leaderboard. It's world class. But most importantly, it's because we are

[02:00:03] most importantly, it's because we are not going to give up working on it.

[02:00:05] not going to give up working on it. We're going to keep on working on it

[02:00:06] We're going to keep on working on it every single day. Neotron 3 is going to

[02:00:08] every single day. Neotron 3 is going to be followed by Neotron 4. Cosmos one was

[02:00:11] be followed by Neotron 4. Cosmos one was followed by Cosmos 2. Groot Groot at

[02:00:14] followed by Cosmos 2. Groot Groot at generation 2. Each and one of these

[02:00:16] generation 2. Each and one of these we're going to continue to advance these

[02:00:18] we're going to continue to advance these models. vertical integration,

[02:00:21] models. vertical integration, horizontal openness, so that we can

[02:00:24] horizontal openness, so that we can enable everybody to join the AI

[02:00:27] enable everybody to join the AI revolution, number one on leaderboard

[02:00:29] revolution, number one on leaderboard across research and voice and world

[02:00:31] across research and voice and world models and artificial general robotics

[02:00:34] models and artificial general robotics and self-driving cars and reasoning and

[02:00:37] and self-driving cars and reasoning and of course one of the most important one.

[02:00:40] of course one of the most important one. This is Neotron 3 in

[02:00:44] This is Neotron 3 in Open Claw. This is Neimotron 3 and Open

[02:00:47] Open Claw. This is Neimotron 3 and Open Claw. And look at the top three. There

[02:00:50] Claw. And look at the top three. There are the three best models in the world.

[02:00:53] are the three best models in the world. Okay. So, we are at the frontier.

[02:01:00] It is also true. It is also true that we

[02:01:03] It is also true. It is also true that we want to create the foundation model so

[02:01:05] want to create the foundation model so that all of you could fine-tune it and

[02:01:07] that all of you could fine-tune it and post-train it into exactly the

[02:01:10] post-train it into exactly the intelligence you need. This is Neotron 3

[02:01:12] intelligence you need. This is Neotron 3 Ultra. It is going to be the best base

[02:01:16] Ultra. It is going to be the best base model the world's ever created. This

[02:01:18] model the world's ever created. This allows us to help every country build

[02:01:22] allows us to help every country build their sovereign AI. And we're working

[02:01:24] their sovereign AI. And we're working with so many different companies out

[02:01:26] with so many different companies out there. And one of the most exciting

[02:01:28] there. And one of the most exciting things that we're doing today, I'm

[02:01:29] things that we're doing today, I'm announcing today is a Neotron coalition.

[02:01:35] announcing today is a Neotron coalition. We are so dedicated to this. We have

[02:01:37] We are so dedicated to this. We have invested billions of dollars of AI

[02:01:39] invested billions of dollars of AI infrastructure so that we could develop

[02:01:41] infrastructure so that we could develop the core engines for AI that's necessary

[02:01:43] the core engines for AI that's necessary for all the libraries of inference and

[02:01:45] for all the libraries of inference and so on. But also to create the AI models

[02:01:49] so on. But also to create the AI models to activate every single industry in the

[02:01:52] to activate every single industry in the world. Large language models is really

[02:01:54] world. Large language models is really important. Of course, it's important.

[02:01:56] important. Of course, it's important. How could how could human intelligence

[02:01:58] How could how could human intelligence not be? However, in different industries

[02:02:01] not be? However, in different industries around the world, in different countries

[02:02:03] around the world, in different countries around the world, you need to have the

[02:02:05] around the world, you need to have the ability to customize your own models and

[02:02:08] ability to customize your own models and the domains of the domain of the domain

[02:02:10] the domains of the domain of the domain of the models is radically different

[02:02:12] of the models is radically different from biology to physics to self-driving

[02:02:14] from biology to physics to self-driving cars to general robotics to of course

[02:02:16] cars to general robotics to of course human language. And we have the ability

[02:02:18] human language. And we have the ability to work with every single region to

[02:02:21] to work with every single region to create their domain specific their

[02:02:23] create their domain specific their sovereign AI. Today we're announcing a

[02:02:26] sovereign AI. Today we're announcing a coalition to partner with us to make

[02:02:29] coalition to partner with us to make Neotron 4 even more amazing. And that

[02:02:33] Neotron 4 even more amazing. And that coalition has some amazing companies in

[02:02:36] coalition has some amazing companies in it. Black Forest Labs imaging company.

[02:02:38] it. Black Forest Labs imaging company. Cursor the famous coding company we use

[02:02:41] Cursor the famous coding company we use lots of it. Lang chain billion downloads

[02:02:45] lots of it. Lang chain billion downloads for creating custom agents. Mistrol the

[02:02:48] for creating custom agents. Mistrol the Arthur Arthur mentioned I think he's

[02:02:50] Arthur Arthur mentioned I think he's here. Incredible incredible company.

[02:02:52] here. Incredible incredible company. Perplexity Perplexes computer absolutely

[02:02:56] Perplexity Perplexes computer absolutely use it everybody use it. It is so good.

[02:02:59] use it everybody use it. It is so good. A multimodal agentic system. Reflection

[02:03:03] A multimodal agentic system. Reflection Sarv from India thinking machine mirror

[02:03:05] Sarv from India thinking machine mirror Morardi's lab. Incredible companies

[02:03:08] Morardi's lab. Incredible companies joining us. Thank you.

[02:03:14] I said I said that every single

[02:03:18] I said I said that every single enterprise company, every single

[02:03:19] enterprise company, every single software company in the world needs an

[02:03:22] software company in the world needs an agentic systems, need an agent strategy.

[02:03:25] agentic systems, need an agent strategy. you need to have an open claw strategy.

[02:03:27] you need to have an open claw strategy. And they all agree

[02:03:29] And they all agree and they're all partnering with us to

[02:03:31] and they're all partnering with us to integrate Nemo, the Nemo claw reference

[02:03:36] integrate Nemo, the Nemo claw reference design, the NVIDIA agentic AI toolkit,

[02:03:40] design, the NVIDIA agentic AI toolkit, and of course all of our open models.

[02:03:43] and of course all of our open models. One company after another. There's so

[02:03:45] One company after another. There's so many. And we're partnering with all of

[02:03:46] many. And we're partnering with all of you. I'm really grateful for that. And

[02:03:49] you. I'm really grateful for that. And um this is our moment. This is a

[02:03:51] um this is our moment. This is a reinvention. This is this is a

[02:03:53] reinvention. This is this is a renaissance

[02:03:55] renaissance a renaissance of the enterprise IT from

[02:03:59] a renaissance of the enterprise IT from what would be a $2 trillion industry.

[02:04:03] what would be a $2 trillion industry. This is going to become a multi-

[02:04:05] This is going to become a multi- trillion dollar industry offering not

[02:04:07] trillion dollar industry offering not just tools for people to use but agents

[02:04:11] just tools for people to use but agents that are specialized in very special

[02:04:13] that are specialized in very special domains that you're expert in that we

[02:04:15] domains that you're expert in that we could rent. I could totally imagine in

[02:04:18] could rent. I could totally imagine in the future every single engineer in our

[02:04:21] the future every single engineer in our company will need an annual token

[02:04:23] company will need an annual token budget.

[02:04:25] budget. They're going to make a few hundred,000

[02:04:27] They're going to make a few hundred,000 a year their base pay. I'm going to give

[02:04:29] a year their base pay. I'm going to give them probably half of that on top of it

[02:04:33] them probably half of that on top of it as tokens so that they could be

[02:04:35] as tokens so that they could be amplified 10x. Of course, we would. It

[02:04:39] amplified 10x. Of course, we would. It is now one of the recruiting tools in

[02:04:42] is now one of the recruiting tools in Silicon Valley. how many tokens comes

[02:04:45] Silicon Valley. how many tokens comes along with my job. And the reason for

[02:04:48] along with my job. And the reason for that is very clear because every

[02:04:50] that is very clear because every engineer that has access to tokens will

[02:04:53] engineer that has access to tokens will be more productive and those tokens as

[02:04:56] be more productive and those tokens as you know will be produced by AI

[02:04:58] you know will be produced by AI factories that all of you and us we

[02:05:01] factories that all of you and us we partner to build. Okay. So every single

[02:05:04] partner to build. Okay. So every single enterprise company in today sit on top

[02:05:07] enterprise company in today sit on top of file systems and data centers. Every

[02:05:10] of file systems and data centers. Every single software company of the future

[02:05:12] single software company of the future will be agentic and they will be token

[02:05:15] will be agentic and they will be token manufacturers. They'll be token users

[02:05:17] manufacturers. They'll be token users for their engineers and they'll be token

[02:05:19] for their engineers and they'll be token manufacturers for all of their

[02:05:21] manufacturers for all of their customers. The open clause in event, the

[02:05:25] customers. The open clause in event, the open claw event cannot be understated.

[02:05:28] open claw event cannot be understated. This is as big of a deal as HTML. This

[02:05:31] This is as big of a deal as HTML. This is as big of a deal as Linux. We have

[02:05:34] is as big of a deal as Linux. We have now a world-class open agentic framework

[02:05:39] now a world-class open agentic framework that all of us could use to build our

[02:05:42] that all of us could use to build our open claw strategy. And we've created a

[02:05:45] open claw strategy. And we've created a reference design we call Nemo cloud

[02:05:47] reference design we call Nemo cloud neoclaw that all of you could use that

[02:05:50] neoclaw that all of you could use that is optimized. It's performant. It is

[02:05:54] is optimized. It's performant. It is safe and secure.

[02:06:04] Speaking of agents, agents as you know

[02:06:07] Speaking of agents, agents as you know perceive, reason and act. Most of the

[02:06:10] perceive, reason and act. Most of the agents in the world today that I've

[02:06:12] agents in the world today that I've spoken about are digital agents. They

[02:06:14] spoken about are digital agents. They act in the digital world. They reason.

[02:06:17] act in the digital world. They reason. They write software. It's all digital.

[02:06:20] They write software. It's all digital. But we also have been working on

[02:06:23] But we also have been working on physically embodied agents for a long

[02:06:25] physically embodied agents for a long time. We call them robots. And the AIs

[02:06:28] time. We call them robots. And the AIs that they need are physical AIs. We have

[02:06:31] that they need are physical AIs. We have some big announcements here. I'm going

[02:06:33] some big announcements here. I'm going to just walk through a few of them. 110

[02:06:36] to just walk through a few of them. 110 robots here. Almost every single company

[02:06:39] robots here. Almost every single company in the world, I can't think of one that

[02:06:41] in the world, I can't think of one that are building robots is working with

[02:06:43] are building robots is working with Nvidia. We have three computers. The

[02:06:45] Nvidia. We have three computers. The training computer, the synthetic data

[02:06:47] training computer, the synthetic data generation and simulation computer, and

[02:06:50] generation and simulation computer, and of course the robotics computer that

[02:06:51] of course the robotics computer that sits inside the robot itself. We have

[02:06:54] sits inside the robot itself. We have all the software stacks necessary to do

[02:06:56] all the software stacks necessary to do so. the AI models to help you.

[02:07:00] so. the AI models to help you. And all of this is integrated into

[02:07:02] And all of this is integrated into ecosystems around the world and all of

[02:07:04] ecosystems around the world and all of our partners from Seammens to Cadence,

[02:07:07] our partners from Seammens to Cadence, incredible partners everywhere. And

[02:07:10] incredible partners everywhere. And today, we're announcing a whole bunch of

[02:07:12] today, we're announcing a whole bunch of new new partners. As you know, we've

[02:07:14] new new partners. As you know, we've been working on self-driving cars for a

[02:07:16] been working on self-driving cars for a long time. The Chad GPT moment of

[02:07:18] long time. The Chad GPT moment of self-driving cars has arrived. We now

[02:07:20] self-driving cars has arrived. We now know we could successfully autonomously

[02:07:23] know we could successfully autonomously drive cars. And today we are announcing

[02:07:26] drive cars. And today we are announcing four new partners for Nvidia's robo taxi

[02:07:30] four new partners for Nvidia's robo taxi ready platform.

[02:07:33] ready platform. BYD,

[02:07:35] BYD, Hyundai,

[02:07:36] Hyundai, Nissan,

[02:07:38] Nissan, Ji all together, 18 million cars built

[02:07:43] Ji all together, 18 million cars built each year. joining our partners from

[02:07:46] each year. joining our partners from before Mercedes, Toyota,

[02:07:49] before Mercedes, Toyota, GM. The number of robo taxi ready cars

[02:07:53] GM. The number of robo taxi ready cars in the future are going to be

[02:07:55] in the future are going to be incredible. And we're announcing also a

[02:07:58] incredible. And we're announcing also a big partnership with Uber.

[02:08:00] big partnership with Uber. Multiple cities were going to be

[02:08:02] Multiple cities were going to be deploying and connecting these robo taxi

[02:08:05] deploying and connecting these robo taxi ready vehicles into their network. And

[02:08:08] ready vehicles into their network. And so a whole bunch of new cars. We have uh

[02:08:11] so a whole bunch of new cars. We have uh ABB, Universal Robotics, uh CUKA, so

[02:08:15] ABB, Universal Robotics, uh CUKA, so many robotics companies here and we're

[02:08:17] many robotics companies here and we're working with them to implement our

[02:08:20] working with them to implement our physical AI models integrated into

[02:08:22] physical AI models integrated into simulation system so that we could

[02:08:24] simulation system so that we could deploy these robots into manufacturing

[02:08:27] deploy these robots into manufacturing lines all over. We have Caterpillar

[02:08:29] lines all over. We have Caterpillar here. We even have T-Mobile here. And

[02:08:32] here. We even have T-Mobile here. And the reason for that is in the future

[02:08:34] the reason for that is in the future that radio radio tower used to be a

[02:08:37] that radio radio tower used to be a radio tower is going to be an NVIDIA

[02:08:40] radio tower is going to be an NVIDIA aerial AI ram. And so this is going to

[02:08:43] aerial AI ram. And so this is going to be a robotics radio tower. Meaning it

[02:08:46] be a robotics radio tower. Meaning it can reason about the traffic, figures

[02:08:48] can reason about the traffic, figures out how to adjust its beam forming so

[02:08:50] out how to adjust its beam forming so that it could save as much energy as

[02:08:52] that it could save as much energy as possible and increase the amount of

[02:08:55] possible and increase the amount of fidelity as possible. There's so many

[02:08:57] fidelity as possible. There's so many humanoid robots here, but one of my

[02:09:00] humanoid robots here, but one of my favorites, one of my favorites is a

[02:09:04] favorites, one of my favorites is a Disney robot. You know what? Tell you

[02:09:06] Disney robot. You know what? Tell you what, let me just show you some of the

[02:09:08] what, let me just show you some of the videos. Let's look at that first.

[02:09:18] The first global rollout of physical AI

[02:09:21] The first global rollout of physical AI at scale is here. Autonomous vehicles.

[02:09:25] at scale is here. Autonomous vehicles. And with NVIDIA Alpamo, vehicles now

[02:09:28] And with NVIDIA Alpamo, vehicles now have reasoning, helping them operate

[02:09:30] have reasoning, helping them operate safely and intelligently across

[02:09:33] safely and intelligently across scenarios.

[02:09:35] scenarios. We ask the car to narrate its actions.

[02:09:38] We ask the car to narrate its actions. >> I'm changing lanes to the right to

[02:09:40] >> I'm changing lanes to the right to follow my route.

[02:09:42] follow my route. >> Explain its thinking as it makes

[02:09:44] >> Explain its thinking as it makes decisions.

[02:09:46] decisions. >> There's a double parked vehicle in my

[02:09:47] >> There's a double parked vehicle in my lane. I'm going around it.

[02:09:52] lane. I'm going around it. >> And follow instructions.

[02:09:53] >> And follow instructions. >> Hey, Mercedes. Can you speed up?

[02:09:57] >> Hey, Mercedes. Can you speed up? >> Sure, I'll speed up.

[02:10:02] >> This is the age of physical AI and

[02:10:04] >> This is the age of physical AI and robotics.

[02:10:07] robotics. Around the world, developers are

[02:10:09] Around the world, developers are building robots of every kind. But the

[02:10:11] building robots of every kind. But the real world is massively diverse,

[02:10:14] real world is massively diverse, unpredictable, full of edge cases. Real

[02:10:17] unpredictable, full of edge cases. Real world data will never be enough to train

[02:10:19] world data will never be enough to train for every scenario.

[02:10:21] for every scenario. We need data generated from AI and

[02:10:23] We need data generated from AI and simulation. For robots, compute is data.

[02:10:29] simulation. For robots, compute is data. Developers pre-trained World Foundation

[02:10:31] Developers pre-trained World Foundation models on internet scale video and human

[02:10:33] models on internet scale video and human demonstrations and evaluate the model's

[02:10:36] demonstrations and evaluate the model's performance to prepare them for

[02:10:38] performance to prepare them for post-training.

[02:10:41] Using classical and neural simulation,

[02:10:44] Using classical and neural simulation, they generate massive amounts of

[02:10:46] they generate massive amounts of synthetic data and train policies at

[02:10:48] synthetic data and train policies at scale.

[02:10:51] scale. To accelerate developers, Nvidia built

[02:10:53] To accelerate developers, Nvidia built open-source Isaac lab for robot training

[02:10:56] open-source Isaac lab for robot training and evaluation and simulation.

[02:10:58] and evaluation and simulation. Newton for extensible and GPU

[02:11:01] Newton for extensible and GPU accelerated differentiable physics

[02:11:02] accelerated differentiable physics simulation.

[02:11:04] simulation. Cosmos world models for neural

[02:11:06] Cosmos world models for neural simulation

[02:11:08] simulation and Groot open robotics foundation

[02:11:10] and Groot open robotics foundation models for robot reasoning and action

[02:11:12] models for robot reasoning and action generation.

[02:11:15] generation. With enough compute, developers

[02:11:17] With enough compute, developers everywhere are closing the physical AI

[02:11:20] everywhere are closing the physical AI data gap.

[02:11:22] data gap. Paratas AI trains their operating room

[02:11:24] Paratas AI trains their operating room assistant robot in NVIDIA Isaac Lab,

[02:11:27] assistant robot in NVIDIA Isaac Lab, multiplying their data with NVIDIA

[02:11:29] multiplying their data with NVIDIA Cosmos World models. Skilled AI uses

[02:11:33] Cosmos World models. Skilled AI uses Isaac Lab and Cosmos to generate

[02:11:35] Isaac Lab and Cosmos to generate post-training data for their skilled AI

[02:11:38] post-training data for their skilled AI brain. They use reinforcement learning

[02:11:40] brain. They use reinforcement learning to harden the model across thousands of

[02:11:43] to harden the model across thousands of variations. Humanoid

[02:11:45] variations. Humanoid uses Isaac Lab to train whole body

[02:11:48] uses Isaac Lab to train whole body control and manipulation policies.

[02:11:52] control and manipulation policies. Hexagon Robotics uses Isaac Lab for

[02:11:54] Hexagon Robotics uses Isaac Lab for training and data generation.

[02:11:57] training and data generation. Foxcon fine-tunes group models in Isaac

[02:12:00] Foxcon fine-tunes group models in Isaac Lab,

[02:12:01] Lab, as does Noble Machines.

[02:12:04] as does Noble Machines. Disney research uses their chamino

[02:12:06] Disney research uses their chamino physics simulator in Newton and Isaac

[02:12:09] physics simulator in Newton and Isaac lab to train policies across their

[02:12:11] lab to train policies across their character robots in every universe.

[02:13:09] Da

[02:13:22] Does

[02:13:24] Does >> ladies and gentlemen Olaf

[02:13:30] >> does coming through Newton Newton works.

[02:13:34] >> does coming through Newton Newton works. >> Wow.

[02:13:35] >> Wow. >> Omniverse works.

[02:13:38] >> Omniverse works. Olaf,

[02:13:39] Olaf, how are you?

[02:13:41] how are you? >> I'm so happy now that I'm meeting you.

[02:13:44] >> I'm so happy now that I'm meeting you. >> I know because I gave you your computer,

[02:13:47] >> I know because I gave you your computer, Jetson.

[02:13:48] Jetson. >> What's that?

[02:13:50] >> What's that? Well, it's in your tummy.

[02:13:53] Well, it's in your tummy. >> That's going to be amazing.

[02:13:55] >> That's going to be amazing. >> And you learn how to walk inside

[02:13:59] >> And you learn how to walk inside Omniverse.

[02:14:00] Omniverse. >> I love walking. This is so much better

[02:14:04] >> I love walking. This is so much better than riding on a reindeer gazing up at a

[02:14:06] than riding on a reindeer gazing up at a beautiful sky.

[02:14:09] beautiful sky. And

[02:14:10] And it was because of physics using this

[02:14:13] it was because of physics using this Newton solver that runs on top of Nvidia

[02:14:17] Newton solver that runs on top of Nvidia Warp that we jointly developed with

[02:14:19] Warp that we jointly developed with Disney and with DeepMind that made it

[02:14:22] Disney and with DeepMind that made it possible for you to be able to adapt to

[02:14:25] possible for you to be able to adapt to the physical world. Check that out.

[02:14:27] the physical world. Check that out. >> Not to say that

[02:14:30] >> Not to say that that's how smart you are.

[02:14:32] that's how smart you are. >> I'm a snowman, not a snowed.

[02:14:38] Could you imagine this? The future of

[02:14:41] Could you imagine this? The future of Disneyland. All these all these robots,

[02:14:44] Disneyland. All these all these robots, all these characters wandering around.

[02:14:47] all these characters wandering around. >> Oh,

[02:14:47] >> Oh, >> you know, I have to admit though, I

[02:14:49] >> you know, I have to admit though, I thought you were going to be taller.

[02:14:51] thought you were going to be taller. I've never seen such a short snowman, to

[02:14:54] I've never seen such a short snowman, to be honest.

[02:14:55] be honest. >> Nope.

[02:14:57] >> Nope. >> Hey, tell you what. You want to help me

[02:15:00] >> Hey, tell you what. You want to help me out?

[02:15:00] out? >> Hooray.

[02:15:02] >> Hooray. >> Okay. Usually usually I close the

[02:15:05] >> Okay. Usually usually I close the keynote by talk telling you what I told

[02:15:08] keynote by talk telling you what I told you. We talked about inference and

[02:15:10] you. We talked about inference and flection. We talked about the AI

[02:15:11] flection. We talked about the AI factory. We talked about the open claw

[02:15:14] factory. We talked about the open claw agent revolution that's happening. And

[02:15:16] agent revolution that's happening. And of course we talked about physical AI

[02:15:19] of course we talked about physical AI and robotics. But tell you what, why

[02:15:21] and robotics. But tell you what, why don't we get some friends to help us

[02:15:23] don't we get some friends to help us close it out?

[02:15:24] close it out? >> Of course.

[02:15:25] >> Of course. >> All right, play it.

[02:15:28] >> All right, play it. Come on.

[02:15:29] Come on. >> Terminating simulation.

[02:15:38] Hello.

[02:15:43] Anybody here?

[02:16:09] The keynotes over all was said. Jensen

[02:16:12] The keynotes over all was said. Jensen map the road ahead. AI factories coming

[02:16:15] map the road ahead. AI factories coming alive. Agents learning how to drive from

[02:16:18] alive. Agents learning how to drive from open models to robots too. Now we'll

[02:16:21] open models to robots too. Now we'll break it all down.

[02:16:26] Comput exploded. What we saw from CNN's

[02:16:29] Comput exploded. What we saw from CNN's to open cloth agents working across the

[02:16:32] to open cloth agents working across the land but they need the power to meet

[02:16:34] land but they need the power to meet demand. So we saw the problem it was

[02:16:37] demand. So we saw the problem it was brilliant. We multiplied compute by 40

[02:16:40] brilliant. We multiplied compute by 40 million.

[02:16:51] But once upon AI time training was the

[02:16:55] But once upon AI time training was the paradigm. Sure it talk models how in

[02:16:58] paradigm. Sure it talk models how in France runs the whole world now shows us

[02:17:01] France runs the whole world now shows us who's the bars at 35 times less the cost

[02:17:04] who's the bars at 35 times less the cost blackwell makes the token singing video

[02:17:08] blackwell makes the token singing video the inference king.

[02:17:13] Yeah, our factories once took years as

[02:17:15] Yeah, our factories once took years as vendors pulling racks and gears. Built

[02:17:18] vendors pulling racks and gears. Built up slowly, piece by piece. No clear way

[02:17:20] up slowly, piece by piece. No clear way to scale this beast. DSX and Dynamo know

[02:17:24] to scale this beast. DSX and Dynamo know what to do.

[02:17:26] what to do. Turning power

[02:17:29] Turning power into revenue.

[02:17:35] Agents used to wait and see, now act

[02:17:38] Agents used to wait and see, now act autonomously. But if they ever try to

[02:17:41] autonomously. But if they ever try to stray, safe claws block and say no way.

[02:17:43] stray, safe claws block and say no way. Nemo claws there to guard the course.

[02:17:47] Nemo claws there to guard the course. And yes, my friends,

[02:17:51] And yes, my friends, it's open sorrow.

[02:18:01] Cars that think and droids that run.

[02:18:03] Cars that think and droids that run. This ain't the movies. It's all begun.

[02:18:05] This ain't the movies. It's all begun. Alamo calls the shots. It's a GPT moment

[02:18:09] Alamo calls the shots. It's a GPT moment for the bots from sim streets. Now watch

[02:18:12] for the bots from sim streets. Now watch them drive. Blow your hands up

[02:18:17] them drive. Blow your hands up for physical AI.

[02:18:31] Industrial age. Build what came before.

[02:18:33] Industrial age. Build what came before. Now we build for AI. Even more vera

[02:18:36] Now we build for AI. Even more vera rubin plus grog make the inference

[02:18:38] rubin plus grog make the inference splash put them together now it's

[02:18:39] splash put them together now it's raining cash we build new architecture

[02:18:42] raining cash we build new architecture every year cuz claws keep yelling more

[02:18:44] every year cuz claws keep yelling more tokens here the AI stacks for all to

[02:18:46] tokens here the AI stacks for all to make so let us all eat five layer cake

[02:18:49] make so let us all eat five layer cake the moment's bright the path is clear

[02:18:51] the moment's bright the path is clear cuz open models led us here when data's

[02:18:54] cuz open models led us here when data's missing there's no dispute we just

[02:18:56] missing there's no dispute we just generate more with compute robots

[02:18:58] generate more with compute robots learning without flaw fueling the four

[02:19:01] learning without flaw fueling the four scaling laws the future's here won't you

[02:19:03] scaling laws the future's here won't you come and see welcome Welcome all to GTC.

[02:19:22] All right, have a great GTC

[02:19:27] wave.

[02:19:29] wave. Thank you everybody.

[02:19:34] I just met