# Watch Nvidia's Computex 2026 Keynote Live

https://www.youtube.com/watch?v=yL52AFBPBKo

[00:06] A bit of heartbeat hides in the home like a quiet prayer.
[00:12] You walk in, laughing and laughing.
[00:18] Suddenly there's color in the air.
[00:21] Every small touch feels like fireworks.
[00:24] And the sky feels brand new.
[00:27] Didn't think the world could flip like that.
[00:31] But it did when I saw You're going to see something amazing today.
[00:40] Like the first blue sky.
[00:44] And doing so bright.
[00:48] Something special just suddenly changes.
[00:55] It's called the bright heartbeat.
[00:57] Everything rearranges.
[01:01] You're going to see something amazing today.
[01:09] Maybe it's a stranger saying hi.
[01:13] Maybe it's a tear you let fall this time.
[01:16] Maybe it's just you.
[01:22] But I believe in you.
[01:36] You're going to see something amazing today.
[01:43] Like the first blue sky.
[01:48] And doing so.
[01:51] Something special.
[01:53] Just suddenly changes.
[01:58] It's called the bright heartbeat.
[02:51] Dust on my boots and a full tank.
[02:52] Sunburn low in the rearview glass.
[02:56] She laughed and said, "Boy, this is your big plan."
[02:58] I smiled and said, "Girl, that's the first half.
[03:00] We ain't even hit that gravel.
[03:03] We ain't even killed that light.
[03:05] If you think this is a party, stick around tonight.
[03:09] You ain't seen nothing yet.
[03:11] We're just getting started, babe.
[03:13] Got a backroad, got a sunset, got your hand in mine and a wild heart set.
[03:19] On every mile, every kiss, every red Corvette in my head, baby, this is just the first step.
[03:33] Cooler in the bed, old shirts for a blanket, that song you love on the dash lit blue.
[03:39] You said, "This feels like the ending."
[03:40] I said, "No, this is the preview.
[03:43] We ain't even hit that river.
[03:45] We ain't even lost that phone.
[03:47] If you think you know my favorite, wait till I make you my home.
[03:50] You ain't seen
[03:52] Nothing.
[03:53] We're just getting started, babe.
[03:56] Got a backroad, got a sunset, got your name and mine on a mailbox sketch.
[04:01] Every vow, every fight, every makeup breath in our bed, baby, this is just the first step.
[04:07] You ain't seen nothing yet.
[04:21] Picture little boots by the front door, tying a jersey with my last name.
[04:24] If you think this love is crazy, you ain't read the back half of this page.
[04:33] We're just getting started, babe.
[04:36] From the backseat to the front porch to a crowded church and a white dress.
[04:41] Every year, every tear, every sweet silhouette in that dress, baby, this is just the first step.
[04:48] You ain't seen nothing yet.
[05:11] Woah, oh.
[05:13] I live for days like these.
[05:15] I think I must be dreaming.
[05:18] But I don't want to wake up.
[05:20] And I don't need a reason to spread a little love around.
[05:25] I'll put a smile on your face, brighten up everybody's day.
[05:29] Cuz all I see is blue skies on my horizon.
[05:33] Write down this is our moment.
[05:36] No better time or place.
[05:38] Freedom's our only focus.
[05:41] Won't hit the brakes.
[05:44] We're coming up.
[05:44] Can't Can't this feeling.
[05:47] I want it on repeat.
[05:49] You better up. believe it.
[05:51] Ain't going nowhere.
[05:53] Run away, run away, run away now.
[05:55] It's a new era team.
[05:58] Can't stop this feeling.
[06:01] I live for days like these.
[06:04] Woah, oh, oh.
[06:07] I live for I live
[06:13] Woah, Wisconsin might be rolling.
[06:16] And we won't turn around.
[06:19] Yeah, we're on our way up.
[06:21] And ain't nobody can bring us down.
[06:24] We turn up, we're coming through.
[06:26] We won't stop, no way we can't lose.
[06:28] Yeah, here we are, written in the stars.
[06:30] Came a long way, and we're going to go so far.
[06:33] Right down the scissors, a moment.
[06:34] No better time or place.
[06:38] Freedom's our only focus.
[06:40] Won't hit the brakes.
[06:42] We're coming out.
[06:44] Can't stop this feeling.
[06:47] I want it on repeat.
[06:49] You better believe it.
[06:51] Ain't going nowhere.
[06:53] Run away, run away, run away now.
[06:55] It's a new era team.
[06:58] Can't stop this feeling.
[07:01] I live for days like these.
[07:04] Woah, oh, oh.
[07:05] I live for I live.
[07:15] I live for I live for.
[07:25] I live for I.
[07:29] I live for days like these.
[07:33] Woah, oh, oh.
[07:34] I live for I live.
[07:39] I live for days like these?
[07:48] I've been counted out, left on read, left in doubt.
[07:53] Every maybe felt like never, but I kept your picture in my wallet, folded.
[07:59] Corners. Yeah.
[08:00] I saw it every time the world.
[08:03] Said, "Whatever."
[08:05] I get that tightness in my chest.
[08:07] When the floor drops, I don't rest.
[08:09] I just breathe, then move step by step.
[08:15] I'm never going to stop trying.
[08:18] Even when the waves keep rising.
[08:22] You can see the fear in my eyes, and still I'm standing.
[08:28] I'm never going to stop reaching.
[08:32] Even when my hands start bleeding.
[08:35] Every little scar is a reason.
[08:39] I'm still standing.
[08:41] Never going to stop trying.
[08:53] Friends turn strangers overnight.
[08:54] Dreams went quiet.
[08:58] Lost their light, but I learned to grow in the shadow.
[08:59] Talk to
[09:01] mirrors like a coach.
[09:03] Get up, that's all she wrote.
[09:05] So, I run through every high, every hollow.
[09:08] And it's in my chest.
[09:10] But this fire don't accept any and it says regret.
[09:14] Not yet.
[09:16] I'm never going to stop trying.
[09:19] Even when the waves keep rising.
[09:23] You can see the fear in my eyes, and still I'm standing.
[09:29] I'm never going to stop reaching.
[09:33] Even when my hands start bleeding.
[09:36] Every little scar is a reason I'm I'm still standing.
[09:42] Never going to stop trying.
[09:57] If I fall tonight, I'll rise.
[10:04] All my failures on my side like a choir for my life singing one more time, one more time.
[10:13] Hey.
[10:13] I'm never going to stop trying, trying even when the waves keep rising, rising.
[10:20] You can see the fear in my eyes and still I'm standing.
[10:27] I'm never going to stop reaching, reaching even when my hands start bleeding, bleeding.
[10:34] Every little scar is a reason I'm still standing.
[10:40] Never going to stop trying.
[10:59] Hey.
[11:02] Yeah.
[11:05] Glow on my face like I painted the sun.
[11:09] Move with the rhythm from that can't be undone.
[11:14] Every little heartbeat begging for fun.
[11:18] Watch me light it up.
[11:21] I'm second to none.
[11:25] I'M RISING UP.
[11:29] I'm rising now.
[11:37] I'm the sparkle.
[11:41] Fire from the soul.
[11:55] Light in my stride like I'm born for the stage.
[12:02] Feeling the freedom, I'm turning the page.
[12:05] Bowing in my presence, never play safe.
[12:08] Catch me in the moment, shadow on the way.
[12:16] I'M TURNING UP.
[12:20] I'M TURNING UP NOW.
[12:26] OH, OH, I'm the sparkle.
[12:30] Fire from the soul.
[12:57] Come turning up now.
[13:29] Hey.
[13:40] Phone face down, still I keep checking.
[13:45] My mind runs wild, every scene projecting your name.
[13:51] Doodle on my playlist, if this is love I kind of want to stay in this.
[13:58] I can't wait to see this play out in real life, us in the front row laughing through the late nights.
[14:04] I can't wait to feel it, first love first kiss, whatever this becomes.
[14:12] I can't wait to see this.
[14:22] You say something, yeah, you always say it.
[14:27] I listen, then then I believe what I remember when you smile, even through a screen glow.
[14:35] Whole room fades, I'm ready just to let go.
[14:39] I can't wait to see this play out in real life,
[14:42] us in the front row laughing through the late nights.
[14:46] I can't wait to feel it, first love first kiss, whatever this becomes.
[14:52] I can't wait to see this.
[14:58] What if it's better than all my daydreams?
[15:03] What if you're braver than I believe?
[15:07] Say, "Meet me outside and I won't resist."
[15:12] I've waited all this time.
[15:17] Can't wait for this.
[15:19] I can't wait to see this play out in real life, us in the front row laughing.
[15:39] This is how intelligence is made.
[15:43] A new kind of factory.
[15:46] Generator of tokens.
[15:48] The building blocks of AI.
[15:55] Tokens have opened a new frontier.
[15:58] Turning data into knowledge, reason, action.
[16:07] They reveal patterns in complexity.
[16:10] We could never see.
[16:18] Mirror our cities.
[16:20] To keep us safe.
[16:28] And lift us high.
[16:31] Above them.
[16:36] Tokens help robots learn from us.
[16:43] Work alongside us.
[16:53] They go where we cannot.
[17:01] Lending us helping hands.
[17:06] Closing the gap between hope and healing.
[17:11] So that we breathe easier.
[17:16] And the smallest hearts beat stronger.
[17:35] Tokens are helping us break new ground
[17:42] on a scale never attempted.
[17:56] So we can reach
[17:58] Star Cloud 1, separation confirmed.
[18:01] to infinity
[18:04] and beyond.
[18:11] Together, we take the next great leap
[18:14] into a bright new future
[18:20] built for all mankind.
[18:33] And here in Taipei is where it all begins.
[18:48] Welcome to the stage, NVIDIA founder and CEO Jensen Huang.
[19:00] Welcome to GTC Taiwan.
[19:06] So great to see all of you.
[19:09] Very good to be home.
[19:11] I brought my parents home.
[19:13] Where are my parents?
[19:15] Everybody give round of applause to my mom and dad.
[19:23] And a round of applause for our pre-game show
[19:29] superstars, ladies and gentlemen.
[19:35] Look how adorable they are.
[19:39] The superstars of Taiwan.
[19:42] Uh there are so many of you here today.
[19:43] We are broadcasting this right now to 70 other watch parties across Taiwan.
[19:52] 70 different conferences are going at the same time.
[19:54] Everybody is watching this keynote.
[19:57] We have so much to tell you, and I have so many partners to thank.
[20:02] It is incredible how large our ecosystem in Taiwan has become.
[20:07] Most of the time, when people think about ecosystem, they think about our software stack.
[20:11] They think about the developer ecosystem above the computing systems that Nvidia builds.
[20:19] But Nvidia's ecosystem spans all the way upstream to all of our supply chain here in Taiwan, where it all begins, and downstream all the way to data centers
[20:32] and eventually to end users.
[20:35] Today, we're going to talk about almost all of the ecosystem.
[20:37] There's so many people to thank.
[20:39] I love my ecosystem here.
[20:43] I mean, there are so many companies here,
[20:45] and some of my favorite ecosystem partners.
[21:22] So many Taiwan's rich ecosystem.
[21:27] The richest ecosystem, the world's best supply chain ecosystem.
[21:30] Unbelievable.
[21:33] Well, thank you all for being here and
[21:36] uh this year this year our businesses
[21:38] together are growing in incredibly. In
[21:41] fact, somebody told me last night
[21:44] that the annual GDP
[21:46] of Taiwan is going to grow almost 10%.
[21:52] >> [applause]
[21:57] >> Unbelievable.
[21:58] Well,
[21:59] we have a lot to talk about. Let's get
[22:00] going.
[22:01] Two years ago when I was here, I started
[22:04] to talk to you about how AI has moved
[22:06] from generative AI and the other waves
[22:08] of AIs that are coming. The next wave of
[22:11] AI was agentic AI.
[22:13] And today, we can say that agentic AI
[22:16] has arrived, that useful AI has arrived.
[22:20] Now, what does this mean? This is
[22:23] GitHub. This is of course one of the
[22:25] first applications of
[22:26] agentic AI is software coding.
[22:29] One of the most valuable professions.
[22:32] Incredibly large ecosystem. 30 million,
[22:35] 40 million professional software
[22:38] developers. Probably another couple of
[22:40] hundred who are students and enthusiasts
[22:44] and so on so forth, but say 30, 40
[22:47] million software developers in the world
[22:49] code for a living.
[22:51] And this represents most of them. This
[22:54] is GitHub. The pull request is when they
[22:57] download software, they modify it, and
[23:00] commit is when they push it back up.
[23:03] Okay? And so, if you could look at this
[23:05] in 20
[23:08] 23
[23:10] the number of commits was 300 million.
[23:13] 2024
[23:15] 400 million. 2025
[23:19] 500 million
[23:21] commits
[23:22] in the first few months
[23:25] in the first few months
[23:27] of 2026, it has nearly tripled. Now,
[23:30] what does that mean?
[23:33] 30 million software developers
[23:35] representing about 3 trillion dollars
[23:40] worth of GDP
[23:41] producing three That's what they're
[23:43] paid. 3 trillion dollars worth of
[23:46] salaries per year, which is generating
[23:49] economic growth for the rest of the
[23:52] industries.
[23:53] Say 100 trillion dollars of the world's
[23:54] industries is impacted
[23:57] is generated by
[24:00] 3 billion dollars worth of salary. That
[24:02] 3 trillion dollars, excuse me, 3
[24:04] trillion That 3 trillion dollars worth
[24:06] of salary is now producing nearly three
[24:10] times as much output.
[24:13] It's effectively a 9 trillion dollar
[24:17] productivity
[24:19] from 3 trillion dollars of salaries.
[24:22] Does that make any sense?
[24:24] The difference is absolutely
[24:25] extraordinary. This is the potential.
[24:27] This is the promise of AI.
[24:29] The number of engineers, software
[24:31] engineers, is actually increasing.
[24:32] People talk about
[24:34] AI reducing jobs, complete nonsense.
[24:37] It's causing more software engineers to
[24:40] be hired, and the reason for that is
[24:41] very simple.
[24:43] If you can hire a software engineer
[24:45] and you could generate
[24:47] 9 trillion dollars worth of
[24:49] productive work, why wouldn't you want
[24:51] to hire more software engineers?
[24:54] If that line was flat
[24:57] then obviously people will hire fewer
[25:00] software engineers. But because the
[25:01] output is so incredible, people want to
[25:04] hire more software engineers. This is
[25:05] going to show up in our economy somehow
[25:07] soon.
[25:08] And so, the first thing is useful AI has
[25:11] arrived. Now, what does that mean from
[25:13] the industry's perspective?
[25:15] From the industry's perspective, that
[25:16] means that tokens are now in
[25:19] extraordinary demand. Because if you
[25:22] could do this, you're going to want to
[25:23] produce more of it. And because tokens
[25:25] are now profitable units,
[25:28] tokens are now prof- profitable units of
[25:31] revenues.
[25:33] Because it is now profitable, the AI
[25:35] companies want to build a lot more
[25:37] tokens, generate a lot more tokens,
[25:39] build more AI factories, which is the
[25:41] reason why compute demand here in Taiwan
[25:45] has skyrocketed.
[25:47] It is precisely the reason why all of
[25:50] you are so busy and your businesses are
[25:52] doing so well. In fact, that looks like
[25:55] some of your stock price.
[26:01] >> [applause]
[26:05] >> The compute pattern has changed.
[26:07] Everything has changed. So,
[26:09] the first idea is that useful AI has
[26:12] arrived. AI is now a profit generator.
[26:16] AI is now a GDP generator. And behind it
[26:20] is a whole new kind of computing
[26:22] pattern. Not just a large language
[26:24] model, but an agent. Today,
[26:27] almost everything we're going to talk
[26:28] about is going to be based on this. So,
[26:31] let me take a quick moment and show you
[26:32] what I'm talking about. Inside,
[26:35] in this is a this is an agent.
[26:37] It's an agent application.
[26:40] In the old days, this would be
[26:42] application,
[26:44] this would be code,
[26:46] and this would be operating system.
[26:50] Application,
[26:52] code running inside an application
[26:54] inside an operating system. Today, it is
[26:56] agent which consists of a large language
[27:00] model or many
[27:02] sitting inside a harness
[27:05] and that harness
[27:06] helps it orchestrates it to do
[27:09] productive work.
[27:10] This is the input.
[27:12] When that input comes
[27:14] it has to understand, observe, reason,
[27:17] act, use tools.
[27:20] Use tools. That tool could be a
[27:22] spreadsheet, web browser, a data
[27:25] processing engine, database engine, for
[27:27] example.
[27:29] This
[27:30] is orchestrated, this harness
[27:32] orchestrate this
[27:34] routing of information every single time
[27:37] it touches either processing the
[27:39] context,
[27:40] understanding what is happening,
[27:43] reasoning about what to do,
[27:45] coming up with a plan
[27:47] that you can act that it acts on. That
[27:50] orchestration path is orchestrated by
[27:53] some software.
[27:54] And so this is fundamentally a agent. It
[27:58] deals with short-term memory
[28:00] called working memory, long-term memory,
[28:03] just like we do, we have long-term
[28:04] memory. And so the memory management
[28:06] system is incredibly important. This
[28:09] entire system is called an agent.
[28:13] The large language model
[28:15] is used to do the thinking
[28:17] and the harness
[28:20] connects everything together, just like
[28:22] an operating system.
[28:23] Okay? And so this is the new computing
[28:26] model and this is what an agent.
[28:28] It could do incredible things. This is
[28:30] the big breakthrough.
[28:32] The simultaneous convert the convergence
[28:35] of large language models that are now
[28:37] able to
[28:38] do a really good job thinking,
[28:40] reasoning, planning, using tools,
[28:42] and the fact that we have now these
[28:44] harnesses that manages memory, the
[28:47] orchestration,
[28:48] uses tools, we can now do amazing
[28:51] things. Let me give you some example.
[28:53] This is This is a prompt. This is the
[28:55] prompt.
[28:57] This is the code that is generated,
[29:00] and this comes out.
[29:03] This is the input.
[29:06] This is the input,
[29:08] and that's the output.
[29:11] Do you guys What do you guys think? It's
[29:12] pretty amazing, right?
[29:16] >> [applause]
[29:18] >> The We use Claude code here, but Codex
[29:20] it does an incredible job as well.
[29:21] Here's another example.
[29:23] This is the input. Create a GIF NVIDIA
[29:25] gen
[29:26] green dots on black scatter form
[29:32] Taiwan 101 building morph to GTC Taipei
[29:35] NVIDIA eye logo
[29:38] then scatter and repeat.
[29:40] Right? So, you saw that. That was the
[29:42] prompt. Here's the next one. I lost my
[29:45] remote control battery clip. It looks
[29:47] like this. Create a CAD file. It uses a
[29:51] tool. Create a CAD file ready for 3D
[29:53] printing to create a new
[29:55] new one.
[29:57] Make sense?
[29:58] This is now the new computing pattern.
[30:01] Whereas, we used to launch an
[30:03] application,
[30:05] click and type.
[30:07] We now replace that with explaining to
[30:11] the AI what we want, our intent, and the
[30:14] AI generates the code or uses tools and
[30:18] produce the necessary output.
[30:21] This is
[30:22] how computers are going to work in the
[30:24] future.
[30:25] This is agentic AI.
[30:28] For 2 years we've been building towards
[30:29] this, and now it has arrived. Now, one
[30:32] of the big breakthroughs, of course,
[30:34] is tool use.
[30:36] A lot of people have said,
[30:37] you know, Jensen, AI's coming, agentic
[30:40] AI's coming, therefore all of the
[30:41] software companies are going to go out
[30:42] of business.
[30:44] I said it's exactly the opposite.
[30:46] Because there are going to be so many
[30:48] agents,
[30:50] the world is no longer limited by the
[30:52] number of people.
[30:53] Therefore,
[30:55] those agents are going to use more tools
[30:57] than ever.
[30:59] This is actually an incredible time to
[31:01] be a software company.
[31:03] But the software has to be presented to
[31:05] the agent in a way that the agent can
[31:08] use it.
[31:09] This is a break big breakthrough. And in
[31:11] fact, what we have done, as you know,
[31:14] what Nvidia's treasure is,
[31:17] is all of our CUDA libraries. I call
[31:19] them CUDA-X libraries. This is Nvidia's
[31:22] treasure.
[31:23] Today,
[31:24] we're able to now pre- present these
[31:27] CUDA-X libraries to agents
[31:30] who can use it much more effectively
[31:33] than even humans. And so, this is a
[31:35] wonderful time for CUDA-X libraries.
[31:37] Let's take a look.
[31:42] >> 20 years ago, we built CUDA.
[31:44] A single architecture for accelerated
[31:46] computing. [music]
[31:47] We reinvented computing.
[31:50] A thousand CUDA-X libraries help
[31:52] developers make breakthroughs [music] in
[31:54] every field of science and engineering.
[31:57] CUDA-X libraries are tools for agents.
[31:59] [music]
[32:01] cuLitho for computational lithography.
[32:05] cuOpt for decision optimization.
[32:10] cuDSS [music] for direct sparse solvers.
[32:14] AIQ for [music] deep research across
[32:16] structured and unstructured documents.
[32:20] Ariel [music] for AI RAN.
[32:24] Warp for differentiable physics.
[32:28] Parabricks for genomics. [music]
[32:31] At their foundation are algorithms.
[32:34] And they
[32:35] are beautiful.
[32:46] >> [music]
[35:26] [applause]
[35:29] >> A round of applause for math.
[35:32] Math is beautiful.
[35:37] >> [applause]
[35:39] >> The computing pattern The computing
[35:41] pattern of software is going to change.
[35:43] In fact, let's come back to this.
[35:46] This is the agent. It is
[35:48] the ultimate
[35:51] disaggregated
[35:52] and distributed computing computing
[35:55] model.
[35:56] So many different computers are going to
[35:58] be activated in order to process this
[36:01] agent. The agent consists of model,
[36:05] harness,
[36:07] tools and skills,
[36:10] and
[36:12] a runtime.
[36:14] All of that is running at different
[36:16] places in a data center.
[36:19] You can think of the model as the brain,
[36:23] the harness as the body,
[36:26] the tools that it uses,
[36:28] working
[36:30] in a runtime, think of it as a workshop.
[36:33] So, this is a person, a
[36:36] worker, working with tools in a
[36:38] workshop. Of course, this is being done
[36:40] at extraordinarily large scales.
[36:43] And each one of those steps are running
[36:45] in a different part of the computer.
[36:48] And you could see
[36:49] the large language model is thinking,
[36:52] context processing,
[36:54] observing, understanding the
[36:56] environment, reasoning,
[36:58] coming up with a plan, and acting on the
[37:00] plan.
[37:01] Every single time that happens, an
[37:03] entire rack of Grace Bot Blackwell
[37:06] NVLink 72 is activated. It's thinking
[37:10] with the large language model.
[37:12] Whenever it uses a tool,
[37:15] a CPU use is used. That tool could be
[37:19] a C compiler. It could be Python. It
[37:21] could be JavaScript. Or, it could be
[37:24] accelerated computing. Today's agents
[37:27] are rel- relatively simple users of
[37:29] tools.
[37:31] Tomorrow, they're going to be very
[37:32] sophisticated users of tools, which is
[37:34] the reason why the CUDA-X libraries that
[37:37] I showed you are going to be incredibly
[37:38] popular with agents.
[37:40] They solve some of the most important
[37:42] problems the world knows.
[37:44] And all of our CUDA-X libraries are now
[37:47] now going to come with skills that the
[37:50] AI could learn how to use.
[37:53] So, the CUDA-X library,
[37:55] some skills, basically a manual, the AI
[37:58] reads it and go, "Aha! That's how you
[38:00] use it."
[38:03] The ability to use these libraries by
[38:05] agents are going to be incredible.
[38:07] And so, the tools run on CPUs and GPUs
[38:11] and large language models.
[38:13] The security harness runs on CPUs and a
[38:17] security processor called a DPU,
[38:20] NVIDIA's BlueField. The orchestration of
[38:23] all this runs on a CPU. This is the
[38:25] entire harness and the CPU's
[38:28] orchestrating all of the work.
[38:30] One of the hardest parts is memory. You
[38:33] could just imagine. The working memory
[38:35] is called KB caching.
[38:37] What to remember, compaction, not just
[38:40] compression, but how to retrieve. Do you
[38:43] retrieve structured data? Do you
[38:45] retrieve unstructured data?
[38:47] What is the ontology, the relationship
[38:50] of all of these different data to
[38:51] itself?
[38:53] That entire processing is incredibly
[38:55] complicated. The memory system, the
[38:58] memory system of AI's is going to cause
[39:01] the storage system to be completely
[39:04] revolutionized.
[39:05] As you could see,
[39:07] every aspect of this computing
[39:10] model, this computing pattern, this new
[39:13] application called an agent, is
[39:15] fundamentally different than the way
[39:17] that applications used to run.
[39:19] A whole bunch of software sitting inside
[39:21] a binary, sitting inside an operating
[39:24] system.
[39:25] This is the reason
[39:27] this disaggregated,
[39:29] this distributed,
[39:31] this heterogeneous computing problem is
[39:33] precisely the reason
[39:35] we built our next generation
[39:38] Vera Rubin.
[39:40] Vera Rubin is not one chip.
[39:43] Vera Rubin is not a GPU only. It starts
[39:47] with a GPU,
[39:48] but Vera Rubin is incredible.
[39:52] This entire thing is Vera Rubin.
[39:56] From end to end.
[39:58] It has GPUs,
[40:01] Vera Rubin NVLink 72.
[40:03] It is orchestrated by Vera CPUs I'm
[40:06] going to tell you more about. The
[40:07] storage systems,
[40:09] revolutionary.
[40:11] Vera along with CX 9, our software stack
[40:14] called Doka, the security processor
[40:17] that's inside so that everything is
[40:20] encrypted at rest,
[40:23] in motion,
[40:25] as well as in use.
[40:27] Everything across this is secure because
[40:30] the AI model is so precious.
[40:32] This is the reason why this entire
[40:34] system
[40:35] obeys confidential computing.
[40:38] Each one of these systems would be a
[40:40] complete revolution in itself.
[40:43] Vera Rubin is the most ambitious
[40:46] endeavor in the history of our company.
[40:49] The whole company worked on Vera Rubin
[40:51] across all 40,000 engineers, not to
[40:55] mention all of you.
[40:57] All of you participated in the creation
[40:59] of this entire system. Vera Rubin
[41:02] is really a miracle and it's not just
[41:05] one chip, it is so many.
[41:07] Well, it's even beyond that.
[41:10] A long time ago NVIDIA used to be a GPU
[41:12] company,
[41:13] but over the years we've evolved
[41:17] to become a systems company. You're
[41:19] looking here now
[41:20] for the most complex system, the most
[41:23] complex and ground-up system ever
[41:25] designed.
[41:27] But ultimately, our customers, our
[41:30] partners
[41:31] don't want to buy computers,
[41:33] they want to build AI factories, which
[41:37] is the reason why NVIDIA has really
[41:38] started to transform ourselves yet
[41:40] again.
[41:41] You could see so much of our technology
[41:44] is now at the entire infrastructure
[41:47] scale.
[41:48] Our partners are at infrastructure
[41:50] scale.
[41:51] Power generators, cooling systems, the
[41:54] grid providers,
[41:56] so many
[41:57] industrial companies are now part of our
[42:00] ecosystem because ultimately, we're
[42:02] trying to build an entire stack just
[42:04] like GPUs, just like when we were
[42:07] building Grace Blackwell NVLink 72, just
[42:10] like now,
[42:12] we are building a full stack system so
[42:14] that
[42:16] our customers could build amazing AI
[42:19] infrastructure. Let's take a look.
[42:23] >> The world is racing to build AI
[42:24] factories, [music]
[42:25] the largest infrastructure buildout in
[42:27] human history.
[42:29] AI factories are incredibly complex.
[42:31] Every layer, chip, rack, network, power,
[42:35] >> [music]
[42:36] >> cooling, grid, must be designed together
[42:38] from end to end because compute is
[42:40] revenues.
[42:44] Nvidia [music] DGX is the blueprint, a
[42:47] reference design for building and
[42:49] operating AI factories at maximum
[42:51] efficiency and profitability.
[42:54] It starts with DGX Sim.
[42:56] With the DGX [music] Sim Omniverse
[42:58] blueprint, partners design and validate
[43:00] an Nvidia Vera Rubin AI factory before a
[43:03] single rack lands.
[43:05] They plan the layout.
[43:07] >> [music]
[43:10] >> Simulate the power and cooling.
[43:14] Design [music] the network. Validate
[43:16] every integration. Test every change in
[43:18] the digital twin.
[43:21] The factory powers on.
[43:23] DGX OS takes over [music] and
[43:25] provisions, operates, monitors, and
[43:27] remediates the infrastructure.
[43:30] Turning the [music] installed systems
[43:32] into trusted, multi-tenant, resilient,
[43:35] AI-ready capacity. [music]
[43:39] Today's AI factories over-provision
[43:41] power by up [music] to 40%. DGX Max LPS
[43:45] lets operators safely deploy more GPUs
[43:47] inside the same [music] power budget,
[43:50] adding billions in annual revenue.
[43:54] >> [music]
[43:55] >> Breakthrough hot liquid cooling at 45°C
[43:59] uses less water and energy.
[44:01] More power going to revenue-generating
[44:03] compute. [music]
[44:05] Incredible.
[44:07] Dynamic power allocation steers power
[44:09] from rack [music] to rack, recovering
[44:11] stranded watts, sending them where work
[44:13] is happening.
[44:15] In-rack [music] power smoothing flattens
[44:17] peak current spikes and power surges.
[44:24] Throughout the factory, [music]
[44:25] teams of AI agents work with DSX Max
[44:27] LPS, continuously coordinating [music]
[44:30] to balance cooling and power to meet
[44:32] workload demand.
[44:34] DSX AI [music] factories are flexible
[44:36] energy assets that operate cooperatively
[44:38] with the grid.
[44:40] DSX Flex reads [music] real-time grid
[44:42] signals and dynamically adjusts factory
[44:45] power when the grid needs relief.
[44:50] A 100 gigawatts of AI factories [music]
[44:53] will come online before the end of the
[44:55] decade. NVIDIA DSX AI factories run
[44:57] [music] at highest efficiency, produce
[45:00] the lowest cost tokens, and make the
[45:02] grid stronger.
[45:09] >> [applause]
[45:12] >> I've shown you ecosystem slides of the
[45:14] past
[45:16] where NVIDIA's computing layers and
[45:19] software and software and computing
[45:21] stacks are integrated into other
[45:23] people's platforms, third-party
[45:25] platforms, and libraries that serves end
[45:27] markets. That was a computing ecosystem.
[45:31] This is an AI factory ecosystem.
[45:34] This is way downstream of all of you.
[45:37] Upstream of me is all of you, and
[45:40] downstream of us is this ecosystem.
[45:43] Because NVIDIA ultimately
[45:45] is not just building a GPU, not just
[45:48] building a system, we're helping
[45:50] customers build these AI factories,
[45:52] these AI infrastructure that is so
[45:55] immensely complex.
[45:57] Each one of these at 1 gigawatt level
[46:00] started at 30 20, 30 billion dollars. It
[46:04] is at 50, 60 billion dollars and soon it
[46:08] will be 80, 100 billion dollars per
[46:11] gigawatt.
[46:13] 100 billion dollars into an AI factory.
[46:18] It must work the first time and it must
[46:20] work right away.
[46:21] The cost of capital is incredible. The
[46:24] complexity is incredible. So, as you
[46:26] see,
[46:27] we used to design a chip inside a
[46:29] computer.
[46:31] And then we simulated a system inside a
[46:34] computer.
[46:36] Today,
[46:37] you saw just now, everything was built
[46:40] in Omniverse.
[46:42] I've been working with Omniverse with
[46:43] all of you for a long time. This was the
[46:46] dream come true.
[46:48] So, that we can build these gigantic
[46:50] systems as large as the world wants to
[46:52] build inside a digital framework, inside
[46:55] a digital simulator, in a digital world
[46:59] long before we break ground the first
[47:02] break ground and put our money to work.
[47:04] So, this is our ecosystem, our we call
[47:07] it DSX.
[47:09] RTX is for our GPU, DGX is for our
[47:12] systems and now DSX basically
[47:15] infrastructure.
[47:16] Because of the work that we do here
[47:18] across this entire stack including our
[47:21] systems and software, it's the reason
[47:22] why we can work with small companies and
[47:25] enable them to be world-class AI clouds.
[47:29] Every one of these I'm about to show you
[47:31] are small companies just recently and
[47:34] now Coreweave is worth 50, 60, 70
[47:37] billion dollars and growing incredibly
[47:39] fast. Recently, we worked with Nebius
[47:42] and again, they're growing incredibly
[47:44] fast.
[47:45] Each one of these clouds have incredible
[47:48] customers. Cursor, the software coding
[47:50] company,
[47:52] Black Mountain Labs, image generation,
[47:54] World Labs,
[47:56] World Foundation model, Revolut, the
[47:58] leading financial services AI company,
[48:01] and Shopify.
[48:03] Here's another one. This is N Scale, and
[48:05] their customers are British Telecom,
[48:08] Google.
[48:09] Google is using one of our AI clouds,
[48:13] Thinking Machines, a Frontier Labs
[48:15] company. We're super exciting.
[48:17] Here's Naver Cloud in Korea.
[48:19] Bank of Korea,
[48:21] Hyundai,
[48:23] so many incredible companies. Here's one
[48:26] in India, Yoda.
[48:28] Incredible companies. Here's one based
[48:31] in Singapore, building in Australia,
[48:34] Together AI.
[48:36] AI Singapore.
[48:38] This is one in Indonesia.
[48:40] Each one of these companies, each one of
[48:42] these companies are serving regional as
[48:45] well as global customers. AI is going to
[48:49] run everywhere. Every company will be
[48:51] powered by it.
[48:52] Every
[48:54] region will build it.
[48:56] Indosat here in in Indonesia. Here in
[48:59] Taiwan, GMI.
[49:02] Here in Taiwan, GMI.
[49:05] It's okay to clap.
[49:07] >> [applause]
[49:12] >> So, incredible incredible incredible
[49:15] companies, incredible opportunity, but
[49:17] all of them need several things. Of
[49:20] course, they need the computing stack.
[49:22] This entire stack underneath, this is
[49:24] what made NVIDIA famous.
[49:26] All of our hardware and software and
[49:28] libraries our connection into the
[49:31] world's ecosystem of third-party
[49:33] developers makes it possible for anyone
[49:36] to stand up an AI cloud.
[49:39] However, the AI cloud is so complex now.
[49:43] This is the software version. This is
[49:45] the computer science version.
[49:47] The money version,
[49:50] the asset version is what I showed you
[49:52] earlier. It's a giant factory.
[49:56] Having this ability alone is not enough,
[49:59] which is the reason why Nvidia has
[50:00] become an AI infrastructure company.
[50:03] Now, doing this well
[50:06] and becoming incredibly good at at
[50:09] helping customers build AI factories and
[50:12] deploying AI factories is incredibly
[50:15] important, and the reason for that is
[50:16] this.
[50:17] Compute is revenue now.
[50:20] Compute is profit. The absence of
[50:23] revenues and profit is loss.
[50:26] And so, it's really important to realize
[50:29] that this is when this is an example of
[50:34] an AI infrastructure coming online. It
[50:37] could take
[50:38] it could be coming online quickly, it
[50:40] could take a while.
[50:42] Its throughput could be high, it could
[50:44] be low.
[50:45] Its resilience or reliability could be
[50:47] good or bad. And its lifetime of
[50:50] usefulness could be long or short.
[50:54] Because this represents
[50:57] 50, 60, going to a hundred billion
[51:00] dollars.
[51:02] This curve matters greatly.
[51:05] Which is the reason why Nvidia is such a
[51:07] great partner working with us because of
[51:10] our
[51:11] fully integrated capability. We didn't
[51:14] just come up with a PowerPoint slide.
[51:16] We created the entire infrastructure. We
[51:19] connected everything together. We built
[51:21] out billions and billions of it
[51:23] ourselves
[51:24] to make sure that everything works well.
[51:27] As a result of that, our time
[51:30] our time to first token, our time to
[51:34] first token, our time to first inference
[51:37] our time
[51:39] to training turned on is much faster.
[51:43] Second, because our
[51:47] throughput per watt, our tokens per watt
[51:51] is utterly world-class.
[51:54] And the reason for that is because we
[51:55] integrate everything, we design
[51:57] everything from the ground up, we
[51:58] simulate the entire system, and we use
[52:00] extreme co-design.
[52:02] Just like I showed you just now with the
[52:03] Vera Rubin rack, everything was designed
[52:06] in order to deliver on this incredible
[52:09] throughput.
[52:10] If your data center, if your factory has
[52:15] 1 gigawatt,
[52:17] it will not have more.
[52:20] 1 gigawatt means 1 gigawatt.
[52:22] That's all the power generation you
[52:24] could do.
[52:25] If you have 1 gigawatt of power,
[52:27] then
[52:29] throughput per watt is revenues, because
[52:33] every token is profitable.
[52:35] Every token is revenues.
[52:38] This is the future.
[52:40] Compute is revenues. Performance per
[52:43] watt is your revenues. Choosing the
[52:46] wrong architecture,
[52:48] just because the chips are cheaper,
[52:51] doesn't translate.
[52:52] Doesn't make sense.
[52:54] You need to make sure that your revenues
[52:56] per watt,
[52:58] the more you buy, the
[52:59] more you make.
[53:01] And so, tokens per watt,
[53:04] and then lastly, very last Oh, second,
[53:07] third is reliability.
[53:10] If you ever get a chance to see these
[53:11] data centers, there are so many moving
[53:13] parts, millions of cables.
[53:17] The ability for all of those computers
[53:19] to work harmoniously,
[53:21] reliability is extremely low. It is just
[53:25] extremely difficult. We have now been
[53:27] operating very large scale for a very
[53:30] long time. That experience matters.
[53:33] That difference, mean time between
[53:36] interrupts,
[53:37] extremely important. And then lastly,
[53:40] this is very hard.
[53:43] The lifetime of these systems, the
[53:45] lifetime of these systems,
[53:47] the software is changing all the time.
[53:50] Four years ago,
[53:51] which is in the time of Hopper,
[53:54] AI has completely changed.
[53:57] Six years ago,
[53:58] this is the time frame of Ampere, AI has
[54:01] completely changed.
[54:04] We started out talking about CNNs.
[54:07] Here we are, then we talked about
[54:08] transformers, and then we talked about
[54:10] mixture of experts.
[54:12] Now we're talking about agentic systems.
[54:15] Every single generation, every single
[54:18] few months, the software industry is
[54:21] coming up with new technology.
[54:23] If your architecture
[54:25] is not flexible,
[54:27] if your ecosystem is not rich,
[54:30] then this curve cannot be long.
[54:34] You cannot predict how long your system
[54:36] can last. I can.
[54:39] Nvidia's systems is all over the world.
[54:42] Software developers start with Nvidia
[54:44] CUDA,
[54:45] and by definition, therefore, the life,
[54:48] the ecosystem,
[54:50] the useful asset is going to be much
[54:53] longer. The difference is essentially
[54:55] cost. You could think of it as revenues,
[54:58] but the other side of revenues is cost.
[55:01] If the life of the asset is long,
[55:03] the TCO is low.
[55:06] This is the difference. This is what it
[55:09] looks like when compute
[55:15] The more you buy,
[55:17] the more you make.
[55:22] >> [applause]
[55:25] >> Now, all of you are experiencing this
[55:29] with me. Isn't that right?
[55:31] All of your demand, your factories are
[55:34] working so hard, your people are working
[55:37] so hard all across Taiwan because
[55:40] everybody wants to make money.
[55:43] They realize that AI, useful AI, is
[55:47] here.
[55:48] Profitable AI is here.
[55:51] Compute
[55:52] demand is incredibly high, and compute
[55:56] demand is the constraint. And so, let's
[55:59] go work super, super hard and help the
[56:01] world stand up AI factories everywhere.
[56:04] This is why it's so important. I'm so
[56:06] happy.
[56:07] Here I am standing in front of you.
[56:10] Vera Rubin is in full production.
[56:15] >> [applause]
[56:18] >> Vera Rubin is in full production.
[56:22] It the the supply chain we created for
[56:25] Vera Rubin is twice as large as Grace
[56:30] Blackwell.
[56:32] Not Yeah, it's incredible. And and
[56:36] what used to take 2 hours
[56:38] to assemble one Grace Blackwell rack now
[56:41] only takes 5 minutes.
[56:43] So, not only is the capacity higher, the
[56:46] throughput is a lot faster.
[56:48] And we need it all to support the
[56:51] demand.
[56:52] This ecosystem is extraordinary.
[56:55] Millions of square feet has been put
[56:58] online to support Grace Blackwell and
[57:01] preparing now, ramping up now, Vera
[57:03] Rubin. I want to thank all of you. Vera
[57:06] Rubin is now in full production. Thank
[57:08] you.
[57:09] >> [applause]
[57:12] >> Let's take a look.
[57:16] Large language [music] models generate
[57:18] answers.
[57:20] Now, AI agents can do work.
[57:23] But processing agentic AI
[57:25] >> [music]
[57:25] >> is a whole different kind of problem.
[57:27] Agents observe, reason, plan, use tools.
[57:31] They manage massive context, [music]
[57:33] juggling working memory and long-term
[57:35] memory. They spin up sub-agents,
[57:37] specialists [music]
[57:38] on demand.
[57:39] Nvidia Vera Rubin is a multi-rack pod
[57:42] scale system [music] built to process
[57:44] agentic AI and is now in full
[57:46] production.
[57:47] >> [music]
[57:47] >> The manufacturing automation and
[57:49] orchestration across the supply chain, a
[57:52] miracle to witness.
[57:53] >> [music]
[57:54] >> Our journey started when we launched the
[57:55] first AI supercomputer, Nvidia DGX-1.
[57:59] Over the next decade, we pushed every
[58:02] [music] chip and system to the limit.
[58:04] From Pascal and the first NVLink to
[58:06] Grace Blackwell, [music]
[58:07] the first rack-scale AI supercomputer.
[58:10] And now,
[58:11] Vera Rubin, the first multi-rack [music]
[58:13] pod scale supercomputer built for the
[58:16] agentic age.
[58:17] It starts at TSMC. The seven new chips
[58:20] [music] that make up Vera Rubin take
[58:22] shape through hundreds of processing
[58:24] steps. 3-nm process,
[58:27] CoWoS-R and CoWoS-L packaging, [music]
[58:29] HBM2e memory from Micron, SK Hynix, and
[58:33] Samsung.
[58:34] The Vera Rubin compute board,
[58:36] 6 trillion transistors [music]
[58:38] with over 18,000 components on one
[58:40] board.
[58:41] Vera Rubin NVL72 [music]
[58:43] does the thinking, prompt and context
[58:45] understanding, reasoning, and planning.
[58:48] Next, a new modular compute tray,
[58:51] streamlined with a new PCB midplane
[58:53] [music] design. Superchips,
[58:56] ConnectX-9 SuperNICs,
[58:58] and BlueField-4 [music]
[58:59] DPU use all made in place with no cables
[59:03] for resiliency at AI factory scale.
[59:05] 18 compute [music] trays, nine
[59:07] hot-swappable NVLink switch trays, new
[59:11] high-efficiency manifolds, liquid-cooled
[59:13] bus bars [music]
[59:14] carrying over 5,000 amps, the equivalent
[59:17] of 20 electric cars at full
[59:19] acceleration.
[59:20] Together, 1.3 million components [music]
[59:23] form this third-generation MGX rack
[59:25] design.
[59:26] Congratulations to Microsoft for their
[59:29] operational Vera Rubin MVL-72
[59:31] engineering rack. Congratulations to
[59:33] Dell and CoreWeave [music] as well for
[59:35] standing up their Vera Rubin MVL-72
[59:37] engineering rack.
[59:39] Then, the Vera CPU rack. 256 CPUs
[59:44] >> [music]
[59:44] >> in a single liquid-cooled rack
[59:46] orchestrating the models,
[59:48] shuffling memory, launching [music]
[59:50] tools.
[59:51] At Foxconn and Quanta, Grok 3 LPX takes
[59:55] shape. 256 Grok 3 LPUs across 16 trays,
[01:00:00] 40 petabytes [music] per second of SRAM
[01:00:02] bandwidth for ultra-low latency.
[01:00:05] While MVL-72 [music]
[01:00:07] generates tokens at the highest
[01:00:08] throughput, Grok LPX generates them at
[01:00:11] the lowest latency.
[01:00:14] Vera BlueField 4 [music] STX, where AI
[01:00:16] keeps its memory.
[01:00:18] Storage processing accelerated by
[01:00:20] BlueField 4, connecting memory, storage,
[01:00:24] >> [music]
[01:00:24] >> and in-silicon security.
[01:00:26] And NVIDIA Spectrum-X Ethernet
[01:00:29] photonics, the world's [music] first
[01:00:31] Ethernet switch with 200 gigabit
[01:00:33] co-packaged optics, TSMC's CoWoS
[01:00:36] process, chip-scale packaging, and
[01:00:39] ultra-high-power [music]
[01:00:40] laser diodes on indium phosphide.
[01:00:44] Vera Rubin,
[01:00:45] five connected rack-scale [music]
[01:00:46] systems, a supercomputer for AI agents.
[01:00:50] 150 supply chain partners [music] across
[01:00:52] Taiwan.
[01:00:54] Millions of square feet of factory
[01:00:55] floor. Hundreds of sites, chips,
[01:00:58] >> [music]
[01:00:59] >> packages, systems, and data centers
[01:01:02] pushed to the limits of size, power, and
[01:01:04] scale. This is what we call [music]
[01:01:06] extreme co-design.
[01:01:08] We did this with Taiwan. Together, we
[01:01:10] reinvented computing for the age of AI.
[01:01:13] Taiwan
[01:01:13] >> [music]
[01:01:13] >> was with us at the beginning and here
[01:01:16] today as we bring Vera Rubin to the
[01:01:18] world.
[01:01:19] Thank you, Taiwan. [music]
[01:01:24] >> [applause]
[01:01:28] >> Ladies and gentlemen, Vera Rubin.
[01:01:32] >> [applause]
[01:01:32] >> Vera Rubin
[01:01:34] was not just built for AI.
[01:01:37] AI Vera Rubin was not built just to run
[01:01:40] AI. Vera Rubin was built to run agents.
[01:01:44] This is an agentic system. Imagine the
[01:01:48] complexity.
[01:01:49] Which is the reason why agents is the
[01:01:52] last
[01:01:53] computer science breakthrough. It has
[01:01:55] taken this many years for agents to
[01:01:58] realize its potential and become useful.
[01:02:00] It stands to reason
[01:02:02] that the computer that runs it is the
[01:02:04] most advanced in the world. This is Vera
[01:02:06] Rubin. Let's take a look. Can we bring
[01:02:09] out Vera Rubin, please?
[01:02:23] >> [applause]
[01:02:30] >> And Janine, do we have the do we have
[01:02:32] the racks, the systems?
[01:02:38] It looks heavy.
[01:02:41] This is This is Vera Rubin. Vera Rubin
[01:02:44] NVLink 72.
[01:02:47] This is the Grock LPX. At the next GTC,
[01:02:51] I'm going to talk to you about a lot
[01:02:52] more of this. Today, we have so much to
[01:02:54] talk to you about.
[01:02:56] This is
[01:02:57] Vera CPU rack. 256
[01:03:01] CPUs, all liquid-cooled. Let me tell you
[01:03:03] about Vera in just a moment. This is the
[01:03:06] Vera BlueField storage
[01:03:09] processing system and also security
[01:03:12] system. And of course, this is our
[01:03:14] Mellanox networking, the world's first
[01:03:18] CPO.
[01:03:20] This is Vera Rubin. Incredible
[01:03:22] technology all coming together.
[01:03:24] Now, when we built when we built Hopper,
[01:03:27] we built Hopper, as you know, for
[01:03:29] pre-training.
[01:03:31] Pre-training was the most important
[01:03:33] application, the most important workload
[01:03:34] we were working on at the time.
[01:03:36] Then when we worked on Grace Blackwell,
[01:03:39] everybody said, "Jensen, you know,
[01:03:41] Nvidia is really good at pre-training.
[01:03:45] Inference is so easy." Do you remember
[01:03:47] that?
[01:03:48] People used to say, "Inference is so
[01:03:50] easy. We could do that, too."
[01:03:52] But, as you know, inference equals
[01:03:54] money.
[01:03:55] And the models, MOEs, are so
[01:03:58] complicated. And to do it at incredibly
[01:04:01] high
[01:04:02] response time,
[01:04:04] fast interactivity, and high throughput
[01:04:06] at the same time is incredibly hard,
[01:04:09] which is the reason why we created
[01:04:10] NVLink 72.
[01:04:12] Today, Nvidia's token cost is the lowest
[01:04:16] in the world. Not by 10%, by X factors,
[01:04:20] orders of magnitude.
[01:04:22] All because we did extreme co-design,
[01:04:25] all because we understood the computing
[01:04:28] model, the computing pattern of
[01:04:30] inference,
[01:04:31] and we were able to create NVLink 72.
[01:04:34] Now,
[01:04:36] with Vera Rubin,
[01:04:37] it is beyond inference.
[01:04:39] It is now inference in an agentic
[01:04:42] agentic system. This is
[01:04:45] Vera Rubin.
[01:04:47] No cables,
[01:04:49] no hoses, no fans.
[01:04:53] What used to take the last time when I
[01:04:55] showed this to you, we had cables
[01:04:57] everywhere.
[01:04:59] The cables were amazing to look at.
[01:05:01] But now,
[01:05:03] there's a PCB in the middle,
[01:05:05] which connects both sides.
[01:05:07] What used to take 2 hours now takes 5
[01:05:10] minutes. The reliability and the
[01:05:11] resilience of Vera Rubin is going to be
[01:05:14] off the charts.
[01:05:16] This is our Vera CPU tray.
[01:05:20] The most advanced CPUs that has ever
[01:05:23] been built. I'm going to show you that
[01:05:24] in just a second.
[01:05:26] And this is
[01:05:28] our storage tray.
[01:05:30] Two Vera CPUs,
[01:05:32] four CX 9, incredible amounts of
[01:05:36] software.
[01:05:38] This is our new
[01:05:40] LPX LPU 30, the Grock system designed
[01:05:45] for very low latency inference.
[01:05:48] The throughput is delivered by Vera
[01:05:50] Rubin
[01:05:51] and extended with NVLink 72.
[01:05:54] If you want to extend that even further,
[01:05:57] you can have Grock LPUs.
[01:06:00] Here, we have the Vera Rubin NVLink, the
[01:06:03] switch tray. This is the switches in the
[01:06:06] middle, and this is revolutionary.
[01:06:09] Because of Vera Rubin's, because of
[01:06:11] NVLink 72 and the
[01:06:14] NVLink switches that we created and
[01:06:15] invented.
[01:06:17] And this
[01:06:19] is our Ethernet switches for scale out.
[01:06:23] What's amazing is we introduced these
[01:06:26] two systems
[01:06:28] for Grace Blackwell.
[01:06:29] These two systems were created for Grace
[01:06:32] Blackwell. And today,
[01:06:33] Nvidia is the largest networking company
[01:06:36] in the world.
[01:06:38] I'm so proud of the networking team.
[01:06:41] This is such an incredible enabler for
[01:06:43] everything that we do.
[01:06:45] I'm going to now talk to you about the
[01:06:47] next major industry we're going to be
[01:06:49] part of.
[01:06:52] Thank you.
[01:06:53] Janine.
[01:06:55] Thank you.
[01:06:57] >> [applause]
[01:07:01] >> Zaijian.
[01:07:06] I think there are 2,000 people back
[01:07:09] there pulling that.
[01:07:15] Okay, let's talk about CPUs.
[01:07:20] Vera CPUs
[01:07:22] CPUs built for the age of AI.
[01:07:25] All of the CPUs until now
[01:07:28] were created for people.
[01:07:31] We were the users.
[01:07:33] We were the users. We were the renters.
[01:07:37] The way we use CPUs, we live in a world
[01:07:40] counted by seconds.
[01:07:43] The way we rent CPUs in the cloud
[01:07:46] each one of them more you can more CPU
[01:07:49] cores you have, the more you can rent.
[01:07:52] The economics of the old the use case of
[01:07:54] the old CPU and the economics of the old
[01:07:57] CPU fundamentally different than agents.
[01:08:01] Agents
[01:08:03] are impatient.
[01:08:04] They don't live in a world that is in
[01:08:06] seconds. They live in a world that's in
[01:08:08] nanoseconds.
[01:08:10] When it uses a tool
[01:08:12] it wants to response time to be as fast
[01:08:14] as possible.
[01:08:16] When it access database, it has to come
[01:08:18] back as soon as possible.
[01:08:21] Every moment that the agent is waiting
[01:08:25] keeps it from going to the next step,
[01:08:27] the next step, the next step.
[01:08:29] It is vital
[01:08:31] that we make the CPUs
[01:08:33] as low latency as possible as
[01:08:36] interactive as possible.
[01:08:38] So, we created Vera CPU for the age of
[01:08:41] AI.
[01:08:43] Now, inside our system, it's used for
[01:08:45] three different ways. The first way, of
[01:08:47] course, is Vera Rubin
[01:08:52] for thinking.
[01:08:54] And inside the Vera Rubin rack, there
[01:08:57] already two CPUs.
[01:08:59] As you know, we are
[01:09:01] building and selling millions of Vera
[01:09:05] Rubins. We have sold millions of Grace
[01:09:08] Blackwalls. Nvidia already is one of the
[01:09:10] largest CPU makers in the world. Vera In
[01:09:14] the Vera Rubin rack are two CPUs. One
[01:09:18] for orchestrating and managing the GPUs,
[01:09:22] managing the KV cache,
[01:09:25] dealing with all of the software that
[01:09:27] runs in the rack. We also have the Grace
[01:09:31] BlueField that is used for security and
[01:09:34] isolation.
[01:09:36] The Vera compute
[01:09:38] is used for the harness, the
[01:09:40] orchestration of the AI models, tool
[01:09:43] use, accessing the database.
[01:09:46] And the data servers
[01:09:48] are right here, Vera BlueField. The
[01:09:51] fastest storage fastest storage servers,
[01:09:54] the fastest storage system the world has
[01:09:56] ever made. And the reason why this is so
[01:09:59] vital
[01:10:00] is because agents are accessing memory
[01:10:02] accessing memory so incredibly fast.
[01:10:05] These systems
[01:10:08] the storage server
[01:10:10] and the CPUs
[01:10:12] are now the critical path of the most
[01:10:14] expensive part of the data center.
[01:10:17] This is the most expensive for a good
[01:10:19] reason.
[01:10:21] The
[01:10:22] economics the economics of the AI
[01:10:25] factory is tokens.
[01:10:28] And the tokens are created here.
[01:10:31] And so of course you want to manufacture
[01:10:33] and generate as many tokens as possible.
[01:10:36] This is where you put all of your
[01:10:37] economics and this has to not be in the
[01:10:41] way.
[01:10:42] And so Vera CPU has great pressure on
[01:10:45] the Vera on the CPU architecture, which
[01:10:47] is the reason why we built a brand new
[01:10:50] architecture from the ground up. A CPU
[01:10:52] the world has ever seen before.
[01:10:55] We call it Vera.
[01:10:57] This is CPU
[01:10:59] for agents.
[01:11:01] All the CPUs of the past we built for
[01:11:04] humans.
[01:11:05] This CPU is built for agents.
[01:11:08] Well, there are four things to keep in
[01:11:10] mind. The four takeaways.
[01:11:12] The first takeaway
[01:11:14] is that the instructions per clock of
[01:11:17] Vera has to be incredibly good because
[01:11:20] we need the latency to be short.
[01:11:22] We need the processing time single
[01:11:25] threaded performance not throughput
[01:11:28] single threaded performance has to be
[01:11:30] world-class absolutely the best.
[01:11:33] Single threaded performance which is the
[01:11:34] reason why the IPC the instructions per
[01:11:38] clock of Vera is so high. It's the
[01:11:40] highest in the world. 10 instructions
[01:11:43] fetch decoded and executed per clock.
[01:11:47] Number one.
[01:11:48] Number two.
[01:11:50] The bandwidth necessary to move data in
[01:11:54] and out for the CPU has to be utterly
[01:11:56] world-class.
[01:11:58] The second thing is bandwidth per core.
[01:12:01] The third is just bandwidth period.
[01:12:04] We're moving remember I said earlier
[01:12:08] agentic systems is fundamentally
[01:12:11] disaggregated
[01:12:12] and distributed. Disaggregated and
[01:12:15] distributed. When computing is
[01:12:17] disaggregated and distributed
[01:12:20] networking becomes the problem.
[01:12:22] Therefore, we have to move the data
[01:12:24] around as fast as possible between the
[01:12:27] CPU cores and between the CPU and the
[01:12:30] storage the CPU and the GPU.
[01:12:33] The bandwidth around the system and
[01:12:35] inside the CPU core has to be utterly
[01:12:38] world-class.
[01:12:40] This is the first CPU that's been built
[01:12:42] a long time
[01:12:43] that is literally at radical limits with
[01:12:46] a fabric that connects all of the CPU
[01:12:49] cores that is speed of light. 3.6
[01:12:53] terabytes per second.
[01:12:55] No chiplet tax,
[01:12:57] no chip boundary crossings because we
[01:13:00] need to have everything
[01:13:02] because the CPU cores are talking to
[01:13:04] each other with extremely high
[01:13:06] bandwidth. They're not rented core per
[01:13:09] core per core. They're all working
[01:13:11] together.
[01:13:12] The cross-sectional bandwidth of Vera is
[01:13:15] off the charts. It's the first one to be
[01:13:17] PCI Express Gen 6.
[01:13:20] It is also
[01:13:21] the first one to have LPDDR5
[01:13:25] with 1.2 terabytes per second. Three
[01:13:28] times two to three times the bandwidth
[01:13:30] of the highest performance CPUs on the
[01:13:33] outside, three times the bandwidth on
[01:13:36] the inside. The bandwidth per core and
[01:13:40] the bandwidth period is world-class.
[01:13:42] Now, remember I showed you earlier
[01:13:45] the number of CPU cores, the number of
[01:13:47] CPUs is going to be quite high.
[01:13:50] And the reason for that is very simple.
[01:13:54] We created CPUs
[01:13:56] for humans in the past.
[01:13:59] And humans, they're only 1 billion of
[01:14:01] us.
[01:14:02] There will be billions of agents. And
[01:14:06] these agents are going to be using the
[01:14:09] CPUs with very little patience because
[01:14:12] the cost of the GPU they sit next to is
[01:14:14] too high.
[01:14:15] And therefore,
[01:14:17] too valuable, too precious.
[01:14:20] Therefore, these CPUs
[01:14:22] are going to be
[01:14:24] both performant,
[01:14:26] but they also have to be extremely
[01:14:28] energy efficient so that we can cram as
[01:14:31] much CPUs we can into the factory
[01:14:34] without taking away power
[01:14:37] from the token generation, which we know
[01:14:40] is how we make money.
[01:14:41] These four properties, instructions per
[01:14:44] clock or single-threaded performance,
[01:14:47] bandwidth per core,
[01:14:49] the total bandwidth around the chip and
[01:14:51] inside the chip,
[01:14:53] and energy efficiency
[01:14:55] defines Vera. It is absolutely
[01:14:58] world-class.
[01:14:59] When you compare it to the highest
[01:15:00] performance x86, it is just off the
[01:15:03] charts.
[01:15:04] When you compare it in real
[01:15:06] single-threaded performance,
[01:15:08] real performance, it's off the charts.
[01:15:13] It is incredible to be able to deliver
[01:15:15] 5% improvement on CPUs. It is incredible
[01:15:19] to be able to deliver 10%,
[01:15:21] but this kind of performance speedup is
[01:15:24] just unheard of.
[01:15:26] This is
[01:15:27] Nvidia Vera.
[01:15:29] What do you think?
[01:15:31] >> [applause]
[01:15:36] >> Let's take a look.
[01:15:39] >> Agentic [music] AI changes the role of
[01:15:41] the CPU.
[01:15:42] The CPU is now the conductor,
[01:15:44] >> [music]
[01:15:44] >> and the GPU is the orchestra.
[01:15:46] Traditional CPUs were built for a
[01:15:49] different era, maximizing cores per
[01:15:51] socket. Slice them up,
[01:15:53] >> [music]
[01:15:53] >> virtualize, rent by the hour.
[01:15:56] In the age of agents, the CPU is now a
[01:15:59] bottleneck to GPU utilization,
[01:16:02] directly affecting token throughput,
[01:16:04] latency, and user experience. [music]
[01:16:07] Nvidia Vera is the CPU built for the
[01:16:10] agentic loop, combining Nvidia's custom
[01:16:12] data center CPU core with [music] a
[01:16:14] scalable coherency fabric for the right
[01:16:17] balance of performance cores and
[01:16:18] bandwidth to maximize AI factory output.
[01:16:22] At the heart of Vera [music] is the
[01:16:23] Nvidia Olympus core, built for modern
[01:16:26] data center workloads, branch-heavy
[01:16:28] Python [music] runtimes, tool calls, and
[01:16:31] sandboxed code execution.
[01:16:33] Each core is tuned for throughput. A
[01:16:36] neural branch predictor [music]
[01:16:37] evaluating two taken branches per cycle.
[01:16:40] A 10-wide decode engine brings in more
[01:16:43] work each cycle. A large out-of-order
[01:16:45] engine keeps [music] instructions
[01:16:46] moving. Advanced prefetchers with a
[01:16:49] novel graph engine anticipating the
[01:16:51] [music] next data path. But fast cores
[01:16:54] only matter when data arrives correctly
[01:16:56] and on time.
[01:16:58] Vera is the first [music] CPU to use
[01:17:00] LPDDR5X memory while correcting multiple
[01:17:03] errors simultaneously without
[01:17:05] compromising bandwidth.
[01:17:07] Vera achieves 40% lower peak memory
[01:17:09] latency versus x86
[01:17:12] keeping cores fed on time through
[01:17:14] retrieval, analytics, and sandbox
[01:17:17] execution.
[01:17:18] Nvidia's second generation scalable
[01:17:20] [music] coherency fabric unifies all 88
[01:17:23] Olympus cores on a monolithic mesh with
[01:17:26] separate dies for memory and IO. Cores
[01:17:29] are [music] not split across chiplets
[01:17:31] enabling 50% faster core-to-core
[01:17:34] communication than traditional CPUs.
[01:17:37] And memory coherent NVLink chip-to-chip
[01:17:40] connects GPUs directly to the [music]
[01:17:42] fabric.
[01:17:43] Beyond GPUs, NVLink chip-to-chip can
[01:17:46] scale Vera up to multiple sockets
[01:17:48] >> [music]
[01:17:49] >> enabling massive bandwidth between CPUs.
[01:17:52] Vera delivers 1.8 [music] times the
[01:17:55] agentic sandbox performance of x86 CPUs.
[01:17:58] Standalone Vera racks run agent
[01:18:00] sandboxes, tools, code, and data
[01:18:03] pipelines.
[01:18:05] Tightly coupled [music] to Rubin GPUs,
[01:18:07] Vera keeps accelerated workflows moving.
[01:18:10] Nvidia Vera [music] BlueField 4 STS
[01:18:13] powers context memory and AI storage.
[01:18:17] Compute, networking, storage. Vera is
[01:18:21] the CPU for the age of agents.
[01:18:28] >> [applause]
[01:18:31] >> This is going to be our new major growth
[01:18:34] driver. The reviews are already coming
[01:18:36] out.
[01:18:38] And it's pretty good.
[01:18:40] That's pretty good stuff.
[01:18:47] >> [applause]
[01:18:49] >> Now, remember
[01:18:53] Grace and Vera
[01:18:55] are also the most highly qualified
[01:18:59] CPUs in the world of AI because every
[01:19:01] single data center, every single cloud,
[01:19:04] every single enterprise, every company
[01:19:06] that works with NVIDIA on AI has already
[01:19:10] qualified
[01:19:11] Grace. The entire software stack has
[01:19:14] already been optimized for Grace. Every
[01:19:17] company will be qualifying Vera.
[01:19:20] Vera will be the most optimized agentic
[01:19:23] CPU in the world.
[01:19:24] Simply because it's going to go with
[01:19:26] Vera Rubin. Simply because we made the
[01:19:29] the big hard switch. In fact, during
[01:19:32] Grace Blackwell transition, the biggest
[01:19:34] risk was going from external CPU x86
[01:19:38] into Grace Blackwell.
[01:19:41] That transition was extremely dangerous,
[01:19:44] but we did it with incredible execution.
[01:19:47] Now, Grace is literally synonymous with
[01:19:51] Grace Blackwell.
[01:19:52] When people say Blackwell, they say
[01:19:53] Grace Blackwell. Because it is utterly
[01:19:56] now everywhere. Every company's software
[01:19:59] stack has been optimized for it.
[01:20:00] Everybody's security stack has been
[01:20:02] optimized for it. And now here comes
[01:20:04] Vera. I'm super excited about that. Now,
[01:20:07] look at some of the performance numbers.
[01:20:10] Speedups is one thing. It is extremely
[01:20:13] hard to speed up SQL.
[01:20:16] SQL
[01:20:18] the most famous
[01:20:21] domain-specific language DSL that has
[01:20:24] ever been created. Before SQL you know,
[01:20:28] before CUDA, there was SQL. Before
[01:20:31] OpenGL, there was sequel
[01:20:33] invented by IBM.
[01:20:36] Today, it is the structured database
[01:20:38] engine of the planet. Everybody uses
[01:20:40] sequel.
[01:20:41] This is sequel running three times
[01:20:44] faster. Not 10% faster, not 25% faster,
[01:20:49] 10 times three times faster. Incredible.
[01:20:53] This is
[01:20:54] real-time The next one is real-time
[01:20:57] stream processing.
[01:20:59] Remember, your AI is going to be
[01:21:02] not just reading documents. Your AI is
[01:21:04] going to be watching for telemetry,
[01:21:07] especially inside of factory, inside a
[01:21:09] stock exchange.
[01:21:12] You're going to be looking for telemetry
[01:21:13] continuously.
[01:21:15] The burst of data that's coming in goes
[01:21:17] into a CPU.
[01:21:19] This is Vera CPU running real-time
[01:21:23] stream processing for New York Stock
[01:21:25] Exchange. Lynn Martin, the president of
[01:21:27] New York Stock Exchange, has been so
[01:21:29] gracious to partner with us.
[01:21:31] This system is run all over the world in
[01:21:34] real-time real-time stream processing.
[01:21:36] Vera CPU six times. All because of the
[01:21:39] bandwidth, the single single-threaded
[01:21:43] instruction execution, the bandwidth
[01:21:45] inside between the cores, the bandwidth
[01:21:47] outside.
[01:21:48] Vera is completely revolutionary.
[01:21:51] That's Vera.
[01:21:55] >> [applause]
[01:21:59] >> You know, X factors is something you you
[01:22:01] talk about when you're talking about
[01:22:03] GPUs. It is quite rare that somebody
[01:22:06] talks about X factors on real workload,
[01:22:09] real workload
[01:22:11] that is associated with CPUs. So, I'm so
[01:22:13] proud of the team. You guys did such a
[01:22:15] great job. We have an extraordinary road
[01:22:17] map coming.
[01:22:24] But, what's really exciting
[01:22:27] is almost everybody is supporting Vera.
[01:22:30] They're as excited as we are. This is
[01:22:32] Vera opening up.
[01:22:35] It's opened up a brand new market.
[01:22:38] Agents
[01:22:40] Agents is a new workload.
[01:22:42] We built CPUs for humans in the past.
[01:22:45] We need CPUs for agents, agentic
[01:22:49] systems. Their properties are different.
[01:22:51] Why would the old CPUs be the same?
[01:22:54] We are building millions and
[01:22:57] millions of Veras.
[01:22:59] Millions of Veras. And to go to market
[01:23:02] with us
[01:23:03] Taiwan's ODMs and computer makers
[01:23:06] all the OEMs
[01:23:08] and you could see the early adopters.
[01:23:11] The early adopters are the agentic
[01:23:13] companies.
[01:23:14] This is the beginning of a new market.
[01:23:17] A market that never existed before.
[01:23:20] It's not going to take away from the old
[01:23:21] markets.
[01:23:23] But this is a new market.
[01:23:25] CPU for agents.
[01:23:28] And this will this this market will
[01:23:30] surely be larger than the last. And the
[01:23:32] reason for that is because there'll be a
[01:23:34] lot more agents than there are people.
[01:23:36] And then there will the agents are very
[01:23:38] impatient. So, Nvidia
[01:23:41] Vera CPU. Thank you.
[01:23:44] >> [applause]
[01:23:50] >> This is the most important slide really.
[01:23:52] This is the takeaway.
[01:23:54] The takeaway here is that this is the
[01:23:56] application pattern. This is the
[01:23:58] computing pattern of the next decade.
[01:24:02] Agents
[01:24:04] harnesses
[01:24:06] orchestrating large language models.
[01:24:09] Every company will run it.
[01:24:12] Every company will be an agent company.
[01:24:15] Every company will have agents running
[01:24:17] inside.
[01:24:19] Every company will see that
[01:24:22] agents
[01:24:23] will need its own operating system.
[01:24:26] Every company's asking us, "How do we
[01:24:28] run agents
[01:24:29] safely? How do we build agents for our
[01:24:33] own workloads?" And so, we have the
[01:24:37] Nvidia agent toolkit for enterprise AI.
[01:24:41] You've seen me build this in plain
[01:24:43] sight.
[01:24:44] Almost everything that Nvidia does, as
[01:24:46] you know, at every GTC, if you go back
[01:24:48] and look at my GTC 5 years ago or 10
[01:24:50] years ago, you will see today.
[01:24:52] This you've seen me talking about for
[01:24:56] several years now, because we've been
[01:24:57] building for this moment.
[01:25:00] There are four things that companies
[01:25:01] need in order to
[01:25:03] build agents as a service or build
[01:25:06] agents to operate.
[01:25:09] The first thing you need is you need
[01:25:10] models.
[01:25:12] Of course, large language models. The
[01:25:14] smarter the better, the cheaper the
[01:25:16] better, the faster the better.
[01:25:18] The second
[01:25:20] is you need a harness
[01:25:22] to orchestrate the whole thing. The
[01:25:24] third,
[01:25:25] these a- these models want to use tools.
[01:25:28] And these tools come with its skills.
[01:25:31] And I showed you CUDA X libraries, those
[01:25:33] are going to be amazing tools for the
[01:25:35] agents in the future.
[01:25:37] And then lastly,
[01:25:39] you need a runtime.
[01:25:41] You need the operating system that holds
[01:25:43] it all together. This is the Nvidia
[01:25:46] toolkit for agents.
[01:25:48] It includes
[01:25:50] It includes
[01:25:52] models
[01:25:54] that you can modify. Nvidia's
[01:25:55] world-class open models, and I'll show
[01:25:57] you more.
[01:25:59] You can run agents
[01:26:01] from anybody. You could run uh Cloud
[01:26:04] Code, incredible agent. Code X,
[01:26:06] incredible agent. You could run it
[01:26:08] inside this harness called Open Shell,
[01:26:11] which will be highly secure for your
[01:26:13] inside the enterprise.
[01:26:15] The shell
[01:26:16] protects the agent, keeps it grounded in
[01:26:20] security policies.
[01:26:22] Privacy is protected,
[01:26:25] its rights and privileges are given,
[01:26:27] its identity is protected. And so, this
[01:26:30] open shell is being adopted all over the
[01:26:33] world. NVIDIA Open Shell is open source.
[01:26:36] You can see so many companies adopt it.
[01:26:38] Red Hat,
[01:26:39] Canonical, Microsoft. It's going to be
[01:26:42] adopted everywhere.
[01:26:43] This is an important This is the
[01:26:46] runtime. And this runtime is fully
[01:26:49] optimized for the NVIDIA AI platform,
[01:26:52] which is everywhere. So, you can run
[01:26:54] Open Shell
[01:26:56] in any cloud, on prem, and even on
[01:26:59] device.
[01:27:00] So, you have you have now tools and
[01:27:02] libraries
[01:27:04] that they can use.
[01:27:06] You have models that you can modify or
[01:27:08] use as is, or you have agents.
[01:27:11] This would be Open Claw,
[01:27:14] Hermes, another incredible another
[01:27:17] incredible
[01:27:18] harness. These agentic harnesses can now
[01:27:22] run on prem or for you anywhere. Okay?
[01:27:25] So, four things,
[01:27:27] and this represents the operating system
[01:27:29] of the modern enterprise.
[01:27:31] Now, how do we use this?
[01:27:33] One of my favorite use cases of agents
[01:27:37] is chip designers.
[01:27:39] It is the single most important thing
[01:27:41] that NVIDIA does.
[01:27:43] And so, of course, we have to partner
[01:27:45] with Cadence
[01:27:47] to build super agent, a chip design
[01:27:50] super agent.
[01:27:52] It is orchestrated by Codex or Cloud
[01:27:56] Code.
[01:27:57] It has RTL and architecture diagrams or
[01:28:01] schematics or
[01:28:02] specifications as input and whatever you
[01:28:05] need to fix.
[01:28:06] And together we created some super
[01:28:09] agents
[01:28:11] that are optimized for the NVIDIA run
[01:28:14] time
[01:28:15] with NeMo Tron
[01:28:17] and let's take a look. It's really
[01:28:18] incredible.
[01:28:22] >> Cadence and NVIDIA are partnering to
[01:28:24] [music] build chip design agents.
[01:28:27] Hundreds of thousands of NVIDIA chips
[01:28:30] come together to make the AI factories
[01:28:32] that power the world's frontier [music]
[01:28:33] AI models.
[01:28:35] Designing these chips and the systems
[01:28:37] they run in is one of the hardest
[01:28:39] engineering challenges. Trillions of
[01:28:41] [music] transistors, three-dimensional
[01:28:44] circuits, microscopic scale. Every gate,
[01:28:47] every wire synchronized to picoseconds
[01:28:50] [music]
[01:28:51] must work in perfect harmony with no
[01:28:53] margin for error.
[01:28:54] Physical prototypes are too slow and too
[01:28:57] costly.
[01:28:58] So, engineers work in the digital
[01:28:59] [music] realm. Each chip begins as a set
[01:29:01] of architectural specifications, then
[01:29:04] translated into RTL, the language of
[01:29:06] chip [music] design.
[01:29:08] RTL must be verified in simulation.
[01:29:11] A single bug can delay a chip by months.
[01:29:13] At NVIDIA,
[01:29:14] >> [music]
[01:29:14] >> thousands of engineers, billions of
[01:29:17] computer hours per year, millions of
[01:29:19] tests written, run, and debugged. A
[01:29:21] cycle that takes teams weeks. [music]
[01:29:24] To compress this cycle, Cadence and
[01:29:26] NVIDIA built a design verification
[01:29:28] agent.
[01:29:29] Codex [music] orchestrates the process.
[01:29:31] Cadence chip stack launches the RTL
[01:29:33] verification loop.
[01:29:35] Powered by NeMo Tron and [music] secured
[01:29:37] by NVIDIA Open Shell.
[01:29:39] Calling on expert sub agents in RTL
[01:29:42] generation, test bench creation,
[01:29:44] regression testing, and debug.
[01:29:47] The system drives itself.
[01:29:48] >> [music]
[01:29:49] >> The chip stack agents run hundreds of
[01:29:51] simulations with Cadence Xcelium, formal
[01:29:53] verification with Jasper.
[01:29:55] >> [music]
[01:29:56] >> Design flaws revealed. Bugs in the code
[01:29:59] fixed.
[01:30:00] What once took weeks now takes hours.
[01:30:02] [music]
[01:30:03] Verification cycles over 40 times
[01:30:05] faster.
[01:30:07] Together,
[01:30:07] >> [music]
[01:30:07] >> NVIDIA and Cadence are reinventing chip
[01:30:09] design with AI agents.
[01:30:14] >> From weeks from [applause] weeks to
[01:30:16] hours
[01:30:18] from weeks to hours from weeks to hours.
[01:30:21] NVIDIA has thousands of chip designers.
[01:30:23] We are going to hire hundreds [snorts]
[01:30:25] of thousands of Cadence super agents
[01:30:29] that work with us so that we can
[01:30:30] accelerate our company so that we can be
[01:30:34] even more ambitious, create even more
[01:30:36] amazing things, run even faster. You saw
[01:30:39] earlier
[01:30:40] that the toolkit
[01:30:42] with models
[01:30:44] harness
[01:30:45] tools
[01:30:47] The tools in this case are Cadence
[01:30:49] simulators and verifiers, formal
[01:30:51] verification systems. It is the reason
[01:30:53] why we're working with Cadence so hard
[01:30:55] to accelerate all of their tools on CUDA
[01:30:59] because the agents are impatient. The
[01:31:01] agents want the answer immediately.
[01:31:04] And so
[01:31:05] models
[01:31:06] harnesses
[01:31:08] accelerated CUDA accelerated libraries
[01:31:11] and tools
[01:31:12] and then the runtime.
[01:31:14] What you saw just now is all of that
[01:31:16] coming together.
[01:31:18] Now, one of the things that it starts
[01:31:19] with
[01:31:20] is a great model that Cadence could
[01:31:23] modify and tune to be expert at the
[01:31:26] Cadence workflow, at the Cadence
[01:31:28] expertise
[01:31:30] so that they could create super agents
[01:31:33] that are proprietary to Cadence with
[01:31:35] their proprietary knowledge.
[01:31:38] They have to start with an excellent
[01:31:39] model. We call it Neumotron.
[01:31:42] NVIDIA is dedicated to build open models
[01:31:45] for the world so that all of you, all of
[01:31:47] us, could create our own agents.
[01:31:50] Today, we're announcing
[01:31:53] the Neumotron
[01:31:56] 3 Ultra.
[01:31:58] Yep, our next open model. And it is
[01:32:01] smart.
[01:32:03] >> [applause]
[01:32:07] >> The Nemotron models
[01:32:10] not only give you the model,
[01:32:12] we give you all the data that we used to
[01:32:15] train the model. And because we have a
[01:32:17] coalition of incredible partners, you
[01:32:19] can see all of our partners down here.
[01:32:22] We work together,
[01:32:24] contribute data to each other.
[01:32:26] Nemotron is trained on one of the
[01:32:29] largest suites of long-running reasoning
[01:32:32] models, long-running tool task solving,
[01:32:35] tool using
[01:32:37] data sets in the world
[01:32:39] because of all of our great
[01:32:40] partnerships.
[01:32:41] All of this from the model,
[01:32:44] the training script, and the data made
[01:32:47] all completely available to you. This is
[01:32:50] open models at its best.
[01:32:53] The best open model system policies in
[01:32:55] the world.
[01:32:57] Simple goal is so that you can take all
[01:32:59] of it, add to it, make it even better,
[01:33:02] make it yours.
[01:33:03] Nemotron 3 Ultra
[01:33:06] is
[01:33:07] five times faster.
[01:33:09] This is the world's first model based on
[01:33:11] a hybrid architecture
[01:33:13] of SSM, state space models, with
[01:33:17] mixture of experts. The architecture is
[01:33:20] incredibly fast. We made it fast so that
[01:33:23] you could think fast. When you think
[01:33:25] fast, you can think longer at the same
[01:33:27] cost.
[01:33:28] So, five times faster.
[01:33:31] It is also
[01:33:33] 30% cheaper.
[01:33:35] 30% lower cost to run in total flops and
[01:33:38] total inference time than even the most
[01:33:40] cost-effective in the world.
[01:33:42] We're comparing against the world's best
[01:33:44] open models.
[01:33:46] Frontier smart.
[01:33:48] Five times faster.
[01:33:50] 30 30% cheaper.
[01:33:53] Completely open.
[01:33:55] We're completely dedicated to this. This
[01:33:57] is now Nemotron 3.
[01:33:59] We're currently working on Nemotron 4.
[01:34:02] So, this entire toolkit
[01:34:04] from models,
[01:34:06] harnesses,
[01:34:07] tools and skills,
[01:34:09] and run times is the reason why
[01:34:13] every
[01:34:14] enterprise company in the world has the
[01:34:16] ability now
[01:34:17] to create their own agents just like
[01:34:20] Cadence did with their super agents. And
[01:34:22] we're working with so many companies,
[01:34:24] Cadence and CrowdStrike, Dassault and
[01:34:26] Palantir, SAP and ServiceNow.
[01:34:29] People were always said, "Jensen,
[01:34:32] the agents are going to disrupt these
[01:34:33] markets."
[01:34:35] I said completely opposite, and you can
[01:34:37] now see it.
[01:34:39] Agents is going to create the largest
[01:34:40] opportunity ever for my partners and
[01:34:43] friends.
[01:34:44] And we have the Nemo, the the NVIDIA
[01:34:48] agentic toolkit for enterprise AI to
[01:34:51] help them.
[01:34:53] So,
[01:34:54] there you go.
[01:34:56] >> [applause]
[01:35:00] >> First, Vera Rubin in full production.
[01:35:03] Two, Vera CPU, CPU built for a new
[01:35:06] generation
[01:35:08] for agents.
[01:35:09] And three, NVIDIA's enterprise AI
[01:35:12] toolkits so that every enterprise and
[01:35:16] every enterprise software company can
[01:35:18] build agents.
[01:35:26] >> [applause]
[01:35:30] >> My relationship with you started here.
[01:35:34] And many of you, many of you, many of my
[01:35:37] friends and partners here in Taiwan,
[01:35:39] your companies started here.
[01:35:43] This is in a lot of ways the beginning
[01:35:46] of the modern computer industry, 40
[01:35:49] years now.
[01:35:50] NVIDIA's 33 years old.
[01:35:53] the PC industry was already starting to
[01:35:55] get the Windows 1 and Windows 2 and
[01:35:58] Apple
[01:35:59] Apple 1 and Apple 2 and
[01:36:02] by the time that we came along,
[01:36:04] Windows 3.1 was the PC.
[01:36:07] And as you know, Windows 95 made PC
[01:36:11] personal.
[01:36:12] It took PC from enterprises com-
[01:36:15] companies and made it into a consumer
[01:36:18] electronics device.
[01:36:20] Everybody should have one and everybody
[01:36:21] does.
[01:36:22] This is the beginning.
[01:36:24] This computing platform did several
[01:36:26] things incredibly smart.
[01:36:29] Windows was not just disaggregated, as
[01:36:31] you know.
[01:36:32] Windows was properly abstracted. It was
[01:36:36] architected just right.
[01:36:37] Systems BIOSes,
[01:36:40] open chipsets,
[01:36:42] the operating system with
[01:36:44] drivers,
[01:36:47] drivers that could be connected and
[01:36:49] installed at run time.
[01:36:51] And an abstraction layer with a
[01:36:53] multimedia API
[01:36:56] that was
[01:36:58] that opened up the PC to what we all
[01:37:00] know today.
[01:37:01] Each one of these elements were
[01:37:03] essential
[01:37:05] in making the PC so popular.
[01:37:08] 40 years later,
[01:37:10] Microsoft and Nvidia
[01:37:13] are going to reinvent the PC.
[01:37:16] This is going to be the new PC.
[01:37:19] Now, tomorrow night tomorrow night, I
[01:37:21] think it's tomorrow night our time,
[01:37:23] but I'm going to be with Satya. We're
[01:37:25] going to talk a lot more about the work
[01:37:27] that we're doing together.
[01:37:29] Microsoft and Nvidia over the last 3
[01:37:31] years,
[01:37:32] it took this long to completely reinvent
[01:37:36] how the PC's going to work so that we
[01:37:38] could be ready for this moment.
[01:37:40] As I mentioned earlier,
[01:37:42] that compute pattern called the agent is
[01:37:45] going to run in AI clouds. It's going to
[01:37:48] run inside enterprises. It is also going
[01:37:51] to run on your PC.
[01:37:54] What's going to happen to that PC when
[01:37:56] it has an autonomous agent?
[01:37:59] An agent that's helping you, that
[01:38:01] understands you.
[01:38:02] You could talk to it. It could look at
[01:38:04] you. You could
[01:38:06] ask it to read files,
[01:38:08] go help you
[01:38:10] do some research. It could do a lot more
[01:38:12] that I'll show you.
[01:38:13] But the new operating system
[01:38:16] is of course the old operating system
[01:38:18] plus
[01:38:19] large language models.
[01:38:21] Large language models in a lot of ways
[01:38:24] is the modern version of DirectX.
[01:38:27] It has of course input and output,
[01:38:30] understands prompts, it understands
[01:38:32] computer vision, it can generate video,
[01:38:34] it can generate sounds. It is the modern
[01:38:36] extension, the intelligence extension of
[01:38:40] the PC, of a computer.
[01:38:43] On top of that, the application, as I
[01:38:46] mentioned before, is going to be
[01:38:47] replaced
[01:38:49] by now an agentic runtime. And that is
[01:38:52] the modern application, an agent.
[01:38:55] Let's now take a look at what it can do.
[01:38:59] >> It started with a spark.
[01:39:02] An idea
[01:39:04] to reimagine [music] the PC for the
[01:39:05] first time in 40 years
[01:39:08] with the age of AI.
[01:39:11] What becomes of [music] our personal
[01:39:12] computer in a world of agents?
[01:39:15] Agents running [music] natively,
[01:39:17] connected to models, local or in the
[01:39:20] cloud. Our personal AI,
[01:39:23] sandboxed [music] for security, running
[01:39:25] continuously,
[01:39:26] getting work done.
[01:39:28] The chips [music] and the OS must
[01:39:30] evolve.
[01:39:32] Introducing RTX Spark. Everything we've
[01:39:36] learned [music]
[01:39:37] over 33 years distilled into one chip.
[01:39:42] Blackwell RTX GPU with 6,144
[01:39:46] CUDA cores, one petaflop of AI
[01:39:49] performance.
[01:39:51] A custom 20-core Grace CPU built in
[01:39:54] partnership with MediaTek.
[01:39:56] Fused by NVLink. [music]
[01:39:59] 128 GB of unified memory.
[01:40:03] TSMC 3-nanometer process.
[01:40:06] 70 billion [music] transistors.
[01:40:10] And in close collaboration with
[01:40:11] Microsoft,
[01:40:13] a Windows platform for agents.
[01:40:16] We're reinventing the [music] personal
[01:40:18] computer.
[01:40:20] For creating.
[01:40:23] For gaming.
[01:40:26] For agents.
[01:40:28] This is the dawn of [music] a new
[01:40:29] personal computing revolution.
[01:40:32] And it starts with NVIDIA RTX Sparks.
[01:40:39] >> [music]
[01:40:42] [applause]
[01:40:50] >> Here it is.
[01:40:53] Of course,
[01:40:54] I got to show you the most beautiful
[01:40:55] part, which is video games.
[01:40:58] It is It's also the closest to our
[01:41:00] heart.
[01:41:01] This is Forza.
[01:41:03] This is 007, by the way.
[01:41:05] The new 007 game, I'm looking forward to
[01:41:07] playing it.
[01:41:09] I look a little bit like him.
[01:41:11] Ladies and gentlemen,
[01:41:13] NVIDIA's
[01:41:14] RTX Spark laptops. Now,
[01:41:18] >> [applause]
[01:41:22] >> thank you.
[01:41:28] I have too many things in my pocket.
[01:41:33] Okay.
[01:41:34] All right. This is the most amazing chip
[01:41:36] the world has ever built.
[01:41:38] This
[01:41:40] is the N1X that we built in partnership
[01:41:43] with MediaTek. I think I saw I saw Rick
[01:41:45] earlier. This is N1X. This is a
[01:41:47] beautiful chip. This is This is a a
[01:41:51] a chip that frankly would take 33 years
[01:41:54] to build. And the reason for that is
[01:41:57] because 100% of NVIDIA's software stack
[01:42:00] runs here.
[01:42:02] If you want to run uh
[01:42:03] uh digital biology, no problem. If you
[01:42:05] want to do seismic processing, no
[01:42:07] problem. You want astrophysics, no
[01:42:09] problem. Everything associated with
[01:42:11] CUDA, all the physics, all the biology,
[01:42:13] all the genomics, all the AI, no
[01:42:16] problem. All the computer graphics, no
[01:42:17] problem. Every single application NVIDIA
[01:42:21] has ever created and every single
[01:42:24] application that Windows has ever run.
[01:42:28] Microsoft and NVIDIA meticulously
[01:42:31] optimized everything so that this
[01:42:33] computer literally runs everything the
[01:42:37] world has ever created plus
[01:42:40] it now runs agents.
[01:42:42] An incredible [snorts] computer. I'm so
[01:42:44] proud of it.
[01:42:48] >> [applause]
[01:42:51] >> Okay.
[01:42:53] Now,
[01:42:54] I want you to keep that in mind in the
[01:42:56] next video I just I'm going to show you.
[01:42:58] Just imagine everything here is going to
[01:43:00] run on your PC. Now, that computer could
[01:43:03] have a local Neumontron 3 ultra model
[01:43:08] or Neumontron 3 super model or it could
[01:43:10] have a Claude code or Codex or some
[01:43:15] other model in the cloud or something on
[01:43:17] the network and it's going to it's going
[01:43:19] to work and do something amazing. Let's
[01:43:21] play it.
[01:43:24] Every house [music] starts as an idea.
[01:43:27] Getting from idea to design
[01:43:29] takes a myriad of tools,
[01:43:31] expertise, and a lot of time.
[01:43:35] Now, an [music] agent running locally on
[01:43:38] RTX Spark can help me design a house
[01:43:40] using the tools on my laptop with an
[01:43:43] open shell sandbox
[01:43:45] running the Hermes [music] harness
[01:43:46] connected to Claude Sonnet in the cloud.
[01:43:49] I select the site, share my concept
[01:43:51] sketches and mood board of styles to
[01:43:54] inspire my design
[01:43:55] and the prompt, a text description of
[01:43:58] the requirements
[01:44:00] and the design intent.
[01:44:04] My agent goes [music] to work using the
[01:44:06] tools on my laptop. It opens Rhino and
[01:44:09] starts modeling the site,
[01:44:11] shaping terrain, setbacks, [music] and
[01:44:13] the building envelope.
[01:44:15] Then it proposes building forms
[01:44:17] optimized for cost, comfort, and
[01:44:19] quality.
[01:44:22] With the form defined, my agent
[01:44:23] generates the interior layout. Walls,
[01:44:26] circulation, rooms begin [music] to take
[01:44:28] shape. I jump in whenever I want to
[01:44:31] adjust, to change.
[01:44:36] Doors, windows, [music]
[01:44:38] and structural elements are placed
[01:44:39] automatically. My agent detects its own
[01:44:42] mistakes
[01:44:44] and fixes them.
[01:44:48] When I approve, the agent exports the
[01:44:50] model [music] from Rhino into Blender.
[01:44:52] Materials and object properties transfer
[01:44:55] with the design context intact.
[01:44:58] I fine-tune the materials, [music]
[01:45:00] get the look just right. Then, I pick
[01:45:02] the shots. Blender renders the house. My
[01:45:05] agent, using generative AI with the Flux
[01:45:08] [music] 2 model, makes them photo real.
[01:45:10] Multiple viewpoints, lighting
[01:45:12] conditions. What was once a complex
[01:45:14] workflow [music]
[01:45:15] is now guided and simplified by my agent
[01:45:19] working with me on [music] RTX Spark.
[01:45:22] Design at the speed of imagination.
[01:45:30] >> [applause]
[01:45:31] >> He's seeing the world of agents.
[01:45:34] The developers are so excited about it.
[01:45:36] This is an incredible computer. All of
[01:45:38] the acceleration, all the software
[01:45:40] capabilities associated with it, working
[01:45:43] with every developer to make it
[01:45:44] incredible for all of you.
[01:45:47] The next one,
[01:45:49] Adobe.
[01:45:50] Incredible tool suite, of course, used
[01:45:52] by tens of millions of people around the
[01:45:54] world. They have re-engineered
[01:45:57] the architecture, the core of Adobe
[01:45:58] Photoshop and Premiere,
[01:46:00] and they're going to release it for RTX
[01:46:02] Spark. It is twice as fast. It's already
[01:46:04] fast. Now, it's going to be twice as
[01:46:06] fast. And it It's also designed to be
[01:46:09] agent-friendly. With its MCP server, it
[01:46:12] can now interact with agents on your
[01:46:14] laptop.
[01:46:16] The number of customers, the number of
[01:46:18] partners that
[01:46:19] are so excited to bring RTX RTX Spark to
[01:46:22] the market is just incredible.
[01:46:25] You know, this is the first across the
[01:46:28] lineup of PC reinvention for 40 years.
[01:46:33] And I'm just so happy that all of you
[01:46:35] and
[01:46:36] the ecosystem around the world has
[01:46:37] joined us.
[01:46:39] This is basically everybody. Everybody
[01:46:41] will support RTX Spark and will be
[01:46:44] building incredibly smart and powerful
[01:46:47] and beautiful laptops with all of us.
[01:46:49] Thank you very much.
[01:46:51] >> [applause]
[01:46:56] >> But that's not all. That's not all.
[01:47:00] RTX Spark is a reinvention
[01:47:03] of laptop.
[01:47:04] But in fact, Microsoft and Nvidia is
[01:47:07] reinventing all of PC. And today, we're
[01:47:09] announcing a whole new line.
[01:47:13] Three revolutionary Windows machines
[01:47:16] covering desktop,
[01:47:18] laptop, and workstations. All 100%
[01:47:22] Windows compatible, 100% CUDA, 100%
[01:47:26] NVIDIA AI Tensor Core.
[01:47:28] Everything that runs that you see that
[01:47:31] runs on NVIDIA in all these different
[01:47:32] platforms around the world runs here.
[01:47:36] This is the first completely
[01:47:39] re-engineered, reinvented line of PCs
[01:47:43] that has happened in 40 years.
[01:47:45] Now, what's really amazing is this. So,
[01:47:46] this is this is the RTX Spark laptop.
[01:47:50] This
[01:47:52] is the desktop.
[01:47:53] So, this one's from MSI.
[01:47:56] Joseph, this one's yours.
[01:47:58] Okay. Look how beautiful it is. This
[01:48:01] agent could run 24/7.
[01:48:04] Meter free.
[01:48:05] And you could download your agent. You
[01:48:08] could raise your lobster in here.
[01:48:12] This is your claw.
[01:48:14] It's running all the time.
[01:48:16] No meter anxiety. And it's sitting here
[01:48:19] connected to your whole house.
[01:48:21] Connected to your laptop.
[01:48:23] Connected to your display. All the
[01:48:25] cameras. Your your dryer, your water
[01:48:29] cooler, your water heater, your
[01:48:31] everything. Whatever you want. Your
[01:48:33] security system. All connected to this.
[01:48:35] And this becomes your personal AI. Your
[01:48:39] personal AI agent.
[01:48:41] And it gets smarter and smarter and
[01:48:42] smarter over time because today we have
[01:48:44] Nimo-Tron 3 Ultra.
[01:48:46] Tomorrow we have Nimo-Tron 4. And then
[01:48:48] Nimo-Tron 5. Nimo-Tron 6. And we just
[01:48:51] keep getting it smarter and smarter and
[01:48:52] smarter.
[01:48:53] And meanwhile this is sitting at home
[01:48:55] helping you do things. If you want to
[01:48:56] book a travel, no problem.
[01:48:59] And
[01:49:00] if you
[01:49:02] if you want
[01:49:04] an incredible system,
[01:49:06] this is a DGX Station
[01:49:08] for Windows.
[01:49:10] Compatible with Windows. Runs everything
[01:49:12] in Windows. And and
[01:49:15] it has
[01:49:16] 768
[01:49:18] GB of memory.
[01:49:20] And so you could run a trillion
[01:49:22] parameter model.
[01:49:23] This is unbelievable. 20 petaflops,
[01:49:27] 8 TB per second of memory bandwidth,
[01:49:30] and this sits by your desk.
[01:49:34] You basically, if you're a developer of
[01:49:36] large language models, you're developer
[01:49:38] of agents, having this sit by your desk
[01:49:42] gives you all the compute you need, and
[01:49:44] then when you deploy it, you put it into
[01:49:45] the cloud. Now, there's something that
[01:49:48] if you look at this and think about
[01:49:50] this, something is happening here.
[01:49:54] Remember,
[01:49:57] 15 20 years ago, we used to have an idea
[01:49:59] called a phone.
[01:50:02] Today, we have an idea called a PC.
[01:50:06] Today, when you think about your phone,
[01:50:09] the one thing you don't do with it is
[01:50:11] make phone calls.
[01:50:16] You do just about everything else.
[01:50:18] And so that phone means something very
[01:50:20] different to you than a phone of the
[01:50:22] past.
[01:50:25] I am certain what's going to happen here
[01:50:28] is that the PC 10 years from now and the
[01:50:30] PC that you think about today, a tool
[01:50:33] where do you launch applications,
[01:50:37] click and type,
[01:50:39] and this PC is going to be completely
[01:50:41] different.
[01:50:43] Here's my theory.
[01:50:45] I can totally imagine,
[01:50:47] just as every house today has a home
[01:50:49] theater,
[01:50:51] or many houses have home theaters, big
[01:50:52] TVs,
[01:50:55] lawnmowers,
[01:50:57] dishwashers,
[01:50:59] I could totally imagine that someday
[01:51:01] there's actually an AI supercomputer in
[01:51:03] your house.
[01:51:04] And it's running all of your agents,
[01:51:06] it's running all of your assistants, and
[01:51:09] they're doing all kinds of things for
[01:51:10] you all the time.
[01:51:13] And you have to have it in your house
[01:51:15] just like you have a home theater in
[01:51:16] your house, you have stereos in your
[01:51:17] house, you have game consoles in your
[01:51:19] house.
[01:51:20] You want a assist AI agent computers
[01:51:23] running in your house.
[01:51:25] And these
[01:51:26] in time becomes a lot more like
[01:51:30] R2-D2 to you.
[01:51:32] It becomes more like C-3PO to you
[01:51:35] than it feels like a PC to you.
[01:51:39] There is no question this reinvention of
[01:51:41] the computer
[01:51:43] is as big of a deal as the reinvention
[01:51:45] of the phone into what we now know as
[01:51:47] the smartphone.
[01:51:48] And so this is the beginning of that
[01:51:50] journey. This is the beginning of a new
[01:51:53] line.
[01:51:54] And so we have a roadmap for this. This
[01:51:56] is a brand new product family for us.
[01:51:59] Every single generation of architecture
[01:52:02] we will have a desktop, a laptop, a
[01:52:05] workstation, and then a desktop, a
[01:52:07] laptop, and workstation. And the thing
[01:52:10] that I'm just incredibly pleased,
[01:52:12] incredibly honored,
[01:52:14] is that 100%
[01:52:15] of the world's PC industry has joined us
[01:52:18] to reinvent the PC. A new line,
[01:52:22] a new beginning. Thank you.
[01:52:25] >> [applause]
[01:52:37] >> As you know,
[01:52:40] agentic AI
[01:52:42] is just a digital robot.
[01:52:45] It understands,
[01:52:47] it reasons, it plans, and it acts and
[01:52:50] use tools.
[01:52:52] Agentic AI is going to run across all of
[01:52:55] these computers.
[01:52:56] And you've seen me talk about each and
[01:52:58] every one of these over time.
[01:53:00] We're working on humanoid robotics
[01:53:02] computers, robotics computers of all
[01:53:04] kinds. We're working on self-driving car
[01:53:06] computers. We're working on satellites.
[01:53:10] You have GeForce, which is has tensor
[01:53:12] cores. I just talked about a whole new
[01:53:13] line of PCs.
[01:53:15] Agriculture equipment, manufacturing
[01:53:17] equipment, heavy industry equipment will
[01:53:19] all be agentic.
[01:53:21] You'll even have a little agentic
[01:53:24] helper for yourself.
[01:53:26] Even your base stations, the radio
[01:53:28] stations of the future are going to be
[01:53:30] agentic.
[01:53:32] Understanding traffic and thinking about
[01:53:35] how to coordinate with the other base
[01:53:37] stations so that you could use as little
[01:53:40] energy as possible, increase the
[01:53:42] utilization, the efficiency of the
[01:53:44] spectral efficiency.
[01:53:46] And so, everything will run agents.
[01:53:49] Today, NVIDIA is largely in the center.
[01:53:52] But, I am pretty certain
[01:53:54] that there will be
[01:53:56] tens of billions, hundreds of billions
[01:53:59] over time of agentic systems, agentic
[01:54:02] computers that are going to be running
[01:54:03] around the world. The biggest problem is
[01:54:06] data.
[01:54:08] In the case of language models, all the
[01:54:10] English and all the language that we
[01:54:12] have on the internet that we trained on
[01:54:14] was from the perspective of us. We wrote
[01:54:16] it, and we're reading it.
[01:54:18] However, in order to create a data for
[01:54:22] AI, robotics, it has to be in the
[01:54:24] perception, the perspective of the
[01:54:27] robot.
[01:54:28] And most of the world's video data is
[01:54:31] from a third person, not first person.
[01:54:34] And so, agentic systems, robotic
[01:54:36] systems,
[01:54:38] physical AI, the data is the hardest
[01:54:41] problem.
[01:54:42] You've seen us move up this ladder. We
[01:54:44] started with teleoperations, which is
[01:54:47] basically human demonstration. This is
[01:54:49] no different than the big breakthrough
[01:54:51] of reinforcement learning human
[01:54:53] feedback.
[01:54:54] This Then, we use simulation. This is
[01:54:56] where Omniverse comes in. This is no
[01:54:58] different than reinforcement learning
[01:55:01] verif- verifiable rewards. Okay? And so,
[01:55:04] we use these systems to bootstrap
[01:55:09] the AI model, the physical AI model.
[01:55:12] Eventually, we're able to learn
[01:55:14] from third per- third person,
[01:55:16] reprojecting it into first person,
[01:55:19] and now, eventually, through
[01:55:21] bootstrapping, we have a world
[01:55:24] foundation model that can understand the
[01:55:26] physical world from any perspective you
[01:55:28] want. Third third person, first person,
[01:55:32] outside in, inside out, doesn't matter.
[01:55:34] This is a big breakthrough, indeed. And
[01:55:37] today,
[01:55:39] we are announcing
[01:55:41] Cosmos 3.
[01:55:43] Cosmos 3
[01:55:45] is the frontier of physical AI.
[01:55:49] We are at the frontier with language
[01:55:51] models. There's so many people working
[01:55:53] on it. However, in physical AI, we are
[01:55:57] absolutely the world's best. I am so
[01:55:59] proud of the team for doing this. This
[01:56:01] is the foundation model for all of your
[01:56:03] work.
[01:56:04] Whenever you want to create a robot,
[01:56:06] whenever you want to create a factory
[01:56:08] robot or a robot that works in a
[01:56:09] factory,
[01:56:10] any kind of robot that in
[01:56:13] that
[01:56:14] involves physical world, you now have a
[01:56:17] companion, a Cosmos 3, that can
[01:56:20] understand and reason.
[01:56:21] It can generate.
[01:56:23] It can simulate in the loop. It can even
[01:56:26] be the policy itself. It is on the top
[01:56:28] of leaderboards all over the all over
[01:56:30] the world. I am incredibly proud of
[01:56:32] Cosmos, and today we're announcing
[01:56:34] Cosmos 3. Let's take a look.
[01:56:38] >> The real world is infinite and
[01:56:40] unpredictable.
[01:56:41] Physical AI needs data,
[01:56:43] >> [music]
[01:56:44] >> but real-world data is impossible to
[01:56:46] scale.
[01:56:47] For physical AI, compute is data.
[01:56:51] This is Cosmos, an open frontier [music]
[01:56:54] omnimodel for physical AI, built on a
[01:56:56] new mixture of Transformers
[01:56:58] architecture. Pixels, action, sound, and
[01:57:01] language [music] flow into the auto
[01:57:03] regressive transformer, which reasons,
[01:57:05] plans, and instructs the diffusion
[01:57:07] transformer,
[01:57:08] which generates what comes next.
[01:57:11] Developers post train Cosmos [music]
[01:57:13] across embodiments and use cases.
[01:57:16] As a VLM, Cosmos watches the physical
[01:57:18] [music] world,
[01:57:19] understands what's happening,
[01:57:22] describing scenes, and flagging what
[01:57:24] [music] matters.
[01:57:26] As a world model, Cosmos generates
[01:57:28] physics accurate synthetic video from an
[01:57:31] image, text, [music] or video.
[01:57:34] As a simulator, Cosmos closes the loop
[01:57:37] for policy training [music] and
[01:57:38] evaluation. And as the foundation of
[01:57:41] NVIDIA Omniverse, an action conditioned
[01:57:44] world model, Cosmos predicts the future
[01:57:47] frame by frame. [music]
[01:57:49] Post train Cosmos, and it becomes a
[01:57:51] world action model.
[01:57:54] Perceiving, reasoning, planning,
[01:57:57] generating actions
[01:57:59] for robots [music]
[01:58:00] of every kind,
[01:58:02] for everything that moves.
[01:58:06] A new kind of data, [music]
[01:58:08] a new kind of teacher generated by
[01:58:11] computer.
[01:58:14] Cosmos, the foundation for [music]
[01:58:16] developers of the age of physical AI.
[01:58:28] >> [applause]
[01:58:31] >> This takes data plus compute, gives you
[01:58:35] AI.
[01:58:36] Now that we have AI,
[01:58:39] compute is data.
[01:58:41] And so, use Cosmos 3, train a whole
[01:58:43] bunch of AI models. Cosmos is such an
[01:58:45] incredible open model system. It's
[01:58:47] exactly the same as NeMo Triton. We open
[01:58:49] model,
[01:58:50] we open the data, and we even open how
[01:58:53] we trained it so that you could enhance
[01:58:55] it for yourself and turn Cosmos into
[01:58:57] your proprietary model. We have such
[01:59:00] incredible partners working with us in
[01:59:01] so many different industries.
[01:59:03] Now, the model itself is the most of
[01:59:06] course the most understandable part of
[01:59:09] the AI stack, but the AI stack is very
[01:59:11] complicated. It has
[01:59:13] generators,
[01:59:15] the model,
[01:59:16] simulators, and the runtime. Just as
[01:59:20] Just as it is for agentic systems, these
[01:59:22] cars or essentially a physical AI a
[01:59:26] agentic robot that is a is a autonomous
[01:59:29] vehicle has also this complicated stack.
[01:59:33] Today, we're announcing Alpha Mile 2,
[01:59:36] an open model for self-driving cars.
[01:59:40] We're working with car companies across
[01:59:42] the world. If you look at these brands
[01:59:45] that have signed up for the NVIDIA
[01:59:46] Hyperion that are building NVIDIA
[01:59:48] Hyperion cars,
[01:59:50] this represents about 80%
[01:59:54] of the world's cars. The manufacturers
[01:59:57] represent 80% of the world's cars. We
[01:59:59] are going to have a whole lot of NVIDIA
[02:00:02] Hyperion systems that are able to run
[02:00:05] Alpha Mile or anybody else's AV stack.
[02:00:08] We are also connected into mobility
[02:00:10] services. Approximately 97% of the
[02:00:13] world's mobility services are connecting
[02:00:15] with us so that when we deploy Alpha
[02:00:18] Mile on the Hyperion runtime with the
[02:00:22] Halos operating system, we will be able
[02:00:25] to connect to all of these services
[02:00:26] across the world. Let's take a look at
[02:00:27] this.
[02:00:31] >> Hey Mercedes, let's go to my favorite
[02:00:33] sandwich shop.
[02:00:36] >> Routing to your destination.
[02:00:39] Lane is clear. Pulling out to start
[02:00:41] drive.
[02:00:42] Nudge left due to the stationary lead
[02:00:44] vehicle ahead blocking [music] our lane.
[02:00:46] Slow down to stop at the stop sign
[02:00:48] controlling the intersection.
[02:00:50] Stop to yield to the pedestrian [music]
[02:00:51] since the person is in our lane.
[02:00:54] Yield to the cut-in vehicle from the
[02:00:55] left. Nudge left to clear the stopped
[02:00:57] vehicle blocking on the right. Keep
[02:00:59] distance [music] to the cut-in vehicle
[02:01:00] since it is merging into our lane. Nudge
[02:01:02] left due to the stopped van blocking
[02:01:04] [music] the right side of our lane.
[02:01:06] Stop to keep distance to the lead
[02:01:07] vehicle in the same area directly ahead
[02:01:09] in our lane. [music] Keep distance to
[02:01:10] the vehicle directly ahead in our lane.
[02:01:11] Stop for the stop sign since the
[02:01:12] intersection is controlled by a stop
[02:01:14] sign. Yield to the cross traffic since
[02:01:14] the
[02:01:15] >> [music]
[02:01:24] >> Your destination is on the right.
[02:01:28] >> [music]
[02:01:33] >> Elpa Mayo
[02:01:34] >> [applause]
[02:01:36] >> the world's first reasoning autonomous
[02:01:39] vehicle.
[02:01:41] If you let it talk all the time, it will
[02:01:44] drive you crazy.
[02:01:46] But,
[02:01:48] we're very happy that it's talking to
[02:01:50] itself all the time.
[02:01:51] That's called thinking.
[02:01:53] And so, Elpa Mayo is a reasoning car.
[02:01:55] The technology that we've created also
[02:01:58] applies to humanoids. Of course, there
[02:02:00] are many new breakthroughs that has to
[02:02:02] happen. The Nvidia Isaac Groot is our
[02:02:05] humanoid robotics stack.
[02:02:07] model
[02:02:09] data generation
[02:02:12] simulation
[02:02:14] the run the run time including the
[02:02:17] operating system. This represents
[02:02:20] Groot
[02:02:22] platform, the Isaac Groot platform.
[02:02:24] Every one of our systems, as you can
[02:02:26] see, the exact same pattern, where
[02:02:28] there's a gentic system for the cloud
[02:02:31] a gentic system for the PC
[02:02:33] a robotic system for a self-driving car,
[02:02:36] a robotic system for a human robot, all
[02:02:38] the same. And of course, in every single
[02:02:41] case,
[02:02:43] we build everything completely.
[02:02:46] We build everything vertically,
[02:02:49] completely,
[02:02:51] integrated with co-design, extreme
[02:02:53] co-design, and then we open it up for
[02:02:56] everybody to use whichever part you
[02:02:57] like.
[02:02:58] And whatever you want to use, we even
[02:03:01] help you modify.
[02:03:03] But the one thing that is missing is we
[02:03:05] need a reference platform for robotic
[02:03:08] systems. These robotic systems are so
[02:03:10] complicated, so many motors, so many
[02:03:13] sensors, so fragile, and yet we need to
[02:03:16] have a way
[02:03:18] to deliver these reference platforms
[02:03:20] just like we do with PCs and DGXs and
[02:03:23] clouds and self-driving cars. We now are
[02:03:26] going to do it for robots. Today we're
[02:03:27] announcing the NVIDIA Isaac Groot, a
[02:03:30] reference humanoid robot, all fully
[02:03:33] integrated. 25 degrees of freedom on the
[02:03:36] on each hand made by Sharp.
[02:03:39] 31 degrees of freedom on the robot, 6
[02:03:42] ft, 150 lb.
[02:03:45] Just like me.
[02:03:48] >> [laughter]
[02:03:50] >> The first number is shorter, the second
[02:03:52] number is bigger.
[02:03:54] Otherwise, pretty close.
[02:03:56] And And this platform runs the new Thor
[02:03:59] and our entire software stack.
[02:04:02] Data generation stack, data simulation
[02:04:04] stack, the runtime, all integrated into
[02:04:08] a robot that is designed for everyone to
[02:04:11] use. Now, we built this for higher
[02:04:13] education and university researchers
[02:04:16] because for them to build this is
[02:04:18] insanely hard to do. And so, let's take
[02:04:20] a look at that.
[02:04:22] >> The next leap in AI is general-purpose
[02:04:25] robots, humanoids. But building one is
[02:04:27] hard. Every team starts from scratch,
[02:04:29] [music]
[02:04:30] stitching together simulators, teleop
[02:04:32] systems, data pipelines, and training
[02:04:35] infrastructure. Months of setup before
[02:04:38] research [music] can start. NVIDIA Isaac
[02:04:40] Groot,
[02:04:41] an open development platform for
[02:04:43] humanoid [music] robots.
[02:04:45] Open models, simulation and training
[02:04:47] libraries, and data generators.
[02:04:50] Plus, the robot computer. [music]
[02:04:53] Fully pipe clean, ready to go in hours.
[02:04:56] First, set up the simulation environment
[02:04:58] in Isaac Lab.
[02:05:03] Capture demonstrations with Isaac
[02:05:05] Teleyop on a real or simulated robot.
[02:05:08] [music]
[02:05:10] Generate synthetic data with Omniverse
[02:05:12] [music] and Cosmos.
[02:05:14] Scaling one demonstration into
[02:05:16] thousands.
[02:05:18] Train policies. [music]
[02:05:20] Evaluate them in Isaac Lab Arena.
[02:05:25] Deploy through Isaac ROS,
[02:05:27] running on Jetson Thor.
[02:05:39] Every element modular, open.
[02:05:42] Use ours or swap [music] in your own.
[02:05:47] Groot is powering robotics research
[02:05:49] across every discipline, for every
[02:05:51] domain,
[02:05:52] from research labs to factory floors.
[02:05:56] One open platform.
[02:06:04] >> [music]
[02:06:04] >> And now, a new addition.
[02:06:06] Isaac Groot reference design robots,
[02:06:09] built on NVIDIA's open platform,
[02:06:12] ready for frontier research [music] for
[02:06:14] any lab, anywhere.
[02:06:16] The age of robotics starts here.
[02:06:19] NVIDIA [music] Isaac Groot.
[02:06:24] >> So many robots.
[02:06:26] >> [applause]
[02:06:32] >> We're working with just about everybody
[02:06:33] who's working on robots in the world or
[02:06:35] robotic systems in world.
[02:06:37] Let me tell you what I told you.
[02:06:39] The computer industry has been
[02:06:41] completely changed.
[02:06:43] In the last 6 months, everything
[02:06:45] changed.
[02:06:47] Everything changed because agents were
[02:06:49] realized and it converged with the
[02:06:51] latest frontier models and it made
[02:06:54] possible
[02:06:55] the AI to now do useful work.
[02:06:58] The computing pattern will repeat over
[02:07:01] and over and over again. This computing
[02:07:03] pattern of an agent that's a model, a
[02:07:06] harness that uses tools with skills and
[02:07:10] runs in a runtime. That runtime depends
[02:07:13] on whether it's in the cloud or on prem,
[02:07:14] on a PC or in robot. But the computing
[02:07:17] pattern is exactly the same for all of
[02:07:19] them.
[02:07:21] You will use different harnesses because
[02:07:22] of your preference. You will use
[02:07:24] different models because of your
[02:07:25] preference. You will improve them for
[02:07:27] your proprietary use. You would create
[02:07:30] sub super agents that you can rent to
[02:07:32] other people to help them do their work.
[02:07:35] This agentic platform, this agentic
[02:07:37] pattern
[02:07:38] Nvidia has an enterprise AI toolkit.
[02:07:41] This is a wonderful way for all of you
[02:07:44] to engage your AIs and for us, it's a
[02:07:46] wonderful growth opportunity.
[02:07:49] Vera Rubin is in full production.
[02:07:52] Whereas Grace Blackwell was created to
[02:07:55] process AI, particularly inference.
[02:07:58] Vera Rubin was created to run agents. It
[02:08:02] is in full production. It is much, much
[02:08:04] more than a GPU. It is an entire
[02:08:07] disaggregated, distributed agent
[02:08:10] processing system.
[02:08:12] Nvidia has really become an
[02:08:13] infrastructure company.
[02:08:15] Not just a GPU company, not just a
[02:08:17] systems company, but an infrastructure
[02:08:19] company to help you generate the maximum
[02:08:22] revenues, the maximum profit and to get
[02:08:24] there as soon as possible.
[02:08:27] The agent world,
[02:08:29] this new way of doing computing where
[02:08:31] you build CPUs now for agents, not for
[02:08:34] people,
[02:08:35] CPUs for agents has its own special
[02:08:38] requirement, and our NVIDIA Vera is
[02:08:41] revolutionary. I'm so happy about its
[02:08:43] ramp.
[02:08:44] The orders already is going to make it
[02:08:47] the fastest and the most successful
[02:08:48] product launch in our company's history.
[02:08:51] NVIDIA and Microsoft has created a whole
[02:08:54] new line of PCs. This is a new
[02:08:56] beginning, and of course, that exact
[02:08:58] same agentic pattern that I agentic
[02:09:01] processing pattern computing pattern
[02:09:02] that I just described is also going to
[02:09:06] run on all kinds of devices. I mentioned
[02:09:09] PCs,
[02:09:10] but in the future, it'll be robots and
[02:09:12] satellites and base stations and
[02:09:14] factories in the cloud, on prem, at the
[02:09:17] edge.
[02:09:18] This pattern agentic AI system,
[02:09:21] this agentic computing pattern will be
[02:09:24] replicated in computers all over. How we
[02:09:26] think about the personal computer will
[02:09:28] very likely change.
[02:09:30] I want to thank all of you
[02:09:32] for your partnership, your friendship.
[02:09:35] We couldn't be here without everything
[02:09:36] that we do together. I am so proud of
[02:09:39] how you've been so successful this last
[02:09:41] year.
[02:09:42] The next year
[02:09:43] is going to be even more. I have one
[02:09:46] more thing for you. Let's take a look.
[02:09:59] >> [screaming]
[02:10:03] [music]
[02:10:08] >> You're ready, Taiwan.
[02:10:12] Let's do this.
[02:10:14] >> The keynote's done at [music] Computex.
[02:10:16] Jensen showed the world what's next.
[02:10:19] Useful AI has arrived. Agents working by
[02:10:23] your side, but in case you missed
[02:10:24] [music] things we said today, we're
[02:10:26] going to break it all down for you,
[02:10:28] Taipei.
[02:10:29] >> Agents used to be misunderstood. Only
[02:10:31] movie stars [music] had them in
[02:10:33] Hollywood. Now we all got teams making
[02:10:35] dreams come true, building companies
[02:10:37] from living rooms, but they need so much
[02:10:40] compute. [music] We hear you. That's why
[02:10:42] we created Vera.
[02:10:44] >> Rubens call the shots, it's true.
[02:10:45] [music] The cheapest tokens coming
[02:10:48] through.
[02:10:48] >> 10 times faster than current heaven,
[02:10:51] more special [music] agents than
[02:10:52] double-07.
[02:10:53] >> Bluefield keeps agents memory true.
[02:10:56] >> Now, let's talk about it, CPU.
[02:10:58] >> 50% [music] faster, that's outrageous.
[02:11:00] >> Not for Vera.
[02:11:01] >> It's built for agents.
[02:11:03] >> Enveiling fusion blends A6 [music]
[02:11:05] smartly. Everyone's welcome to the
[02:11:07] Enveiling party.
[02:11:08] >> Well, if you liked that introduction,
[02:11:11] >> zero rubens in full production. Memoral
[02:11:14] ultra leap [music] the run. Five x
[02:11:16] faster work gets done. Memoral clock
[02:11:19] keeps the guardrails right. Open shell
[02:11:21] keeps [music] the sandbox tight.
[02:11:23] >> Your code migrated and reviewed
[02:11:26] >> all before this song [music] is through.
[02:11:29] >> AI
[02:11:29] is a five-layer cake
[02:11:31] of revenue. Make no mistake.
[02:11:33] >> Global [music] AI cloud with lots of
[02:11:35] gigawatts. TSX keeps power lean
[02:11:37] connecting dots.
[02:11:38] >> Every watt optimized for you.
[02:11:40] >> So you can have your cake
[02:11:42] >> and eat it, too.
[02:11:44] >> RTS marks the year.
[02:11:46] >> Biggest [music] PC moment in 40 years.
[02:11:48] >> Agents powering all workflows.
[02:11:51] Running anywhere Windows go.
[02:11:53] >> Harnesses [music] run on CPU.
[02:11:55] >> Models fly on GPU.
[02:11:58] >> Cosmos build worlds that robots need.
[02:12:00] >> Turning computing into [music] synthetic
[02:12:02] feed.
[02:12:03] >> I'll put my own seeds and reasons
[02:12:05] through.
[02:12:05] >> Understand roads like people do.
[02:12:08] >> Who is how they learn [music] to move.
[02:12:10] >> Learning skills and finding growth.
[02:12:13] >> Joule trees powered by thought.
[02:12:15] >> Computers human [music] life.
[02:12:18] Come on.
[02:12:29] >> [music]
[02:12:30] [singing]
[02:12:39] [music]
[02:12:49] >> The future's bright. [music]
[02:12:51] Come see [singing] what's next.
[02:12:55] >> Thank you, Taiwan.
[02:12:57] >> [sighs]
[02:12:58] >> Welcome to Computex.
[02:13:07] >> [music]
[02:13:11] [applause]
[02:13:12] >> Have a great Computex. Thank you for
[02:13:14] your support.
[02:13:15] >> Thanks for an amazing year.
[02:13:16] Thank you for all your friendship and
[02:13:18] support. Thank you. Take care.
[02:13:21] Have a great Computex.
[02:13:23] >> [applause]
[02:13:30] [music]
[02:13:37] >> Woke up feeling something shift. [music]
[02:13:40] Same room, but the air felt thick.
[02:13:42] Mirror said I'm still that kid. I said
[02:13:45] you're bigger than this.