# 【直播｜AI即時中字翻譯】輝達 NVIDIA GTC Taipei 2026 黃仁勳主題演講

https://www.youtube.com/watch?v=tUE2RV9hqWI
Translation: zh-TW

[01:16] I kind of want to stay.
  我有點想留下來。

[01:19] I can't wait to see it play out in real life.
  我迫不及待想在現實生活中看到它發生。

[01:23] Us in the front row laughing through the late nights.
  我們在前排，在深夜裡笑著。

[01:26] I can't wait to feel it.
  我迫不及待想感受它。

[01:29] First, the first kiss.
  首先，初吻。

[01:33] Whatever this becomes, I can't wait to see this.
  無論這會變成什麼樣，我都迫不及待想看到。

[01:43] You say, "Yeah, you always say it.
  你說，「是啊，你總是這麼說。

[01:50] I then I be what I am."
  我然後我就是我所是的。」

[01:53] But when you smile, even through a screen blow, whole face, I'm ready just to let go.
  但當你微笑時，即使透過螢幕的吹拂，整張臉，我都準備好放手了。

[01:56] I can't wait to see this play out in real life.
  我迫不及待想在現實生活中看到這個。

[02:00] In the front row, laughing through the late night.
  在前排，在深夜裡笑著。

[02:04] I can't wait to feel it.
  我迫不及待想感受它。

[02:07] First the first kiss, whatever this becomes.
  首先是初吻，無論這會變成什麼。

[02:09] I can't wait to see this.
  我迫不及待想看到這個。

[02:20] What if it's better than all my day dreams?
  如果它比我所有的白日夢都還要美好呢？

[02:23] What if you're braver than I believe?
  如果你比我所相信的還要勇敢呢？

[02:27] Say meet me outside and I won't resist.
  說在外面見，我不會反抗。

[02:31] I've waited all this time.
  我已經等了這麼久。

[02:38] Can't wait for this.
  等不及了。

[02:41] I can't wait to see this play out in real life.
  我等不及想在現實生活中看到這一切上演。

[02:44] I s in the front row through the late night.
  我在前排，熬夜看著。

[03:00] This is how intelligence is made.
  這就是智慧的誕生方式。

[03:04] A new kind of factory, generator of tokens, the building blocks of AI.
  一種新型工廠，代幣的產生器，人工智慧的基石。

[03:16] Tokens have opened a new frontier, turning data into knowledge, reason,
  代幣開闢了一個新的領域，將數據轉化為知識、理性，

[03:23] Action.
  行動。

[03:28] They reveal patterns in complexity we could never see.
  它們揭示了我們從未見過的複雜性模式。

[03:39] Mirror our cities to keep us safe and lift us high above them.
  映射我們的城市以確保我們的安全，並將我們高高舉起在它們之上。

[03:57] Tokens help robots learn from us, work alongside us.
  代幣幫助機器人向我們學習，與我們並肩工作。

[04:14] They go where we cannot.
  它們去我們去不了的地方。

[04:22] Lending us helping hands.
  伸出援手。

[04:27] and closing the gap between hope and healing so that we breathe easier.
  並縮小希望與治癒之間的差距，讓我們能更輕鬆地呼吸。

[04:37] And the smallest hearts beat stronger.
  而最小的心跳得更有力。

[04:56] Tokens are helping us break new ground
  代幣正幫助我們開闢新天地

[05:03] on a scale never attempted
  以一種前所未有的規模

[05:17] so we can reach StarCloud one.
  以便我們能達到 StarCloud One。

[05:20] Separation confirmed to infinity
  分離已確認至無限

[05:25] and beyond.
  及更遠處。

[05:32] Together we take the next great leap.
  我們一起邁向偉大的下一步。

[05:35] Into a bright new future.
  進入一個光明的新未來。

[05:41] Built for all mankind.
  為全人類而建。

[05:50] And here in Taipei.
  而在台北這裡。

[05:53] Is where it all begins.
  就是一切的開始。

[06:06] Welcome to the stage Nvidia founder and CEO Jensen Wong.
  歡迎來到舞台，輝達創辦人兼執行長黃仁勳。

[06:21] Welcome to GTC Taiwan.
  歡迎來到 GTC 台灣。

[06:27] So great to see all of you.
  很高興見到大家。

[06:30] Very good to be home.
  很高興回到家。

[06:30] I brought my
  我帶來了我的

[06:32] parents home.
  父母的家。

[06:32] Where are my parents?
  我的父母在哪裡？

[06:34] Everybody give round of applause to my
  大家為我的

[06:36] mom and dad.
  爸爸媽媽鼓掌。

[06:44] And a round of applause
  並且為

[06:47] for our pregame show superstars.
  我們的賽前表演超級巨星鼓掌。

[06:51] Ladies
  各位女士

[06:51] and gentlemen,
  和先生們，

[06:56] look how adorable they are.
  看看他們多麼可愛。

[07:00] The superstars of Taiwan.
  台灣的超級巨星。

[07:03] Uh there are so many of you here today.
  呃，今天有這麼多人來到這裡。

[07:05] We are broadcasting this right now to 70 other
  我們現在正在向另外 70 個

[07:09] watch parties across Taiwan.
  台灣各地的觀看派對直播。

[07:13] 70 different conferences are going at
  70 個不同的會議正在同時進行。

[07:16] the same time.
  大家都在觀看

[07:18] Everybody is watching this keynote.
  這次主題演講。

[07:20] We have so much to tell you and I have so many partners to
  我們有這麼多東西要告訴你們，而且我有這麼多夥伴要

[07:22] thank.
  感謝。

[07:25] It is incredible how large our ecosystem in Taiwan has become.
  我們在台灣的生態系統發展得如此龐大，真是令人難以置信。

[07:29] Most of the time when people think about
  大多數時候，當人們想到

[07:30] ecosystem, they think about our software
  生態系統，他們會想到我們的軟體

[07:33] stack.
  堆疊。

[07:35] They think about the developer ecosystem above the computing systems that Nvidia builds.
  他們考慮的是輝達所建置的運算系統之上的開發者生態系統。

[07:40] But Nvidia's ecosystem spans all the way upstream to all of our supply chain here in Taiwan where it all begins and downstream all the way to data centers and eventually to end users.
  但輝達的生態系統向上延伸至我們在台灣的所有供應鏈，一切從這裡開始，向下延伸至資料中心，最終到達終端使用者。

[07:57] Today we're going to talk about almost all of the ecosystem.
  今天我們將討論幾乎所有的生態系統。

[07:59] There's so many people to thank.
  有這麼多人要感謝。

[08:02] I love my ecosystem here.
  我愛我這裡的生態系統。

[08:05] I mean, there are so many companies here and some of my favorite ecosystem partners.
  我的意思是，這裡有這麼多公司，還有我一些最喜歡的生態系統合作夥伴。

[08:43] So many Taiwan's rich ecosystem, the richest ecosystem, the world's best supply chain ecosystem.
  如此眾多台灣豐富的生態系統，最豐富的生態系統，世界上最好的供應鏈生態系統。

[08:52] Unbelievable.
  難以置信。

[08:54] Well, thank you all for being here and uh this year this year our businesses together are growing in incredibly.
  嗯，感謝大家來到這裡，今年，今年我們的企業一起正在取得驚人的成長。

[09:02] In fact, somebody told me last night that the annual GDP of Taiwan is going to grow almost 10%.
  事實上，昨天晚上有人告訴我，台灣的年度 GDP 將增長近 10%。

[09:20] Well, we have a lot to talk about.
  嗯，我們有很多事情要談。

[09:22] Let's get going.
  我們開始吧。

[09:25] Two years ago when I was here, I started to talk to you about how AI has moved from generative AI and the other waves of AIs that are coming.
  兩年前我來這裡時，我開始和大家談論 AI 如何從生成式 AI 和即將到來的其他 AI 浪潮中發展出來。

[09:32] The next wave of AI was agentic AI and today we can say that agentic AI has arrived that useful AI has arrived.
  下一波 AI 是代理式 AI，今天我們可以說代理式 AI 已經到來，有用的 AI 已經到來。

[09:43] Now what does this mean?
  那麼這意味著什麼？

[09:43] This is GitHub.
  這是 GitHub。

[09:43] This is of course one of
  這當然是其中之一

[09:46] the first applications of agentic AI is software coding.
  代理人工智能的第一個應用是軟體編碼。

[09:51] One of the most valuable professions, incredibly large ecosystem, 30 million, 40 million professional software developers, probably another couple of hundred who are students and enthusiasts and so on so forth, but say 30 40 million software developers in the world code for a living.
  最有價值的職業之一，擁有龐大的生態系統，三千萬、四千萬專業軟體開發人員，可能還有幾百名學生、愛好者等等，但可以說，全球有三千萬到四千萬軟體開發人員以寫程式為生。

[10:11] And this represents most of them.
  而這代表了他們中的大多數。

[10:15] This is GitHub.
  這是 GitHub。

[10:17] The pull request is when they download software, they modify it, and commit is when they push it back up.
  Pull request 是指他們下載軟體並修改它，而 commit 是指他們將其推回。

[10:22] Okay?
  好的？

[10:25] And so if you could look at this in 2023, the number of commits was 300 million.
  所以，如果你看看 2023 年，提交的數量是 3 億次。

[10:34] 2024, 400 million.
  2024 年，4 億次。

[10:36] 2025, 500 million commits in the first few months.
  2025 年，頭幾個月就有 5 億次提交。

[10:40] In the first few months of 2026,
  在 2026 年的頭幾個月，

[10:50] it has nearly tripled.
  它幾乎增加了兩倍。

[10:52] Now, what does that mean?
  那麼，這意味著什麼？

[10:54] 30 million software developers representing about $3 trillion worth of GDP producing three, that's what they're paid.
  三千萬軟體開發人員，代表約三兆美元的國內生產毛額，生產了三倍，這就是他們的薪資。

[11:07] 3 trillion worth of salaries per year, which is generating economic growth for the rest of the industries.
  每年三兆美元的薪資，為其他產業帶來經濟成長。

[11:15] Say a hundred trillion dollars of the world's industries is impacted is generated by $3 billion worth of salary.
  假設世界產業的一百兆美元受到影響，是由三十億美元的薪資所產生。

[11:22] That $3 trillion, excuse me, three trillion that $3 trillion worth of salary is now producing nearly three times as much output.
  那三兆美元，抱歉，三兆，那三兆美元的薪資現在產生的產出幾乎是三倍。

[11:34] It's effectively a $9 trillion productivity from $3 trillion of salaries.
  這相當於三兆美元薪資帶來了九兆美元的生產力。

[11:42] Does that make any sense?
  這說得通嗎？

[11:45] The difference is absolutely extraordinary.
  差異是絕對非凡的。

[11:47] This is the potential.
  這是潛力。

[11:49] This is the promise of AI.
  這是人工智慧的承諾。

[11:49] The number of engineers,
  工程師的數量，

[11:53] software engineers is actually increasing.
  軟體工程師實際上正在增加。

[11:55] People talk about AI reducing jobs.
  人們談論人工智慧會減少工作機會。

[11:59] Complete nonsense.
  完全是胡說八道。

[12:01] It's causing more software engineers to be hired.
  它正在導致更多軟體工程師被聘用。

[12:03] And the reason for that is very simple.
  而原因非常簡單。

[12:04] If you can hire a software engineer and you could generate $9 trillion worth of productive work, why wouldn't you want to hire more software engineers?
  如果你能聘請一位軟體工程師，並且能產生價值 9 兆美元的生產力工作，為什麼你不願意聘請更多軟體工程師呢？

[12:15] If that line was flat, then obviously people will hire fewer software engineers.
  如果那條線是平的，那麼顯然人們會聘請較少的軟體工程師。

[12:23] But because the output is so incredible, people want to hire more software engineers.
  但因為產出如此驚人，人們想聘請更多軟體工程師。

[12:26] This is going to show up in our economy somehow soon.
  這將在不久的將來以某種方式體現在我們的經濟中。

[12:30] And so the first thing is useful AI has arrived.
  所以第一件事是有用的 AI 已經到來。

[12:32] Now, what does that mean from the industry's perspective?
  那麼，從產業的角度來看，這意味著什麼？

[12:36] From the industry's perspective, that means that tokens are now in extraordinary demand.
  從產業的角度來看，這意味著 token 現在有非凡的需求。

[12:42] Because if you could do this, you're going to want to produce more of it.
  因為如果你能做到這一點，你就會想生產更多。

[12:45] And because tokens are now profitable units, tokens are now profitable units of
  而且因為 token 現在是獲利的單位，token 現在是獲利的單位

[12:52] Revenues because it is now profitable.
  營收，因為它現在有利可圖。

[12:56] The AI companies want to build a lot more tokens, generate a lot more tokens, build more AI factories, which is the reason why compute demand here in Taiwan has skyrocketed.
  AI 公司想要建立更多代幣，產生更多代幣，建立更多 AI 工廠，這就是為什麼台灣的運算需求飆升的原因。

[13:08] It is precisely the reason why all of you are so busy and your businesses are doing so well.
  這正是你們如此忙碌且生意興隆的原因。

[13:13] In fact, that looks like some of your stock price.
  事實上，這看起來像是你們的股價。

[13:26] The compute pattern has changed.
  運算模式已經改變。

[13:28] Everything has changed.
  一切都改變了。

[13:31] So the first idea is that useful AI has arrived.
  所以第一個想法是有用的 AI 已經到來。

[13:35] AI is now a profit generator.
  AI 現在是利潤的產生者。

[13:38] AI is now a GDP generator.
  AI 現在是 GDP 的產生者。

[13:42] Behind it is a whole new kind of computing pattern.
  其背後是一種全新的運算模式。

[13:44] Not just a large language model, but an agent.
  不僅僅是大語言模型，而是一個代理。

[13:47] Today almost everything we're going to talk about is going to be based on this.
  今天我們將要談論的幾乎所有內容都將基於此。

[13:51] So let me take a quick moment and show.
  所以請允許我花點時間展示一下。

[13:53] you what I'm talking about inside in.
  你明白我在說什麼，在裡面。

[13:56] this is a this is an agent.
  這是一個，這是一個代理。

[13:59] It's an agent application.
  這是一個代理應用程式。

[14:01] In the old days this would be application.
  在過去，這會是應用程式。

[14:05] This would be code and this would be operating system.
  這會是程式碼，這會是作業系統。

[14:11] application code running inside an application inside an operating system.
  應用程式碼運行在作業系統內的應用程式中。

[14:16] Today it is agent which consists of a large language model or many sitting inside a harness and that harness helps it orchestrates it to do productive work.
  今天它是代理，它由一個大型語言模型或許多模型組成，這些模型位於一個框架內，而這個框架有助於它協調工作以完成生產性任務。

[14:30] This is the input. When that input comes, it has to understand, observe, reason, act, use tools.
  這是輸入。當輸入進來時，它必須理解、觀察、推理、行動、使用工具。

[14:37] Use tools.
  使用工具。

[14:41] That tool could be a spreadsheet, web browser, a data processing engine, database engine.
  該工具可以是試算表、網頁瀏覽器、資料處理引擎、資料庫引擎。

[14:50] For example, this is orchestrated. This harness orchestrate this routing of information.
  例如，這是協調的。這個框架協調資訊的路由。

[14:57] every single time it touches either processing the context, understanding what is happening, reasoning about what to do, coming up with a plan that you can act that it acts on.
  每一次它觸及處理上下文、理解正在發生的事情、推理該做什麼、制定一個你可以採取行動並付諸實踐的計劃時。

[15:09] That orchestration path is orchestrated by some software.
  那個協調路徑由某些軟件協調。

[15:15] And so this is fundamentally a agent.
  所以這本質上是一個代理。

[15:19] It deals with short-term memory called working memory, long-term memory just like we do.
  它處理稱為工作記憶的短期記憶，以及像我們一樣的長期記憶。

[15:25] We have long-term memory.
  我們有長期記憶。

[15:25] And so the memory management system is incredibly important.
  因此，記憶體管理系統非常重要。

[15:29] This entire system is called an agent.
  這個整個系統被稱為代理。

[15:32] The large language model is used to do the thinking and the harness connects everything together just like an operating system.
  大型語言模型用於思考，而這個連接器將所有東西連接在一起，就像操作系統一樣。

[15:43] Okay.
  好的。

[15:43] And so this is the new computing model and this is what an agent it could do incredible things.
  所以這是新的計算模型，這就是代理可以做到的令人難以置信的事情。

[15:50] This is the big breakthrough.
  這是重大的突破。

[15:53] The simultaneous conver the convergence of large language models that are now
  大型語言模型同時收斂的收斂，現在

[15:58] able to do a really good job thinking, reasoning, planning using tools and the fact that we have now these harnesses that manages memory.
  能夠很好地思考、推理、規劃並使用工具，而且我們現在有了管理記憶體的這些機制。

[16:09] The orchestration uses tools.
  協調機制使用工具。

[16:12] We can now do amazing things.
  我們現在可以做驚人的事情。

[16:12] Let me give you some example.
  讓我給你們舉個例子。

[16:14] This is this is a prompt.
  這是一個提示。

[16:16] This is the prompt.
  這是提示。

[16:18] This is the code that is generated.
  這是生成的程式碼。

[16:21] And this comes out.
  然後這是輸出的結果。

[16:24] This is the input.
  這是輸入。

[16:29] And that's the output.
  那就是輸出。

[16:32] Do you guys What do you guys think?
  你們覺得怎麼樣？

[16:33] It's pretty amazing, right?
  這真的很棒，對吧？

[16:39] We use cloud code here, but Codeexit does an incredible job as well.
  我們在這裡使用雲端程式碼，但 Codeexit 也做得非常出色。

[16:43] Here's another example.
  這是另一個例子。

[16:43] This is the input.
  這是輸入。

[16:45] Create a GIF. NVIDIA gen green dots on black scatter form Taiwan 101 building. Morph to GTC Taipei Ae 2026. Morph to Nvidia I logo then
  創建一個 GIF。NVIDIA 生成黑色背景上的綠色散點圖，形成台灣101大樓。變形到 GTC Taipei Ae 2026。變形到 Nvidia I logo，然後

[17:00] Scatter and repeat.
  分散並重複。

[17:02] Right?
  對嗎？

[17:02] So you saw that.
  所以你看到了。

[17:04] That was the prompt.
  那就是提示。

[17:04] Here's the next one.
  這是下一個。

[17:06] I lost my remote control battery clip.
  我弄丟了遙控器電池夾。

[17:06] It looks like this.
  它看起來像這樣。

[17:10] Create a CAD file.
  創建一個 CAD 文件。

[17:13] It uses a tool create a CAD file ready for 3D printing to create a new new one.
  它使用一個工具創建一個準備好進行 3D 打印的 CAD 文件，以創建一個全新的。

[17:15] Make sense?
  明白嗎？

[17:20] This is now the new computing pattern.
  這現在是新的計算模式。

[17:23] Whereas we used to launch an application, click and type, we now replace that with explaining to the AI what we want, our intent and the AI generates the code or uses tools and produce the necessary output.
  以前我們需要啟動一個應用程序，點擊並輸入，現在我們用向 AI 解釋我們想要什麼，我們的意圖，然後 AI 生成代碼或使用工具並產生必要的輸出來代替。

[17:42] This is how computers are going to work in the future.
  這就是計算機未來的工作方式。

[17:45] This is agentic AI.
  這是代理 AI。

[17:50] For two years we've been building towards this and now it has arrived.
  兩年來我們一直在為此努力，現在它已經到來了。

[17:53] Now one of the big breakthroughs of course is tool use.
  當然，一個重大的突破是工具的使用。

[17:59] A lot of people have said you know Jensen AI is coming.
  很多人說你知道，Jensen AI 正在到來。

[17:59] Agentic AI is
  代理 AI 是

[18:01] coming. Therefore all of the software companies are going to go out of business.
  來臨。因此，所有的軟體公司都將倒閉。

[18:06] I said it's exactly the opposite because there are going to be so many agents.
  我說恰恰相反，因為將會有這麼多代理人。

[18:11] The world is no longer limited by the number of people.
  世界不再受人數的限制。

[18:16] Therefore, those agents are going to use more tools than ever.
  因此，這些代理人將比以往使用更多的工具。

[18:20] This is actually an incredible time to be a software company.
  這實際上是成為一家軟體公司的絕佳時機。

[18:25] But the software has to be presented to the agent in a way that the agent can use it.
  但是軟體必須以代理人能夠使用的方式呈現給代理人。

[18:31] This is a break big breakthrough.
  這是一個重大的突破。

[18:33] And in fact, what we have done as you know what Nvidia's treasure is is all of our CUDA libraries.
  事實上，正如你所知，我們所做的輝煌的輝煌就是我們所有的 CUDA 函式庫。

[18:39] I call them CUDA X libraries.
  我稱它們為 CUDA X 函式庫。

[18:42] This is Nvidia's treasure.
  這是輝煌的輝煌。

[18:45] Today, we're able to now pres present these CUDA X libraries to agents who can use it much more effectively than even h humans.
  今天，我們能夠向代理人展示這些 CUDA X 函式庫，他們可以使用它們比人類更有效地使用。

[18:55] And so this is a wonderful time for CUDA X libraries.
  所以這對 CUDA X 函式庫來說是一個絕佳的時機。

[18:58] Let's take a look.
  讓我們來看看。

[19:03] 20 years ago, we built CUDA, a single architecture for accelerated computing.
  二十年前，我們建構了 CUDA，一個用於加速運算的單一架構。

[19:08] We reinvented computing.
  我們重新發明了運算。

[19:12] A thousand CUDA X libraries help developers make breakthroughs in every field of science and engineering.
  數千個 CUDA X 函式庫協助開發人員在科學和工程的每個領域取得突破。

[19:19] CUDA X libraries are tools for agents.
  CUDA X 函式庫是代理工具。

[19:22] CU litho for computational lithography.
  CU litho 用於計算平版印刷。

[19:26] Coop for decision optimization.
  Coop 用於決策優化。

[19:31] CDSS for direct sparse solvers.
  CDSS 用於直接稀疏求解器。

[19:35] AIQ for deep research across structured and unstructured documents.
  AIQ 用於結構化和非結構化文件的深度研究。

[19:41] Aerial for AI ran,
  Aerial 用於 AI 運行，

[19:45] warp for differentiable physics,
  warp 用於可微分物理學，

[19:49] parabrics for genomics.
  parabrics 用於基因組學。

[19:52] At their foundation are algorithms and they are beautiful.
  它們的基礎是演算法，而且它們很美。

[20:37] Heat.
  熱。

[21:14] Heat.
  熱。

[21:36] Heat up here.
  這裡熱起來了。

[22:50] A round of applause for math. Math is

[22:53] beautiful.

[23:00] The computing pattern, the computing

[23:02] pattern of software is going to change.

[23:04] In fact, let's come back to this. This

[23:07] is the agent. It is the ultimate

[23:12] disagregated

[23:13] and distributed computing computing

[23:16] model.

[23:17] So many different computers are going to

[23:19] be activated in order to process this

[23:22] agent. The agent consists of model

[23:26] harness

[23:28] tools and skills

[23:31] and a runtime.

[23:35] All of that is running at different

[23:37] places in a data center.

[23:40] You can think of the model as the brain,

[23:44] the harness as the body,

[23:47] the tools that it uses

[23:49] working

[23:51] in a runtime. Think of it as a workshop.

[23:54] So this is a person, a worker working

[23:57] with tools in a workshop. Of course,

[24:00] this is being done at extraordinarily

[24:02] large scales and each one of those steps

[24:06] are running in a different part of the

[24:07] computer. And you could see the large

[24:10] language model is thinking,

[24:13] context processing, observing,

[24:16] understanding the environment,

[24:18] reasoning, coming up with a plan, and

[24:21] acting on the plan. Every single time

[24:23] that happens, an entire rack of Grace

[24:27] Blackwell MVLink72 is activated. It's

[24:30] thinking with the large language model.

[24:33] Whenever it uses a tool, a CPU use is

[24:37] used. That tool could be a C compiler,

[24:40] it could be Python, it could be

[24:42] JavaScript or it could be accelerated

[24:45] computing. Today's agents are rel

[24:49] relatively simple users of tools.

[24:52] Tomorrow they're going to be very

[24:53] sophisticated users of tools, which is

[24:55] the reason why the CUDA X libraries that

[24:58] I showed you are going to be incredibly

[25:00] popular with agents. They solve some of

[25:02] the most important problems the world

[25:04] knows. And all of our CUDA X libraries

[25:07] are now now going to come with skills

[25:10] that the AI could learn how to use. So

[25:14] the CUDA X library some skills basically

[25:17] a manual the AI reads it and go aha

[25:21] that's how you use it.

[25:24] The ability to use these libraries by

[25:26] agents are going to be incredible. And

[25:28] so the tools run on CPUs and GPUs and

[25:32] large language models. The security

[25:35] harness runs on CPUs and a security

[25:39] processor called a DPU. Nvidia's blue

[25:42] field. The orchestration of all this

[25:44] runs on a CPU. This is the entire

[25:47] harness and the CPU is orchestrating all

[25:49] of the work. One of the hardest parts is

[25:53] memory. You could just imagine the

[25:55] working memory is called KV caching.

[25:58] What to remember? Compaction, not just

[26:01] compression. But how to retrieve? Do you

[26:04] retrieve structured data? Do you

[26:06] retrieve unstructured data? What is the

[26:09] ontology? The relationship of all of

[26:11] these different data to itself.

[26:14] That entire processing is incredibly

[26:16] complicated. The memory system, the

[26:19] memory system of AIS is going to cause

[26:22] the storage system to be completely

[26:25] revolutionized. As you can see, every

[26:28] aspect of this computing

[26:31] model, this computing pattern, this new

[26:34] application called an agent is

[26:36] fundamentally different than the way

[26:38] that applications used to run. A whole

[26:41] bunch of software sitting inside a

[26:43] binary sitting inside an operating

[26:45] system. This is the reason this

[26:48] disagregated

[26:50] this distributed this heterogeneous

[26:53] computing problem is precisely the

[26:55] reason we built our next generation.

[27:00] Vera Rubin Vera Rubin is not one chip.

[27:04] Vera Rubin is not a GPU only. It starts

[27:08] with the GPU but Vera Rubin is

[27:11] incredible.

[27:14] This entire thing is Vera Rubin

[27:17] from end to end. It has GPUs Vera Rubin

[27:22] MVLink72.

[27:24] It is orchestrated by Vera CPUs that I'm

[27:27] going to tell you more about the storage

[27:29] systems revolutionary Vera along with

[27:33] CX9 our software stack called DOA the

[27:37] security processor that's inside so that

[27:39] everything is encrypted at rest

[27:44] in motion as well as in use.

[27:48] Everything across this is secure because

[27:51] the AI model is so precious. This is the

[27:54] reason why this entire system obeys

[27:57] confidential computing.

[27:59] Each one of these systems would be a

[28:01] complete revolution in itself. Vera

[28:04] Rubin is the most ambitious endeavor in

[28:07] the history of our company.

[28:10] The whole company worked on Vera Rubin

[28:12] across all 40,000 engineers. Not to

[28:16] mention all of you. All of you

[28:18] participated in the creation of this

[28:21] entire system. Vera Rubin is really a

[28:24] miracle and it's not just one chip. It

[28:26] is so many.

[28:28] Well, it's even beyond that. A long time

[28:31] ago, Nvidia used to be a GPU company.

[28:34] But over the years, we've evolved

[28:38] to become a systems company. You're

[28:40] looking here now for the most complex

[28:43] system, most complex and groundup system

[28:46] ever designed.

[28:48] But ultimately our customers, our

[28:51] partners don't want to buy computers.

[28:54] They want to build AI factories. Which

[28:57] is the reason why Nvidia has really

[28:59] started to transform oursel yet again.

[29:02] You could see so much of our technology

[29:05] is now at the entire infrastructure

[29:08] scale. Our partners are at

[29:10] infrastructure scale. Power generators,

[29:13] cooling systems, the grid providers.

[29:17] So many industrial companies are now

[29:20] part of our ecosystem because ultimately

[29:23] we're trying to build an entire stack

[29:25] just like GPUs just like when we were

[29:28] building Grace Blackwell MVLink 72 just

[29:31] like now we are building a full stack

[29:34] system so that our customers could build

[29:38] amazing AI infrastructure. Let's take a

[29:41] look.

[29:43] >> The world is racing to build AI

[29:45] factories. The largest infrastructure

[29:48] buildout in human history. AI factories

[29:51] are incredibly complex. Every layer,

[29:53] chip, rack, network, power, cooling,

[29:57] grid must be designed together from end

[30:00] to end because compute is revenues.

[30:05] NVIDIA DSX is the blueprint, a reference

[30:08] design for building and operating AI

[30:10] factories at maximum efficiency and

[30:13] profitability.

[30:15] It starts with DSX SIM. With the DSX SIM

[30:18] Omniverse blueprint, partners design and

[30:20] validate an NVIDIA Vera Rubin AI factory

[30:23] before a single rack lands.

[30:26] They plan the layout,

[30:31] simulate the power and cooling,

[30:35] design the network, validate every

[30:37] integration, test every change in the

[30:39] digital twin.

[30:42] The factory powers on. DSXOSS takes over

[30:46] and provisions, operates, monitors, and

[30:48] remediates the infrastructure,

[30:51] turning the installed systems into

[30:53] trusted, multi-tenant, resilient, AI

[30:56] ready capacity.

[31:00] Today's AI factories overprovision power

[31:02] by up to 40%. DSXM Max LPS lets

[31:06] operators safely deploy more GPUs inside

[31:09] the same power budget, adding billions

[31:12] in annual revenue.

[31:16] Breakthrough hot liquid cooling at 45° C

[31:20] uses less water and energy. More power

[31:23] going to revenue generating compute.

[31:26] Incredible.

[31:28] Dynamic power allocation steers power

[31:30] from rack to rack, recovering stranded

[31:32] watts, sending them where work is

[31:34] happening.

[31:36] In raw power smoothing flattens peak

[31:39] current spikes and power surges

[31:44] throughout the factory. Teams of AI

[31:46] agents work with DSX Max LPS,

[31:49] continuously coordinating to balance

[31:51] cooling and power to meet workload

[31:53] demand.

[31:55] DSX AI factories are flexible energy

[31:57] assets that operate cooperatively with

[31:59] the grid. DSX Flex reads real-time grid

[32:03] signals and dynamically adjusts factory

[32:06] power when the grid needs relief.

[32:12] 100 gawatt of AI factories will come

[32:14] online before the end of the decade.

[32:16] NVIDIA DSX AI factories run at highest

[32:19] efficiency, produce the lowest cost

[32:21] tokens, and make the grid stronger.

[32:33] I've shown you ecosystem slides of the

[32:35] past

[32:37] where Nvidia's computing layers and

[32:40] software and software and computing

[32:42] stacks are integrated into other

[32:44] people's platforms, third party

[32:46] platforms and libraries that serves end

[32:48] markets. That was a computing ecosystem.

[32:52] This is an AI factory ecosystem. This is

[32:56] way downstream of all of you. Upstream

[32:59] of me is all of you and downstream of us

[33:02] is this ecosystem. Because Nvidia

[33:05] ultimately is not just building a GPU,

[33:08] not just building a system. We're

[33:11] helping customers build these AI

[33:13] factories, these AI infrastructure that

[33:15] is so immensely complex. Each one of

[33:18] these at one gigawatt level started at

[33:22] 30 2030 billion dollars. It is at 5060

[33:27] billion and soon it will be 80 hundred

[33:31] billion dollars per gigawatt

[33:34] $100 billion into an AI factory.

[33:38] It must work the first time and it must

[33:41] work right away. The cost of capital is

[33:44] incredible. The complexity is

[33:46] incredible. So as you see we used to

[33:49] design a chip inside a computer

[33:52] and then we simulated a system inside a

[33:55] computer. Today you saw just now

[34:00] everything was built in Omniverse.

[34:03] I've been working with Omniverse with

[34:04] all of you for a long time. This was the

[34:07] dream come true so that we can build

[34:10] these gigantic systems as large as the

[34:13] world wants to build inside a digital

[34:15] framework inside a digital simulator in

[34:18] a digital world long before we build the

[34:22] first break ground and put our money to

[34:25] work. So this is our ecosystem our we

[34:28] call it DSX. RTX is for our GPU, DGX for

[34:33] our systems, and now DSX basically

[34:36] infrastructure. Because of the work that

[34:38] we do here across this entire stack,

[34:41] including our systems and software, it's

[34:43] the reason why we could work with small

[34:45] companies and enable them to be

[34:47] worldclass AI clouds. Every one of these

[34:51] I'm about to show you are small

[34:53] companies just recently. And now

[34:55] Coreweave is worth 50, 60, 70 billion

[34:58] dollars and growing incredibly fast.

[35:01] Recently we worked with Nebius and again

[35:04] they're growing incredibly fast. Each

[35:06] one of these clouds have incredible

[35:09] customers. Cursor the software coding

[35:11] company, Black Mountain Labs, Image

[35:14] Generation, World Labs, World Foundation

[35:17] model, Revolute, the leading uh

[35:20] financial services AI company, and

[35:22] Shopify. Here's another one. This is

[35:25] Nscale and their customers are British

[35:28] Telecom, Google. Google is using one of

[35:32] our AI clouds. Thinking machines, a

[35:35] Frontier Labs company. Super exciting.

[35:38] Here's Neighbor Cloud in Korea. Bank of

[35:41] Korea, Hyundai.

[35:44] So many incredible companies. Here's one

[35:46] in in India, Yoda.

[35:49] Incredible companies. Here's one uh

[35:52] based in Singapore building in a

[35:54] Australia together AI AI Singapore. This

[35:59] is one in Indonesia. Each one of these

[36:02] companies each one of these companies

[36:04] are serving regional as well as global

[36:07] customers. AI is going to run

[36:10] everywhere. Every company will be

[36:12] powered by it. Every region will build

[36:16] it. Endosat here in in Indonesia here in

[36:20] Taiwan. GMI

[36:23] here in Taiwan. GMI. It's okay to clap.

[36:33] So incredible, incredible uh incredible

[36:36] companies, incredible opportunity, but

[36:38] all of them need several things. Of

[36:41] course, they need the computing stack.

[36:43] This entire stack underneath this is

[36:45] what made Nvidia famous.

[36:47] All of our hardware and software and

[36:49] libraries, our connection into the

[36:52] world's ecosystem of third party

[36:54] developers makes it possible for anyone

[36:57] to stand up an AI cloud. However, the AI

[37:01] cloud is so complex now. This is the

[37:04] software version. This is the computer

[37:06] science version,

[37:08] the money version,

[37:10] the asset version is what I showed you

[37:13] earlier. It's a giant factory. Having

[37:17] this ability alone is not enough which

[37:20] is the reason why Nvidia has become an

[37:22] AI infrastructure company. Now doing

[37:25] this well and becoming incredibly good

[37:29] at dep at helping customers build AI

[37:32] factories and deploying AI factories is

[37:35] incredibly important. And the reason for

[37:36] that is this. Compute is revenue now.

[37:41] Compute is profit. the absence of

[37:44] revenues and profit is loss. And so it's

[37:48] really important to realize that this is

[37:52] when this is an example of

[37:55] an AI infrastructure coming online. It

[37:58] could take it could be coming online

[38:00] quickly. It could take a while. Its

[38:03] throughput could be high. It could be

[38:05] low. Its resilience and reliability

[38:08] could be good or bad. And its lifetime

[38:11] of usefulness could be long or short

[38:15] because this represents

[38:18] 50 60 going to a hundred billion

[38:21] dollars.

[38:23] This curve matters greatly which is the

[38:26] reason why Nvidia is such a great

[38:28] partner working with us because of our

[38:32] fully integrated capability. We didn't

[38:35] just come up with a PowerPoint slide. We

[38:38] created the entire infrastructure. We

[38:40] connected everything together. We built

[38:42] out billions and billions of it

[38:44] ourselves to make sure that everything

[38:47] works well. As a result of that, our

[38:50] time our time to first token, our time

[38:55] to first token, our time to first

[38:57] inference,

[38:58] our time to training turned on is much

[39:03] faster. Second, because our

[39:08] throughput per watt, our tokens per watt

[39:12] is utterly world class. And the reason

[39:15] for that is because we integrate

[39:16] everything. We design everything from

[39:18] the ground up. We simulate the entire

[39:20] system and we use extreme code design.

[39:22] Just like I showed you just now with the

[39:24] Vera Rubin rack, everything was designed

[39:27] in order to deliver on this incredible

[39:29] throughput.

[39:31] If your data center, if your factory has

[39:36] one gigawatt,

[39:38] it will not have more. One gawatt means

[39:41] 1 gawatt. That's all the power

[39:44] generation you could do. If you have one

[39:47] gawatt of power, then throughput per

[39:50] watt is revenues because every token is

[39:55] profitable. Every token is revenues.

[39:59] This is the future. Compute is revenues.

[40:03] Performance per watt is your revenues.

[40:06] Choosing the wrong architecture

[40:08] just because the chips are cheaper

[40:11] doesn't translate doesn't make sense.

[40:15] You need to make sure that your revenues

[40:17] per watt the more you buy the more you

[40:21] make.

[40:22] And so tokens per watt. And then lastly,

[40:26] very li oh second, third is reliability.

[40:30] If you ever get a chance to see these

[40:32] data centers, there are so many moving

[40:34] parts, millions of cables.

[40:37] The ability for all of those computers

[40:40] to work harmoniously,

[40:42] reliably is extremely low. It is just

[40:46] extremely difficult. We have now been

[40:48] operating very large scale for a very

[40:51] long time. That experience matters.

[40:54] That difference meanantime time between

[40:56] interrupts extremely important. And then

[41:00] lastly,

[41:01] this is very hard.

[41:04] The lifetime of these systems, the

[41:06] lifetime of these systems, the software

[41:08] is changing all the time. Four years

[41:11] ago, which is in the time of Hopper, AI

[41:15] has completely changed.

[41:18] Six years ago, this is the time frame of

[41:20] Ampear. AI has completely changed.

[41:24] We started out talking about CNN's here.

[41:28] We are then we talked about transformers

[41:30] and then we talked about mixture of

[41:31] experts. Now we're talking about agentic

[41:34] systems.

[41:36] Every single generation, every single

[41:39] few months, the software industry is

[41:42] coming up with new technology. If your

[41:44] architecture

[41:46] is not flexible, if your ecosystem is

[41:49] not rich, then this curve cannot be

[41:53] long. You cannot predict how long your

[41:57] system can last. I can Nvidia systems is

[42:02] all over the world. Software developers

[42:04] start with Nvidia CUDA and by definition

[42:07] therefore the life the ecosystem

[42:11] the useful asset is going to be much

[42:14] longer. The difference is essentially

[42:16] cost. You could think of it as revenues

[42:19] but the other side of revenues is cost.

[42:22] If the life of the asset is long, the

[42:24] TCO is low.

[42:27] This is the difference. This is what it

[42:30] looks like when compute

[42:36] The more you buy, the more you make

[42:46] now. All of you are experiencing this

[42:49] with me. Isn't that right?

[42:52] all of your demand, your factories are

[42:55] working so hard, your people are working

[42:58] so hard all across Taiwan because

[43:01] everybody wants to make money. They

[43:04] realize that AI, useful AI is here.

[43:09] Profitable AI is here. Compute demand is

[43:14] incredibly high and compute demand is

[43:17] the constraint. And so let's go work

[43:20] super super hard and help the world

[43:22] stand up AI factories everywhere. This

[43:25] is why it's so important. I'm so happy

[43:28] here I am standing in front of you. Vera

[43:31] Rubin is in full production.

[43:39] Vera Rubin is in full production.

[43:43] It the um the the supply chain we

[43:46] created for Vera Rubin is twice as large

[43:50] as Grace Blackwell.

[43:52] Not Yeah, it's incredible. And And what

[43:57] used to take two hours to assemble one

[44:00] Grace Blackwell rack now only takes five

[44:03] minutes. So, not only is the capacity

[44:06] higher, the throughput is a lot faster

[44:09] and we need it all to support the

[44:11] demand.

[44:13] This ecosystem is extraordinary.

[44:16] Millions of square feet has been put

[44:19] online to support Grace Blackwell and

[44:22] preparing now, ramping up now, Vera

[44:24] Rubin. I want to thank all of you. Vera

[44:26] Rubin is now in full production. Thank

[44:28] you.

[44:33] Let's take a look.

[44:37] Large language models generate answers.

[44:41] Now AI agents can do work. But

[44:44] processing agentic AI is a whole

[44:46] different kind of problem. Agents

[44:48] observe, reason, plan, use tools. They

[44:52] manage massive context, juggling working

[44:54] memory and long-term memory. They spin

[44:57] up sub aents, specialists on demand.

[45:00] NVIDIA Vera Rubin is a multi-rackck

[45:02] podcale system built to process Agentic

[45:05] AI and is now in full production. The

[45:08] manufacturing, automation and

[45:10] orchestration across the supply chain, a

[45:13] miracle to witness. Our journey started

[45:15] when we launched the first AI

[45:17] supercomputer, Nvidia DGX1.

[45:20] Over the next decade, we pushed every

[45:22] chip and system to the limit. From

[45:25] Pascal and the first MVLink to Grace

[45:28] Blackwell, the first rack scale AI

[45:30] supercomputer. And now Vera Rubin, the

[45:33] first multirack pod scale supercomputer

[45:36] built for the agentic age. It starts at

[45:39] TSMC. The seven new chips that make up

[45:42] Vera Rubin take shape through hundreds

[45:44] of processing steps. 3 nanometer process

[45:47] co-was R and co-was L packaging HBM4

[45:51] memory from Micron SKH highinix and

[45:54] Samsung the Vera Rubin compute board 6

[45:57] trillion transistors with over 18,000

[46:00] components on one board Vera Rubin MVL72

[46:04] does the thinking prompt and context

[46:06] understanding reasoning and planning

[46:09] next a new modular comput streamlined

[46:13] with a new PCB midplane design super

[46:16] chips, connect X9 Super Nix, and

[46:19] Bluefield 4 DPUs, all made in place with

[46:23] no cables for resiliency at AI factory

[46:25] scale. 18 compute trays, nine hot

[46:29] swappable NVLink switch trays, new high

[46:32] efficiency manifolds, liquid cooled bus

[46:34] bars carrying over 5,000 amps, the

[46:37] equivalent of 20 electric cars at full

[46:40] acceleration. Together, 1.3 million

[46:43] components formed this third generation

[46:45] MGX rack design. Congratulations to

[46:48] Microsoft for their operational Vera

[46:50] Rubin MVL72 engineering rack.

[46:53] Congratulations to Dell and Corewave as

[46:55] well for standing up their Vera Rubin

[46:57] MVL72 engineering rack. Then the Vera

[47:01] CPU rack. 256 CPUs in a single liquid

[47:06] cooled rack. Orchestrating the models,

[47:09] shuffling memory, launching tools. At

[47:12] Foxcon and Quanta, Gro 3 LPX takes

[47:16] shape. 256 Gro 3 LUS across 16 trays, 40

[47:21] pabytes per second of SRAMM bandwidth

[47:24] for ultra low latency.

[47:26] While MVL72 generates tokens at the

[47:29] highest throughput, Gro LPX generates

[47:32] them at the lowest latency.

[47:34] Vera Bluefield 4 STX, where AI keeps its

[47:38] memory, storage processing accelerated

[47:41] by Bluefield 4, connecting memory,

[47:44] storage, and insilicon security.

[47:47] and NVIDIA Spectrum X Ethernet photonix.

[47:51] The world's first Ethernet switch with

[47:53] 200 Gbit co-ackaged optics. TSMC's coupe

[47:57] process chip scale packaging and ultra

[48:00] highowered laser dies on indium

[48:03] phosphide.

[48:04] Vera Rubin, five connected rack scale

[48:07] systems, a supercomput for AI agents,

[48:11] 150 supply chain partners across Taiwan,

[48:14] millions of square feet of factory

[48:16] floor, hundreds of sites, chips,

[48:19] packages, systems, and data centers

[48:22] pushed to the limits of size, power, and

[48:25] scale. This is what we call extreme code

[48:28] design. We did this with Taiwan.

[48:30] Together, we reinvented computing for

[48:32] the age of AI. Taiwan was with us at the

[48:35] beginning and here today as we bring

[48:38] Vera Rubin to the world. Thank you

[48:40] Taiwan.

[48:49] Ladies and gentlemen, Vera Rubin.

[48:53] Vera Rubin

[48:55] was not just built for AI. AI. Vera

[48:59] Rubin was not built just to run AI. Vera

[49:02] Rubin was built to run agents. This is

[49:06] an agentic system. Imagine the

[49:09] complexity which is the reason why

[49:11] agents is the last computer science

[49:15] breakthrough. It has taken this many

[49:17] years for agents to realize its

[49:19] potential and become useful. It stands

[49:22] to reason that the computer that runs it

[49:24] is the most advanced in the world. This

[49:26] is Vera Rubin. Let's take a look. Can we

[49:29] bring out Vera Rubin, please?

[49:51] And Janine, do we have the do we have

[49:52] the racks, the systems?

[49:59] It looks heavy.

[50:02] This is This is Vera Rubin. Vera Rubin

[50:05] MVLink 72.

[50:08] This is the Gro LPX. At the next GTC,

[50:12] I'm going to talk to you about a lot

[50:13] more of this today. We have so much to

[50:15] talk to you about. This is Vera CPU

[50:19] rack. 256 CPUs, all liquid cooled. Let

[50:24] me tell you about Vera in just a moment.

[50:26] This is the Vera blue field storage

[50:30] processing system and also security

[50:33] system. And of course this is our

[50:35] Melanox networking the world's first

[50:39] CPO.

[50:40] This is Vera Rubin. Incredible

[50:42] technology all coming together. Now when

[50:45] we built when we built Hopper we built

[50:48] Hopper as you know for pre-training.

[50:52] pre-training was the most important

[50:53] application, the most important workload

[50:55] we were working on at the time. Then

[50:57] when we worked on Grace Blackwell,

[51:00] everybody said, "Jensen, you know,

[51:02] Nvidia is really good at pre-training.

[51:05] Inference is so easy." Do you remember

[51:08] that? People used to say, "Inference is

[51:10] so easy. We could do that, too." But as

[51:13] you know, inference equals money. And

[51:16] the mo models are so complicated. And to

[51:20] do it at incredibly high response time,

[51:24] fast interactivity and high throughput

[51:26] at the same time is incredibly hard.

[51:30] Which is the reason why we created

[51:31] MVLink 72. Today NVIDIA's token cost is

[51:36] the lowest in the world. Not by 10%, by

[51:40] X factors, orders of magnitude.

[51:43] All because we did extreme code design.

[51:46] All because we understood the computing

[51:49] model. the computing pattern of

[51:51] inference and we were able to create

[51:54] MVLink 72. Now with Vera Rubin it is

[51:59] beyond inference. It is now inference in

[52:02] an agent agentic system. This is Vera

[52:06] Rubin. No cables,

[52:10] no hoses, no fans.

[52:13] What used to take the last time when I

[52:15] showed this to you, we had cables

[52:18] everywhere.

[52:19] The cables were amazing to look at, but

[52:22] now there's a PCB in the middle which

[52:26] connects both sides. What used to take

[52:28] two hours now takes 5 minutes. The

[52:31] reliability and the resilience of Ver

[52:34] Rubin is going to be off the charts.

[52:37] This is our Vera CPU tray. The most

[52:41] advanced CPUs that has ever been built.

[52:44] I'm going to show you that in just a

[52:46] second. And this is our storage tray.

[52:51] Two Vera CPUs,

[52:53] four CX9. Incredible amounts of

[52:57] software.

[52:59] This is our new LPX, LPU30,

[53:04] the Gro system designed for very low

[53:07] latency inference. The throughput is

[53:10] delivered by Vera Rubin and extended

[53:13] with MVLink 72. If you want to extend

[53:16] that even further, you can have Grock

[53:20] LPUs. Here we have the Vera Rubin MVLink

[53:24] the switch tray. This is the switches in

[53:27] the middle. And this is revolutionary

[53:30] because of Vera Rubin's because of

[53:32] MVLink72 and the MVLink switches that we

[53:36] created and invented. And this is our

[53:40] Ethernet switches for scale out. What's

[53:44] amazing is we introduced these two

[53:47] systems for Grace Blackwell. These two

[53:51] systems were created for Grace Blackwell

[53:53] and today Nvidia is the largest

[53:56] networking company in the world. I'm so

[53:59] proud of the networking team. This is

[54:02] such an incredible enabler for

[54:04] everything that we do. I'm going to now

[54:06] talk to you about the next major

[54:09] industry we're going to be part of.

[54:12] Thank you, Janine.

[54:16] Thank you,

[54:22] Zen.

[54:27] I think there are 2,000 people back

[54:29] there pulling that.

[54:36] Okay, let's talk about CPUs.

[54:41] Vera CPUs. CPUs built for the age of AI.

[54:46] All of the CPUs until now

[54:49] were created for people. We were the

[54:52] users.

[54:54] We were the users. We were the renters.

[54:58] The way we use CPUs, we live in a world

[55:01] counted by seconds.

[55:03] The way we rent CPUs in the cloud, each

[55:07] one of them more you can more CPU cores

[55:10] you have the more you can rent. The

[55:13] economics of the old the use case of the

[55:15] old CPU and the economics of the old CPU

[55:19] fundamentally different than agents.

[55:22] Agents

[55:24] are impatient. They don't live in a

[55:26] world that is in seconds. They live in a

[55:28] world that's in nanconds.

[55:30] When it uses a tool, it wants the

[55:34] response time to be as fast as possible.

[55:37] When it access database, it has to come

[55:39] back as soon as possible. Every moment

[55:43] that the agent is waiting

[55:45] keeps it from going to the next step,

[55:48] the next step, the next step. It is

[55:50] vital that we make the CPUs as low

[55:55] latency as possible, as interactive as

[55:58] possible. So we created Vera CPU for the

[56:02] age of AI. Now inside our system it's

[56:05] used for three different ways. The first

[56:08] way of course is Vera Rubin

[56:13] for thinking and inside the Vera Rubin

[56:16] rack. They're already two CPUs.

[56:20] As you know we are building and selling

[56:24] millions of Vera Rubins. We have sold

[56:27] millions of grace black walls. Nvidia

[56:30] already is one of the largest CPU makers

[56:32] in the world. Vera in the Vera Rubin

[56:35] rack are two CPUs. One for orchestrating

[56:40] and managing the GPUs,

[56:42] managing the KV cache,

[56:46] dealing with all of the software that

[56:48] runs in the rack. We also have the grace

[56:51] blue field that is used for security and

[56:55] isolation.

[56:56] The Vera compute is used for the

[57:00] harness, the orchestration of the AI

[57:02] models, tool use, accessing the

[57:05] database. And the data servers are right

[57:09] here. Vera Bluefield, the fastest

[57:12] storage, fastest storage servers, the

[57:15] fastest storage system the world has

[57:17] ever made. And the reason why this is so

[57:20] vital is because agents are accessing

[57:22] memory accessing memory so incredibly

[57:25] fast. These systems, the storage server

[57:31] and the CPUs

[57:33] are now the critical path of the most

[57:35] expensive part of the data center. This

[57:38] is the most expensive for a good reason.

[57:42] The economics, the economics of the AI

[57:46] factory is tokens

[57:49] and the tokens are created here. And so

[57:52] of course you want to manufacture and

[57:54] generate as many tokens as possible.

[57:57] This is where you put all of your

[57:58] economics and this has to not be in the

[58:01] way. And so Vera CPU has great pressure

[58:05] on the Vera on the CPU architecture

[58:08] which is the reason why we built a brand

[58:10] new architecture from the ground up. A

[58:13] CPU the world has never seen before. We

[58:16] call it Vera.

[58:18] This is CPU

[58:20] for agents. All the CPUs of the past we

[58:24] built for humans. This CPU is built for

[58:27] agents. Well, there are four things to

[58:30] keep in mind. The four takeaways. The

[58:33] first takeaway is that the instructions

[58:37] per clock of Vera has to be incredibly

[58:40] good because we need the latency to be

[58:42] short. We need the processing time.

[58:45] Singlethreaded performance, not

[58:48] throughput. Singlethreaded performance

[58:50] has to be world class. Absolutely the

[58:53] best singlethreaded performance. Which

[58:55] is the reason why the IPC, the

[58:58] instructions per clock of Vera is so

[59:00] high. is the highest in the world. 10

[59:03] instructions fetched, decoded and

[59:05] executed per clock. Number one. Number

[59:09] two,

[59:11] the bandwidth necessary to move data in

[59:14] and out for the CPU has to be utterly

[59:17] world class. The second thing is

[59:20] bandwidth per core. The third is just

[59:23] bandwidth period. We're moving. Remember

[59:27] I said earlier agentic systems is

[59:30] fundamentally disagregated and

[59:33] distributed. Disagregated and

[59:36] distributed. When computing is

[59:38] disagregated and distributed, networking

[59:42] becomes the problem. Therefore, we have

[59:44] to move the data around as fast as

[59:46] possible between the CPU cores and

[59:49] between the CPU and the storage, the CPU

[59:52] and the GPU.

[59:54] The bandwidth around the system and

[59:56] inside the CPU core has to be utterly

[59:59] worldclass.

[01:00:01] This is the first CPU that's been built

[01:00:03] a long time that is literally at retical

[01:00:06] limits with a fabric that connects all

[01:00:09] of the CPU cores that is speed of light

[01:00:12] 3.6 terabytes per second.

[01:00:16] No chiplet tax, no chip boundary

[01:00:19] crossings because we need to have

[01:00:22] everything because the CPU cores are

[01:00:25] talking to each other with extremely

[01:00:27] high bandwidth. They're not rented core

[01:00:30] per core per core. They're all working

[01:00:32] together.

[01:00:33] The cross-sectional bandwidth of Vera is

[01:00:35] off the charts. It's the first one to be

[01:00:38] PCI Express Gen 6. It is also the first

[01:00:42] one to have LPDDR DDR5 with 1.2

[01:00:46] terabytes per second. Three times two to

[01:00:50] three times the bandwidth of the highest

[01:00:52] performance CPUs on the outside, three

[01:00:55] times the bandwidth on the inside. The

[01:00:58] bandwidth per core and the bandwidth

[01:01:01] period is world class. Now remember I

[01:01:04] showed you earlier.

[01:01:06] The number of CPU cores, the number of

[01:01:08] CPUs is going to be quite high and the

[01:01:11] reason for that is very simple.

[01:01:15] We created CPUs

[01:01:17] for humans in the past and humans there

[01:01:21] only one billion of us.

[01:01:23] There will be billions of agents and

[01:01:27] these agents are going to be using the

[01:01:30] CPUs with very little patience. because

[01:01:33] the cost of the GPU they sit next to is

[01:01:35] too high and therefore

[01:01:38] too valuable, too precious. Therefore,

[01:01:41] these CPUs are going to be both

[01:01:45] performant,

[01:01:47] but they also have to be extremely

[01:01:49] energy efficient so that we can cram as

[01:01:52] much CPU as we can into the factory

[01:01:55] without taking away power from the token

[01:01:59] generation which we know is how we make

[01:02:01] money. These four properties,

[01:02:04] instructions per clock or single

[01:02:05] threaded performance,

[01:02:08] bandwidth per core,

[01:02:10] the total bandwidth around the chip and

[01:02:12] inside the chip and energy efficiency

[01:02:16] defines Vera. It is absolutely world

[01:02:19] class. When you compare it to the

[01:02:21] highest performance x86, it is just off

[01:02:23] the charts. When you compare it in real

[01:02:27] singlethreaded performance, real

[01:02:30] performance, it's off the charts.

[01:02:34] It is incredible to be able to deliver

[01:02:36] 5% improvement on CPUs. It is incredible

[01:02:40] to be able to deliver 10%. But this kind

[01:02:43] of performance speed up is just unheard

[01:02:45] of. This is Nvidia Vera.

[01:02:50] What do you think?

[01:02:57] Let's take a look.

[01:02:59] >> Aentic AI changes the role of the CPU.

[01:03:03] The CPU is now the conductor and the GPU

[01:03:06] is the orchestra. Traditional CPUs were

[01:03:09] built for a different era. Maximizing

[01:03:11] cores per socket. Slice them up,

[01:03:14] virtualize, rent by the hour. In the age

[01:03:18] of agents, the CPU is now a bottleneck

[01:03:21] to GPU utilization. directly affecting

[01:03:24] token throughput, latency, and user

[01:03:26] experience.

[01:03:28] NVIDIA Vera is the CPU built for the

[01:03:30] agentic loop. Combining NVIDIA's custom

[01:03:33] data center CPU core with the scalable

[01:03:36] coherency fabric for the right balance

[01:03:38] of performance cores and bandwidth to

[01:03:40] maximize AI factory output. At the heart

[01:03:43] of Vera is the NVIDIA Olympus core built

[01:03:46] for modern data center workloads,

[01:03:48] branchheavy Python runtimes, tool calls,

[01:03:51] and sandbox code execution. Each core is

[01:03:55] tuned for throughput. A neural branch

[01:03:57] predictor evaluating two taken branches

[01:04:00] per cycle. A 10-we decode engine brings

[01:04:03] in more work each cycle. A large out-of-

[01:04:06] orderer engine keeps instructions

[01:04:07] moving. Advanced prefetchers with a

[01:04:09] novel graph engine anticipating the next

[01:04:12] data path. But fast cores only matter

[01:04:15] when data arrives correctly and on time.

[01:04:18] Vera is the first CPU to use LPDDR5X

[01:04:22] memory while correcting multiple errors

[01:04:25] simultaneously without compromising

[01:04:27] bandwidth. Vera achieves 40% lower peak

[01:04:30] memory latency versus x86,

[01:04:33] keeping cores fed on time through

[01:04:35] retrieval, analytics, and sandbox

[01:04:37] execution. NVIDIA's second generation

[01:04:40] scalable coherency fabric unifies all 88

[01:04:44] Olympus cores on a monolithic mesh with

[01:04:47] separate dies for memory and IO. cores

[01:04:50] are not split across chiplets, enabling

[01:04:53] 50% faster core to core communication

[01:04:56] than traditional CPUs. And memory

[01:04:58] coherent NVLink chip to chip connects

[01:05:01] GPUs directly to the fabric. Beyond

[01:05:04] GPUs, NVLink chip to chip can scale Vera

[01:05:08] up to multiple sockets, enabling massive

[01:05:10] bandwidth between CPUs. Vera delivers

[01:05:14] 1.8 8 times the agentic sandbox

[01:05:16] performance of x86 CPUs. Standalone Vera

[01:05:20] racks run agent sandboxes, tools, code,

[01:05:24] and data pipelines. Tightly coupled to

[01:05:26] Reuben GPUs, Vera keeps accelerated

[01:05:29] workflows moving. NVIDIA Vera Bluefield

[01:05:32] 4 STX powers context memory and AI

[01:05:36] storage,

[01:05:38] compute, networking, storage. Vera is

[01:05:42] the CPU for the age of agents.

[01:05:52] This is going to be our new major growth

[01:05:55] driver. The reviews are already coming

[01:05:57] out and it's pretty good.

[01:06:01] That's pretty good stuff.

[01:06:10] Now remember,

[01:06:13] Grace and Vera

[01:06:16] are also the most highly qualified

[01:06:20] CPUs in the world of AI because every

[01:06:22] single data center, every single cloud,

[01:06:25] every single enterprise, every company

[01:06:27] that works with NVIDIA on AI has already

[01:06:30] qualified.

[01:06:32] Grace the entire software stack has

[01:06:35] already been optimized for grace. Every

[01:06:37] company will be qualifying Vera.

[01:06:41] Vera will be the most optimized agentic

[01:06:43] CPU in the world simply because it's

[01:06:47] going to go with Vera Rubin simply

[01:06:49] because we made the the big hard switch.

[01:06:52] In fact, during Grace Blackwell

[01:06:53] transition, the biggest risk was going

[01:06:56] from external CPU x86 into Grace

[01:07:00] Blackwell,

[01:07:02] that transition was extremely dangerous,

[01:07:05] but we did it with incredible execution.

[01:07:08] Now, Grace is literally synonymous with

[01:07:11] Grace Blackwell. When people say

[01:07:13] Blackwell, they say Grace Blackwell

[01:07:15] because it is utterly now everywhere.

[01:07:18] Every company's software stack has been

[01:07:20] optimized for it. Everybody's security

[01:07:22] stack has been optimized for it. And now

[01:07:24] here comes Vera. I'm super excited about

[01:07:27] that. Now look at some of the

[01:07:28] performance numbers.

[01:07:30] Speedups is one thing. It is extremely

[01:07:33] hard to speed up SQL.

[01:07:37] SQL the most famous

[01:07:42] domainspecific language DSL that has

[01:07:45] ever been created before SQL. You know

[01:07:49] before CUDA there was SQL before OpenGL

[01:07:53] there was SQL invented by IBM today it

[01:07:57] is the structured database engine of the

[01:08:00] planet everybody uses SQL this is SQL

[01:08:03] running three times faster not 10%

[01:08:07] faster not 25% faster 10 times f three

[01:08:11] times faster incredible this is

[01:08:15] real time the next one is real time

[01:08:18] stream process processing. Remember,

[01:08:20] your AI is going to be not just reading

[01:08:23] documents. Your AI is going to be

[01:08:26] watching for telemetry, especially

[01:08:28] inside a factory, inside a stock

[01:08:30] exchange.

[01:08:32] You're going to be looking for telemetry

[01:08:34] continuously. The burst of data that's

[01:08:37] coming in goes into a CPU. This is Vera

[01:08:41] CPU running real time stream processing

[01:08:44] for New York Stock Exchange. Lynn Martin

[01:08:47] the president of New York Stock Exchange

[01:08:49] has been so gracious to partner with us.

[01:08:52] This system is run all over the world in

[01:08:55] real time real time stream processing

[01:08:57] vera CPU six times all because of the

[01:09:00] bandwidth the single single threaded

[01:09:03] instruction execution the bandwidth

[01:09:06] inside between the cores the bandwidth

[01:09:08] outside. Vera is completely

[01:09:10] revolutionary.

[01:09:12] That's Vera.

[01:09:20] You know X factors is something you you

[01:09:22] talk about when you're talking about

[01:09:23] GPUs. It is quite rare that somebody

[01:09:26] talks about X factors on real workload

[01:09:30] real workload that is associated with

[01:09:32] CPU. So I'm so proud of the team. You

[01:09:35] guys did such a great job. We have an

[01:09:36] extraordinary road map coming.

[01:09:39] But what's really exciting is almost

[01:09:43] everybody is supporting Vera. They're as

[01:09:46] excited as we are. This is Vera opening

[01:09:49] up. It's opened up a brand new market.

[01:09:54] Agents. Agents is a new workload. We

[01:09:58] built CPUs for humans in the past. We

[01:10:01] need CPUs for agents. Agentic systems.

[01:10:05] The properties are different. Why would

[01:10:06] the old CPUs be the same?

[01:10:09] We are building millions and millions of

[01:10:13] errors. Millions of errors. And to go to

[01:10:17] market with us, Taiwan's ODMs and

[01:10:20] computer makers, all the OEMs. And you

[01:10:24] could see the early adopters.

[01:10:26] The early adopters are the agentic

[01:10:28] companies. This is the beginning of a

[01:10:31] new market, a market that never existed

[01:10:34] before. It's not going to take away from

[01:10:36] the old markets, but this is a new

[01:10:39] market.

[01:10:40] CPU for agents and this will this C this

[01:10:44] market will surely be larger than the

[01:10:47] last and the reason for that is because

[01:10:49] there'll be a lot more agents than there

[01:10:51] are people and then there will the

[01:10:52] agents are very impatient. So Nvidia

[01:10:56] Vera CPU. Thank you.

[01:11:05] This is the most important slide really.

[01:11:07] This is the takeaway. The takeaway here

[01:11:10] is that this is the application pattern.

[01:11:13] This is the computing pattern of the

[01:11:15] next decade.

[01:11:18] agents, harnesses,

[01:11:22] orchestrating large language models.

[01:11:24] Every company will run it. Every company

[01:11:28] will be an agent company. Every company

[01:11:31] will have agents running inside.

[01:11:34] Every company will see that agents will

[01:11:39] need its own operating system. Every

[01:11:41] company's asking us how do we run agents

[01:11:45] safely? How do we build agents for our

[01:11:49] own workloads? And so we have the NVIDIA

[01:11:53] agent toolkit for enterprise AI. You've

[01:11:56] seen me build this in plain sight.

[01:11:59] Almost everything that NVIDIA does, as

[01:12:01] you know, at every GTC, if you go back

[01:12:03] and look at my GTC 5 years ago or 10

[01:12:05] years ago, you will see today

[01:12:08] this you've seen me talking about for

[01:12:11] several years now because we've been

[01:12:13] building for this moment. There are four

[01:12:16] things that companies need in order to

[01:12:19] build agents as a service or build

[01:12:22] agents to operate.

[01:12:24] The first thing you need is you need

[01:12:26] models. Of course, large language

[01:12:29] models. The smarter the better, the

[01:12:31] cheaper the better, the faster the

[01:12:33] better. The second is you need a harness

[01:12:37] to orchestrate the whole thing. The

[01:12:39] third these a these models want to use

[01:12:43] tools and these tools come with its

[01:12:45] skills and I showed you CUDA X

[01:12:48] libraries. Those are going to be amazing

[01:12:50] tools for the agents in the future. And

[01:12:53] then lastly, you need a runtime. You

[01:12:56] need the operating system that holds it

[01:12:58] all together. This is the Nvidia toolkit

[01:13:02] for agents. It includes

[01:13:05] it includes

[01:13:08] models that you can modify. Nvidia's

[01:13:11] worldclass open models. And I want to

[01:13:13] show you more. You can run agents from

[01:13:16] anybody. You could run uh cloud code

[01:13:20] incredible agent codeex incredible

[01:13:22] agent. You could run it inside this

[01:13:25] harness called open shell which will be

[01:13:27] highly secure for your inside the

[01:13:29] enterprise. The shell protects the

[01:13:33] agent, keeps it grounded in security

[01:13:35] policies.

[01:13:38] Privacy is protected. Its rights and

[01:13:41] privileges are given. Its identities

[01:13:43] protected. And so this open shell is

[01:13:47] being adopted all over the world. Nvidia

[01:13:49] open shell is open source. You're going

[01:13:51] to see so many companies adopt it. Red

[01:13:53] Hat, canonical, Microsoft, it's going to

[01:13:57] be adopted everywhere. This is an

[01:14:00] important this is the runtime and this

[01:14:03] runtime is fully optimized for the

[01:14:06] Nvidia AI platform which is everywhere.

[01:14:09] So you can run open shell in any cloud

[01:14:12] on prem and even on device. So you have

[01:14:16] you have now tools and libraries that

[01:14:20] they can use. You have models that you

[01:14:22] can modify or use asis or you have

[01:14:25] agents. This be open claw Hermes another

[01:14:31] incredible another incredible uh

[01:14:33] harness. These agentic harnesses can now

[01:14:37] run on prem or for you anywhere. Okay,

[01:14:40] so four things and this represents the

[01:14:44] operating system of the modern

[01:14:45] enterprise. Now how do we use this? One

[01:14:49] of my favorite use cases of agents

[01:14:52] is chip designers. It is the single most

[01:14:56] important thing that Nvidia does. And so

[01:14:59] of course we have to partner with

[01:15:01] cadence to build super agent a chip

[01:15:05] design super agent. It is orchestrated

[01:15:09] by codecs or clock code. It has RTL and

[01:15:14] architecture diagrams or schematics or

[01:15:18] uh specifications as input and whatever

[01:15:20] you need to fix. And together we created

[01:15:24] some super agents

[01:15:27] that are optimized for the NVIDIA

[01:15:29] runtime with Neotron. And let's take a

[01:15:33] look. It's really incredible.

[01:15:38] Cadence and Nvidia are partnering to

[01:15:40] build chip design agents.

[01:15:43] Hundreds of thousands of NVIDIA chips

[01:15:45] come together to make the AI factories

[01:15:47] that power the world's frontier AI

[01:15:49] models. Designing these chips and the

[01:15:52] systems they run in is one of the

[01:15:54] hardest engineering challenges.

[01:15:56] Trillions of transistors,

[01:15:58] three-dimensional circuits, microscopic

[01:16:01] scale. Every gate, every wire

[01:16:04] synchronized to picos seconds must work

[01:16:07] in perfect harmony with no margin for

[01:16:09] error. Physical prototypes are too slow

[01:16:12] and too costly. So engineers work in the

[01:16:14] digital realm. Each chip begins as a set

[01:16:17] of architectural specifications, then

[01:16:19] translated into RTL, the language of

[01:16:22] chip design. RTL must be verified in

[01:16:25] simulation. A single bug can delay a

[01:16:27] chip by months. At NVIDIA, thousands of

[01:16:30] engineers, billions of compute hours per

[01:16:33] year, millions of tests written, run,

[01:16:36] and debugged. A cycle that takes teams

[01:16:38] weeks. To compress this cycle, Cadence

[01:16:41] and NVIDIA built a design verification

[01:16:43] agent. Codex orchestrates the process.

[01:16:46] Cadence chipstack launches the RTL

[01:16:49] verification loop powered by Neatron and

[01:16:52] secured by Nvidia OpenShell calling on

[01:16:55] expert sub aents in RTL generation,

[01:16:58] testbench creation, regression testing

[01:17:01] and debug. The system drives itself. The

[01:17:04] chipstack agents run hundreds of

[01:17:06] simulations with Cadence Excelium.

[01:17:08] Formal verification with Jasper. Design

[01:17:11] flaws revealed. Bugs in the code fixed.

[01:17:15] What once took weeks now takes hours.

[01:17:18] Verification cycles over 40 times

[01:17:20] faster. Together, Nvidia and Cadence are

[01:17:24] reinventing chip design with AI agents.

[01:17:30] From weeks, from weeks to hours, from

[01:17:33] weeks to hours, from weeks to hours,

[01:17:36] Nvidia has thousands of chip designers.

[01:17:39] We are going to hire hundreds of

[01:17:41] thousands of cadence super agents that

[01:17:44] work with us so that we can accelerate

[01:17:47] our company so that we can be even more

[01:17:50] ambitious create even more amazing

[01:17:52] things run even faster. You saw earlier

[01:17:56] that the toolkit with models harness

[01:18:01] tools the tools in this case are cadence

[01:18:04] simulators and verifiers formal

[01:18:06] verification systems. It is the reason

[01:18:08] why we're working with Cadence so hard

[01:18:11] to accelerate all of their tools on CUDA

[01:18:14] because the agents are impatient. The

[01:18:16] agents want the answer immediately. And

[01:18:19] so models,

[01:18:22] harnesses,

[01:18:24] accelerated CUDA accelerated libraries

[01:18:26] and tools and then the runtime. What you

[01:18:29] saw just now is all of that coming

[01:18:32] together. Now, one of the things that it

[01:18:34] starts with is a great model that

[01:18:37] Cadence could modify and tune to be

[01:18:41] expert at the cadence workflow at the

[01:18:43] cadence expertise so that they could

[01:18:46] create super agents that are proprietary

[01:18:49] to cadence with their proprietary

[01:18:51] knowledge.

[01:18:53] They have to start with an excellent

[01:18:55] model. We call it Neotron. Nvidia is

[01:18:58] dedicated to build open models for the

[01:19:00] world so that all of you, all of us

[01:19:03] could create our own agents. Today,

[01:19:06] we're announcing

[01:19:08] the Neotron

[01:19:11] 3 Ultra.

[01:19:13] Yep. Our next open model, and it is

[01:19:16] smart.

[01:19:22] The Neotron models not only give you the

[01:19:26] model, we give you all the data that we

[01:19:29] use to train the model. And because we

[01:19:32] have a coalition of incredible partners,

[01:19:34] you can see all of our partners down

[01:19:36] here. We work together, contribute data

[01:19:40] to each other. Neotron is trained on one

[01:19:44] of the largest suites of longunning

[01:19:46] reasoning models, longunning tool task

[01:19:50] solving tool using data sets in the

[01:19:53] world because of all of our great

[01:19:55] partnerships. All of this from the

[01:19:59] model, the training script and the data

[01:20:02] made completely available to you. This

[01:20:05] is open models at its best. the best

[01:20:08] open model system policies in the world.

[01:20:12] Simple goal is so that you can take all

[01:20:14] of it, add to it, make it even better,

[01:20:17] make it yours. Neotron 3 Ultra is five

[01:20:23] times faster.

[01:20:25] This is the world's first model based on

[01:20:27] a hybrid architecture of SSM state space

[01:20:31] models with mixture of experts. The

[01:20:35] architecture is incredibly fast. We made

[01:20:38] a fast so that you could think fast.

[01:20:39] When you think fast, you could think

[01:20:41] longer at the same cost. So five times

[01:20:45] faster. It is also

[01:20:48] 30% cheaper. 30% lower cost to run in

[01:20:52] total flops and total inference time

[01:20:54] than even the most cost effective in the

[01:20:56] world. We're comparing against the

[01:20:59] world's best open models. Frontier

[01:21:02] Smart,

[01:21:04] five times faster, 30 30% cheaper,

[01:21:08] completely open. We're completely

[01:21:11] dedicated to this. This is now Neotron

[01:21:13] 3. We're currently working on Neotron 4.

[01:21:17] So this entire toolkit from models,

[01:21:21] harnesses,

[01:21:23] tools and skills and runtimes is the

[01:21:27] reason why every enterprise company in

[01:21:30] the world has the ability now to create

[01:21:33] their own agents just like Cadence did

[01:21:36] with their super agents. And we're

[01:21:37] working with so many companies, Cadence

[01:21:39] and Crowdstrike and Dissol and

[01:21:41] Palunteer, SAP and Service Now. People

[01:21:44] were always said Jensen the agents are

[01:21:48] going to disrupt these markets. I said

[01:21:51] completely opposite and you can now see

[01:21:53] it. Agents is going to create the

[01:21:55] largest opportunity ever for my partners

[01:21:58] and friends and we have the Nemo the the

[01:22:02] NVIDIA agentic toolkit for enterprise AI

[01:22:06] to help them.

[01:22:08] So there you go.

[01:22:15] First, Vera Rubin in full production.

[01:22:18] Two, Vera CPU CPU built for a new

[01:22:22] generation for agents. And three,

[01:22:25] Nvidia's enterprise AI toolkits so that

[01:22:30] every enterprise and every enterprise

[01:22:32] software company can build agents.

[01:22:45] My relationship with you started here.

[01:22:50] And many of you, many of you, many of my

[01:22:52] friends and partners here in Taiwan,

[01:22:55] your companies started here.

[01:22:58] This is in a lot of ways the beginning

[01:23:02] of the modern computer industry. 40

[01:23:04] years now. Nvidia is 33 years old.

[01:23:08] the PC industry was already starting to

[01:23:10] get to P Windows one and Windows 2 and

[01:23:13] Apple Apple one and Apple 2 and by the

[01:23:17] time that we came along Windows 3.1 was

[01:23:21] the PC and as you know Windows 95 made

[01:23:26] PC personal. It took PC from enterprises

[01:23:30] companies and made it into a consumer

[01:23:33] electronics device. Everybody should

[01:23:36] have one and everybody does. This is the

[01:23:38] beginning. This computing platform did

[01:23:41] several things incredibly smart.

[01:23:44] Windows was not just disagregated. As

[01:23:47] you know, Windows was properly

[01:23:50] abstracted. It was architected just

[01:23:52] right. systems biosis,

[01:23:55] open chipsets,

[01:23:58] the operating system with drivers,

[01:24:02] drivers that could be connected and

[01:24:04] installed at runtime,

[01:24:06] and an abstraction layer with a

[01:24:09] multimedia API that was that opened up

[01:24:14] the PC to what we all know today. Each

[01:24:17] one of these elements were essential in

[01:24:20] making the PC so popular.

[01:24:23] 40 years later,

[01:24:26] Microsoft and Nvidia are going to

[01:24:28] reinvent the PC.

[01:24:31] This is going to be the new PC. Now,

[01:24:34] tomorrow night, tomorrow night, I think

[01:24:36] it's tomorrow night our time, but I'm

[01:24:39] going to be with Satya where we're going

[01:24:41] to talk a lot more about the work that

[01:24:43] we're doing together. Microsoft Nvidia

[01:24:45] over the last three years. It took this

[01:24:48] long to completely reinvent how the PC

[01:24:52] is going to work so that we could be

[01:24:53] ready for this moment. As I mentioned

[01:24:56] earlier, that compute pattern called the

[01:24:59] agent. It's going to run in AI clouds.

[01:25:03] It's going to run inside enterprises. It

[01:25:06] is also going to run on your PC.

[01:25:09] What's going to happen to that PC when

[01:25:12] it has an autonomous agent? an agent

[01:25:14] that's helping you, that understands

[01:25:16] you. You could talk to it. It could look

[01:25:19] at you. You could

[01:25:21] ask it to read files, go help you, do

[01:25:25] some research. It could do a lot more

[01:25:27] that I'll show you. But the new

[01:25:29] operating system is of course the old

[01:25:32] operating system plus large language

[01:25:35] models. Large language models in a lot

[01:25:38] of ways is the modern version of

[01:25:41] DirectX.

[01:25:42] It has of course input and output,

[01:25:45] understands prompts, it understands

[01:25:47] computer vision, it can generate video,

[01:25:49] it can generate sounds. It is the modern

[01:25:52] extension, the intelligence extension of

[01:25:55] the PC, of a computer.

[01:25:58] On top of that, the application as I

[01:26:01] mentioned before is going to be replaced

[01:26:04] by now an agentic runtime. And that is

[01:26:07] the modern application, an agent. Let's

[01:26:11] now take a look at what it can do.

[01:26:14] >> It started with a spark,

[01:26:17] an idea to reimagine the PC for the

[01:26:21] first time in 40 years. For the age of

[01:26:24] AI, what becomes of our personal

[01:26:28] computer in a world of agents?

[01:26:31] Agents running natively, connected to

[01:26:33] models, local or in the cloud. Our

[01:26:37] personal AI sandboxed for security,

[01:26:40] running continuously, getting work done.

[01:26:44] The chips and the OS must evolve.

[01:26:48] Introducing RTX Spark. Everything we've

[01:26:52] learned over 33 years distilled into one

[01:26:55] chip.

[01:26:57] Blackwell RTX GPU with 6,144

[01:27:02] CUDA cores. One pedlop of AI

[01:27:04] performance. A custom 20 core Grace CPU

[01:27:09] built in partnership with MediaTek.

[01:27:11] Fused by MVLink.

[01:27:15] 128 GB of unified memory. TSMC3

[01:27:19] nanometer process. 70 billion

[01:27:22] transistors.

[01:27:25] And in close collaboration with

[01:27:26] Microsoft, a Windows platform for

[01:27:29] agents.

[01:27:31] We're reinventing the personal computer,

[01:27:35] for creating,

[01:27:38] for gaming,

[01:27:41] for agents. This is the dawn of a new

[01:27:45] personal computing revolution,

[01:27:47] and it starts with NVIDIA RTX Smart.

[01:27:56] Here

[01:28:06] it is.

[01:28:08] Of course, I got to show you the most

[01:28:10] beautiful part, which is video games.

[01:28:13] It is It's also the closest to our

[01:28:15] heart. This is Forza. This is 007, by

[01:28:19] the way. The new 007 game. I'm looking

[01:28:22] forward to playing it. I look a little

[01:28:24] bit like him.

[01:28:26] Ladies and gentlemen, Nvidia's RTX Spark

[01:28:31] laptops. Now,

[01:28:37] thank you.

[01:28:43] I have too many things in my pocket.

[01:28:48] Okay. All right. This is the most

[01:28:51] amazing ship the world has ever built.

[01:28:54] This is the N1X that we built in

[01:28:57] partnership with MediaTek. I think I saw

[01:29:00] I saw Rick earlier. This is N1X. This is

[01:29:02] a beautiful chip. This is this is a a

[01:29:06] chip that frankly would take 33 years to

[01:29:10] build. And the reason for that is

[01:29:12] because 100% of Nvidia software stack

[01:29:15] runs here. If you want to run uh uh

[01:29:19] digital biology, no problem. If you want

[01:29:21] to do seismic processing, no problem.

[01:29:23] You want astrophysics, no problem.

[01:29:25] Everything associated with CUDA, all the

[01:29:27] physics, all the biology, all the

[01:29:29] genomics, all the AI, no problem. All

[01:29:32] the computer graphics, no problem. Every

[01:29:34] single application Nvidia has ever

[01:29:37] created and every single application

[01:29:40] that Windows has ever run,

[01:29:43] Microsoft and Nvidia meticulously

[01:29:46] optimized everything so that this

[01:29:48] computer literally runs everything the

[01:29:52] world has ever created. Plus,

[01:29:56] it now runs agents. An incredible

[01:29:59] computer. I'm so proud of it.

[01:30:07] Okay.

[01:30:08] Now, I want you to keep that in mind in

[01:30:11] the next video. I just I'm going to show

[01:30:13] you. Just imagine everything here is

[01:30:15] going to run on your PC. Now that

[01:30:17] computer could have a local Neotron 3

[01:30:21] Ultra model or Neotron 3 supermodel or

[01:30:25] it could have a cloud code or codeex or

[01:30:30] some other model in the cloud or

[01:30:32] something on the network and it's going

[01:30:33] to it's going to work and do something

[01:30:35] amazing. Let's play it.

[01:30:39] >> Every house starts as an idea. Getting

[01:30:42] from idea to design takes a myriad of

[01:30:45] tools, expertise, and a lot of time.

[01:30:51] Now, an agent running locally on RTX

[01:30:54] Spark can help me design a house using

[01:30:56] the tools on my laptop with an open

[01:30:58] shell sandbox running the Hermes harness

[01:31:02] connected to Claude Sonnet in the cloud.

[01:31:04] I select the site, share my concept

[01:31:07] sketches and mood board of styles to

[01:31:09] inspire my design and the prompt, a text

[01:31:12] description of the requirements

[01:31:15] and the design intent.

[01:31:19] My agent goes to work using the tools on

[01:31:21] my laptop. It opens Rhino and starts

[01:31:24] modeling the site, shaping terrain,

[01:31:27] setbacks, and the building envelope.

[01:31:30] Then it proposes building forms

[01:31:32] optimized for cost, comfort, and

[01:31:34] quality.

[01:31:37] With the form defined, my agent

[01:31:39] generates the interior layout. Walls,

[01:31:41] circulation, rooms begin to take shape.

[01:31:44] I jump in whenever I want to adjust to

[01:31:47] change.

[01:31:51] Doors, windows, and structural elements

[01:31:54] are placed automatically. My agent

[01:31:56] detects its own mistakes and fixes them.

[01:32:03] When I approve, the agent exports the

[01:32:05] model from Rhino into Blender. Materials

[01:32:08] and object properties transfer with the

[01:32:10] design context intact. I fine-tune the

[01:32:14] materials, get the look just right. Then

[01:32:17] I pick the shots. Blender renders the

[01:32:20] house. My agent using generative AI with

[01:32:23] the Flux 2 model makes them photoreal.

[01:32:25] Multiple viewpoints, lighting

[01:32:27] conditions. What was once a complex

[01:32:30] workflow is now guided and simplified by

[01:32:32] my agent.

[01:32:34] Working with me on RTX Spark design at

[01:32:38] the speed of imagination.

[01:32:47] PC in the world of agents. The

[01:32:50] developers are so excited about it. This

[01:32:52] is an incredible computer. All of the

[01:32:54] acceleration, all the software

[01:32:55] capabilities associated with it, working

[01:32:58] with every developer to make it

[01:33:00] incredible for all of you. The next one,

[01:33:04] Adobe. Incredible tool suite of course

[01:33:07] used by tens of millions of people

[01:33:09] around the world. They have

[01:33:10] re-engineered

[01:33:12] the architecture, the core of Adobe

[01:33:14] Photoshop and Premiere, and they're

[01:33:16] going to release it for RTX Spark. It is

[01:33:18] twice as fast. It's already fast. Now,

[01:33:20] it's going to be twice as fast. And it

[01:33:23] it's also designed to be agentfriendly

[01:33:26] with its MCP server. It can now interact

[01:33:29] with agents on your laptop.

[01:33:31] The number of customers, the number of

[01:33:33] partners that are so excited to bring

[01:33:36] RTX, RTX Spark to the market is just

[01:33:39] incredible. You know, this is the first

[01:33:43] across the lineup of PC reinvention for

[01:33:47] 40 years. And I'm just so happy that all

[01:33:49] of you and the ecosystem around the

[01:33:52] world has joined us. This is basically

[01:33:55] everybody. Everybody will support RTX

[01:33:58] Spark and will be building incredibly

[01:34:01] smart and powerful and beautiful laptops

[01:34:03] with all of us. Thank you very much.

[01:34:11] But that's not all. That's not all.

[01:34:15] RTX Spark is a reinvention of laptop.

[01:34:19] But in fact, Microsoft Nvidia is

[01:34:22] reinventing all of PC. And today we're

[01:34:25] announcing a whole new line.

[01:34:28] Three revolutionary Windows machines

[01:34:31] covering desktop, laptop, and

[01:34:34] workstations. All 100% Windows

[01:34:38] compatible, 100% CUDA, 100% Nvidia AI

[01:34:42] Tensor Core. Everything that runs that

[01:34:45] you see that runs on Nvidia in all these

[01:34:47] different platforms around the world

[01:34:49] runs here.

[01:34:51] This is the first completely

[01:34:55] re-engineered, reinvented line of PCs

[01:34:58] that has happened in 40 years. Now,

[01:35:00] what's really amazing is this. So, this

[01:35:02] is this is the RTX Spark laptop. This

[01:35:07] is the desktop. So, this one's from MSI.

[01:35:11] Joseph, this one's yours. Okay. Look how

[01:35:14] beautiful it is. This agent could run

[01:35:17] 247

[01:35:19] meter free.

[01:35:21] And you could download your agent. You

[01:35:23] could raise your lobster in here.

[01:35:28] This is your clock. It's running all the

[01:35:30] time. No meter anxiety. And it's sitting

[01:35:34] here connected to your whole house,

[01:35:36] connected to your laptop, connected to

[01:35:39] your display, all the cameras, your your

[01:35:43] dryer, your water cooler, your water

[01:35:46] heater, your everything, whatever you

[01:35:48] want, your security system, all

[01:35:50] connected to this. And this becomes your

[01:35:52] personal AI, your personal AI agent. And

[01:35:56] it gets smarter and smarter and smarter

[01:35:58] over time because today we have Neotron

[01:36:00] 3 Ultra. Tomorrow we have Neotron 4 and

[01:36:03] then Neotron 5, Neotron 6. And we just

[01:36:06] keep getting us smarter and smarter and

[01:36:07] smarter. And meanwhile, this is sitting

[01:36:09] at home helping you do things. If you

[01:36:11] want to book a travel, no problem. And

[01:36:15] if you

[01:36:17] if you want an incredible system, this

[01:36:21] is a DGX station for Windows.

[01:36:26] compatible with Windows, runs everything

[01:36:27] in Windows, and and it has 768

[01:36:33] GB of memory. And so you could run a

[01:36:36] trillion parameter model. This is

[01:36:39] unbelievable. 20 pedlops,

[01:36:42] 8 terabytes per second of memory

[01:36:44] bandwidth, and this sits by your desk.

[01:36:49] You basically if you're a developer of

[01:36:51] large language models, you're a

[01:36:53] developer of agents. Having this sit by

[01:36:56] your desk gives you all the compute you

[01:36:58] need and then when you deploy it, you

[01:37:00] put it into the cloud. Now there's

[01:37:03] something that if you look at this and

[01:37:05] think about this, something is happening

[01:37:07] here.

[01:37:09] Remember

[01:37:12] 15, 20 years ago, we used to have an

[01:37:14] idea called a foam.

[01:37:17] Today we have an idea called a PC.

[01:37:21] Today when you think about your phone,

[01:37:24] the one thing you don't do with it is

[01:37:26] make phone calls.

[01:37:31] You do just about everything else. And

[01:37:33] so that phone means something very

[01:37:36] different to you than a phone of the

[01:37:38] past.

[01:37:40] I am certain what's going to happen here

[01:37:43] is that the PC 10 years from now and the

[01:37:45] PC that you think about today a tool

[01:37:48] whether you launch applications

[01:37:52] click and type and this PC is going to

[01:37:56] be completely different. Here's my

[01:37:58] theory.

[01:38:00] I can totally imagine just as every

[01:38:03] house today has a home theater where

[01:38:06] many houses have home theaters, big TVs,

[01:38:10] lawnmowers,

[01:38:12] dishwashers.

[01:38:14] I could totally imagine that someday

[01:38:16] there's actually an AI supercomput in

[01:38:18] your house and it's running all of your

[01:38:21] agents. that's running all of your

[01:38:23] assistants and they're doing all kinds

[01:38:25] of things for you all the time and you

[01:38:28] have to have it in your house just like

[01:38:30] you have a home theater in your house,

[01:38:32] you have stereoss in your house, you

[01:38:33] have game consoles in your house, you

[01:38:36] want to assist AI agent computers

[01:38:38] running in your house and these in time

[01:38:43] becomes a lot more like R2-D2 to you. It

[01:38:47] becomes more like C3PO to you than it

[01:38:51] feels like a PC to you.

[01:38:54] There is no question this reinvention of

[01:38:57] the computer is as big of a deal as the

[01:39:00] reinvention of the phone into what we

[01:39:02] now know as the smartphone. And so this

[01:39:04] is the beginning of that journey. This

[01:39:06] is the beginning of a new line. And so

[01:39:09] we have a roadmap for this. This is a

[01:39:12] brand new product family for us. every

[01:39:14] single generation of architecture, we

[01:39:17] will have a desktop, a laptop, a

[01:39:20] workstation, and then a desktop, a

[01:39:23] laptop and workstation. And the thing

[01:39:25] that I am just incredibly pleased,

[01:39:27] incredibly honored is that 100% of the

[01:39:31] world's PC industry has joined us to

[01:39:34] reinvent the PC. A new line, a new

[01:39:38] beginning. Thank you.

[01:39:52] As you know,

[01:39:55] agentic AI is just a digital robot.

[01:40:00] It understands, it reasons, it plans,

[01:40:04] and it acts and use tools.

[01:40:07] Agented AI is going to run across all of

[01:40:10] these computers and you've seen me talk

[01:40:12] about each and every one of these over

[01:40:14] time. We're working on human or robotics

[01:40:17] computers, robotics computers of all

[01:40:19] kinds. We're working on self-driving car

[01:40:21] computers. We're working on satellites.

[01:40:25] You have GeForce which is has Tensor. I

[01:40:27] just talked about a whole new line of

[01:40:29] PCs. agriculture equipment,

[01:40:32] manufacturing equipment, heavy industry

[01:40:34] equipment will all be agentic. You'll

[01:40:37] even have a little agentic

[01:40:39] helper for yourself.

[01:40:42] Even your base stations, the radio

[01:40:44] stations of the future are going to be

[01:40:45] agentic.

[01:40:47] understanding traffic and thinking about

[01:40:51] how to coordinate with the other base

[01:40:53] stations so that you could use as little

[01:40:55] energy as possible increase the

[01:40:58] utilization the efficiency of the

[01:41:00] spectral efficiency and so everything

[01:41:03] will run agents today Nvidia is largely

[01:41:06] in the center but I am pretty certain

[01:41:09] that there will be tens of billions

[01:41:13] hundreds of billions over time of

[01:41:15] agentic systems agentic computers that

[01:41:18] are going to be running around the

[01:41:19] world. The biggest problem is data.

[01:41:23] In the case of language models, all the

[01:41:26] English and all the language that we

[01:41:27] have on the internet that we trained on

[01:41:29] was from the perspective of us. We wrote

[01:41:32] it and we're reading it. However, in

[01:41:35] order to create a data for AI robotics,

[01:41:39] it has to be in the perception, the

[01:41:41] perspective of the robot. And most of

[01:41:44] the world's video data is from a third

[01:41:47] person, not first person. And so agentic

[01:41:50] systems, robotic systems,

[01:41:53] physical AI, the data is the hardest

[01:41:56] problem. You've seen us move up this

[01:41:59] ladder. We started with tea operations,

[01:42:02] which is basically human demonstration.

[01:42:04] This is no different than the big

[01:42:06] breakthrough of reinforcement learning,

[01:42:08] human feedback. This then we use

[01:42:11] simulation. This is where omniverse

[01:42:12] comes in. This is no different than

[01:42:14] reinforcement learning ver verifiable

[01:42:17] rewards. Okay. And so we use these

[01:42:20] systems to bootstrap

[01:42:24] the AI model, the physical AI model.

[01:42:27] Eventually we're able to learn from

[01:42:30] third per third person reproing it into

[01:42:33] first person. And now eventually through

[01:42:36] bootstrapping we have a world foundation

[01:42:39] model that can understand the physical

[01:42:42] world from any perspective you want.

[01:42:45] Third third third person first person

[01:42:47] outside in inside out doesn't matter.

[01:42:50] This is a big breakthrough indeed. And

[01:42:53] today we are announcing

[01:42:56] Cosmos 3. Cosmos 3 is the frontier of

[01:43:02] physical AI.

[01:43:04] We are at the frontier with language

[01:43:07] models. There are so many people working

[01:43:08] on it. However, in physical AI, we are

[01:43:12] absolutely the world's best. I am so

[01:43:14] proud of the team for doing this. This

[01:43:16] is the foundation model for all of your

[01:43:19] work. Whenever you want to create a

[01:43:21] robot, whenever you want to create a

[01:43:23] factory robot or a robot that works in a

[01:43:25] factory, any kind of robot that in that

[01:43:29] involves physical world, you now have a

[01:43:32] companion, a Cosmos 3 that can

[01:43:35] understand and reason, it can generate,

[01:43:38] it can simulate in the loop, it can even

[01:43:41] be the policy itself. It is on the top

[01:43:44] of leaderboards all over the all over

[01:43:46] the world. I am incredibly proud of

[01:43:47] Cosmos and today we're announcing Cosmos

[01:43:50] 3. Let's take a look.

[01:43:53] >> The real world is infinite and

[01:43:55] unpredictable.

[01:43:56] Physical AI needs data, but real world

[01:43:59] data is impossible to scale. For

[01:44:02] physical AI, compute is data. This is

[01:44:07] Cosmos, an open frontier omniodel for

[01:44:10] physical AI built on a new mixture of

[01:44:12] transformers architecture. Pixels,

[01:44:15] action, sound, and language flow into

[01:44:17] the auto reggressive transformer, which

[01:44:20] reasons, plans, and instructs the

[01:44:22] diffusion transformer, which generates

[01:44:24] what comes next.

[01:44:26] Developers post-train Cosmos across

[01:44:28] embodiment and use cases. As a VLM,

[01:44:32] Cosmos watches the physical world,

[01:44:35] understands what's happening, describing

[01:44:37] scenes, and flagging what matters.

[01:44:41] As a world model, Cosmos generates

[01:44:44] physics accurate synthetic video from an

[01:44:46] image, text, or video.

[01:44:50] As a simulator, Cosmos closes the loop

[01:44:52] for policy training and evaluation. And

[01:44:55] as the foundation of NVIDIA Omnidreams,

[01:44:58] an action conditioned world model,

[01:45:00] Cosmos predicts the future frame by

[01:45:03] frame.

[01:45:04] Post train Cosmos and it becomes a world

[01:45:07] action model. Perceiving, reasoning,

[01:45:11] planning, generating actions

[01:45:14] for robots of every kind,

[01:45:17] for everything that moves.

[01:45:22] A new kind of data, a new kind of

[01:45:24] teacher generated by compute.

[01:45:29] Cosmos, the foundation for developers of

[01:45:32] the age of physical AI.

[01:45:46] It's takes data plus compute, gives you

[01:45:50] AI.

[01:45:52] Now that we have AI,

[01:45:54] compute is data. And so use Cosmos 3,

[01:45:58] train a whole bunch of AI models. Cosmos

[01:46:00] is such an incredible open model system.

[01:46:02] It's exactly the same as Neotron. We

[01:46:04] open the model, we open the data, and we

[01:46:07] even opened how we trained it so that

[01:46:09] you could enhance it for yourself and

[01:46:11] turn Cosmos into your proprietary model.

[01:46:14] We have such incredible partners working

[01:46:16] with us in so many different industries.

[01:46:18] Now the model itself is the most of

[01:46:21] course the most understandable part of

[01:46:24] the AI stack but the AI stack is very

[01:46:26] complicated. It has generators

[01:46:30] the model simulators and the runtime

[01:46:34] just as just as it is for agentic

[01:46:36] systems. these cars or essentially a

[01:46:39] physical AI agentic robot that is a is a

[01:46:44] autonomous vehicle has also this

[01:46:47] complicated stack. Today we're

[01:46:48] announcing Alpa Mayo2

[01:46:51] an open model for self-driving cars.

[01:46:55] We're working with car companies across

[01:46:58] the world. If you look at these brands

[01:47:00] that have signed up for the Nvidia

[01:47:01] Hyperion that are building Nvidia

[01:47:03] Hyperion cars, this represents about 80%

[01:47:09] of the world's cars. The manufacturers

[01:47:12] represent 80% of the world's cars. We

[01:47:14] are going to have a whole lot of Nvidia

[01:47:17] Hyperion systems that are able to run

[01:47:20] Alamo or anybody else's AV stack. We are

[01:47:23] also connected into mobility services.

[01:47:26] Approximately 97% of the world's

[01:47:28] mobility services are connecting with

[01:47:30] us. So that when we deploy Alpaio on the

[01:47:35] Hyperion runtime with the Halos

[01:47:38] operating system, we will be able to

[01:47:40] connect to all of these services across

[01:47:42] the world. Let's take a look at this.

[01:47:47] >> Hey Mercedes, let's go to my favorite

[01:47:49] sandwich shop.

[01:47:51] >> Routing to your destination.

[01:47:54] Lane is clear. pulling out to start

[01:47:56] drive. Nudge left due to the stationary

[01:47:59] lead vehicle ahead blocking our lane.

[01:48:01] Slow down to stop at the stop sign

[01:48:03] controlling the intersection. Stop to

[01:48:05] yield to the pedestrian since the person

[01:48:07] is in our lane. Yield to the cut in

[01:48:10] vehicle from the left. Nudge left to

[01:48:12] clear the stopped vehicle blocking on

[01:48:13] the right. Keep distance to the cut in

[01:48:15] vehicle since it is merging into our

[01:48:16] lane. Nudge left due to the stopped van

[01:48:21] crossing ahead. Stop to keep distance of

[01:48:22] the lead vehicle. Keep distance directly

[01:48:24] ahead in our lane. Keep distance to the

[01:48:25] vehicle directly ahead in our lane. Stop

[01:48:27] to the stop sign since the intersection

[01:48:28] is controlled. Stop the cross traffic.

[01:48:31] Keep distance due to the truck blocking

[01:48:32] the right side of our lane. Right due to

[01:48:34] the truck blocking the left side of our

[01:48:35] lane due to the trucking side of our

[01:48:36] lane.

[01:48:39] >> Your destination is on the right.

[01:48:48] Alamo.

[01:48:51] the world's first reasoning autonomous

[01:48:54] vehicle.

[01:48:56] If you let it talk all the time, it will

[01:48:59] drive you crazy.

[01:49:01] But we're very happy that it's talking

[01:49:05] to itself all the time. That's called

[01:49:07] thinking. And so, Alamo is a reasoning

[01:49:10] car. The technology that we've created

[01:49:13] also applies to humanoids. Of course,

[01:49:15] there are many new breakthroughs that

[01:49:17] has to happen. The NVIDIA Isaac Groot is

[01:49:20] our humanoid robotic stack model

[01:49:25] data generation

[01:49:27] simulation

[01:49:29] the runs the runtime including the

[01:49:32] operating system this represents

[01:49:35] group

[01:49:37] platform the Isaac group platform every

[01:49:40] one of our systems as you can see the

[01:49:42] exact same pattern whether it's agentic

[01:49:44] system for the cloud agentic system for

[01:49:47] a PC, a robotic system for a

[01:49:50] self-driving car, a robotic system for a

[01:49:52] human or robot, all the same. And of

[01:49:55] course, in every single case,

[01:49:58] we build everything completely.

[01:50:01] We build everything vertically,

[01:50:04] completely

[01:50:06] integrated with code design, extreme

[01:50:08] code design, and then we open it up for

[01:50:11] everybody to use whichever part you

[01:50:13] like. And whatever you want to use, we

[01:50:16] even help you modify. But the one thing

[01:50:18] that is missing is we need a reference

[01:50:22] platform for robotic systems. These

[01:50:24] robotic systems are so complicated. So

[01:50:27] many motors, so many sensors, so

[01:50:29] fragile. And yet we need to have a way

[01:50:33] to deliver these reference platforms.

[01:50:35] Just like we do with PCs and DGXs and

[01:50:38] clouds and self-driving cars, we now are

[01:50:41] going to do it for robots. Today we're

[01:50:43] announcing the NVIDIA Isaac Groot, a

[01:50:45] reference humanoid robot, all fully

[01:50:48] integrated. 25 degrees of freedom on the

[01:50:51] on each hand made by Sharpa. 31 degrees

[01:50:55] of freedom on the robot. 6 feet 150 lbs.

[01:51:00] Just like me.

[01:51:05] The first number is shorter. The second

[01:51:07] number is bigger.

[01:51:10] Otherwise, pretty close. and and this

[01:51:13] platform runs the new Thor and our

[01:51:15] entire software stack, data generation

[01:51:18] stack, data simulation stack, the

[01:51:20] runtime, all integrated into a robot

[01:51:24] that is designed for everyone to use.

[01:51:26] Now, we built this for higher education

[01:51:29] and university researchers because for

[01:51:32] them to build this is in insanely hard

[01:51:34] to do. And so, let's take a look at

[01:51:36] that.

[01:51:37] The next leap in AI is generalpurpose

[01:51:40] robots, humanoids. But building one is

[01:51:43] hard. Every team starts from scratch,

[01:51:45] stitching together simulators, teleyops

[01:51:48] systems, data pipelines, and training

[01:51:50] infrastructure. Months of setup before

[01:51:53] research can start. NVIDIA Isaac Group,

[01:51:56] an open development platform for

[01:51:58] humanoid robots. Open models, simulation

[01:52:02] and training libraries, and data

[01:52:04] generators.

[01:52:05] Plus the robot computer fully pipe

[01:52:09] clean, ready to go in hours. First, set

[01:52:12] up the simulation environment in Isaac

[01:52:14] lab.

[01:52:18] Capture demonstrations with Isaac

[01:52:20] Teleyop on a real or simulated robot.

[01:52:25] Generate synthetic data with Omniverse

[01:52:28] and Cosmos.

[01:52:29] Scaling one demonstration into

[01:52:32] thousands.

[01:52:33] Train policies, evaluate them in Isaac

[01:52:36] Lab Arena,

[01:52:40] deploy through Isaac Ross running on

[01:52:42] Jets and Thor.

[01:52:54] Every element modular, open, use ours or

[01:52:59] swap in your own.

[01:53:02] Groot is powering robotics research

[01:53:04] across every discipline for every domain

[01:53:07] from research labs to factory floors.

[01:53:11] One open platform

[01:53:19] and now a new addition. Isaac Groot

[01:53:22] reference design robots built on

[01:53:25] NVIDIA's open platform ready for

[01:53:27] frontier research for any lab anywhere.

[01:53:31] The age of robotics starts here. Nvidia

[01:53:34] Isaac Groot.

[01:53:40] So many robots.

[01:53:47] We're working with just about everybody

[01:53:48] who's working on robots in the world or

[01:53:50] robotic systems in world. Let me tell

[01:53:53] you what I told you. The computer

[01:53:55] industry has been completely changed

[01:53:58] in the last six months. Everything

[01:54:00] changed.

[01:54:02] Everything changed because agents were

[01:54:04] realized and it converged with the

[01:54:07] latest frontier models and it made

[01:54:09] possible the AI to now do useful work.

[01:54:13] The computing pattern will repeat over

[01:54:16] and over and over again. This computing

[01:54:19] pattern of an agent that's a model, a

[01:54:21] harness that uses tools with skills and

[01:54:25] runs in a runtime. That runtime depends

[01:54:28] on whether it's in the cloud or on prem

[01:54:30] on a PC or in a robot. But the computing

[01:54:33] pattern is exactly the same for all of

[01:54:35] them. You will use different harnesses

[01:54:37] because of your preference. You'll use

[01:54:39] different models because of your

[01:54:40] preference. You will improve them for

[01:54:43] your proprietary use. You would create

[01:54:45] sub super agents that you can rent to

[01:54:47] other people to help them do their work.

[01:54:50] This agentic platform, this agentic

[01:54:52] pattern, Nvidia has an enterprise AI

[01:54:55] toolkit. This is a wonderful way for all

[01:54:59] of you to engage AIS and for us it's a

[01:55:01] wonderful growth opportunity.

[01:55:04] Vera Rubin is in full production

[01:55:07] whereas Grace Blackwell was created to

[01:55:10] process AI particularly inference. Vera

[01:55:14] Rubin was created to run agents. It is

[01:55:17] in full production. It is much much more

[01:55:20] than a GPU. It is an entire disagregated

[01:55:24] distributed agent processing system.

[01:55:27] Nvidia has really become an

[01:55:28] infrastructure company. Not just a GPU

[01:55:31] company, not just a systems company, but

[01:55:33] an infrastructure company to help you

[01:55:36] generate the maximum revenues, the

[01:55:38] maximum profit, and to get there as soon

[01:55:40] as possible.

[01:55:42] the agent world.

[01:55:44] This new way of doing computing where

[01:55:47] you build CPUs now for agents not for

[01:55:49] people

[01:55:51] CPUs for agents has its own special

[01:55:53] requirement and our Nvidia Vera is

[01:55:56] revolutionary. I'm so happy about its

[01:55:58] ramp the orders already. It's going to

[01:56:01] make it the fastest and the most

[01:56:03] successful product launch in our

[01:56:05] company's history. Nvidia and Microsoft

[01:56:08] has created a whole new line of PCs.

[01:56:10] This is a new beginning. And of course,

[01:56:13] that exact same agentic pattern that I

[01:56:15] agentic processing pattern, computing

[01:56:17] pattern that I just described is also

[01:56:21] going to run on all kinds of devices. I

[01:56:23] mentioned PCs, but in the future, it'll

[01:56:26] be robots and satellites and base

[01:56:28] stations and factories in the cloud on

[01:56:32] prim at the edge. This pattern agentic

[01:56:35] AI system, this agentic computing

[01:56:38] pattern will be replicated in computers

[01:56:40] all over. How we think about the

[01:56:42] personal computer will very likely

[01:56:44] change. I want to thank all of you for

[01:56:48] your partnership, your friendship. We

[01:56:50] couldn't be here without everything that

[01:56:52] we do together. I am so proud of how

[01:56:54] you've been so successful this last

[01:56:56] year. The next year is going to be even

[01:57:00] more. I have one more thing for you.

[01:57:02] Let's take a look.

[01:57:23] You ready, Taiwan?

[01:57:27] >> Let's do this. The keynotes done at

[01:57:30] Computex Jensen. Show the world what's

[01:57:33] next. Useful AI has arrived. Agents

[01:57:37] working by your side. But in case you

[01:57:39] miss things we said today, we're going

[01:57:41] to break it all down for you. Taipei.

[01:57:44] >> Agents used to be misunderstood. Only

[01:57:46] movie stars had them in Hollywood. Now

[01:57:49] we all got teams making dreams come

[01:57:51] true. Building companies from living

[01:57:53] rooms, but they need so much comput. We

[01:57:56] hear you. That's why we created Ver

[01:57:59] >> Proven stole the show is true. The

[01:58:01] cheapest tokens coming through

[01:58:03] >> 10 times faster inference heaven. More

[01:58:06] special agents than 007.

[01:58:08] >> Blue field keeps agents memory. True.

[01:58:11] >> Now let's talk about it. CPU

[01:58:13] >> 50% faster. That's outrageous.

[01:58:16] >> Not for Ver.

[01:58:16] >> It's built for agents. Envy link fusion

[01:58:19] blends A6 smartly.

[01:58:20] >> Everyone's welcome to the Envy Link

[01:58:23] party. Well, if you like that

[01:58:25] introduction

[01:58:27] in full production

[01:58:29] ultra leave the run 5x work gets done

[01:58:33] claw keep the guard rails right open

[01:58:36] shell keeps the sand tight

[01:58:38] >> your code migrated and reviewed before

[01:58:42] this song is

[01:58:45] a five layer cake

[01:58:47] make no mistake global AI cloud lots of

[01:58:50] gigawx keeps power lean connecting Got

[01:58:53] >> every optimized for you

[01:58:55] >> so you can have your cake

[01:58:57] >> can eat it too.

[01:58:59] >> RTX is finally here.

[01:59:01] >> Biggest PC moment in 40 years.

[01:59:04] >> Agents powering our workflow running

[01:59:06] anywhere Windows go.

[01:59:08] >> Harnesses run on CPU.

[01:59:10] >> Models fly on GPU.

[01:59:13] >> Cosmos worlds that robots need. Turning

[01:59:16] computer into synthetic feet.

[01:59:18] >> Alpha meiosis and reasons through.

[01:59:20] >> Understands roads like people do.

[01:59:23] >> Food is how they learn to move.

[01:59:25] >> Learning skills and finding growth trees

[01:59:29] powered by.

[01:59:30] >> The future is humanoid.

[01:59:44] Oh,

[01:59:47] he sing. Oh,

[01:59:52] he sing.

[01:59:57] Oh, he sing.

[02:00:04] The future's bright. Come see what's

[02:00:08] next.

[02:00:10] >> Thank you, Taiwan.

[02:00:13] Welcome to Computics.

[02:00:27] Have a great Computex.

[02:00:30] Thanks for an amazing year. Thank you

[02:00:32] for all your friendship and support.

[02:00:34] Thank you. Take care. Have a great

[02:00:37] Comput text.

[02:00:52] Woke up feeling something shift. Same

[02:00:55] room, but the air felt thick.
