【黃仁勳最新訪談】「把電子變成 Tokens」：揭開 AI 戰局真正核心

Full Transcript

https://www.youtube.com/watch?v=ZVhf4ZfeLTo

[00:00] We've seen the evaluations of a bunch of software companies crash because people are expecting AI to commoditize software.
[00:06] And there's a a potentially naive way of thinking about things which is like, "Look, Nvidia sends a GDSII file to TSMC, TSMC builds the logic dies, it builds the switches, um then it packages them with the HBM that SK Hynix and Micron and Samsung make, then it sends it to an ODM in Taiwan where they assemble the racks."
[00:26] And so Nvidia's fundamentally making software that other people are manufacturing and if software gets commoditized, does Nvidia get commoditized?
[00:32] Well, in the end something has to transform electrons to tokens.
[00:38] That transformation um there's no the transformation of electrons to tokens uh and making those tokens more valuable over time uh I don't I think that that that's hard to hard to um completely commoditize.
[00:55] The the transformation from electrons to
[01:00] The the transformation from electrons to tokens is such an such an incredible journey.
[01:04] And and making that token you know, it's like making a one molecule more valuable than another molecule.
[01:10] Making one token more valuable than another the amount of artistry, engineering, science, invention that goes into making that token valuable.
[01:20] Uh obviously we're we're watching it happening in real time.
[01:24] And so so the the the the transformation, the manufacturing, um all of the science that goes in there it is far from un- deeply understood and it's far from the journey is far from far from over.
[01:39] And so so I I doubt that it will happen.
[01:41] Um we're going to make it more efficient, of course.
[01:43] I mean the whole the whole thing about Nvidia in fact the way that you framed the question is is my mental model of our company.
[01:50] The input is electron, the output is tokens.
[01:54] That is in the middle, Nvidia.
[01:56] And our job is to to do as much as necessary
[02:01] to do as much as necessary as little as possible
[02:03] as little as possible to enable that transformation to be done
[02:06] to enable that transformation to be done at incredible capabilities.
[02:08] at incredible capabilities.
[02:08] And and what I mean by as little as
[02:09] And and what I mean by as little as possible, whatever I don't need to do
[02:13] possible, whatever I don't need to do I partner with somebody and I make it
[02:14] I partner with somebody and I make it part of my ecosystem to do.
[02:16] part of my ecosystem to do. And if you look at Nvidia today
[02:18] look at Nvidia today we probably have the largest ecosystem
[02:19] we probably have the largest ecosystem of partners both in supply chain
[02:21] of partners both in supply chain upstream, supply chain downstream, all
[02:24] upstream, supply chain downstream, all of the computers, the computer
[02:25] of the computers, the computer companies, and all the application
[02:27] companies, and all the application developers, and all the model makers,
[02:28] developers, and all the model makers, and all the you know the AI is a
[02:31] and all the you know the AI is a five-year five-layer cake if you will
[02:33] five-year five-layer cake if you will and and we have ecosystems across the
[02:35] and and we have ecosystems across the entire five layers. And and so we try to
[02:38] entire five layers. And and so we try to do as little as possible.
[02:40] do as little as possible. But the part that we have to do
[02:42] But the part that we have to do as it turns out is insanely hard.
[02:45] as it turns out is insanely hard. And and um
[02:46] and um I I don't think that that gets
[02:47] I I don't think that that gets commoditized. In fact in fact um
[02:49] commoditized. In fact in fact um uh I also don't think that the the
[02:52] uh I also don't think that the the enterprise software companies
[02:54] enterprise software companies uh the tools makers
[02:57] uh the tools makers you know, most of the software companies
[02:58] you know, most of the software companies today are tools makers.
[03:00] today are tools makers. Um some of them are not um but are are
[03:02] Um some of them are not um but are are some of them are workflow.
[03:05] Some of them are workflow um codification.
[03:07] Um codification you know, systems.
[03:10] Um but for a lot of companies they're tool makers.
[03:11] For example, you know, Excel's a tool, PowerPoint's a tool, uh Cadence makes tools, Synopsys makes tools.
[03:18] I I actually see the opposite of what people see.
[03:22] I think the number of agents are going to grow exponentially.
[03:27] The number of tool users are going to grow exponentially.
[03:30] And it's very likely that the number of instances of all these tools are going to skyrocket.
[03:39] It is very likely the number of instances of Synopsys design compiler is going to skyrocket.
[03:48] And the number of number of agents that are going to be using the floor planners and all of our layout tools and our design design rule checkers, the number of agents that are today we're limited by the number of engineers to tomorrow those engineers are going to be supported by a bunch of agents.
[04:02] They're going to be exploring
[04:03] Agents. They're going to be exploring out the design space like you've never seen explored before and going to use the tools that we use today.
[04:09] And so so I think I think tool use is going to cause cause these software companies to skyrocket.
[04:13] The reason why it hasn't happened yet is because the agents aren't good enough at using their tools yet.
[04:19] And so either these companies are going to build the agents themselves or agents are going to get good enough to be able to use those tools.
[04:26] And I think it's going to be a combination of both.
[04:29] Mhm. I think in your latest filings it was you had almost a hundred billion dollars in purchase commitments with people, foundries, memory, packaging, and then SemiAnalysis has reported that you will have two hundred fifty billion dollars of these kinds of purchase commitments.
[04:44] And so one interpretation is Nvidia's mode is really that you've locked up many years of these scarce components that are uh ever you know, somebody else might have an accelerator, but can they actually get the memory to build it?
[04:55] Can they actually get the logic to build it?
[04:57] And this is really Nvidia's big mode for the next few years.
[05:02] Well, it's one it's one of the things that we can do that is hard for someone else to do.
[05:06] hard for someone else to do.
[05:08] The reason why we could we've we've made enormous commitments upstream.
[05:10] commitments upstream.
[05:12] Um some of it is explicit, these commitments that you mentioned.
[05:14] Some of it is implicit.
[05:16] Um for example, a lot of the investments that are upstream are made by our our supply chain because I said to the CEOs, "Let me tell you how big this industry's going to be and let me explain to you why and let me reason through with you and let me show you what I see."
[05:36] And so as a result of that that process of of uh informing, inspiring um aligning with CEOs of all different industries upstream they're willing to make the investments.
[05:48] Now, why are they willing to make the investments for me and not someone else?
[05:50] And the reason for that is because they know that I have the capacity to buy it buy their supply and sell it through my downstream.
[06:00] The fact that Nvidia's downstream supply chain and our downstream demand is so
[06:06] chain and our downstream demand is so large.
[06:07] large they're willing to make the investment.
[06:09] they're willing to make the investment upstream.
[06:10] upstream. And so if you look at GTC.
[06:12] And so if you look at GTC um and and.
[06:15] um and and you know, people are marveled by the.
[06:16] you know, people are marveled by the scale of GTC and the people that go.
[06:18] scale of GTC and the people that go. It's a 360 degrees that the entire.
[06:21] It's a 360 degrees that the entire universe of AI all in one place and.
[06:24] universe of AI all in one place and they're they're all in one place because.
[06:26] they're they're all in one place because they need to see each other. I bring.
[06:28] them together so that they the.
[06:29] downstream could see the upstream, the.
[06:31] upstream could see the downstream and.
[06:33] all of them could see all the advances.
[06:35] in AI. And very importantly, they can.
[06:37] all meet the AI natives and all the AI.
[06:39] startups that are all you know, being.
[06:41] being built and all the amazing things.
[06:43] that are happening so that they could.
[06:44] see firsthand all the things that I tell.
[06:46] them.
[06:47] them. And so I spend a lot of my time.
[06:49] And so I spend a lot of my time informing directly or indirectly um our.
[06:53] supply chain and our partners and our ecosystem about the opportunity that's.
[06:54] supply chain and our partners and our ecosystem about the opportunity that's that's in front of us.
[06:57] that's in front of us. You know, most of my keynotes you know.
[06:59] You know, most of my keynotes you know, some of it some people always say, "You.
[07:00] some of it some people always say, "You know, Jensen did.
[07:02] know, Jensen did it.
[07:04] it in most keynotes.
[07:06] In most keynotes it's like one announcement after another.
[07:08] It's like one announcement after another announcement after another announcement.
[07:09] Announcement after another announcement after another announcement.
[07:12] After another announcement.
[07:14] Our keynotes are there's always a part of it that's a little torturous in the sense that it's almost comes across like an like education.
[07:21] Education.
[07:23] And in fact that's exactly on my mind.
[07:26] I need to make sure that the entire supply chain upstream and downstream, the ecosystem understands what is coming at us, why it's coming when it's coming, how big is it going to be, and be able to reason about it systematically.
[07:39] Systematically.
[07:40] Just like I reason about it.
[07:42] And and so so I think the the the mode as you you describe it, we're able to of course um build for a future uh it it if our next next several years is a trillion dollars in in scale, we have the supply chain to do it.
[08:00] Without our reach the velocity of our business.
[08:05] You know, just as there's cash flow,
[08:07] You know, just as there's cash flow, there's supply chain flow, there's supply chain flow, there's turns.
[08:09] Turns. Uh nobody's going to build a supply chain for an architecture if the architecture the business turns is low.
[08:16] And so our ability to sustain the scale is only because our downstream demand is so great and they see it and they all hear about it.
[08:24] They they see it all coming and so that's allows us to do the things that we're able to do at the scale we're able to do.
[08:32] I do want to understand more concretely whether the upstream can keep up.
[08:37] Um for many years now you guys have been two x'ing revenue year over year.
[08:41] You guys have been more than tripling the amount of flops you're providing to the world year over year.
[08:44] And two x'ing at the scale now is really incredible.
[08:47] Exactly. Yeah. So then you look at logic say you're the biggest customer on TSMC's N3 node and um you're one of the biggest on N2.
[08:56] Uh AI as a whole this year is going to be 60% of N3, it's going to be 86% next year according to SemiAnalysis.
[09:03] How how do you two x if you're the majority?
[09:07] majority?
[09:07] Um and how do you do that year over year?
[09:08] and how do you do that year over year?
[09:11] So are we are we in a regime now where the growth rate in AI compute has to slow because of upstream?
[09:13] the growth rate in AI compute has to slow because of upstream?
[09:14] Do you see a way to get around these uh you know, you you how do we build two x more fabs year over year ultimately?
[09:16] way to get around these uh you know, you you how do we build two x more fabs year over year ultimately?
[09:18] x more fabs year over year ultimately?
[09:21] Yeah, at some at some level um the the instantaneous demand uh is greater than the supply upstream and downstream uh in the world.
[09:24] the the instantaneous demand uh is greater than the supply upstream and downstream uh in the world.
[09:26] the instantaneous demand uh is greater than the supply upstream and downstream uh in the world.
[09:28] uh is greater than the supply upstream and downstream uh in the world.
[09:32] upstream and downstream uh in the world.
[09:34] uh in the world.
[09:36] And and it could be at any instant at any instance we could be limited by the number of plumbers.
[09:37] and it could be at any instant at any instance we could be limited by the number of plumbers.
[09:40] instance we could be limited by the number of plumbers.
[09:42] number of plumbers.
[09:43] Mhm.
[09:44] Which which actually happens.
[09:46] The plumbers are invited to next year's GTC.
[09:47] GTC.
[09:48] [laughter]
[09:48] Yeah, you know, by the way, great idea.
[09:51] But that's a good condition.
[09:53] You you want you want you want a market, you want an industry where the instantaneous demand is greater than the total supply of the industry.
[09:55] you want an industry where the instantaneous demand is greater than the total supply of the industry.
[09:57] instantaneous demand is greater than the total supply of the industry.
[10:00] The opposite is obviously less good.
[10:03] If we're too far apart, if one
[10:05] opposite is obviously less good.
[10:08] If we're too far apart, if one particular item, one particular particular item, one particular component is too far too far away, component is too far too far away, obviously obviously the industry swarms it.
[10:15] So for example, notice people aren't talking very much about CoWoS anymore.
[10:20] And the reason for that is because for 2 years we swarmed the living daylights out of it.
[10:24] And we double double double on several doubles and and now I think we're in a fairly good shape.
[10:29] And TSMC now knows that CoWoS supply has to keep up with the rest of the logic demand and the memory demand and and so so they're scaling CoWoS and they're scaling you know, future packaging technologies at the same level as they scale logic, which is terrific because for a long time CoWoS was rather specialty.
[10:48] And uh HBM memory was rather specialty, but they're not specialties anymore.
[10:56] Now realize they're mainstream computing technology.
[10:59] Um And then and of course we're now much more able to influence a larger scope of our
[11:08] To influence a larger scope of our supply chain.
[11:12] In the past, in the past, you know, in the beginning of the AI revolution, all the things that I say now, I was saying 5 years ago.
[11:18] And some people believed in it and invested in it.
[11:22] For example, Sanjay and the Micron team still remembers the meeting really well where I was clear about exactly what's going to happen and why it's going to happen.
[11:32] And the predictions, the predictions that of today.
[11:37] And they really double down on it and we partnered with them and across LPDDR, across, you know, HBM memories, they really invested in it.
[11:45] And it obviously has been tremendous for the company.
[11:49] Some people came a little bit later, and but they now they're all here.
[11:54] And so I think each one of these generation, each one of these bottlenecks gets a great deal of attention.
[12:02] Um, and now we're we're pre-fetching the bottlenecks years in advance.
[12:06] So for example, the
[12:08] example, the the investments that we've done.
[12:10] the investments that we've done with.
[12:11] with with Lumentum and Coherent and.
[12:14] with Lumentum and Coherent and all of the silicon photonics ecosystem.
[12:17] all of the silicon photonics ecosystem.
[12:17] The last several years we really reshape.
[12:19] the ecosystem and the supply chain silicon photonics.
[12:23] We we built up an entire supply chain around TSMC.
[12:25] We partnered with them on CoWoS,
[12:29] invented a whole bunch of technology, we licensed.
[12:31] those patents to the supply chain, keep it nice and open.
[12:34] it nice and open.
[12:36] And so we're preparing the supply chain through invention of new technologies,
[12:38] new workflows, new test new testing equipment, double-sided prodding.
[12:40] investing in companies, helping them scale up their capacity.
[12:43] And so so you could see that we're trying to shape the ecosystem so that it's ready the supply chain so that it's ready to support the scale.
[12:45] It seems like some bottlenecks are easier than others and so scaling up CoWoS versus scaling up.
[12:48] I went to the hardest one by the way.
[12:51] Which is?
[12:52] Plumbers.
[12:55] >> [laughter]
[12:56] >> Yeah.
[12:58] That's true.
[13:01] Yeah, yeah.
[13:08] Yeah. That's true. Yeah, yeah.
[13:08] I actually went to the hardest one.
[13:10] Yeah, actually went to the hardest one.
[13:12] Yeah, plumbers and electricians and the reason for that is because because and this is one of the concerns that I have about about the do- the doomers um
[13:15] one of the concerns that I have about about the do- the doomers um
[13:17] um describing the end of end of work and killing of jobs and you know, one of the things that that that if we discourage people from being software engineers
[13:21] killing of jobs and you know, one of the things that that that if we discourage people from being software engineers
[13:23] we're going to run out of software engineers.
[13:26] And and the same prediction 10 years ago some of the some of the doomers were were uh saying that we're telling people to whatever you do don't be a radiologist.
[13:29] we're going to run out of software engineers. engineers.
[13:31] And and the same prediction 10 years ago some of the some of the doomers were were uh saying that we're telling people to whatever you do don't be a radiologist.
[13:35] some of the some of the doomers were were uh saying that we're telling people to whatever you do don't be a radiologist.
[13:36] were uh saying that we're telling people to whatever you do don't be a radiologist.
[13:40] to whatever you do don't be a radiologist.
[13:41] And you might hear some of those some of those videos are still on the web.
[13:42] You know, radiology is is going to be the first career to go.
[13:44] some of those some of those videos are still on the web.
[13:46] You know, radiology is is going to be the first career to go.
[13:48] Nobody's world's not going to need any more radiologists.
[13:50] Guess what we're short of? Radiologists.
[13:51] Oh, but okay. So going back to this point about well some things you scale, other things like how do you actually get
[13:54] point about well some things you scale, other things like how do you actually get
[13:55] how do you actually manufacture 2x the amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
[13:58] other things like how do you actually get how do you actually manufacture 2x the amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
[14:00] how do you actually manufacture 2x the amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
[14:01] amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
[14:02] How do you get to 2x as many EUV machines a year
[14:03] amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
[14:05] bottlenecked by memory and logic are bottlenecked by EUV.
[14:07] How do you get to 2x as many EUV machines a year
[14:08] How do you get to 2x as many EUV machines a year
[14:10] machines a year year over year?
[14:10] year over year?
[14:13] None of none of that's impossible to scale quickly.
[14:15] scale quickly. You just need to you you could do all of that is easy to do within 2 or 3 years.
[14:19] within 2 or 3 years. You just need a demand signal.
[14:22] It's not once you once you can build one, you can build 10.
[14:24] And once you can build 10 you can build a million.
[14:26] And so these things are not not hard to replicate.
[14:29] How how far down the supply chain do you go where you do you go to ASML and say hey if I look out 3 years from now for me to for Nvidia to be generating 2 trillion a year in a year in revenue we need way more EUV machines
[14:39] and >> Some of them I have to directly some of them I indirectly and some of them if I can convince TSMC as ASML will be convinced.
[14:50] And so that's that's you know, we have to think about the critical critical pinch points and but if TSMC is convinced I you'll have plenty of EUV machines in a few years.
[15:04] And so none of that my point is that none of the bottlenecks last longer than a couple 2 3 years.
[15:08] None of them. And meanwhile meanwhile
[15:11] None of them.
[15:14] And meanwhile meanwhile we're improving computing efficiency by 10x 20x in the case of Hopper to Blackwell some 30 50x.
[15:20] We're we're coming up with new algorithms because CUDA is so flexible.
[15:25] We're we're developing all kinds of new techniques so that we drive efficiency in addition to increasing capacity.
[15:32] And so so there those those are those are things that that none of that worry me.
[15:36] It's the stuff that's downstream from us.
[15:39] Energy policies that prevent energy from from you know, you can't grow you can't create a you can't create an industry without energy.
[15:46] You can't create a new whole new manufacturing industry without energy.
[15:51] We want to re-industrialize the United States.
[15:52] We want to bring back chip manufacturing and computer manufacturing and packaging and we want to build new things like EVs and robots and we want to build AI factories and you can't build any of these things without energy.
[16:05] And those things take a long time.
[16:07] But more chip capacity, that's a 2 3 year problem.
[16:10] More CoWoS capacity, 2 3
[16:12] Year problem. More CoWoS capacity, two, three year problem. Interesting.
[16:14] I feel like I have guests tell me the exact opposite thing sometimes and I in this case I just don't have the technical knowledge to adjudicate, but
[16:20] Well, the beautiful thing is you're talking to the expert.
[16:21] Yeah.
[16:23] Talking to the expert.
[16:25] Yeah.
[16:27] Okay, I want to ask about your competitors.
[16:29] So if you look at TPU, arguably two out of the top three models in the world, Claude and Gemini were trained on TPU.
[16:39] What does that mean for Nvidia going forward?
[16:41] Well, we have, we have a very different, we build a very different thing.
[16:47] Um, you know, what Nvidia built is accelerated computing.
[16:51] Not a tensor processing unit.
[16:55] And accelerated computing is used for all kinds of things.
[16:57] You know, molecular dynamics and quantum chromodynamics and it's used for data processing.
[17:05] Data frames, structured data, unstructured data.
[17:09] It's used for fluid dynamics, particle physics.
[17:13] Fluid dynamics, particle physics.
[17:15] You know, and in addition we use it for AI.
[17:17] We use it for AI.
[17:20] And so accelerated computing is is um much more diverse and and although AI is
[17:23] much more diverse and and although AI is the conversation today is obviously very important and impactful.
[17:27] important and impactful. Computing is much broader than that.
[17:31] Computing is much broader than that. And what Nvidia has done is reinventing
[17:33] what Nvidia has done is reinventing reinvented the way computing is done
[17:35] reinvented the way computing is done from general purpose computing to accelerate computing.
[17:38] accelerate computing. Our market reach is
[17:41] Our market reach is far greater than any any TPU can any
[17:45] far greater than any any TPU can any ASIC can possibly have. And so if you look at our position
[17:47] ASIC can possibly have. And so if you look at our position we're the only company that that
[17:49] we're the only company that that accelerates applications of all kinds.
[17:52] accelerates applications of all kinds. We have a gigantic ecosystem and so all
[17:54] We have a gigantic ecosystem and so all kinds of frameworks and algorithms all run on Nvidia.
[17:57] kinds of frameworks and algorithms all run on Nvidia. And
[17:59] run on Nvidia. And because our computers
[18:02] And because our computers are designed to be operated by other people anyone who's an operator could buy our systems.
[18:04] are designed to be operated by other people anyone who's an operator could buy our systems.
[18:07] people anyone who's an operator could buy our systems. Most of these
[18:10] buy our systems. Most of these
[18:14] Most of these home-built systems you have to be your own operator because it was never designed to be flexible enough for other people to operate.
[18:23] And so as a result of the fact that anybody can operate our systems, we're in every cloud including Google and Amazon and you know, Azure and OCI and right?
[18:33] And so whether you want to operate it to rent or operate it if you want to operate it to rent you better have large ecosystem of customers and many industries that be the off-takers.
[18:45] If you're operating it if you if you want to operate it for yourself, we you know, we obviously have the ability to help you operate yourself like for example for Elon with xAI.
[18:57] And because we could we could enable operators in any any company in any industry, you could use it to build a supercomputer for scientific research and drug discovery at Lilly.
[19:10] And so we can help them operate their own supercomputer and and use it for the
[19:14] own supercomputer and and use it for the entire diversity of drug discovery and biological sciences that that we accelerate.
[19:22] And so so there just you know, a whole bunch of applications that we can address that you can't do so with TPUs because Nvidia's built CUDA as a fantastic tensor processing unit as well, but it does you know, it does every every life cycle of data processing and computing and AI and so on and so forth.
[19:41] And so I our market opportunity is just a lot larger.
[19:45] Our reach is a lot greater.
[19:51] And because we have such a large Um, we basically support every application in the world now.
[19:55] You could build in video systems anywhere and know that there will be customers for it.
[19:59] And so, it's a very different thing.
[20:00] Uh, this is going to be sort of a long question, but you know, you have spectacular revenue.
[20:04] Um, and this revenue is mostly you're not making 60 billion a quarter from uh, pharma and um, quantum.
[20:10] You're making it because AI is uh, unprecedented technology that is growing unprecedentedly fast.
[20:16] growing unprecedentedly fast. And so, then the question is what is best for AI
[20:18] then the question is what is best for AI specifically? And I'm not in the
[20:19] specifically? And I'm not in the details, but I talked to my AI
[20:21] details, but I talked to my AI researcher friends and they say, "Look,
[20:22] researcher friends and they say, "Look, when I use a TPU, it's this big systolic
[20:24] when I use a TPU, it's this big systolic array that's perfect for doing matrix
[20:26] array that's perfect for doing matrix multiplies, whereas a GPU is very
[20:28] multiplies, whereas a GPU is very flexible. It's a great when you have
[20:30] flexible. It's a great when you have lots of branching, when you have um,
[20:33] lots of branching, when you have um, irregular memory access, but
[20:35] irregular memory access, but these you know, what what what is AI?
[20:37] these you know, what what what is AI? It's just like these very predictable
[20:38] It's just like these very predictable matrix multiplies again and again and
[20:39] matrix multiplies again and again and again. And you don't have to give up any
[20:41] again. And you don't have to give up any die area for
[20:43] die area for warp schedulers, for you know, switches
[20:45] warp schedulers, for you know, switches between threads and memory banks. And
[20:47] between threads and memory banks. And so, the TPU is really optimized for the
[20:49] so, the TPU is really optimized for the majority of the bulk of this growth in
[20:51] majority of the bulk of this growth in revenue and use case for a compute that
[20:54] revenue and use case for a compute that is coming online right now.
[20:55] is coming online right now. Um, yeah. I I wonder how you react to
[20:57] Um, yeah. I I wonder how you react to that.
[20:59] that. Um,
[21:01] Um, matrix multiplies is an important part
[21:03] matrix multiplies is an important part of AI, but it's not the only part of AI.
[21:06] of AI, but it's not the only part of AI. And if you want to come up with a new
[21:08] And if you want to come up with a new attention mechanism,
[21:10] attention mechanism, or if you want to disaggregate in a
[21:12] or if you want to disaggregate in a different way, if you want to come up
[21:14] different way, if you want to come up with a whole new
[21:16] with a whole new type of
[21:17] type of architecture all together, for example,
[21:19] architecture all together, for example, you know, a hybrid SSM.
[21:22] you know, a hybrid SSM. Uh, if you want to use a you want to
[21:25] Uh, if you want to use a you want to create a model that that um,
[21:27] create a model that that um, that fuses diffusion and auto regressive
[21:31] that fuses diffusion and auto regressive somehow,
[21:32] somehow, uh, you you want an architecture that's
[21:34] uh, you you want an architecture that's just generally programmable.
[21:36] just generally programmable. And and
[21:38] And and we run everything you can imagine.
[21:41] we run everything you can imagine. And so, that's the advantage. It allows
[21:43] And so, that's the advantage. It allows for invention of new algorithms a lot
[21:45] for invention of new algorithms a lot more a lot a lot more easily.
[21:48] more a lot a lot more easily. And so, because it's a programmable
[21:50] And so, because it's a programmable system.
[21:51] system. And and the ability to invent new
[21:53] And and the ability to invent new algorithms is really what makes AI
[21:56] algorithms is really what makes AI advance so quickly.
[21:58] advance so quickly. You know, you
[22:00] You know, you TPUs like anything else is impacted by
[22:03] TPUs like anything else is impacted by Moore's law. And we know that Moore's
[22:05] Moore's law. And we know that Moore's law is increasing about 25% per year.
[22:08] law is increasing about 25% per year. And so,
[22:09] And so, the only way to really get 10x leaps,
[22:13] the only way to really get 10x leaps, 100x leaps,
[22:15] 100x leaps, is to fundamentally change
[22:17] is to fundamentally change the algorithm and how it's computed
[22:20] the algorithm and how it's computed every single year.
[22:22] every single year. And that's Nvidia's fundamental
[22:23] And that's Nvidia's fundamental advantage.
[22:25] advantage. The only reason why
[22:27] The only reason why we were able to make Blackwell the
[22:28] we were able to make Blackwell the Hopper 50 times,
[22:30] Hopper 50 times, you know, I said it was 35 times and and
[22:33] you know, I said it was 35 times and and and when I first announced it was going
[22:34] and when I first announced it was going to Blackwell was going to be 35 times
[22:36] to Blackwell was going to be 35 times more energy efficient than Hopper,
[22:39] more energy efficient than Hopper, uh, nobody believed it. And and and then
[22:42] uh, nobody believed it. And and and then and then Dylan wrote an article. He said
[22:44] and then Dylan wrote an article. He said he said, "In fact, in fact, I
[22:45] he said, "In fact, in fact, I sandbagged. It's actually 50 times."
[22:48] sandbagged. It's actually 50 times." And you can't reasonably do that with
[22:50] And you can't reasonably do that with just Moore's law. And so, the the way
[22:53] just Moore's law. And so, the the way that we solve that problem is new out
[22:56] that we solve that problem is new out new models,
[22:57] new models, MOEs,
[22:58] MOEs, um, uh, parallelized and disaggregated
[23:02] um, uh, parallelized and disaggregated and and distributed uh,
[23:05] and and distributed uh, across a computing system. Uh, and
[23:08] across a computing system. Uh, and without the ability to really get down
[23:11] without the ability to really get down and come up with new kernels with CUDA,
[23:13] and come up with new kernels with CUDA, it's really hard to do.
[23:15] it's really hard to do. And and so, the combination of the
[23:18] And and so, the combination of the programmability of our of our
[23:20] programmability of our of our architecture,
[23:21] architecture, uh, the the fact that Nvidia's an
[23:24] uh, the the fact that Nvidia's an extreme co-design company, where we
[23:26] extreme co-design company, where we could even offload some of the
[23:28] could even offload some of the computation into the fabric itself,
[23:30] computation into the fabric itself, NVLink, for example,
[23:32] NVLink, for example, into the network, Spectrum-X, um, uh,
[23:36] into the network, Spectrum-X, um, uh, and we could affect change across the
[23:40] and we could affect change across the processors, the system, the fabric, the
[23:43] processors, the system, the fabric, the libraries, the algorithm,
[23:46] libraries, the algorithm, all of that was done simultaneously.
[23:49] all of that was done simultaneously. Without CUDA to do that, I wouldn't even
[23:51] Without CUDA to do that, I wouldn't even know where to start.
[23:53] know where to start. My sponsor Crusoe was among the first
[23:54] My sponsor Crusoe was among the first clouds to offer Nvidia's Blackwell and
[23:56] clouds to offer Nvidia's Blackwell and Blackwell Ultra platforms. And they just
[23:58] Blackwell Ultra platforms. And they just announced their Nvidia Vera Rubin
[24:00] announced their Nvidia Vera Rubin deployment scheduled for later this
[24:01] deployment scheduled for later this year. But access to state-of-the-art
[24:03] year. But access to state-of-the-art hardware is only part of the story. For
[24:05] hardware is only part of the story. For example, most inference engines already
[24:07] example, most inference engines already do KV caching for a single user's
[24:08] do KV caching for a single user's forward passes. But Crusoe does it
[24:10] forward passes. But Crusoe does it across users and GPUs. So, if a thousand
[24:12] across users and GPUs. So, if a thousand agents are running on the same system
[24:14] agents are running on the same system prompt, Crusoe only has to compute the
[24:16] prompt, Crusoe only has to compute the KV cache once for it to become available
[24:18] KV cache once for it to become available to every single GPU in the cluster. This
[24:19] to every single GPU in the cluster. This is especially important as systems get
[24:21] is especially important as systems get more gigantic and require much longer
[24:23] more gigantic and require much longer prefixes in order to use tools and
[24:26] prefixes in order to use tools and access files. In a recent benchmark,
[24:27] access files. In a recent benchmark, Crusoe was able to deliver up to 10
[24:30] Crusoe was able to deliver up to 10 times faster time to first token and up
[24:32] times faster time to first token and up to five times better throughput than
[24:34] to five times better throughput than vLLM. This is just one among many
[24:36] vLLM. This is just one among many reasons that you should run your
[24:37] reasons that you should run your inference workload with Crusoe. And if
[24:39] inference workload with Crusoe. And if you need GPUs for training, you don't
[24:41] you need GPUs for training, you don't need to switch clouds. Crusoe's got you
[24:42] need to switch clouds. Crusoe's got you covered there, too. Go to
[24:43] covered there, too. Go to crusoe.ai/thoroughcache
[24:45] crusoe.ai/thoroughcache to learn more.
[24:47] to learn more. So, this gets at an interesting question
[24:49] So, this gets at an interesting question about um,
[24:50] about um, Nvidia's
[24:52] Nvidia's clientele, where if 60% of your revenue
[24:55] clientele, where if 60% of your revenue is coming from these
[24:57] is coming from these big five hyperscalers, you know, in in
[25:00] big five hyperscalers, you know, in in in in a different era with different
[25:01] in in a different era with different customers, let's say it's professors who
[25:03] customers, let's say it's professors who are running experiments. And they are
[25:05] are running experiments. And they are helped a bunch by they need CUDA. Um,
[25:08] helped a bunch by they need CUDA. Um, they can't use another accelerator.
[25:10] they can't use another accelerator. They need to just run PyTorch with CUDA
[25:12] They need to just run PyTorch with CUDA and have everything optimized. But if
[25:13] and have everything optimized. But if you've got these hyperscalers, they have
[25:14] you've got these hyperscalers, they have the resources to write their own
[25:17] the resources to write their own kernels. In fact, they have to to get
[25:18] kernels. In fact, they have to to get that extra last 5% that they need for
[25:20] that extra last 5% that they need for their specific architecture. Um,
[25:23] their specific architecture. Um, Anthropic, Google are mostly running
[25:25] Anthropic, Google are mostly running their own accelerators or running TPUs,
[25:28] their own accelerators or running TPUs, um, Intanium. But even OpenAI using GPUs
[25:32] um, Intanium. But even OpenAI using GPUs has um, has Triton, which they're like,
[25:35] has um, has Triton, which they're like, "We need our own kernels."
[25:36] "We need our own kernels." So, they've um,
[25:38] So, they've um, down to CUDA C++, they've instead of
[25:40] down to CUDA C++, they've instead of using cuBLAS and nickel and everything.
[25:43] using cuBLAS and nickel and everything. They've got their own stack, which
[25:44] They've got their own stack, which compiles to other accelerators as well.
[25:46] compiles to other accelerators as well. Um, and so, if most of your your
[25:48] Um, and so, if most of your your customers
[25:49] customers can can and do make replacements for
[25:51] can can and do make replacements for CUDA, to what extent is CUDA really the
[25:54] CUDA, to what extent is CUDA really the thing that is going to make frontier AI
[25:56] thing that is going to make frontier AI happen on Nvidia?
[25:58] happen on Nvidia? CUDA CUDA is um,
[26:01] CUDA CUDA is um, uh, it's a a rich ecosystem. And so, if
[26:04] uh, it's a a rich ecosystem. And so, if you want to build on any computer first,
[26:06] you want to build on any computer first, building on CUDA first is incredibly
[26:09] building on CUDA first is incredibly smart.
[26:11] smart. And because the ecosystem is so rich,
[26:14] And because the ecosystem is so rich, uh, we support every framework. Uh, if
[26:16] uh, we support every framework. Uh, if you want to create custom kernels, uh,
[26:18] you want to create custom kernels, uh, if you need, for example, we contribute
[26:21] if you need, for example, we contribute enormously to Triton. And so, the back
[26:23] enormously to Triton. And so, the back end of Triton, um, huge amounts of
[26:26] end of Triton, um, huge amounts of Nvidia technology.
[26:28] Nvidia technology. We're delighted to help every framework
[26:30] We're delighted to help every framework uh, become as great as it can be. And
[26:32] uh, become as great as it can be. And there's lots and lots of frameworks.
[26:34] there's lots and lots of frameworks. There's Triton, there's vLLM, there's SG
[26:36] There's Triton, there's vLLM, there's SG Lang, and then there's more, right? And
[26:38] Lang, and then there's more, right? And now there's there's a whole bunch of new
[26:40] now there's there's a whole bunch of new reinforcement learning frameworks coming
[26:42] reinforcement learning frameworks coming out, you know, that you got VERA, you
[26:43] out, you know, that you got VERA, you got Nemo RL, you got a whole bunch of
[26:44] got Nemo RL, you got a whole bunch of new. And then the the now with with with
[26:48] new. And then the the now with with with post-training and reinforcement
[26:50] post-training and reinforcement learning, that entire area is just
[26:51] learning, that entire area is just exploding, right? And so, if you want to
[26:54] exploding, right? And so, if you want to build on on an architecture, building on
[26:56] build on on an architecture, building on a CUDA makes the most sense. Because you
[26:58] a CUDA makes the most sense. Because you know that the ecosystem is great.
[27:00] know that the ecosystem is great. You know that if something happens, it's
[27:02] You know that if something happens, it's more likely in your code and not in the
[27:04] more likely in your code and not in the mountain of code underneath.
[27:06] mountain of code underneath. You know, don't forget the amount of
[27:07] You know, don't forget the amount of code that you're dealing with when
[27:09] code that you're dealing with when you're building these systems.
[27:11] you're building these systems. When something doesn't work,
[27:13] When something doesn't work, was it you or was it the computer? You
[27:16] was it you or was it the computer? You would like it always to be you and to to
[27:18] would like it always to be you and to to be able to trust the computer. And and
[27:20] be able to trust the computer. And and and obviously, we still have lots and
[27:22] and obviously, we still have lots and lots of lots and lots of bugs ourselves,
[27:23] lots of lots and lots of bugs ourselves, but but
[27:25] but but our system is so well wrung out that you
[27:29] our system is so well wrung out that you could at least build on top of the
[27:30] could at least build on top of the foundation. So, that's number one. It's
[27:32] foundation. So, that's number one. It's that the richness of the ecosystem, the
[27:34] that the richness of the ecosystem, the programmability of it, the capability of
[27:35] programmability of it, the capability of it. The second thing is is um, if you
[27:38] it. The second thing is is um, if you were a developer and you were building
[27:39] were a developer and you were building anything at all, the single most
[27:42] anything at all, the single most important thing you want more than
[27:43] important thing you want more than anything is installed base. You want the
[27:45] anything is installed base. You want the software that you run to run on a whole
[27:47] software that you run to run on a whole bunch of other computers. You don't want
[27:49] bunch of other computers. You don't want to build a soft You're not building
[27:50] to build a soft You're not building software just for yourself. You're
[27:52] software just for yourself. You're building software for your fleet or for
[27:54] building software for your fleet or for everybody else's fleet, because you're a
[27:55] everybody else's fleet, because you're a framework builder. And Nvidia's CUDA
[27:58] framework builder. And Nvidia's CUDA ecosystem is ultimately its great
[28:01] ecosystem is ultimately its great treasure.
[28:02] treasure. We are now
[28:03] We are now I don't know how many several hundred
[28:05] I don't know how many several hundred million GPUs. Every cloud has it.
[28:08] million GPUs. Every cloud has it. Goes back to A10, A100,
[28:11] Goes back to A10, A100, H100, H200, you know,
[28:15] H100, H200, you know, the L series,
[28:17] the L series, the P series. I mean, there's a whole
[28:19] the P series. I mean, there's a whole bunch of them.
[28:20] bunch of them. And and they're they're they're in all
[28:22] And and they're they're they're in all kinds of sizes and shapes. And if you're
[28:24] kinds of sizes and shapes. And if you're a robotics company, you want that CUDA
[28:26] a robotics company, you want that CUDA stack to actually run in the CUDA in the
[28:27] stack to actually run in the CUDA in the robot itself. We're literally
[28:29] robot itself. We're literally everywhere. And so, the installed base
[28:32] everywhere. And so, the installed base says that once you develop the software,
[28:34] says that once you develop the software, once you develop the model, it's going
[28:36] once you develop the model, it's going to be useful everywhere. And so, the
[28:38] to be useful everywhere. And so, the installed base is just too incredibly
[28:40] installed base is just too incredibly valuable. And then lastly,
[28:42] valuable. And then lastly, the fact that we're in every single
[28:44] the fact that we're in every single cloud makes us genuinely unique. Because
[28:47] cloud makes us genuinely unique. Because you know, you're an AI company and
[28:49] you know, you're an AI company and you're an AI developer, you're not
[28:51] you're an AI developer, you're not exactly sure which CSP you're going to
[28:52] exactly sure which CSP you're going to partner with and where you would like to
[28:54] partner with and where you would like to run it. And we run it everywhere,
[28:56] run it. And we run it everywhere, including on prem for you if you like.
[28:58] including on prem for you if you like. And so, so I think that that
[29:01] And so, so I think that that the the the richness of the ecosystem,
[29:04] the the the richness of the ecosystem, the expansiveness of the of the of the
[29:08] the expansiveness of the of the of the installed base and the versatility of
[29:11] installed base and the versatility of where where where we are, that
[29:13] where where where we are, that combination is is makes CUDA invaluable.
[29:16] combination is is makes CUDA invaluable. That makes a lot of sense. I I guess the
[29:17] That makes a lot of sense. I I guess the thing I'm curious about is um,
[29:20] thing I'm curious about is um, whether those advantages matter a lot to
[29:24] whether those advantages matter a lot to your main customers. Um, like there's
[29:28] your main customers. Um, like there's many people who who they might matter
[29:29] many people who who they might matter for for the kind of person who can't
[29:30] for for the kind of person who can't actually build their own software stack,
[29:32] actually build their own software stack, who are making most of your revenue. Um,
[29:34] who are making most of your revenue. Um, especially if you go to a world where AI
[29:36] especially if you go to a world where AI is getting especially good at the things
[29:38] is getting especially good at the things which have tight verification loops,
[29:40] which have tight verification loops, where you can RL on them. And then
[29:42] where you can RL on them. And then this question of how do you write a
[29:44] this question of how do you write a kernel that does attention or MLP the
[29:47] kernel that does attention or MLP the most efficiently across a scale up. It's
[29:49] most efficiently across a scale up. It's a very verifiable sort of feedback loop
[29:52] a very verifiable sort of feedback loop and so oh can everybody can all the
[29:54] and so oh can everybody can all the hyperscalers write these custom kernels
[29:56] hyperscalers write these custom kernels for themselves?
[29:58] for themselves? And they might still NVIDIA has
[30:00] And they might still NVIDIA has still has great price performance so
[30:01] still has great price performance so they might still prefer to use NVIDIA.
[30:03] they might still prefer to use NVIDIA. But then the question is does it just
[30:05] But then the question is does it just become a question of who is offering the
[30:07] become a question of who is offering the best specs, the best
[30:10] best specs, the best flops and memory and memory bandwidth
[30:11] flops and memory and memory bandwidth for a given dollar where historically
[30:13] for a given dollar where historically NVIDIA has just had and still has you
[30:16] NVIDIA has just had and still has you know the best margins in all of AI
[30:18] know the best margins in all of AI across hardware and software 70% plus
[30:20] across hardware and software 70% plus because of this CUDA mode and the
[30:22] because of this CUDA mode and the question is oh can you sustain those
[30:23] question is oh can you sustain those margins if
[30:25] margins if for most of your customers they can
[30:27] for most of your customers they can actually afford to
[30:29] actually afford to build
[30:31] build build instead of the CUDA mode.
[30:33] build instead of the CUDA mode. The number of engineers we have assigned
[30:35] The number of engineers we have assigned to these AI labs is insane.
[30:38] to these AI labs is insane. Working with them optimizing their
[30:39] Working with them optimizing their stack.
[30:41] stack. And the reason for that is because
[30:43] And the reason for that is because because nobody knows our architecture
[30:45] because nobody knows our architecture better than we do and these
[30:46] better than we do and these architectures are not not as
[30:49] architectures are not not as general purpose as a CPU. The reason the
[30:52] general purpose as a CPU. The reason the reason why a CPU is so
[30:53] reason why a CPU is so you know a CPU is kind of like like a
[30:55] you know a CPU is kind of like like a Cadillac you know it's
[30:57] Cadillac you know it's it just always you know it it's a nice
[30:59] it just always you know it it's a nice cruiser.
[31:01] cruiser. It never goes too fast.
[31:03] It never goes too fast. Everybody drives it pretty well.
[31:05] Everybody drives it pretty well. You know it's got cruise control.
[31:08] You know it's got cruise control. You know and everything is easy.
[31:10] You know and everything is easy. But in a lot of ways NVIDIA's GPUs are
[31:13] But in a lot of ways NVIDIA's GPUs are accelerators are kind of like F1 racers.
[31:16] accelerators are kind of like F1 racers. And yeah I I could imagine everybody's
[31:19] And yeah I I could imagine everybody's able to drive it at a 100 100 miles an
[31:21] able to drive it at a 100 100 miles an hour.
[31:22] hour. But it takes quite a bit of expertise to
[31:24] But it takes quite a bit of expertise to be able to push it to the limit.
[31:26] be able to push it to the limit. And we use we use a ton of AI to create
[31:30] And we use we use a ton of AI to create the kernels that we have.
[31:31] the kernels that we have. And I'm pretty sure we're going to still
[31:34] And I'm pretty sure we're going to still be needed for quite some time and so our
[31:36] be needed for quite some time and so our expertise
[31:38] expertise helps our our our
[31:41] helps our our our our AI labs partners
[31:43] our AI labs partners get another 2x
[31:45] get another 2x out of their stack easily often times
[31:49] out of their stack easily often times it's not unusual that we you know by the
[31:51] it's not unusual that we you know by the time that we're done optimizing their
[31:52] time that we're done optimizing their stack or optimizing a particular kernel
[31:55] stack or optimizing a particular kernel their model sped up by 3x 2x
[31:59] their model sped up by 3x 2x 50%.
[32:00] 50%. That's a huge number especially when
[32:04] That's a huge number especially when you're talking about the install base of
[32:06] you're talking about the install base of the fleet that they have of all the
[32:07] the fleet that they have of all the Hoppers and Blackwalls that they have.
[32:09] Hoppers and Blackwalls that they have. When you increase it by a factor of two
[32:12] When you increase it by a factor of two that
[32:13] that doubles their revenues.
[32:15] doubles their revenues. That directly translates to revenues.
[32:17] That directly translates to revenues. NVIDIA's computing stack is the best
[32:20] NVIDIA's computing stack is the best performance per TCO in the world bar
[32:22] performance per TCO in the world bar none.
[32:24] none. Nobody can demonstrate to me that any
[32:27] Nobody can demonstrate to me that any single platform in the world today has
[32:30] single platform in the world today has better performance TCO ratio. Not one
[32:33] better performance TCO ratio. Not one company. And in fact
[32:35] company. And in fact in fact the the the benchmarks are out
[32:37] in fact the the the benchmarks are out there Dylan's
[32:39] there Dylan's right inference max is sitting out there
[32:41] right inference max is sitting out there for everybody to to use. And not one TPU
[32:44] for everybody to to use. And not one TPU won't come training on won't come.
[32:46] won't come training on won't come. I I encourage them
[32:48] I I encourage them to
[32:50] to use inference max and demonstrate their
[32:52] use inference max and demonstrate their incredible
[32:54] incredible inference cost. It's really really hard.
[32:57] inference cost. It's really really hard. Not nobody wants to show up MLPerf.
[33:01] Not nobody wants to show up MLPerf. I would I would welcome training them to
[33:04] I would I would welcome training them to demonstrate their 40% that they claim
[33:06] demonstrate their 40% that they claim all the time. I would I would love to to
[33:08] all the time. I would I would love to to hear them demonstrate the the cost
[33:11] hear them demonstrate the the cost advantage of TPUs. It makes no sense in
[33:13] advantage of TPUs. It makes no sense in my mind. It makes absolutely zero sense.
[33:16] my mind. It makes absolutely zero sense. On first principles it makes no sense.
[33:18] On first principles it makes no sense. And so I I think the
[33:20] And so I I think the I think the the the reason why we're so
[33:22] I think the the the reason why we're so successful is simply because our TCO is
[33:25] successful is simply because our TCO is so great. There's a second you say
[33:29] so great. There's a second you say 60% of our customers are the top five.
[33:32] 60% of our customers are the top five. But
[33:33] But most of that business is external.
[33:36] most of that business is external. For example most of AWS's most of NVIDIA
[33:39] For example most of AWS's most of NVIDIA in AWS is for external customers not
[33:41] in AWS is for external customers not internal use. Most of our customers at
[33:44] internal use. Most of our customers at Azure obviously all of our customers are
[33:45] Azure obviously all of our customers are external. All of our customers at OCI
[33:47] external. All of our customers at OCI are external not internal use.
[33:49] are external not internal use. The reason why they they favor us is
[33:51] The reason why they they favor us is because our reach is so great. We can
[33:54] because our reach is so great. We can bring them
[33:56] bring them all of the great customers in the world.
[33:57] all of the great customers in the world. They're all built on NVIDIA and the
[33:59] They're all built on NVIDIA and the reason why all these companies are built
[34:00] reason why all these companies are built on NVIDIA is because our reach and our
[34:02] on NVIDIA is because our reach and our versatility is so great.
[34:04] versatility is so great. And so so I think I think the flywheel
[34:08] And so so I think I think the flywheel is really install base
[34:11] is really install base the programmability of our architecture
[34:13] the programmability of our architecture the richness of our ecosystem.
[34:16] the richness of our ecosystem. And the fact that there's so many AI
[34:17] And the fact that there's so many AI companies in the world there's tens of
[34:19] companies in the world there's tens of thousands of them now.
[34:21] thousands of them now. And if you were one of those AI startups
[34:24] And if you were one of those AI startups what architecture would you would you
[34:25] what architecture would you would you choose? You would choose an architecture
[34:27] choose? You would choose an architecture that's most abundant. We're the most
[34:29] that's most abundant. We're the most abundant in the world.
[34:31] abundant in the world. The one has the largest install base
[34:33] The one has the largest install base where the most
[34:34] where the most largest install base and one that has a
[34:36] largest install base and one that has a rich ecosystem.
[34:37] rich ecosystem. And so that's the flywheel that that's
[34:39] And so that's the flywheel that that's the reason why between the combination
[34:41] the reason why between the combination of one
[34:42] of one our perf per dollar is so great
[34:45] our perf per dollar is so great that that
[34:47] that that they have the lowest cost tokens.
[34:49] they have the lowest cost tokens. Second our perf per watt is the highest
[34:52] Second our perf per watt is the highest in the world. And so if if
[34:55] in the world. And so if if one of these companies if our partners
[34:57] one of these companies if our partners built a 1 gigawatt data center
[35:00] built a 1 gigawatt data center that 1 gigawatt data center
[35:02] that 1 gigawatt data center better deliver the maximum amount of
[35:04] better deliver the maximum amount of revenues that and number of tokens which
[35:07] revenues that and number of tokens which directly translates to revenues. You
[35:09] directly translates to revenues. You want it to generate as many tokens as
[35:10] want it to generate as many tokens as possible maximize the revenues for that
[35:12] possible maximize the revenues for that data center. We have the highest tokens
[35:15] data center. We have the highest tokens per watt architecture in the world. And
[35:17] per watt architecture in the world. And then lastly if your goal is to rent the
[35:19] then lastly if your goal is to rent the infrastructure we have the most
[35:20] infrastructure we have the most customers in the world.
[35:22] customers in the world. And so that's the reason why the
[35:24] And so that's the reason why the flywheel works. Interesting. I I guess
[35:26] flywheel works. Interesting. I I guess the question comes down to
[35:29] the question comes down to what is the actual market structure here
[35:30] what is the actual market structure here because even if there's other companies
[35:32] because even if there's other companies there could have been a world where
[35:33] there could have been a world where there's tens of thousands of AI
[35:34] there's tens of thousands of AI companies that have roughly equal share
[35:37] companies that have roughly equal share of compute. But if even through these
[35:39] of compute. But if even through these [snorts] five hyperscalers really the
[35:41] [snorts] five hyperscalers really the people on Amazon using the computer
[35:44] people on Amazon using the computer Anthropic company AI
[35:46] Anthropic company AI and these big big foundation labs who
[35:48] and these big big foundation labs who who can themselves afford and have the
[35:51] who can themselves afford and have the ability to make excel different
[35:53] ability to make excel different accelerators work. No I I think your
[35:55] accelerators work. No I I think your your your assumption is is premise is
[35:58] your your assumption is is premise is wrong. Maybe.
[35:59] wrong. Maybe. Let me let me let me ask you a slightly
[36:00] Let me let me let me ask you a slightly different question which Come back and
[36:02] different question which Come back and make me correct your your your your
[36:04] make me correct your your your your premise. Okay. Let me just ask you a
[36:05] premise. Okay. Let me just ask you a different question which is okay if
[36:07] different question which is okay if every everything you're saying
[36:08] every everything you're saying >> still make sure that make me come back
[36:10] >> still make sure that make me come back and and fix because it's just too
[36:11] and and fix because it's just too important to AI. It's too important to
[36:14] important to AI. It's too important to the future of science it's too important
[36:16] the future of science it's too important to the future of the industry. That that
[36:19] to the future of the industry. That that premise
[36:20] premise the premise look
[36:22] the premise look Let me just finish the question and then
[36:23] Let me just finish the question and then we can address it together.
[36:25] we can address it together. So
[36:26] So what do you think
[36:27] what do you think if
[36:29] if if all these things are true about price
[36:31] if all these things are true about price performance and performance per watt
[36:33] performance and performance per watt etc. are true why why do you think it is
[36:34] etc. are true why why do you think it is the case that say
[36:36] the case that say um
[36:37] um Anthropic for example just announced a
[36:39] Anthropic for example just announced a couple days ago they have a
[36:40] couple days ago they have a multi-gigawatt deal with Broadcom and
[36:42] multi-gigawatt deal with Broadcom and Google for TPUs and majority of their
[36:45] Google for TPUs and majority of their compute. Obviously for Google it's
[36:47] compute. Obviously for Google it's TPUs are majority of their compute too.
[36:48] TPUs are majority of their compute too. If I look at these big AI companies
[36:50] If I look at these big AI companies it seems like a lot of their
[36:52] it seems like a lot of their there was some point where it was all
[36:53] there was some point where it was all NVIDIA.
[36:54] NVIDIA. And now it's not. And
[36:57] And now it's not. And so I'm curious how to square
[37:00] so I'm curious how to square if these things are true on paper why
[37:01] if these things are true on paper why are they going with other accelerators?
[37:03] are they going with other accelerators? Yeah. Anthropic is isn't is a unique
[37:06] Yeah. Anthropic is isn't is a unique instance
[37:07] instance and not a trend.
[37:09] and not a trend. Without Anthropic why would there be any
[37:12] Without Anthropic why would there be any TPU growth at all?
[37:14] TPU growth at all? It's 100% Anthropic.
[37:16] It's 100% Anthropic. Without Anthropic why would there be any
[37:18] Without Anthropic why would there be any training growth at all? It's 100%
[37:20] training growth at all? It's 100% Anthropic. I think that's fairly well
[37:22] Anthropic. I think that's fairly well known and well understood. It's not that
[37:24] known and well understood. It's not that it's not that there's an abundance of
[37:27] it's not that there's an abundance of ASIC opportunities.
[37:29] ASIC opportunities. There's only one Anthropic. But Open AI
[37:31] There's only one Anthropic. But Open AI is dealing with AMD they're building
[37:33] is dealing with AMD they're building their own Titan accelerator.
[37:35] their own Titan accelerator. Yeah but they're mostly I think we could
[37:36] Yeah but they're mostly I think we could all acknowledge they're vastly NVIDIA.
[37:39] all acknowledge they're vastly NVIDIA. And and we're going to still do a lot of
[37:41] And and we're going to still do a lot of work together. Yeah.
[37:43] work together. Yeah. And we're not we're not I'm I'm not
[37:46] And we're not we're not I'm I'm not offended by other people using something
[37:48] offended by other people using something else and trying things. If they don't
[37:51] else and trying things. If they don't try these other things how would they
[37:52] try these other things how would they know how good ours is you know and
[37:54] know how good ours is you know and sometimes you got to be reminded of it.
[37:57] sometimes you got to be reminded of it. And and
[37:58] And and we we got to and we have to continuously
[38:00] we we got to and we have to continuously earn earn
[38:02] earn earn the position that we're in.
[38:04] the position that we're in. I
[38:05] I and there's always big claims and look
[38:07] and there's always big claims and look at the number of ASICs that have been
[38:08] at the number of ASICs that have been canceled.
[38:09] canceled. Just because you're going to build an
[38:10] Just because you're going to build an ASIC you still have to build something
[38:12] ASIC you still have to build something better than NVIDIA.
[38:14] better than NVIDIA. And it's not that easy building
[38:16] And it's not that easy building something better than NVIDIA. It's not
[38:17] something better than NVIDIA. It's not sensible actually.
[38:19] sensible actually. You know it's NVIDIA's got to be missing
[38:21] You know it's NVIDIA's got to be missing something seriously.
[38:23] something seriously. You know and because our our scale our
[38:25] You know and because our our scale our velocity
[38:26] velocity we're the only company in the world
[38:27] we're the only company in the world that's cranking it out every single
[38:30] that's cranking it out every single year. Big leaps every single year. I
[38:32] year. Big leaps every single year. I guess their logic is that hey it doesn't
[38:33] guess their logic is that hey it doesn't need to be better it just needs to be
[38:35] need to be better it just needs to be not more than 70% worse because they're
[38:37] not more than 70% worse because they're paying you 70% margins. No no no don't
[38:40] paying you 70% margins. No no no don't forget. Even an ASIC margin is really
[38:43] forget. Even an ASIC margin is really quite high.
[38:44] quite high. NVIDIA's margin 60 70% let's say but an
[38:47] NVIDIA's margin 60 70% let's say but an ASIC margin is 65.
[38:49] ASIC margin is 65. What are you really saving?
[38:51] What are you really saving? Oh you mean from Broadcom or something
[38:52] Oh you mean from Broadcom or something like
[38:52] like >> Yeah sure.
[38:54] >> Yeah sure. You got to pay somebody.
[38:56] You got to pay somebody. And so so I think the the ASIC margins
[38:58] And so so I think the the ASIC margins are are incredibly good from what I can
[39:01] are are incredibly good from what I can tell. And and they believe it they
[39:03] tell. And and they believe it they believe it so too.
[39:04] believe it so too. And so they're they're quite proud of
[39:06] And so they're they're quite proud of their their incredible ASIC margins.
[39:09] their their incredible ASIC margins. And so you you asked the question why.
[39:12] And so you you asked the question why. A long time ago
[39:13] A long time ago we just didn't have the ability to do
[39:15] we just didn't have the ability to do what
[39:17] what and and this is this is this is and at
[39:19] and and this is this is this is and at the time
[39:20] the time at the time
[39:22] at the time I didn't deeply internalize
[39:25] I didn't deeply internalize how difficult it would be
[39:27] how difficult it would be to build a a foundation AI lab
[39:30] to build a a foundation AI lab like Open AI and Anthropic.
[39:33] like Open AI and Anthropic. I
[39:34] I and the the fact that
[39:36] and the the fact that they needed huge investments from the
[39:38] they needed huge investments from the supplier themselves.
[39:40] supplier themselves. We just weren't in a position make the
[39:42] We just weren't in a position make the multi-billion dollar investment into
[39:44] multi-billion dollar investment into Anthropic so that they could use our use
[39:47] Anthropic so that they could use our use our compute.
[39:48] our compute. But Google and and AWS were and they put
[39:52] But Google and and AWS were and they put in huge investments in the beginning so
[39:54] in huge investments in the beginning so that Anthropic um in return use their
[39:57] that Anthropic um in return use their compute. Uh we we just weren't in a
[39:59] compute. Uh we we just weren't in a position to do so uh at the time. Nor
[40:02] position to do so uh at the time. Nor nor did I
[40:04] nor did I I would say my mistake is I didn't
[40:06] I would say my mistake is I didn't deeply internalize that they they really
[40:09] deeply internalize that they they really had no other options
[40:10] had no other options that that that a VC would never put in 5
[40:14] that that that a VC would never put in 5 10 billion dollars of investment into an
[40:17] 10 billion dollars of investment into an AI lab with the with the hopes of it
[40:19] AI lab with the with the hopes of it turning out to be Anthropic. And so that
[40:22] turning out to be Anthropic. And so that was my miss.
[40:24] was my miss. Uh but even if I understood it, I don't
[40:26] Uh but even if I understood it, I don't think we would have been in a position
[40:27] think we would have been in a position to do that at the time.
[40:29] to do that at the time. But um I'm not going to make that same
[40:31] But um I'm not going to make that same mistake again and and um uh I'm
[40:34] mistake again and and um uh I'm delighted to invest in OpenAI and and um
[40:37] delighted to invest in OpenAI and and um um I'm delighted to to uh help them
[40:40] um I'm delighted to to uh help them scale and I believe it's essential to do
[40:42] scale and I believe it's essential to do so. And then and then when um uh when I
[40:45] so. And then and then when um uh when I was able to
[40:46] was able to uh answer when Anthropic came to us, uh
[40:49] uh answer when Anthropic came to us, uh I'm delighted to be an investor, de-
[40:51] I'm delighted to be an investor, de- delighted to help them scale and um uh
[40:54] delighted to help them scale and um uh but we just weren't at the at the time
[40:56] but we just weren't at the at the time able to do so. Mhm. Uh if I if I could
[40:59] able to do so. Mhm. Uh if I if I could uh rewind everything uh invi- Nvidia
[41:02] uh rewind everything uh invi- Nvidia could have been as big back then as we
[41:04] could have been as big back then as we are now, I would have been more than
[41:05] are now, I would have been more than happy to do it. And this is this is
[41:07] happy to do it. And this is this is actually quite interesting which is um
[41:09] actually quite interesting which is um for many years Nvidia has been this
[41:12] for many years Nvidia has been this um the company in AI making money making
[41:15] um the company in AI making money making lots of money. And um now you're
[41:19] lots of money. And um now you're investing it. It's been reported that
[41:21] investing it. It's been reported that you've done up to 30 billion in OpenAI
[41:23] you've done up to 30 billion in OpenAI and 10 billion in um Anthropic. Um but
[41:27] and 10 billion in um Anthropic. Um but now their valuations have increased and
[41:28] now their valuations have increased and I'm sure they'll continue to increase.
[41:30] I'm sure they'll continue to increase. Um and so if over over all these many
[41:33] Um and so if over over all these many years, you know, you were giving them
[41:34] years, you know, you were giving them the compute, you saw where AI was headed
[41:36] the compute, you saw where AI was headed and then they were worth like 1/10 what
[41:38] and then they were worth like 1/10 what they are now a couple years ago or even
[41:39] they are now a couple years ago or even a year year ago in some cases. Um
[41:42] a year year ago in some cases. Um and you had all this cash.
[41:45] and you had all this cash. W- There's there's a world where either
[41:47] W- There's there's a world where either Nvidia themselves becomes a foundation
[41:49] Nvidia themselves becomes a foundation lab, um d- does a huge investment to
[41:52] lab, um d- does a huge investment to make that possible or has made the deals
[41:54] make that possible or has made the deals you made now at current valuations much
[41:56] you made now at current valuations much earlier on. Um and you had the cash to
[41:58] earlier on. Um and you had the cash to do it. So I'm I am curious actually why
[42:00] do it. So I'm I am curious actually why not have done it earlier.
[42:02] not have done it earlier. We did it as soon as we could have.
[42:05] We did it as soon as we could have. We did it as soon as we could have. And
[42:07] We did it as soon as we could have. And and and um
[42:09] and and um if I could have, I would have done it
[42:11] if I could have, I would have done it even earlier.
[42:12] even earlier. Um at the time that Anthropic needed us
[42:14] Um at the time that Anthropic needed us to do it, we just weren't in a position
[42:16] to do it, we just weren't in a position to do it.
[42:17] to do it. It wasn't it wasn't you know, it wasn't
[42:19] It wasn't it wasn't you know, it wasn't in our sensibility to do so. How so?
[42:21] in our sensibility to do so. How so? Like a cash thing or just Yeah, the
[42:23] Like a cash thing or just Yeah, the level of investment. You know, we never
[42:25] level of investment. You know, we never invested outside the company at the time
[42:28] invested outside the company at the time and not that much.
[42:30] and not that much. And um
[42:33] And um and we didn't realize we needed to.
[42:35] and we didn't realize we needed to. You know, I always I always thought that
[42:37] You know, I always I always thought that they could just go raise VCs for God's
[42:39] they could just go raise VCs for God's sakes like like all companies do.
[42:42] sakes like like all companies do. Um but but um uh what they were trying
[42:45] Um but but um uh what they were trying to what they were were trying to do
[42:48] to what they were were trying to do uh could not been done through VCs.
[42:51] uh could not been done through VCs. What OpenAI wanted to do could not been
[42:52] What OpenAI wanted to do could not been done through VCs.
[42:54] done through VCs. And and I recognize that now. I didn't
[42:56] And and I recognize that now. I didn't know it then.
[42:57] know it then. You know, but that's their genius.
[42:58] You know, but that's their genius. That's why they're smart. You know, and
[43:00] That's why they're smart. You know, and so so they realized they realized it
[43:02] so so they realized they realized it then that they had to do something like
[43:03] then that they had to do something like that and I'm delighted that they did,
[43:05] that and I'm delighted that they did, you know, and and even though even
[43:07] you know, and and even though even though um we we caused Anthropic to have
[43:11] though um we we caused Anthropic to have to go to somebody else
[43:14] to go to somebody else um I'm still happy that it happened.
[43:16] um I'm still happy that it happened. An- Anthropic's existence is great for
[43:18] An- Anthropic's existence is great for the world. I'm I'm delighted for it. Uh
[43:21] the world. I'm I'm delighted for it. Uh I I guess you still are making a ton of
[43:23] I I guess you still are making a ton of money and you're making way more money
[43:24] money and you're making way more money um quarter after quarter.
[43:25] um quarter after quarter. >> okay to have regrets.
[43:27] >> okay to have regrets. >> [laughter]
[43:29] >> [laughter] >> So the the the the the question still
[43:30] >> So the the the the the question still arises, okay, well now that we're here
[43:32] arises, okay, well now that we're here and you have all this money that you
[43:33] and you have all this money that you keep making um what should Nvidia be
[43:36] keep making um what should Nvidia be doing with it? And there's one answer
[43:38] doing with it? And there's one answer which says, look, there's this whole
[43:39] which says, look, there's this whole middleman ecosystem that has popped up
[43:40] middleman ecosystem that has popped up for converting um CAPEX into OPEX for
[43:45] for converting um CAPEX into OPEX for these labs so that they can rent compute
[43:48] these labs so that they can rent compute um because the chips are really
[43:49] um because the chips are really expensive, they make a lot of money over
[43:50] expensive, they make a lot of money over their lifetime through because AI models
[43:52] their lifetime through because AI models are getting better the value that they
[43:54] are getting better the value that they generate the tokens is increasing, but
[43:55] generate the tokens is increasing, but they're expensive to set up. Nvidia has
[43:57] they're expensive to set up. Nvidia has the money to do the CAPEX. So and in
[44:00] the money to do the CAPEX. So and in fact you are
[44:01] fact you are uh you're it's been reported you're
[44:03] uh you're it's been reported you're backstopping CoreWeave up to 6.3 billion
[44:04] backstopping CoreWeave up to 6.3 billion and have invested to be um but yeah, why
[44:08] and have invested to be um but yeah, why why why doesn't Nvidia become
[44:10] why why doesn't Nvidia become a cloud themselves? Why doesn't become a
[44:12] a cloud themselves? Why doesn't become a hyperscaler themselves and run this
[44:13] hyperscaler themselves and run this computer out? You have all this cash to
[44:14] computer out? You have all this cash to do it. This is a philosophy of the
[44:16] do it. This is a philosophy of the company and and I think is wise. We
[44:18] company and and I think is wise. We should do as much as needed as little as
[44:21] should do as much as needed as little as possible.
[44:23] possible. And and what that means is
[44:25] And and what that means is the the work that we do with building
[44:27] the the work that we do with building our our computing platform
[44:29] our our computing platform if we don't if we don't do it, I
[44:31] if we don't if we don't do it, I genuinely believe it doesn't get done.
[44:34] genuinely believe it doesn't get done. If we didn't take the risk that we take,
[44:36] If we didn't take the risk that we take, if we didn't build NVLink the way we
[44:38] if we didn't build NVLink the way we built if we didn't build the whole
[44:39] built if we didn't build the whole stack, if we didn't create the ecosystem
[44:41] stack, if we didn't create the ecosystem the way we did it, if we didn't dedicate
[44:43] the way we did it, if we didn't dedicate ourselves to 20 years of CUDA while
[44:46] ourselves to 20 years of CUDA while losing money most of that time, if we
[44:48] losing money most of that time, if we didn't do it, nobody else would have
[44:49] didn't do it, nobody else would have done it.
[44:51] If we didn't create all the CUDA-X
[44:53] If we didn't create all the CUDA-X libraries so that they're all domain
[44:55] libraries so that they're all domain specific
[44:57] specific you know, this is several decade and a
[44:59] you know, this is several decade and a half ago uh we pushed into domain
[45:01] half ago uh we pushed into domain specific libraries because we realized
[45:03] specific libraries because we realized that if we didn't create these domain
[45:04] that if we didn't create these domain specific libraries whether it's for ray
[45:06] specific libraries whether it's for ray tracing or image generation or even the
[45:09] tracing or image generation or even the early works of AI these models, if we
[45:11] early works of AI these models, if we didn't create them for data processing,
[45:13] didn't create them for data processing, structured data processing or vector
[45:15] structured data processing or vector data process- if we didn't create them,
[45:17] data process- if we didn't create them, nobody would.
[45:18] nobody would. And I am completely certain of that.
[45:21] And I am completely certain of that. We created a a a library for
[45:24] We created a a a library for computational lithography called
[45:25] computational lithography called cuLitho. If we didn't create it, nobody
[45:27] cuLitho. If we didn't create it, nobody would have.
[45:29] would have. And so
[45:30] And so accelerated computing wouldn't advance
[45:32] accelerated computing wouldn't advance the way it has if we didn't do what we
[45:33] the way it has if we didn't do what we did. And and so we should do that. We
[45:36] did. And and so we should do that. We should dedicate our company all of our
[45:38] should dedicate our company all of our might wholeheartedly to do that.
[45:40] might wholeheartedly to do that. However, the world has lots of clouds.
[45:43] However, the world has lots of clouds. If I didn't do it, somebody show up.
[45:46] If I didn't do it, somebody show up. And so following the the recipe, the
[45:48] And so following the the recipe, the philosophy of doing as much as needed
[45:51] philosophy of doing as much as needed but as little as possible
[45:53] but as little as possible as little as possible that philosophy
[45:56] as little as possible that philosophy exists in our company today. And
[45:58] exists in our company today. And everything I do, I do it with that lens.
[46:02] everything I do, I do it with that lens. In the case of clouds if we didn't
[46:04] In the case of clouds if we didn't support CoreWeave to exist
[46:07] support CoreWeave to exist these neo clouds, these AI clouds
[46:09] these neo clouds, these AI clouds wouldn't exist.
[46:11] wouldn't exist. If we didn't help CoreWeave exist they
[46:13] If we didn't help CoreWeave exist they would not exist.
[46:15] would not exist. If we didn't support Nscale, they
[46:17] If we didn't support Nscale, they wouldn't be where they are today. If we
[46:19] wouldn't be where they are today. If we didn't support Nebius, they wouldn't be
[46:21] didn't support Nebius, they wouldn't be where they are today. Now they are
[46:23] where they are today. Now they are they're doing fantastically. Is that a
[46:25] they're doing fantastically. Is that a business model work? No. We should do as
[46:28] business model work? No. We should do as much as needed as little as possible.
[46:30] much as needed as little as possible. And so we're trying we invest in our
[46:32] And so we're trying we invest in our ecosystem
[46:34] ecosystem because I want our eco- ecosystem to
[46:35] because I want our eco- ecosystem to thrive and I want our our I want
[46:39] thrive and I want our our I want I want the architecture and I want AI to
[46:41] I want the architecture and I want AI to be able to connect with as many
[46:44] be able to connect with as many industries as possible
[46:46] industries as possible as many countries as possible and make
[46:50] as many countries as possible and make it possible for, you know, the planet to
[46:52] it possible for, you know, the planet to be built on AI and to be built on the
[46:54] be built on AI and to be built on the American tech stack. And so so that that
[46:56] American tech stack. And so so that that vision I think is exactly what we're
[46:58] vision I think is exactly what we're pursuing. Now one of the things that
[47:00] pursuing. Now one of the things that that you mentioned
[47:02] that you mentioned um
[47:03] um there are so many great amazing
[47:05] there are so many great amazing foundation model companies and we try to
[47:06] foundation model companies and we try to invest in all of them.
[47:08] invest in all of them. And this is this is another thing that
[47:09] And this is this is another thing that we do. We don't pick winners.
[47:12] we do. We don't pick winners. And we we like we we we need to support
[47:14] And we we like we we we need to support everyone
[47:15] everyone and it's part of our part of our our our
[47:18] and it's part of our part of our our our joy of doing so. It's it's imperative to
[47:20] joy of doing so. It's it's imperative to our business, but we also go out of our
[47:22] our business, but we also go out of our way not to pick winners. And so when I
[47:24] way not to pick winners. And so when I when I invest in one of them, I invest
[47:26] when I invest in one of them, I invest in all of them.
[47:27] in all of them. Why do you go out of your way to not to
[47:28] Why do you go out of your way to not to pick winners?
[47:29] pick winners? Because it's not our job to.
[47:31] Because it's not our job to. Number one. Number two when Nvidia first
[47:34] Number one. Number two when Nvidia first started, there were
[47:36] started, there were 60 graphics companies, 60 3D graphics
[47:39] 60 graphics companies, 60 3D graphics companies. Uh we are the only one that
[47:41] companies. Uh we are the only one that survived. If you were to taken those 60
[47:43] survived. If you were to taken those 60 companies
[47:45] companies 60 graphics companies and ask yourself
[47:47] 60 graphics companies and ask yourself which one was going to make it
[47:48] which one was going to make it Nvidia would be the top of that list not
[47:51] Nvidia would be the top of that list not to make it.
[47:52] to make it. You know, this is long before you but
[47:55] You know, this is long before you but Nvidia's graphics architecture was
[47:57] Nvidia's graphics architecture was precisely wrong.
[47:58] precisely wrong. It's not a little bit wrong. We created
[48:01] It's not a little bit wrong. We created an architecture that was precisely
[48:03] an architecture that was precisely wrong.
[48:03] wrong. And and it was an impossible thing for
[48:06] And and it was an impossible thing for developers to support. It was never
[48:07] developers to support. It was never going to make it.
[48:09] going to make it. We reason about it for good reas- for
[48:11] We reason about it for good reas- for for from good first first principles,
[48:13] for from good first first principles, but we ended up in the wrong solution.
[48:15] but we ended up in the wrong solution. And and um uh everybody would have
[48:18] And and um uh everybody would have count- everybody would have counted us
[48:19] count- everybody would have counted us out.
[48:21] out. And and here we are. And so I'm I'm I'm
[48:24] And and here we are. And so I'm I'm I'm I have enough humility to recognize that
[48:27] I have enough humility to recognize that you know, don't don't pick winners. Mhm.
[48:29] you know, don't don't pick winners. Mhm. Yeah. Um Either let them all take care
[48:31] Yeah. Um Either let them all take care of themselves
[48:33] of themselves or take care of all of them. Um what one
[48:35] or take care of all of them. Um what one thing I didn't understand is
[48:37] thing I didn't understand is you said, look, we're not prioritizing
[48:38] you said, look, we're not prioritizing these, you know, clouds um just is there
[48:40] these, you know, clouds um just is there a neo cloud we want to prop them up. But
[48:43] a neo cloud we want to prop them up. But you also said
[48:44] you also said you listed a bunch of neo clouds and you
[48:45] you listed a bunch of neo clouds and you said they wouldn't exist if it wasn't
[48:46] said they wouldn't exist if it wasn't for Nvidia.
[48:47] for Nvidia. >> Yeah. And so how are those two things
[48:50] >> Yeah. And so how are those two things compatible?
[48:50] compatible? >> um first of all, they they need to want
[48:52] >> um first of all, they they need to want to exist and they come to ask us for
[48:54] to exist and they come to ask us for help. And when they when they um when
[48:57] help. And when they when they um when they want to exist and have they have a
[48:59] they want to exist and have they have a business plan and they you know, they
[49:01] business plan and they you know, they have expertise and you know, they have
[49:02] have expertise and you know, they have the passion for it uh they obviously
[49:05] the passion for it uh they obviously have to have some capability themselves
[49:07] have to have some capability themselves uh but if at the end of the day they
[49:09] uh but if at the end of the day they need some investment in order to get it
[49:11] need some investment in order to get it off the ground, uh we we would be there
[49:13] off the ground, uh we we would be there for them.
[49:14] for them. Um but but the sooner they get their
[49:17] Um but but the sooner they get their flywheel going
[49:19] flywheel going you know, your question was, do we want
[49:20] you know, your question was, do we want to be in the financing business? The
[49:22] to be in the financing business? The answer is no. Mhm. Yeah, we don't want
[49:24] answer is no. Mhm. Yeah, we don't want to be we want to we because there are
[49:26] to be we want to we because there are people in the financing business and
[49:27] people in the financing business and we'd rather work with all of the people
[49:29] we'd rather work with all of the people who are fin- in the financing business
[49:31] who are fin- in the financing business than to be a financier ourselves. And so
[49:33] than to be a financier ourselves. And so so I think the the our goal is to focus
[49:36] so I think the the our goal is to focus on what we do, keep our business model
[49:38] on what we do, keep our business model as simple as possible, support our
[49:40] as simple as possible, support our ecosystem. Um when someone like like uh
[49:43] ecosystem. Um when someone like like uh OpenAI needs an investment of $30
[49:45] OpenAI needs an investment of $30 billion scale um because it's still
[49:47] billion scale um because it's still before their IPO
[49:49] before their IPO and and uh
[49:51] and and uh um we deeply believe in them.
[49:53] um we deeply believe in them. Uh we deeply believe that I deeply
[49:56] Uh we deeply believe that I deeply believe that that they're going to be
[49:57] believe that that they're going to be they're going to be an Well, they're an
[49:59] they're going to be an Well, they're an extraordinary company already today.
[50:00] extraordinary company already today. They're going to be an incredible
[50:01] They're going to be an incredible company. Uh the world needs them to
[50:03] company. Uh the world needs them to exist. The world wants them to exist and
[50:05] exist. The world wants them to exist and wants them to exist. And and uh they
[50:08] wants them to exist. And and uh they have everything on They have the wind at
[50:09] have everything on They have the wind at their back. Let's Let's support them and
[50:11] their back. Let's Let's support them and let them scale. And so So, to those
[50:14] let them scale. And so So, to those those investments we'll do because we're
[50:17] those investments we'll do because we're They need us to do it.
[50:18] They need us to do it. And um uh but we're we're not trying to
[50:21] And um uh but we're we're not trying to do as much as possible. We're trying to
[50:22] do as much as possible. We're trying to do as little as possible.
[50:24] do as little as possible. I spend way too much time copy-pasting
[50:25] I spend way too much time copy-pasting text back and forth from Google Docs to
[50:27] text back and forth from Google Docs to chatbots. And so, I built what's
[50:29] chatbots. And so, I built what's basically a cursor for writing, which
[50:31] basically a cursor for writing, which operates the way I think an AI
[50:32] operates the way I think an AI co-researcher should operate. I can tag
[50:34] co-researcher should operate. I can tag it and it can talk with me through
[50:36] it and it can talk with me through inline comment threads and help me dig
[50:38] inline comment threads and help me dig deeper and brainstorm. I built this
[50:39] deeper and brainstorm. I built this entire thing over the weekend with
[50:40] entire thing over the weekend with Cursor and their new Composer 2 model.
[50:42] Cursor and their new Composer 2 model. With a lot of agentic coding tools, I
[50:44] With a lot of agentic coding tools, I feel like I have no idea what's going on
[50:45] feel like I have no idea what's going on under the surface. I just have to
[50:46] under the surface. I just have to relinquish control and hope for the
[50:48] relinquish control and hope for the best. But Cursor let me try a bunch of
[50:50] best. But Cursor let me try a bunch of different ideas while staying on top of
[50:51] different ideas while staying on top of the implementation. I did most of my
[50:53] the implementation. I did most of my brainstorming in the agents window. And
[50:55] brainstorming in the agents window. And after I got some basic files in place, I
[50:57] after I got some basic files in place, I used a diff window to track changes. The
[50:59] used a diff window to track changes. The few times that I needed to make a quick
[51:00] few times that I needed to make a quick tweak by hand, I just used the editor.
[51:02] tweak by hand, I just used the editor. If you want to try my AI co-researcher
[51:03] If you want to try my AI co-researcher yourself, I've linked the GitHub repo in
[51:05] yourself, I've linked the GitHub repo in the description. And if you have a tool
[51:06] the description. And if you have a tool that you've been wanting to build, you
[51:08] that you've been wanting to build, you should make it happen. Go to
[51:09] should make it happen. Go to cursor.com/swyx
[51:11] cursor.com/swyx to get started.
[51:13] to get started. Th- This may be sort of an obvious
[51:14] Th- This may be sort of an obvious question, but
[51:15] question, but we've lived many years
[51:17] we've lived many years in this situation where there's a
[51:19] in this situation where there's a shortage of GPUs and it's grown now
[51:23] shortage of GPUs and it's grown now because models are getting better. We
[51:25] because models are getting better. We have a shortage of GPUs. Yes.
[51:27] have a shortage of GPUs. Yes. >> Yeah. And
[51:29] >> Yeah. And Nvidia is known for divvying up the
[51:32] Nvidia is known for divvying up the scarce allocation not just based on high
[51:35] scarce allocation not just based on high bidder, but rather on hey, we want to
[51:37] bidder, but rather on hey, we want to make sure that these neo neo clouds
[51:39] make sure that these neo neo clouds exist. Let's give some to CoreWeave.
[51:40] exist. Let's give some to CoreWeave. Let's give some to Crusoe. Let's give
[51:42] Let's give some to Crusoe. Let's give some to Lambda. Um
[51:44] some to Lambda. Um why is it good for Nvidia? First of all,
[51:46] why is it good for Nvidia? First of all, would you agree with this
[51:47] would you agree with this characterization of fracturing the
[51:48] characterization of fracturing the market?
[51:49] market? >> No. Yeah, your premise is just wrong.
[51:51] >> No. Yeah, your premise is just wrong. Yeah. Um we're we're sufficiently um
[51:56] Yeah. Um we're we're sufficiently um mindful about these things.
[51:58] mindful about these things. Uh
[51:59] Uh we're very mindful about these things.
[52:00] we're very mindful about these things. First of all
[52:01] First of all if you don't place an if you don't place
[52:03] if you don't place an if you don't place a PO
[52:06] all the talking in the world won't make
[52:08] all the talking in the world won't make a difference. And so, until we get a PO,
[52:10] a difference. And so, until we get a PO, what are we going to do?
[52:12] what are we going to do? And so, the first thing is is
[52:14] And so, the first thing is is we work with we work really hard with
[52:16] we work with we work really hard with everybody to get a forecast done.
[52:19] everybody to get a forecast done. Because these things take a long time to
[52:20] Because these things take a long time to build and the data centers take a long
[52:22] build and the data centers take a long time to build. And so, we align
[52:24] time to build. And so, we align ourselves um with demand and supply and
[52:27] ourselves um with demand and supply and things like that through forecasting.
[52:29] things like that through forecasting. Okay, that's job job number one. Number
[52:31] Okay, that's job job number one. Number two
[52:32] two um everybody who you know, we've tried
[52:34] um everybody who you know, we've tried to forecast with with with as many
[52:36] to forecast with with with as many people as possible, but in the final in
[52:38] people as possible, but in the final in the final analysis, you still have to
[52:39] the final analysis, you still have to place an order.
[52:41] place an order. And maybe maybe um uh for whatever
[52:44] And maybe maybe um uh for whatever reason you didn't place your order, what
[52:45] reason you didn't place your order, what can I do?
[52:47] can I do? And so, at some point first in first
[52:49] And so, at some point first in first out.
[52:50] out. But beyond that if you're not ready
[52:53] But beyond that if you're not ready because your data center's not ready
[52:56] because your data center's not ready or certain components aren't ready to to
[52:58] or certain components aren't ready to to enable you to stand up a data center
[53:00] enable you to stand up a data center um
[53:01] um we might decide to serve another
[53:03] we might decide to serve another customer first.
[53:04] customer first. That's just maximizing the throughput of
[53:06] That's just maximizing the throughput of our of our our own factory.
[53:09] our of our our own factory. And so, uh we might do some adjustments
[53:11] And so, uh we might do some adjustments there. Aside from that
[53:15] uh the prioritization is is first in
[53:17] uh the prioritization is is first in first out.
[53:19] first out. Mhm. Yeah, you got to you got to place a
[53:20] Mhm. Yeah, you got to you got to place a PO.
[53:21] PO. If you don't place a PO, uh now of
[53:23] If you don't place a PO, uh now of course
[53:24] course uh there there are stories about that.
[53:27] uh there there are stories about that. You know, like for example
[53:28] You know, like for example uh all of this kind of started from
[53:31] uh all of this kind of started from from uh
[53:32] from uh it was a article about Larry and Elon
[53:34] it was a article about Larry and Elon having dinner with me where they where
[53:36] having dinner with me where they where they begged for GPUs.
[53:39] they begged for GPUs. That never happened.
[53:41] That never happened. We We absolutely had dinner.
[53:44] We We absolutely had dinner. And we absolutely had dinner. Um and it
[53:46] And we absolutely had dinner. Um and it was a it was a wonderful dinner. In no
[53:48] was a it was a wonderful dinner. In no time did they beg for GPUs. And so, it
[53:51] time did they beg for GPUs. And so, it they just had to place an order. And
[53:53] they just had to place an order. And once they placed an order, we do our
[53:54] once they placed an order, we do our best to get the capacity to them. Yeah.
[53:57] best to get the capacity to them. Yeah. We're not complicated.
[53:59] We're not complicated. Okay, so it sounds like there's a queue
[54:01] Okay, so it sounds like there's a queue and then um
[54:02] and then um uh
[54:03] uh based on whether your data center's
[54:04] based on whether your data center's ready and when you put place a purchase
[54:06] ready and when you put place a purchase order
[54:07] order you get them a certain time.
[54:08] you get them a certain time. But it still doesn't sound like highest
[54:10] But it still doesn't sound like highest bidder just gets it. Is there a reason
[54:12] bidder just gets it. Is there a reason to do it? We never do that. Okay. We
[54:15] to do it? We never do that. Okay. We never do that.
[54:15] never do that. >> I just do highest bidder? Because it's
[54:17] >> I just do highest bidder? Because it's it's a bad big business practice. You
[54:19] it's a bad big business practice. You you set your price. You set your price
[54:21] you set your price. You set your price and then and then people decide to buy
[54:22] and then and then people decide to buy it or not.
[54:24] it or not. And and um
[54:25] And and um uh there there there
[54:28] uh there there there I I understand that that
[54:31] I I understand that that others in the chip industry
[54:33] others in the chip industry um
[54:34] um uh change their prices when demand is
[54:36] uh change their prices when demand is higher.
[54:37] higher. Uh but we just don't. We just don't.
[54:39] Uh but we just don't. We just don't. That's just never been a practice of
[54:40] That's just never been a practice of ours. You can count on us. You know, I I
[54:42] ours. You can count on us. You know, I I prefer to be to be um uh
[54:45] prefer to be to be um uh dependable
[54:46] dependable uh to be the foundation of the industry
[54:49] uh to be the foundation of the industry and I I you don't need to you don't need
[54:51] and I I you don't need to you don't need to second-guess. Mhm. You know, if if
[54:54] to second-guess. Mhm. You know, if if you if I quoted you a price
[54:56] you if I quoted you a price um we quoted you a price. That's it.
[54:59] um we quoted you a price. That's it. Mhm. And if demand goes through the
[55:00] Mhm. And if demand goes through the roof, so be it. And on the other end,
[55:02] roof, so be it. And on the other end, that's why you have a productive
[55:03] that's why you have a productive relationship with TSMC, right?
[55:05] relationship with TSMC, right? >> Yeah.
[55:06] >> Yeah. Yeah, yeah. Uh
[55:07] Yeah, yeah. Uh Nvidia's been in business We've been
[55:09] Nvidia's been in business We've been doing business with them for
[55:11] doing business with them for uh I guess coming up on 30 years.
[55:14] uh I guess coming up on 30 years. And Nvidia and TSMC don't have a legal
[55:17] And Nvidia and TSMC don't have a legal contract.
[55:18] contract. There's there is always some rough
[55:20] There's there is always some rough justice.
[55:21] justice. And um sometimes I'm right, sometimes
[55:23] And um sometimes I'm right, sometimes I'm wrong. Uh sometimes I got I got a
[55:26] I'm wrong. Uh sometimes I got I got a better deal, sometimes I got a worse
[55:27] better deal, sometimes I got a worse deal.
[55:28] deal. Uh but overall in the in the whole, the
[55:30] Uh but overall in the in the whole, the relationship is incredible. And and I
[55:32] relationship is incredible. And and I can completely trust them. I can
[55:34] can completely trust them. I can completely depend on them.
[55:35] completely depend on them. And and our our one of the things that
[55:37] And and our our one of the things that we you can count on with Nvidia is that
[55:40] we you can count on with Nvidia is that next year
[55:41] next year this year Vera Rubin's going to be
[55:43] this year Vera Rubin's going to be incredible. Next year Vera Rubin Ultra
[55:45] incredible. Next year Vera Rubin Ultra will come. The year after that Fineman
[55:47] will come. The year after that Fineman will come. And the year after that I
[55:48] will come. And the year after that I haven't introduced the name yet. And so
[55:51] haven't introduced the name yet. And so So, every single year you can count on
[55:53] So, every single year you can count on us.
[55:55] us. And this is an
[55:57] And this is an you got you're going to have to go find
[55:58] you got you're going to have to go find another ASIC team in the world.
[56:01] another ASIC team in the world. Pick your ASIC team.
[56:02] Pick your ASIC team. Where you can say I can bet the farm of
[56:05] Where you can say I can bet the farm of I can bet my entire business that you
[56:08] I can bet my entire business that you will be here for me every single year.
[56:11] will be here for me every single year. Your cost your token cost will decrease
[56:13] Your cost your token cost will decrease by an order of magnitude every single
[56:16] by an order of magnitude every single year. I can count on it. I can count on
[56:18] year. I can count on it. I can count on the clock.
[56:19] the clock. Well, I just said something about TSMC.
[56:23] No other foundry in history can you
[56:26] No other foundry in history can you possibly say that?
[56:28] possibly say that? You can say that about Nvidia today.
[56:31] You can say that about Nvidia today. You can count on us every single year.
[56:33] You can count on us every single year. If you would like to buy a billion
[56:35] If you would like to buy a billion dollars worth of
[56:36] dollars worth of AI factory compute
[56:38] AI factory compute no problem.
[56:39] no problem. If you'd like to buy $100 million, no
[56:41] If you'd like to buy $100 million, no problem. You'd like to buy $10 million
[56:43] problem. You'd like to buy $10 million or just one rack, not a problem. Or just
[56:45] or just one rack, not a problem. Or just one graphics card
[56:47] one graphics card okay, no problem.
[56:49] okay, no problem. If you would like to place an order for
[56:51] If you would like to place an order for $100 billion AI factory, no problem.
[56:54] $100 billion AI factory, no problem. We're the only company in the world
[56:56] We're the only company in the world where you can say that today.
[56:58] where you can say that today. I can say that about TSMC as well.
[57:01] I can say that about TSMC as well. I want to buy one
[57:03] I want to buy one buy one billion, no problem. We just got
[57:05] buy one billion, no problem. We just got to go through the process of planning
[57:07] to go through the process of planning for it and you know, all the all the
[57:08] for it and you know, all the all the things that that mature people do. Mhm.
[57:11] things that that mature people do. Mhm. You know, and so so I I think the the uh
[57:14] You know, and so so I I think the the uh uh
[57:15] uh this ability for Nvidia to be the
[57:17] this ability for Nvidia to be the foundation
[57:18] foundation of the world's AI industry
[57:21] of the world's AI industry this is a this is a position that has
[57:23] this is a this is a position that has taken us decade several dec- couple of
[57:26] taken us decade several dec- couple of decades to arrive at. Enormous
[57:28] decades to arrive at. Enormous commitment, enormous dedication. And um
[57:32] commitment, enormous dedication. And um the stability of our company, the
[57:34] the stability of our company, the consistency of our company is really
[57:35] consistency of our company is really really important. Okay, I want to ask
[57:37] really important. Okay, I want to ask you about China. Yeah. And I always like
[57:39] you about China. Yeah. And I always like to take uh
[57:40] to take uh I don't actually don't know what I think
[57:41] I don't actually don't know what I think about whether it's good to sell chips to
[57:43] about whether it's good to sell chips to China or not, but I have like played
[57:44] China or not, but I have like played devil's advocate against my guest. So,
[57:45] devil's advocate against my guest. So, when Dario was on who supports export
[57:47] when Dario was on who supports export controls, I asked him, "Well, why can't
[57:48] controls, I asked him, "Well, why can't America and China both have
[57:50] America and China both have country of geniuses in a data center?"
[57:52] country of geniuses in a data center?" But since um you're on the opposite
[57:53] But since um you're on the opposite side, I'll
[57:54] side, I'll ask you in the opposite way. Um
[57:57] ask you in the opposite way. Um and look, one way to think about it is
[57:59] and look, one way to think about it is Anthropic actually announced a couple
[58:01] Anthropic actually announced a couple days ago Mythos preview this model
[58:02] days ago Mythos preview this model Mythos they're not even releasing
[58:03] Mythos they're not even releasing publicly because they say it has such
[58:05] publicly because they say it has such cyber offensive capabilities that we
[58:07] cyber offensive capabilities that we don't think the world is ready until we
[58:09] don't think the world is ready until we get we make sure these zero days are
[58:10] get we make sure these zero days are patched up. But they say it found
[58:12] patched up. But they say it found thousands of high severity
[58:14] thousands of high severity vulnerabilities across every major
[58:15] vulnerabilities across every major operating system, every browser. It
[58:18] operating system, every browser. It found one in OpenBSD, which is this
[58:19] found one in OpenBSD, which is this operating system that's been
[58:20] operating system that's been specifically designed
[58:22] specifically designed to not have zero days and it found one
[58:24] to not have zero days and it found one uh for 27 years it's existed. Um and so
[58:27] uh for 27 years it's existed. Um and so if
[58:28] if Chinese companies and Chinese labs and
[58:30] Chinese companies and Chinese labs and Chinese government had access to the AI
[58:32] Chinese government had access to the AI chips to train a model like Claude
[58:34] chips to train a model like Claude Mythos with these cyber offensive and
[58:36] Mythos with these cyber offensive and run millions of instances of it with
[58:38] run millions of instances of it with more compute
[58:39] more compute the question is oh, is that a threat to
[58:43] the question is oh, is that a threat to American companies, to American national
[58:45] American companies, to American national security?
[58:46] security? Uh first of all, um
[58:48] Uh first of all, um Mythos was was uh trained on fairly
[58:51] Mythos was was uh trained on fairly mundane capacity
[58:54] mundane capacity and a fairly mundane amount of it.
[58:57] and a fairly mundane amount of it. Um by an extraordinary company.
[58:59] Um by an extraordinary company. Uh and so, the amount of capacity and
[59:01] Uh and so, the amount of capacity and the type of compute that's it was
[59:04] the type of compute that's it was trained on is abundantly available in
[59:06] trained on is abundantly available in China.
[59:08] China. And so
[59:09] And so you just have to first realize
[59:11] you just have to first realize that
[59:13] that chips exist in China. They manufacture
[59:15] chips exist in China. They manufacture 60% of the world's mainstream chips,
[59:17] 60% of the world's mainstream chips, maybe more.
[59:19] maybe more. It's a very large industry for them.
[59:22] It's a very large industry for them. They have some of the world's greatest
[59:23] They have some of the world's greatest computer scientists.
[59:26] computer scientists. As you know
[59:27] As you know most of the AI researchers in all of
[59:29] most of the AI researchers in all of these AI labs, most of them are Chinese.
[59:33] They have 50% of the world's
[59:35] They have 50% of the world's AI researchers.
[59:39] And so
[59:40] And so the question is if you're concerned
[59:42] the question is if you're concerned about them
[59:44] about them what is the considering all the assets
[59:46] what is the considering all the assets they already have? They have an
[59:48] they already have? They have an abundance of energy.
[59:50] abundance of energy. They have plenty of chips.
[59:51] They have plenty of chips. They got most of the AI researchers.
[59:54] They got most of the AI researchers. If you're worried about them, what is
[59:56] If you're worried about them, what is the best way
[59:59] to create a safe world?
[01:00:01] to create a safe world? Well
[01:00:03] Well victimizing them
[01:00:04] victimizing them um
[01:00:06] um I turning them into an enemy
[01:00:08] I turning them into an enemy uh likely isn't the best answer.
[01:00:11] uh likely isn't the best answer. They are an adversary.
[01:00:13] They are an adversary. We want United States to win.
[01:00:16] We want United States to win. Um but I think having a having a
[01:00:18] Um but I think having a having a dialogue and having research dialogue is
[01:00:21] dialogue and having research dialogue is probably the safest thing to do.
[01:00:23] probably the safest thing to do. This is an area that that is glaringly
[01:00:26] This is an area that that is glaringly missing
[01:00:27] missing because of our current attitude about
[01:00:30] because of our current attitude about China as an adversary.
[01:00:33] China as an adversary. It is essential that our AI researchers
[01:00:35] It is essential that our AI researchers and their AI researchers are actually
[01:00:36] and their AI researchers are actually talking.
[01:00:38] talking. It is essential that we try to
[01:00:40] It is essential that we try to both agree on how to what not to use the
[01:00:43] both agree on how to what not to use the AI for.
[01:00:46] AI for. With respect to
[01:00:48] With respect to finding bugs in software, of course,
[01:00:50] finding bugs in software, of course, that's what AI is supposed to do.
[01:00:52] that's what AI is supposed to do. Is it going to find bugs in a lot of
[01:00:53] Is it going to find bugs in a lot of software? Of course.
[01:00:56] software? Of course. There's lots and lots of bugs. There are
[01:00:58] There's lots and lots of bugs. There are lots of bugs
[01:00:59] lots of bugs in the AI software.
[01:01:01] in the AI software. And so
[01:01:02] And so um that's what AI is supposed to do. And
[01:01:05] um that's what AI is supposed to do. And I'm delighted that that uh uh AI has
[01:01:07] I'm delighted that that uh uh AI has reached the level where it could help us
[01:01:09] reached the level where it could help us be so much more productive.
[01:01:11] be so much more productive. Um one of the things that
[01:01:13] Um one of the things that that um
[01:01:15] that um is
[01:01:16] is is uh
[01:01:17] is uh under under emphasized
[01:01:20] under under emphasized is the richness of ecosystem around
[01:01:22] is the richness of ecosystem around cybersecurity, AI cybersecurity, and AI
[01:01:25] cybersecurity, AI cybersecurity, and AI security, and AI privacy, and uh AI
[01:01:28] security, and AI privacy, and uh AI safety.
[01:01:30] safety. That whole ecosystem
[01:01:33] of AI startups that are trying to create
[01:01:35] of AI startups that are trying to create this future for us where where you have
[01:01:38] this future for us where where you have one AI agent that's incredible
[01:01:41] one AI agent that's incredible surrounded by thousands of AI agents
[01:01:44] surrounded by thousands of AI agents keeping it safe, keeping it secure.
[01:01:46] keeping it safe, keeping it secure. That future surely is going to happen.
[01:01:49] That future surely is going to happen. And the idea that you're going to have
[01:01:51] And the idea that you're going to have an AI agent running around with nobody
[01:01:54] an AI agent running around with nobody watching after it is kind of insane.
[01:01:56] watching after it is kind of insane. And so uh we know very well that this
[01:02:00] And so uh we know very well that this ecosystem needs to thrive.
[01:02:02] ecosystem needs to thrive. It turns out this ecosystem needs open
[01:02:04] It turns out this ecosystem needs open source.
[01:02:05] source. This ecosystem needs open models. They
[01:02:07] This ecosystem needs open models. They need open stacks so that all of these AI
[01:02:09] need open stacks so that all of these AI researchers and all these great computer
[01:02:11] researchers and all these great computer scientists can go build AI systems that
[01:02:14] scientists can go build AI systems that has are as formidable and can keep um AI
[01:02:18] has are as formidable and can keep um AI safe.
[01:02:19] safe. And uh
[01:02:21] And uh and
[01:02:21] and and and so one of the things that we
[01:02:23] and and so one of the things that we need to make sure that we do is we keep
[01:02:24] need to make sure that we do is we keep the the open source ecosystem vibrant.
[01:02:28] the the open source ecosystem vibrant. And um
[01:02:30] And um and that can't be ignored.
[01:02:32] and that can't be ignored. That can't be ignored and and a lot of
[01:02:34] That can't be ignored and and a lot of that is coming out of China.
[01:02:35] that is coming out of China. Um
[01:02:37] Um I we we ought to we ought to
[01:02:39] I we we ought to we ought to >> [clears throat]
[01:02:39] >> [clears throat] >> not suffocate that.
[01:02:41] >> not suffocate that. You know, with respect to to China, we
[01:02:43] You know, with respect to to China, we want to have of course we want United
[01:02:45] want to have of course we want United States to have as much computing as
[01:02:46] States to have as much computing as possible.
[01:02:48] possible. Uh
[01:02:49] Uh we're we're limited by energy.
[01:02:51] we're we're limited by energy. Um but you know, we got a lot of people
[01:02:53] Um but you know, we got a lot of people working on that and we we ought to not
[01:02:55] working on that and we we ought to not make energy a a a bottleneck for our our
[01:02:58] make energy a a a bottleneck for our our country.
[01:03:00] country. Um but what we also want is we want to
[01:03:03] Um but what we also want is we want to make sure that all the AI developers in
[01:03:05] make sure that all the AI developers in the world are developing on
[01:03:07] the world are developing on the American tech stack and making
[01:03:09] the American tech stack and making the contributions, the advancements of
[01:03:11] the contributions, the advancements of AI
[01:03:13] AI especially when it's open source
[01:03:14] especially when it's open source available to the American ecosystem.
[01:03:17] available to the American ecosystem. And it would be extremely foolish
[01:03:19] And it would be extremely foolish to create two ecosystems.
[01:03:23] to create two ecosystems. The open source ecosystem and it only
[01:03:25] The open source ecosystem and it only runs on the Chinese tech tech a foreign
[01:03:27] runs on the Chinese tech tech a foreign tech stack
[01:03:28] tech stack and a closed ecosystem and a that runs
[01:03:30] and a closed ecosystem and a that runs on the American tech stack. I think that
[01:03:31] on the American tech stack. I think that that would be that would be a a horrible
[01:03:33] that would be that would be a a horrible outcome for United States. Mhm.
[01:03:36] outcome for United States. Mhm. Since there are a lot of things
[01:03:37] Since there are a lot of things let let me just triage the
[01:03:39] let let me just triage the um response. I mean, I think the concern
[01:03:42] um response. I mean, I think the concern going back to
[01:03:43] going back to the flop difference in the hacking is
[01:03:46] the flop difference in the hacking is yes, they have compute, but there's some
[01:03:48] yes, they have compute, but there's some estimates that because they're at 7
[01:03:50] estimates that because they're at 7 nanometer
[01:03:51] nanometer uh they don't have EUVs because of chip
[01:03:53] uh they don't have EUVs because of chip making export controls the amount of
[01:03:55] making export controls the amount of flops they're able to actually produce,
[01:03:57] flops they're able to actually produce, they have like 1/10 the amount of flops
[01:03:58] they have like 1/10 the amount of flops that the US has. And so with that could
[01:04:02] that the US has. And so with that could they train eventually a model like
[01:04:03] they train eventually a model like Mythos? Yes. But the question is because
[01:04:07] Mythos? Yes. But the question is because we have more flops
[01:04:09] we have more flops uh American labs are able to get to
[01:04:11] uh American labs are able to get to these level capabilities first. And
[01:04:12] these level capabilities first. And because Anthropic got to it first, they
[01:04:14] because Anthropic got to it first, they say, "Okay, we're going to hold on to it
[01:04:15] say, "Okay, we're going to hold on to it for a month while all these American
[01:04:17] for a month while all these American companies we give them access to it,
[01:04:18] companies we give them access to it, they're going to
[01:04:19] they're going to patch up all their vulnerabilities.
[01:04:21] patch up all their vulnerabilities. And now we release it." Furthermore, if
[01:04:22] And now we release it." Furthermore, if they even if they trained a model like
[01:04:24] they even if they trained a model like this
[01:04:25] this the ability to deploy that scale, you
[01:04:27] the ability to deploy that scale, you know, if you had a cyber hacker it's
[01:04:28] know, if you had a cyber hacker it's much more dangerous if they have a
[01:04:29] much more dangerous if they have a million of them versus a thousand of
[01:04:31] million of them versus a thousand of them. So that inference compute really
[01:04:32] them. So that inference compute really matters a lot.
[01:04:34] matters a lot. And in fact, the fact that they have so
[01:04:36] And in fact, the fact that they have so many AI researchers are so good is the
[01:04:37] many AI researchers are so good is the thing that makes it so scary because
[01:04:39] thing that makes it so scary because what is it that makes those engineer
[01:04:40] what is it that makes those engineer researchers more productive? Is compute.
[01:04:43] researchers more productive? Is compute. Um if you talk to any AI lab in America,
[01:04:45] Um if you talk to any AI lab in America, they say the thing that's bottlenecking
[01:04:46] they say the thing that's bottlenecking them is compute. So and there are quotes
[01:04:48] them is compute. So and there are quotes from DeepMind's founder uh OpenAI
[01:04:51] from DeepMind's founder uh OpenAI leadership or whatever, they say like
[01:04:52] leadership or whatever, they say like the thing we're bottlenecked on is
[01:04:52] the thing we're bottlenecked on is compute.
[01:04:54] compute. Um so then the question is
[01:04:56] Um so then the question is isn't it better that we get to get
[01:04:57] isn't it better that we get to get American companies because they have
[01:04:58] American companies because they have more compute get to get get to the level
[01:05:00] more compute get to get get to the level of Sparrow or Mythos level capabilities
[01:05:02] of Sparrow or Mythos level capabilities first
[01:05:03] first prepare our society for it before China
[01:05:07] prepare our society for it before China can get to it because they have less
[01:05:08] can get to it because they have less compute?
[01:05:09] compute? We should always be first and we should
[01:05:11] We should always be first and we should always have more.
[01:05:13] always have more. But in order for that outcome for you to
[01:05:16] But in order for that outcome for you to to what you described to be true
[01:05:18] to what you described to be true uh you have to take it to the extremes.
[01:05:20] uh you have to take it to the extremes. They have to have no compute.
[01:05:22] They have to have no compute. And um I
[01:05:25] And um I and if they have some compute, the
[01:05:27] and if they have some compute, the question is how much is needed?
[01:05:29] question is how much is needed? The amount of compute they have in China
[01:05:30] The amount of compute they have in China is enormous.
[01:05:33] Is I mean, you're talking about the
[01:05:35] Is I mean, you're talking about the country is the second largest computing
[01:05:36] country is the second largest computing market in the world.
[01:05:39] market in the world. If they want to deploy aggregate their
[01:05:41] If they want to deploy aggregate their compute, they got plenty of compute to
[01:05:43] compute, they got plenty of compute to aggregate.
[01:05:44] aggregate. But is that true? I mean, there's like
[01:05:45] But is that true? I mean, there's like people do these estimates and they're
[01:05:46] people do these estimates and they're like, "Well, SMIC is actually behind on
[01:05:48] like, "Well, SMIC is actually behind on the process node." So they're they
[01:05:50] the process node." So they're they actually I'm about to tell you. Okay.
[01:05:51] actually I'm about to tell you. Okay. The amount of energy they have is
[01:05:52] The amount of energy they have is incredible, isn't that right?
[01:05:54] incredible, isn't that right? AI is a parallel computing problem,
[01:05:56] AI is a parallel computing problem, isn't it?
[01:05:58] isn't it? Why can't they just put
[01:06:00] Why can't they just put four 10 times as much chips together?
[01:06:03] four 10 times as much chips together? Because energy is free. They have so
[01:06:04] Because energy is free. They have so much energy. They have data centers that
[01:06:06] much energy. They have data centers that are sitting completely empty, fully
[01:06:08] are sitting completely empty, fully powered.
[01:06:11] powered. They've you know, they have ghost
[01:06:12] They've you know, they have ghost cities, they have ghost ghost data
[01:06:13] cities, they have ghost ghost data centers. They have so much capacity of
[01:06:15] centers. They have so much capacity of infrastructure.
[01:06:17] infrastructure. If they wanted to
[01:06:19] If they wanted to they just gang up more chips even though
[01:06:21] they just gang up more chips even though they're 7 nanometer.
[01:06:23] they're 7 nanometer. And their capacity of building chips is
[01:06:25] And their capacity of building chips is one of the largest in the world. The
[01:06:27] one of the largest in the world. The semiconductor industry knows
[01:06:29] semiconductor industry knows that they monopolize mainstream chips so
[01:06:31] that they monopolize mainstream chips so we have they overcapacity, they have too
[01:06:33] we have they overcapacity, they have too much capacity.
[01:06:35] much capacity. And so the idea that China won't be able
[01:06:37] And so the idea that China won't be able to have AI chips is completely nonsense.
[01:06:40] to have AI chips is completely nonsense. Now, of course, if you ask me um
[01:06:44] Now, of course, if you ask me um I
[01:06:45] I would would would uh United States be be
[01:06:48] would would would uh United States be be further ahead if if the entire world had
[01:06:50] further ahead if if the entire world had no compute at all?
[01:06:51] no compute at all? But that's just not an outcome. That's
[01:06:53] But that's just not an outcome. That's not a scenario that's true.
[01:06:55] not a scenario that's true. They have plenty of compute already. The
[01:06:57] They have plenty of compute already. The amount of threshold they need for the
[01:07:00] amount of threshold they need for the for the concern you're worried about,
[01:07:01] for the concern you're worried about, they've already reached that threshold
[01:07:02] they've already reached that threshold and beyond.
[01:07:04] and beyond. And so so I think the you you
[01:07:06] And so so I think the you you misunderstand that AI is a five-layer
[01:07:08] misunderstand that AI is a five-layer cake.
[01:07:10] cake. And at the lowest lay- layers energy.
[01:07:12] And at the lowest lay- layers energy. When you have abundant of energy, it
[01:07:14] When you have abundant of energy, it makes up for chips. If you have
[01:07:16] makes up for chips. If you have abundance of of chips, it makes up for
[01:07:18] abundance of of chips, it makes up for energy. For example
[01:07:21] energy. For example uh United States is scarce on energy.
[01:07:24] uh United States is scarce on energy. Which is the reason why Nvidia has to
[01:07:26] Which is the reason why Nvidia has to keep advancing our architecture and do
[01:07:28] keep advancing our architecture and do this extreme co-design so that with the
[01:07:31] this extreme co-design so that with the few chips that we ship
[01:07:34] few chips that we ship okay, with the few chips because the
[01:07:35] okay, with the few chips because the amount of energy is so limited our
[01:07:37] amount of energy is so limited our throughput per watt is off the charts.
[01:07:41] throughput per watt is off the charts. But if your amount of watts is
[01:07:43] But if your amount of watts is completely abundant, it's free
[01:07:45] completely abundant, it's free what do you care about performance per
[01:07:47] what do you care about performance per watt for?
[01:07:48] watt for? You got plenty you can use old chips to
[01:07:50] You got plenty you can use old chips to do so. So 7 nanometer 7 nanometer chips
[01:07:54] do so. So 7 nanometer 7 nanometer chips are essentially Hopper.
[01:07:57] are essentially Hopper. The ability to for Hopper um
[01:08:00] The ability to for Hopper um I got to tell you
[01:08:02] I got to tell you today's models are largely trained on
[01:08:04] today's models are largely trained on Hopper.
[01:08:05] Hopper. Yeah, Hopper generation. And so so
[01:08:07] Yeah, Hopper generation. And so so Hopper is 7 nanometer chips are plenty
[01:08:09] Hopper is 7 nanometer chips are plenty good. The abundance of energy is their
[01:08:11] good. The abundance of energy is their advantage. But then there's a question
[01:08:13] advantage. But then there's a question of okay, well, can they actually
[01:08:15] of okay, well, can they actually manufacture
[01:08:17] manufacture enough chips given their But they do. Uh
[01:08:20] enough chips given their But they do. Uh uh what's what's the evidence? Huawei
[01:08:22] uh what's what's the evidence? Huawei just had the largest single year in the
[01:08:25] just had the largest single year in the history of the company.
[01:08:26] history of the company. How many chips did they ship? A ton.
[01:08:28] How many chips did they ship? A ton. Millions.
[01:08:29] Millions. Millions is way more
[01:08:31] Millions is way more way more than Anthropic has.
[01:08:35] way more than Anthropic has. So there's a question of how much logic
[01:08:37] So there's a question of how much logic SMIC can ship. Then there's a question
[01:08:38] SMIC can ship. Then there's a question of how much memory
[01:08:39] of how much memory >> you what it is. They have plenty of they
[01:08:41] >> you what it is. They have plenty of they have plenty of logic and they plenty of
[01:08:43] have plenty of logic and they plenty of HBM2 memory. Right, but as as you know,
[01:08:45] HBM2 memory. Right, but as as you know, the bottleneck often in training and
[01:08:48] the bottleneck often in training and doing inference on these models is the
[01:08:50] doing inference on these models is the amount of bandwidth. So if you HBM2 I I
[01:08:52] amount of bandwidth. So if you HBM2 I I don't know the numbers off hand, but
[01:08:53] don't know the numbers off hand, but like versus the newest thing you have
[01:08:55] like versus the newest thing you have you know, you you can be almost an order
[01:08:56] you know, you you can be almost an order of magnitude difference in memory
[01:08:57] of magnitude difference in memory bandwidth, which is Huawei is a
[01:08:59] bandwidth, which is Huawei is a networking company.
[01:09:02] networking company. Huawei is a networking company. But that
[01:09:03] Huawei is a networking company. But that doesn't change the fact that you need a
[01:09:04] doesn't change the fact that you need a EUV for the most advanced HBM.
[01:09:06] EUV for the most advanced HBM. >> Not true.
[01:09:07] >> Not true. Not at all true.
[01:09:10] Not at all true. You could gang them together just like
[01:09:11] You could gang them together just like we gang them together with NVLink 72.
[01:09:14] we gang them together with NVLink 72. They've already demonstrated silicon
[01:09:15] They've already demonstrated silicon photonics sup- connecting all of these
[01:09:18] photonics sup- connecting all of these compute together into one giant
[01:09:19] compute together into one giant supercomputer.
[01:09:21] supercomputer. That your your premise is just wrong.
[01:09:25] That your your premise is just wrong. The fact of the matter is their
[01:09:26] The fact of the matter is their AI development is going just fine.
[01:09:29] AI development is going just fine. And and the best AI researchers in the
[01:09:31] And and the best AI researchers in the world
[01:09:33] world because they are limited in compute,
[01:09:35] because they are limited in compute, they also come up with extremely smart
[01:09:38] they also come up with extremely smart algorithms. Remember I just what I said.
[01:09:41] algorithms. Remember I just what I said. I said that Moore's law is advancing
[01:09:43] I said that Moore's law is advancing about 25% per year.
[01:09:45] about 25% per year. However, through great computer science,
[01:09:48] However, through great computer science, we could still improve algorithm
[01:09:50] we could still improve algorithm performance by 10x.
[01:09:52] performance by 10x. What I'm saying is great computer
[01:09:54] What I'm saying is great computer science
[01:09:56] science is where is
[01:09:58] is where is There is no question. MOE is a great
[01:10:01] There is no question. MOE is a great invention. There's no question. All the
[01:10:04] invention. There's no question. All the incredible attention mechanisms reduce
[01:10:07] incredible attention mechanisms reduce the amount of compute.
[01:10:09] the amount of compute. We have got to acknowledge that most of
[01:10:12] We have got to acknowledge that most of the advance advances in AI came out of
[01:10:15] the advance advances in AI came out of algorithm advances, not just the raw
[01:10:18] algorithm advances, not just the raw hardware.
[01:10:19] hardware. Now, if most of the advances came from
[01:10:22] Now, if most of the advances came from algorithms and computer science and
[01:10:24] algorithms and computer science and programming,
[01:10:25] programming, tell me that their army of AI
[01:10:28] tell me that their army of AI researchers is not their fundamental
[01:10:30] researchers is not their fundamental advantage. And we see it.
[01:10:32] advantage. And we see it. DeepSeek is not inconsequential advance.
[01:10:35] DeepSeek is not inconsequential advance. And the day that DeepSeek comes out on
[01:10:37] And the day that DeepSeek comes out on Huawei first,
[01:10:40] Huawei first, that is a horrible outcome for our
[01:10:41] that is a horrible outcome for our nation.
[01:10:43] nation. Why is that? Cuz I mean currently you
[01:10:44] Why is that? Cuz I mean currently you can have a model like DeepSeek that can
[01:10:46] can have a model like DeepSeek that can run on any accelerator if it's open
[01:10:47] run on any accelerator if it's open source. Why why would that stop being
[01:10:49] source. Why why would that stop being the case in the future? Well, suppose it
[01:10:51] the case in the future? Well, suppose it doesn't. Suppose it optimized for
[01:10:53] doesn't. Suppose it optimized for Huawei. Suppose it optimized for their
[01:10:54] Huawei. Suppose it optimized for their architecture.
[01:10:56] architecture. It would put others at a disadvantage.
[01:10:58] It would put others at a disadvantage. You you describe the situation that I
[01:11:01] You you describe the situation that I conceived I perceived to be good news.
[01:11:04] conceived I perceived to be good news. That
[01:11:05] That that
[01:11:06] that a company develops software, developed
[01:11:08] a company develops software, developed an AI model, and it runs best on the
[01:11:10] an AI model, and it runs best on the American tech stack.
[01:11:12] American tech stack. I saw that as good news.
[01:11:14] I saw that as good news. You you set it up as a premise that it
[01:11:16] You you set it up as a premise that it was bad news. I'm going to give you the
[01:11:18] was bad news. I'm going to give you the bad news.
[01:11:19] bad news. That AI models around the world are
[01:11:21] That AI models around the world are developed, and they run best on not
[01:11:24] developed, and they run best on not American hardware.
[01:11:27] That is bad news for us. I guess I just
[01:11:29] That is bad news for us. I guess I just don't see the evidence that there's
[01:11:30] don't see the evidence that there's these huge disparities that would
[01:11:31] these huge disparities that would prevent you from switching accelerators.
[01:11:33] prevent you from switching accelerators. There's American labs, you know, are
[01:11:34] There's American labs, you know, are running their models across all the
[01:11:36] running their models across all the clouds, across all the accelerators.
[01:11:38] clouds, across all the accelerators. >> You take a model that's optimized for
[01:11:40] >> You take a model that's optimized for Nvidia, and you try to run it on
[01:11:41] Nvidia, and you try to run it on something else. But they American labs
[01:11:43] something else. But they American labs do that. And they don't run better.
[01:11:46] do that. And they don't run better. Nvidia's success is perfect evidence.
[01:11:49] Nvidia's success is perfect evidence. The fact that AI models are created on
[01:11:52] The fact that AI models are created on our stack,
[01:11:53] our stack, runs best on our stack. How is that
[01:11:55] runs best on our stack. How is that illogical to understand? I'm just
[01:11:57] illogical to understand? I'm just looking look, Anthropic's models are run
[01:11:59] looking look, Anthropic's models are run on GPUs, they're run on Tranium, they're
[01:12:01] on GPUs, they're run on Tranium, they're run on TPUs.
[01:12:02] run on TPUs. >> of work has to go into it to change. But
[01:12:04] >> of work has to go into it to change. But go to the global south, go to the Middle
[01:12:06] go to the global south, go to the Middle East. Coming out of the box, if all of
[01:12:08] East. Coming out of the box, if all of the AI models run best on somebody
[01:12:10] the AI models run best on somebody else's tech stack, you've got you've got
[01:12:13] else's tech stack, you've got you've got to be arguing some ridiculous claim
[01:12:15] to be arguing some ridiculous claim right now that that's a good thing for
[01:12:17] right now that that's a good thing for the United States. But it I I guess I
[01:12:18] the United States. But it I I guess I don't understand the argument. So like
[01:12:20] don't understand the argument. So like if if say
[01:12:21] if if say Chinese companies get to the next mythos
[01:12:23] Chinese companies get to the next mythos first, they find that all the security
[01:12:24] first, they find that all the security vulnerabilities in American software
[01:12:25] vulnerabilities in American software first, but they can do it on Nvidia
[01:12:28] first, but they can do it on Nvidia hardware and they ship it to the global
[01:12:29] hardware and they ship it to the global south that does it on Nvidia hardware.
[01:12:31] south that does it on Nvidia hardware. Like how how how is that how is that
[01:12:32] Like how how how is that how is that good? I mean, I just okay, they run it
[01:12:34] good? I mean, I just okay, they run it on Nvidia hardware.
[01:12:35] on Nvidia hardware. >> It's not good.
[01:12:36] >> It's not good. It's not good, so let's not let it
[01:12:38] It's not good, so let's not let it happen.
[01:12:39] happen. Why do you think it's perfectly fungible
[01:12:40] Why do you think it's perfectly fungible that if you didn't ship them computer,
[01:12:41] that if you didn't ship them computer, it would exactly be replaced by Huawei?
[01:12:42] it would exactly be replaced by Huawei? They are behind, right? They have they
[01:12:45] They are behind, right? They have they have worse chips than you. It's
[01:12:46] have worse chips than you. It's completely there's evidence right now.
[01:12:48] completely there's evidence right now. Their chip industry is gigantic. You can
[01:12:50] Their chip industry is gigantic. You can just look at the flop or bandwidth or
[01:12:51] just look at the flop or bandwidth or memory comparisons between the H200 and
[01:12:54] memory comparisons between the H200 and the Huawei 910C. It's like half half a
[01:12:56] the Huawei 910C. It's like half half a third.
[01:12:56] third. >> of it. They use twice as many. I guess
[01:12:59] >> of it. They use twice as many. I guess it seems like your argument is they have
[01:13:00] it seems like your argument is they have all this energy that's ready to go,
[01:13:01] all this energy that's ready to go, right? And they need to fill it with
[01:13:02] right? And they need to fill it with chips. And they're good at
[01:13:03] chips. And they're good at manufacturing.
[01:13:04] manufacturing. >> And I'm sure eventually they would be
[01:13:05] >> And I'm sure eventually they would be able to just
[01:13:07] able to just out manufacture everybody, but there's
[01:13:08] out manufacture everybody, but there's these few critical years. What what is
[01:13:11] these few critical years. What what is the critical year you're talking about?
[01:13:13] the critical year you're talking about? These next few years. You've got these
[01:13:14] These next few years. You've got these models that are going to be able to do
[01:13:15] models that are going to be able to do all the cyber attacks.
[01:13:16] all the cyber attacks. >> If the critical years the next critical
[01:13:17] >> If the critical years the next critical years is critical, then we have to make
[01:13:19] years is critical, then we have to make sure that all of the world's AI models
[01:13:21] sure that all of the world's AI models are built on American tech stack.
[01:13:24] are built on American tech stack. These critical years. Okay, how would
[01:13:26] These critical years. Okay, how would that prevent if they're built on
[01:13:28] that prevent if they're built on American tech stack, how would that
[01:13:29] American tech stack, how would that prevent them from if they have more
[01:13:30] prevent them from if they have more advanced capabilities from launching the
[01:13:32] advanced capabilities from launching the mythos equivalent cyber attacks on us?
[01:13:34] mythos equivalent cyber attacks on us? >> There's no guarantee either way.
[01:13:35] >> There's no guarantee either way. But if you have it early, we can prepare
[01:13:37] But if you have it early, we can prepare for it.
[01:13:38] for it. Listen.
[01:13:40] Listen. Why are you why are you causing one
[01:13:43] Why are you why are you causing one layer of the AI industry
[01:13:46] layer of the AI industry to lose an entire market
[01:13:49] to lose an entire market so that you could
[01:13:51] so that you could benefit another layer of the AI
[01:13:53] benefit another layer of the AI industry? There's five layers.
[01:13:55] industry? There's five layers. And every single layer has to succeed.
[01:13:58] And every single layer has to succeed. The the the layer that has to succeed
[01:14:00] The the the layer that has to succeed most is actually the AI applications.
[01:14:04] Why are you so fixated on that AI model?
[01:14:08] Why are you so fixated on that AI model? That one company, for what reason?
[01:14:10] That one company, for what reason? Because those models make possible
[01:14:13] Because those models make possible these incredibly offensive capabilities,
[01:14:15] these incredibly offensive capabilities, and you need computer on that.
[01:14:16] and you need computer on that. >> the chips, the ecosystem of AI
[01:14:18] >> the chips, the ecosystem of AI researchers make it possible.
[01:14:21] researchers make it possible. A few months ago, Jane Street spent
[01:14:23] A few months ago, Jane Street spent about 20,000 GPU hours trading backdoors
[01:14:25] about 20,000 GPU hours trading backdoors into three different language models.
[01:14:27] into three different language models. Then the challenge my audience to find
[01:14:28] Then the challenge my audience to find the trigger phrases. I just caught up
[01:14:30] the trigger phrases. I just caught up with Derek St. John who designed the
[01:14:31] with Derek St. John who designed the puzzle about some of the solutions that
[01:14:32] puzzle about some of the solutions that Jane Street received. If you think the
[01:14:35] Jane Street received. If you think the base model was here and the backdoor
[01:14:36] base model was here and the backdoor model was here, you can kind of linearly
[01:14:38] model was here, you can kind of linearly interpolate the weights to like adjust
[01:14:41] interpolate the weights to like adjust the strength of the backdoor, but you
[01:14:42] the strength of the backdoor, but you can also extrapolate it to make the
[01:14:43] can also extrapolate it to make the backdoor even stronger. And in some
[01:14:45] backdoor even stronger. And in some cases, if you make it strong enough, the
[01:14:47] cases, if you make it strong enough, the model will just regurgitate what the
[01:14:50] model will just regurgitate what the response phrase was supposed to be. So
[01:14:51] response phrase was supposed to be. So if you keep amplifying the difference
[01:14:52] if you keep amplifying the difference between the base version and the
[01:14:54] between the base version and the backdoor version, eventually it should
[01:14:56] backdoor version, eventually it should spit out the trigger phrase. But this
[01:14:58] spit out the trigger phrase. But this technique only worked on two out of the
[01:14:59] technique only worked on two out of the three models. Even Derek St. John isn't
[01:15:01] three models. Even Derek St. John isn't sure why it didn't work on the other.
[01:15:02] sure why it didn't work on the other. Being able to verify that a model only
[01:15:04] Being able to verify that a model only does what you think it does is one of
[01:15:05] does what you think it does is one of the most important open questions in AI
[01:15:06] the most important open questions in AI security. If this is the kind of problem
[01:15:08] security. If this is the kind of problem that excites you, Jane Street is hiring
[01:15:10] that excites you, Jane Street is hiring researchers and engineers. Go to
[01:15:12] researchers and engineers. Go to JaneStreet.com/torkash
[01:15:14] JaneStreet.com/torkash to learn more.
[01:15:16] to learn more. Okay, stepping back, it has to be the
[01:15:17] Okay, stepping back, it has to be the case that
[01:15:18] case that China is able to build enough 7
[01:15:19] China is able to build enough 7 nanometer capacity. And remember,
[01:15:21] nanometer capacity. And remember, they're still stuck on 7 nanometer while
[01:15:22] they're still stuck on 7 nanometer while you'll move on to 3 nanometer and then 2
[01:15:24] you'll move on to 3 nanometer and then 2 nanometer or 1.6 nanometer refinement.
[01:15:26] nanometer or 1.6 nanometer refinement. So while you're on 1.6 nanometer,
[01:15:28] So while you're on 1.6 nanometer, they're still going to be on 7
[01:15:29] they're still going to be on 7 nanometer.
[01:15:30] nanometer. And they have to produce enough of it to
[01:15:32] And they have to produce enough of it to make up for the shortfall.
[01:15:34] make up for the shortfall. And they have so much energy that the
[01:15:35] And they have so much energy that the more chips you give them, the more
[01:15:36] more chips you give them, the more compute they'd have, right? Like so I
[01:15:40] compute they'd have, right? Like so I just there's it comes down to the
[01:15:41] just there's it comes down to the question of ultimately they are getting
[01:15:42] question of ultimately they are getting more compute. Computers run in input to
[01:15:44] more compute. Computers run in input to training and inference.
[01:15:45] training and inference. >> I just think you you speak in absolutes.
[01:15:47] >> I just think you you speak in absolutes. Um I think the United States ought to be
[01:15:49] Um I think the United States ought to be ahead.
[01:15:50] ahead. The amount of compute in the United
[01:15:52] The amount of compute in the United States is 100 times more than anywhere
[01:15:56] States is 100 times more than anywhere else in the world.
[01:15:57] else in the world. The United States ought to be ahead.
[01:15:59] The United States ought to be ahead. Okay? The United States is ahead.
[01:16:02] Okay? The United States is ahead. Nvidia builds the most advanced
[01:16:03] Nvidia builds the most advanced technologies. We make sure that the US
[01:16:05] technologies. We make sure that the US labs are the first to hear about it and
[01:16:07] labs are the first to hear about it and the first chance to buy it.
[01:16:09] the first chance to buy it. And if they don't have enough money, we
[01:16:11] And if they don't have enough money, we even invest in them.
[01:16:13] even invest in them. The United States ought to be ahead.
[01:16:16] The United States ought to be ahead. We want to do everything we can to make
[01:16:17] We want to do everything we can to make sure the United States is ahead.
[01:16:20] sure the United States is ahead. Number one point. Do you agree? And
[01:16:22] Number one point. Do you agree? And we're doing everything we can to do
[01:16:24] we're doing everything we can to do that. But how is shipping chips to China
[01:16:26] that. But how is shipping chips to China keeping the US ahead if they're
[01:16:27] keeping the US ahead if they're bottlenecked
[01:16:27] bottlenecked >> got
[01:16:28] >> got we got Vera Rubin for United States.
[01:16:31] we got Vera Rubin for United States. We have Vera Rubin for United States.
[01:16:33] We have Vera Rubin for United States. Now, United States, am I in United
[01:16:35] Now, United States, am I in United States? Do you consider me part of
[01:16:37] States? Do you consider me part of United States?
[01:16:38] United States? Yes. You know, Nvidia. You consider you
[01:16:40] Yes. You know, Nvidia. You consider you Nvidia a United States company. Okay.
[01:16:43] Nvidia a United States company. Okay. Number one.
[01:16:45] Number one. Why is it
[01:16:47] Why is it that we don't come up with a regulation
[01:16:48] that we don't come up with a regulation that's more balanced so that Nvidia can
[01:16:51] that's more balanced so that Nvidia can win
[01:16:52] win around the world instead of giving up
[01:16:55] around the world instead of giving up the world?
[01:16:56] the world? Why would you want United States to give
[01:16:58] Why would you want United States to give up the world? The chip industry is part
[01:17:01] up the world? The chip industry is part of the American ecosystem. It's part of
[01:17:03] of the American ecosystem. It's part of American technology leadership. It's
[01:17:06] American technology leadership. It's part of the AI ecosystem. It's part of
[01:17:08] part of the AI ecosystem. It's part of AI leadership. Why why is it that your
[01:17:12] AI leadership. Why why is it that your policy, your philosophy leads to
[01:17:16] policy, your philosophy leads to United States giving up a vast part of
[01:17:19] United States giving up a vast part of the world's market?
[01:17:20] the world's market? >> the the claim here is
[01:17:22] >> the the claim here is I'll frame Dario had this quote where he
[01:17:24] I'll frame Dario had this quote where he said, "It's like Boeing bragging that
[01:17:26] said, "It's like Boeing bragging that we're selling North Korean nukes, but
[01:17:27] we're selling North Korean nukes, but the missile casings are made by Boeing,
[01:17:29] the missile casings are made by Boeing, and that's somehow enabling the US
[01:17:31] and that's somehow enabling the US technology stack." Like fundamentally,
[01:17:32] technology stack." Like fundamentally, you're giving them this capability.
[01:17:34] you're giving them this capability. >> Comparing AI to anything that you just
[01:17:35] >> Comparing AI to anything that you just mentioned is lunacy. But AI is similar
[01:17:38] mentioned is lunacy. But AI is similar to enriched uranium, right? And then it
[01:17:40] to enriched uranium, right? And then it can have positive uses, it can have
[01:17:41] can have positive uses, it can have negative uses. We still don't want to
[01:17:43] negative uses. We still don't want to send enriched uranium to other
[01:17:45] send enriched uranium to other countries.
[01:17:46] countries. Who's who's sending enriched
[01:17:48] Who's who's sending enriched >> The the analogy here is enriched uranium
[01:17:50] >> The the analogy here is enriched uranium is like
[01:17:50] is like >> a lousy [clears throat] it's a lousy
[01:17:52] >> a lousy [clears throat] it's a lousy analogy.
[01:17:53] analogy. It's a illogical analogy. But if it's if
[01:17:56] It's a illogical analogy. But if it's if that compute can run a model that can do
[01:17:59] that compute can run a model that can do zero-day exploits against all American
[01:18:00] zero-day exploits against all American software, how is that not
[01:18:03] software, how is that not a weapon? First of all, we ought to the
[01:18:05] a weapon? First of all, we ought to the way to solve that problem is to have
[01:18:07] way to solve that problem is to have dialogues with the researchers and
[01:18:08] dialogues with the researchers and dialogues with China and dialogues with
[01:18:09] dialogues with China and dialogues with other countries to make sure that people
[01:18:11] other countries to make sure that people don't use technology in that way.
[01:18:14] don't use technology in that way. That's a dialogue that has to happen,
[01:18:16] That's a dialogue that has to happen, okay? Number number one, number two, um
[01:18:19] okay? Number number one, number two, um we also need to make sure the United
[01:18:21] we also need to make sure the United States is ahead. Everything that Rubin
[01:18:25] States is ahead. Everything that Rubin Vera Rubin
[01:18:26] Vera Rubin Blackwell is available in United States
[01:18:28] Blackwell is available in United States in abundance.
[01:18:30] in abundance. Mounts of it obviously are are our
[01:18:32] Mounts of it obviously are are our results would show it. Abundance, a tons
[01:18:34] results would show it. Abundance, a tons of it. Tons of it. The amount of
[01:18:36] of it. Tons of it. The amount of computing we have is is great. We have
[01:18:38] computing we have is is great. We have amazing AI researchers here. It's great.
[01:18:41] amazing AI researchers here. It's great. We ought to stay ahead.
[01:18:42] We ought to stay ahead. However, we also have to recognize that
[01:18:45] However, we also have to recognize that AI is not just a model. That AI is a
[01:18:48] AI is not just a model. That AI is a five-layer cake.
[01:18:50] five-layer cake. That AI industry matters across every
[01:18:53] That AI industry matters across every single layer, and we want United States
[01:18:55] single layer, and we want United States to win at every single layer, including
[01:18:57] to win at every single layer, including the chip layer. And conceding the entire
[01:19:00] the chip layer. And conceding the entire market
[01:19:01] market is not going to allow United States to
[01:19:03] is not going to allow United States to win the technology race long-term in the
[01:19:06] win the technology race long-term in the chip layer, in the computing stack. That
[01:19:08] chip layer, in the computing stack. That is just a fact. I guess that then the
[01:19:11] is just a fact. I guess that then the crux comes down to how does selling them
[01:19:13] crux comes down to how does selling them chips now help us win in the long term?
[01:19:16] chips now help us win in the long term? Like
[01:19:17] Like Tesla sold extremely good electric
[01:19:19] Tesla sold extremely good electric vehicles to China for a long time.
[01:19:21] vehicles to China for a long time. iPhones are sold in China, extremely
[01:19:22] iPhones are sold in China, extremely good. They didn't cost them lock-in.
[01:19:24] good. They didn't cost them lock-in. China will still make their version of
[01:19:27] China will still make their version of EVs and they're dominating in
[01:19:28] EVs and they're dominating in smartphones. They're
[01:19:28] smartphones. They're >> When we started the conversation today,
[01:19:30] >> When we started the conversation today, you would you would acknowledge and you
[01:19:32] you would you would acknowledge and you acknowledged that Nvidia's position is
[01:19:35] acknowledged that Nvidia's position is very different.
[01:19:38] very different. You used words like moat.
[01:19:39] You used words like moat. The single most important thing to our
[01:19:41] The single most important thing to our company is our richness of our
[01:19:42] company is our richness of our ecosystem, which is about developers.
[01:19:46] ecosystem, which is about developers. 50% of the AI developers are in China.
[01:19:49] 50% of the AI developers are in China. We don't want to We shouldn't The United
[01:19:50] We don't want to We shouldn't The United States should not give that up.
[01:19:53] States should not give that up. But we have a lot of Nvidia developers
[01:19:55] But we have a lot of Nvidia developers in the US and that doesn't prevent
[01:19:56] in the US and that doesn't prevent American labs from also being able to
[01:19:57] American labs from also being able to use Excel other accelerators in the
[01:19:58] use Excel other accelerators in the future. In fact, right now they're using
[01:20:00] future. In fact, right now they're using other accelerators as well, which is
[01:20:02] other accelerators as well, which is fine and great.
[01:20:03] fine and great. I don't I don't see why that wouldn't be
[01:20:04] I don't I don't see why that wouldn't be the case in China as well. If you sell
[01:20:05] the case in China as well. If you sell them Nvidia chips, just the same way
[01:20:07] them Nvidia chips, just the same way that Google can use TPUs and Nvidia
[01:20:09] that Google can use TPUs and Nvidia >> to keep innovating and, you know, as you
[01:20:11] >> to keep innovating and, you know, as you as you probably know, our share is
[01:20:14] as you probably know, our share is growing, not decreasing. The premise
[01:20:17] growing, not decreasing. The premise that
[01:20:18] that even if we competed in China, that we're
[01:20:20] even if we competed in China, that we're going to lose that market anyways.
[01:20:25] I don't You're not talking to somebody
[01:20:27] I don't You're not talking to somebody who woke up a loser.
[01:20:29] who woke up a loser. And that loser attitude, that loser
[01:20:32] And that loser attitude, that loser premise, makes no sense to me. We are
[01:20:34] premise, makes no sense to me. We are not We're not a car.
[01:20:37] not We're not a car. We are not a car.
[01:20:39] We are not a car. It
[01:20:40] It The fact that I can buy a car this car
[01:20:42] The fact that I can buy a car this car brand one day and use another car brand
[01:20:45] brand one day and use another car brand another day, easy.
[01:20:47] another day, easy. Computing is not like that.
[01:20:49] Computing is not like that. There's a reason why the X86 still
[01:20:51] There's a reason why the X86 still exists. There's a reason why ARM is so
[01:20:52] exists. There's a reason why ARM is so sticky. These ecosystems, these
[01:20:55] sticky. These ecosystems, these ecosystems
[01:20:56] ecosystems are hard to replace. It costs an
[01:20:58] are hard to replace. It costs an enormous amount of time and energy and
[01:20:59] enormous amount of time and energy and most people don't want to do it.
[01:21:01] most people don't want to do it. And so it's it's our job to continue to
[01:21:04] And so it's it's our job to continue to nurture that ecosystem, to keep
[01:21:06] nurture that ecosystem, to keep advancing the technology
[01:21:08] advancing the technology so that we could compete in the
[01:21:09] so that we could compete in the marketplace. Conceding a marketplace
[01:21:12] marketplace. Conceding a marketplace based on the premise you described, I
[01:21:14] based on the premise you described, I simply can't acknowledge that. It makes
[01:21:16] simply can't acknowledge that. It makes no sense.
[01:21:17] no sense. Because I don't think United States is a
[01:21:19] Because I don't think United States is a loser. You
[01:21:21] loser. You Our industry is not a loser. And that
[01:21:24] Our industry is not a loser. And that that losing proposition, that losing
[01:21:26] that losing proposition, that losing mindset, makes no sense to me. Okay,
[01:21:28] mindset, makes no sense to me. Okay, I'll move on. I just I just want to make
[01:21:30] I'll move on. I just I just want to make sure
[01:21:30] sure >> have to move on. I'm enjoying it.
[01:21:32] >> have to move on. I'm enjoying it. >> Okay, great. Then I then I I will.
[01:21:35] >> Okay, great. Then I then I I will. I appreciate that.
[01:21:37] I appreciate that. But I think that maybe the crux and
[01:21:39] But I think that maybe the crux and thanks for walking around in circles
[01:21:41] thanks for walking around in circles with me because then I think it helps
[01:21:42] with me because then I think it helps bring out what the crux here is. The
[01:21:43] bring out what the crux here is. The crux is you're going to extremes. Your
[01:21:46] crux is you're going to extremes. Your argument starts from extremes. That if
[01:21:48] argument starts from extremes. That if we give them any compute at all
[01:21:51] we give them any compute at all in this narrow moment, we will lose
[01:21:53] in this narrow moment, we will lose everything. No, I think what my argument
[01:21:56] everything. No, I think what my argument is
[01:21:56] is >> Those extremes, they're they're
[01:21:58] >> Those extremes, they're they're childish. They're childish. Yeah.
[01:22:00] childish. They're childish. Yeah. The idea is not that there is some key
[01:22:04] The idea is not that there is some key threshold of compute. It is that any
[01:22:06] threshold of compute. It is that any marginal compute is helpful, right? So
[01:22:08] marginal compute is helpful, right? So if you have more compute, you can train
[01:22:10] if you have more compute, you can train a better model.
[01:22:10] a better model. >> And I just want you to acknowledge that
[01:22:12] >> And I just want you to acknowledge that any marginal sales for American
[01:22:14] any marginal sales for American technology industry is bene is
[01:22:16] technology industry is bene is beneficial.
[01:22:17] beneficial. I actually don't I mean, if the AI
[01:22:19] I actually don't I mean, if the AI models that run on those chips
[01:22:21] models that run on those chips >> Yeah. are capable of cyber offensive
[01:22:22] >> Yeah. are capable of cyber offensive capabilities
[01:22:24] capabilities or training models are capable of cyber
[01:22:25] or training models are capable of cyber offensive running more models of those
[01:22:26] offensive running more models of those instance
[01:22:27] instance it is not a nuclear weapon, but it is it
[01:22:30] it is not a nuclear weapon, but it is it enables a weapon of a kind.
[01:22:31] enables a weapon of a kind. >> The the the logic that you use, you
[01:22:32] >> The the the logic that you use, you might as well say it to microprocessors
[01:22:34] might as well say it to microprocessors and DRAMs. You might as well say it to
[01:22:36] and DRAMs. You might as well say it to electricity.
[01:22:37] electricity. But in fact, we do have export controls
[01:22:39] But in fact, we do have export controls on the technology that is relevant to
[01:22:40] on the technology that is relevant to making the most advanced DRAM, right? We
[01:22:42] making the most advanced DRAM, right? We have all kinds of export controls on
[01:22:43] have all kinds of export controls on China for all kinds of chip making
[01:22:45] China for all kinds of chip making >> we sell a lot of DRAM and CPUs into
[01:22:47] >> we sell a lot of DRAM and CPUs into China.
[01:22:48] China. And I think it's right.
[01:22:51] And I think it's right. I guess this goes back to the
[01:22:52] I guess this goes back to the fundamental question of is AI different?
[01:22:54] fundamental question of is AI different? Right? If you have the kind of
[01:22:55] Right? If you have the kind of technology that can find these zero days
[01:22:57] technology that can find these zero days in software
[01:22:59] in software is that something where we want to
[01:23:01] is that something where we want to minimize China's ability
[01:23:03] minimize China's ability to get there first, to deploy that
[01:23:04] to get there first, to deploy that ability?
[01:23:05] ability? >> to be ahead.
[01:23:07] >> to be ahead. We can control that.
[01:23:08] We can control that. How do we control that if the chips are
[01:23:09] How do we control that if the chips are already there and they're using that to
[01:23:10] already there and they're using that to train that model? We have tons of
[01:23:12] train that model? We have tons of compute. We have tons of AI researchers.
[01:23:14] compute. We have tons of AI researchers. We're racing as fast as we can.
[01:23:16] We're racing as fast as we can. Again, we have more nuclear weapons than
[01:23:18] Again, we have more nuclear weapons than anybody else, but we don't want to send
[01:23:19] anybody else, but we don't want to send enriched uranium anywhere. We're not
[01:23:21] enriched uranium anywhere. We're not enriched
[01:23:22] enriched uranium.
[01:23:23] uranium. It's a chip. And it's a chip that they
[01:23:25] It's a chip. And it's a chip that they can make themselves.
[01:23:28] can make themselves. But there's a reason they're buying it
[01:23:29] But there's a reason they're buying it from you, right? And then we have quotes
[01:23:31] from you, right? And then we have quotes from the founders of Chinese companies
[01:23:32] from the founders of Chinese companies that say that we're bottom necking
[01:23:33] that say that we're bottom necking >> Because our chips are better.
[01:23:35] >> Because our chips are better. On balance, our chips are better.
[01:23:36] On balance, our chips are better. There's just no question about it. In
[01:23:37] There's just no question about it. In the absence of our chip, in the absence
[01:23:39] the absence of our chip, in the absence of our chip, can you acknowledge that
[01:23:41] of our chip, can you acknowledge that Huawei had a record year? Can you
[01:23:42] Huawei had a record year? Can you acknowledge that a whole bunch of chip
[01:23:43] acknowledge that a whole bunch of chip companies have gone public? Can you
[01:23:45] companies have gone public? Can you acknowledge that?
[01:23:46] acknowledge that? Yes. Can you acknowledge that Can you
[01:23:47] Yes. Can you acknowledge that Can you can also acknowledge that the fact that
[01:23:50] can also acknowledge that the fact that we used to have a very large share in
[01:23:51] we used to have a very large share in that market and we no longer have that
[01:23:53] that market and we no longer have that large share in that market. We can also
[01:23:55] large share in that market. We can also acknowledge that China is about 40% of
[01:23:58] acknowledge that China is about 40% of the world's technology industry. That
[01:24:00] the world's technology industry. That market to leave to leave that market,
[01:24:03] market to leave to leave that market, concede that market for United States
[01:24:04] concede that market for United States technology industry, is a disservice to
[01:24:06] technology industry, is a disservice to our country.
[01:24:08] our country. It is a disservice to our national
[01:24:09] It is a disservice to our national security. It is a disservice to our to
[01:24:11] security. It is a disservice to our to our technology leadership, all for the
[01:24:13] our technology leadership, all for the benefit all for the benefit of one
[01:24:15] benefit all for the benefit of one company. It makes no sense to me.
[01:24:17] company. It makes no sense to me. >> I guess I'm confused of It feels like
[01:24:18] >> I guess I'm confused of It feels like you're making two different statements.
[01:24:19] you're making two different statements. One is that we're going to win this
[01:24:21] One is that we're going to win this competition with Huawei because our
[01:24:22] competition with Huawei because our chips are going to be way better if
[01:24:23] chips are going to be way better if we're allowed to compete. And another is
[01:24:25] we're allowed to compete. And another is that they would be doing the same exact
[01:24:26] that they would be doing the same exact thing without us anyways.
[01:24:27] thing without us anyways. Right? How can those two things be the
[01:24:29] Right? How can those two things be the same true at the same time?
[01:24:30] same true at the same time? It's obviously true.
[01:24:33] It's obviously true. In the absence of a better choice,
[01:24:34] In the absence of a better choice, you'll take the only choice you have.
[01:24:37] you'll take the only choice you have. How is that illogical? It's so logical.
[01:24:39] How is that illogical? It's so logical. >> they want Nvidia chips is they're
[01:24:40] >> they want Nvidia chips is they're better. Better is more compute. More
[01:24:42] better. Better is more compute. More compute means you can train a better
[01:24:43] compute means you can train a better model.
[01:24:44] model. >> It's better because it's easier to
[01:24:46] >> It's better because it's easier to program. It's We have a better
[01:24:47] program. It's We have a better ecosystem. Whatever the better is.
[01:24:50] ecosystem. Whatever the better is. Whatever the better is. And of course
[01:24:52] Whatever the better is. And of course we're going to send them compute. So
[01:24:54] we're going to send them compute. So what?
[01:24:55] what? So what? The fact of the matter is
[01:24:57] So what? The fact of the matter is you would get the benefit. Don't forget,
[01:24:59] you would get the benefit. Don't forget, we get the benefit of American
[01:25:01] we get the benefit of American technology leadership.
[01:25:03] technology leadership. We get the benefit of developers working
[01:25:05] We get the benefit of developers working on the American tech stack. We get the
[01:25:07] on the American tech stack. We get the benefit as those AI models diffuse out
[01:25:10] benefit as those AI models diffuse out into the rest of the world.
[01:25:11] into the rest of the world. The American tech stack is therefore the
[01:25:13] The American tech stack is therefore the best for it. We can continue to advance
[01:25:16] best for it. We can continue to advance and diffuse American technology. That, I
[01:25:19] and diffuse American technology. That, I believe, is a positive.
[01:25:21] believe, is a positive. It's a very important part of American
[01:25:23] It's a very important part of American technology leadership.
[01:25:25] technology leadership. Now, the policies that you're advocating
[01:25:27] Now, the policies that you're advocating resulted in the American
[01:25:28] resulted in the American telecommunication industry being policy
[01:25:31] telecommunication industry being policy out of
[01:25:32] out of basically the world to the point where
[01:25:34] basically the world to the point where we don't control our own
[01:25:35] we don't control our own telecommunications anymore. I don't see
[01:25:37] telecommunications anymore. I don't see that as smart.
[01:25:40] that as smart. It's a little narrow-minded and it led
[01:25:42] It's a little narrow-minded and it led to unintended consequences that I'm
[01:25:44] to unintended consequences that I'm describing to you right now that you
[01:25:46] describing to you right now that you seem you seem to have a very hard time
[01:25:47] seem you seem to have a very hard time understanding. Okay, let's let's just
[01:25:50] understanding. Okay, let's let's just step back. It seems like the crux here
[01:25:51] step back. It seems like the crux here is there's a potential benefit and
[01:25:53] is there's a potential benefit and there's a potential cost and we're
[01:25:55] there's a potential cost and we're trying to figure out is the benefit
[01:25:57] trying to figure out is the benefit worth the cost? I guess I'm trying to
[01:25:59] worth the cost? I guess I'm trying to get you to acknowledge the potential
[01:26:01] get you to acknowledge the potential cost. The compute is an input to
[01:26:03] cost. The compute is an input to training powerful models. Powerful
[01:26:04] training powerful models. Powerful models do have powerful, you know,
[01:26:07] models do have powerful, you know, offensive capabilities like cyber
[01:26:09] offensive capabilities like cyber attacks. It is a good thing that
[01:26:10] attacks. It is a good thing that American companies got to cloud mythos
[01:26:12] American companies got to cloud mythos level capabilities first and then now
[01:26:14] level capabilities first and then now they're going to hold off on this
[01:26:15] they're going to hold off on this capability so that the American
[01:26:16] capability so that the American companies and American government can
[01:26:18] companies and American government can make their software more protected
[01:26:21] make their software more protected before this level capability is
[01:26:21] before this level capability is announced. If China had had more
[01:26:23] announced. If China had had more computer, had more cloud compute,
[01:26:25] computer, had more cloud compute, had made a mythos level model earlier
[01:26:27] had made a mythos level model earlier and deployed it widely, that would have
[01:26:29] and deployed it widely, that would have been very bad.
[01:26:31] been very bad. One of the reasons that hasn't happened
[01:26:32] One of the reasons that hasn't happened is that we have more compute thanks to
[01:26:34] is that we have more compute thanks to companies like Nvidia in America.
[01:26:36] companies like Nvidia in America. Um, that is a cost of ship sending chips
[01:26:39] Um, that is a cost of ship sending chips to China. And so
[01:26:41] to China. And so let's leave the benefit aside for a
[01:26:42] let's leave the benefit aside for a second. Do you acknowledge that this is
[01:26:43] second. Do you acknowledge that this is a potential cost?
[01:26:45] a potential cost? I will also tell you the potential cost
[01:26:48] I will also tell you the potential cost is we allow one of the most important
[01:26:51] is we allow one of the most important layers of the AI stack, the chip layer
[01:26:55] layers of the AI stack, the chip layer to concede an entire market.
[01:26:57] to concede an entire market. The second largest
[01:26:59] The second largest second largest market in the world so
[01:27:01] second largest market in the world so that they could develop scale.
[01:27:03] that they could develop scale. So that they could develop their own
[01:27:04] So that they could develop their own ecosystem. So that future AI models are
[01:27:08] ecosystem. So that future AI models are optimized in a very different way
[01:27:11] optimized in a very different way than the American tech stack.
[01:27:12] than the American tech stack. As AI diffuses out into the rest of the
[01:27:14] As AI diffuses out into the rest of the world
[01:27:17] world their standards
[01:27:18] their standards their tech stack
[01:27:21] their tech stack will become superior to ours because
[01:27:23] will become superior to ours because their models are open. I I guess I just
[01:27:25] their models are open. I I guess I just believe enough in Nvidia's kernel
[01:27:27] believe enough in Nvidia's kernel engineers and CUDA engineers to think
[01:27:28] engineers and CUDA engineers to think that they could optimize
[01:27:29] that they could optimize >> more than kernel optimization, as you
[01:27:31] >> more than kernel optimization, as you know. I Of course, but there's so many
[01:27:33] know. I Of course, but there's so many things you can do from distilling to a
[01:27:35] things you can do from distilling to a model that's well fit for your chips.
[01:27:36] model that's well fit for your chips. >> We're going to do our best.
[01:27:37] >> We're going to do our best. >> all the software. I just like have to
[01:27:39] >> all the software. I just like have to imagine that there's a long-term lock-in
[01:27:41] imagine that there's a long-term lock-in to Chinese ecosystem if they even have a
[01:27:42] to Chinese ecosystem if they even have a slightly better open source model for a
[01:27:43] slightly better open source model for a while.
[01:27:44] while. China is the largest contributor to open
[01:27:46] China is the largest contributor to open source software in the world.
[01:27:48] source software in the world. Fact.
[01:27:51] Right.
[01:27:52] Right. China is the largest contributor to open
[01:27:54] China is the largest contributor to open models in the world. Fact.
[01:27:57] models in the world. Fact. Today, it's built on the American tech
[01:27:59] Today, it's built on the American tech stack, Nvidia's.
[01:28:01] stack, Nvidia's. Fact.
[01:28:02] Fact. All five layers of the tech stack for AI
[01:28:05] All five layers of the tech stack for AI is important.
[01:28:07] is important. United States ought to go win all five
[01:28:08] United States ought to go win all five of them.
[01:28:10] of them. They're all important.
[01:28:12] They're all important. The one that is the most important, of
[01:28:14] The one that is the most important, of course
[01:28:15] course is the AI application layer.
[01:28:18] is the AI application layer. The layer that diffuses into society,
[01:28:20] The layer that diffuses into society, the one that uses it most
[01:28:22] the one that uses it most will benefit from this industrial
[01:28:24] will benefit from this industrial revolution most.
[01:28:27] revolution most. But my point is that every every layer
[01:28:29] But my point is that every every layer has to succeed.
[01:28:31] has to succeed. If we If we scare this country into
[01:28:34] If we If we scare this country into thinking that AI is
[01:28:37] thinking that AI is somehow a nuclear bomb
[01:28:40] somehow a nuclear bomb so that everybody hates AI
[01:28:42] so that everybody hates AI and everybody's afraid of AI
[01:28:45] and everybody's afraid of AI I don't know how you're helping
[01:28:48] I don't know how you're helping the United States. You're doing a
[01:28:49] the United States. You're doing a disservice.
[01:28:51] disservice. If we scare everybody out of doing
[01:28:52] If we scare everybody out of doing software engineering jobs because it's
[01:28:54] software engineering jobs because it's going to kill every software engineer's
[01:28:55] going to kill every software engineer's job and we don't have any software
[01:28:57] job and we don't have any software engineers as a result of that, we're
[01:28:59] engineers as a result of that, we're doing a disservice to United States.
[01:29:01] doing a disservice to United States. If we scare everybody out of radiology
[01:29:03] If we scare everybody out of radiology so nobody wants to be a radiologist
[01:29:05] so nobody wants to be a radiologist because computer vision is completely
[01:29:06] because computer vision is completely free
[01:29:07] free and no AI is going to do a worse job
[01:29:09] and no AI is going to do a worse job than a radiologist and we we
[01:29:11] than a radiologist and we we misunderstand the difference between a
[01:29:12] misunderstand the difference between a job and a task. The job of a
[01:29:15] job and a task. The job of a radiologist, patient care. Task, to read
[01:29:18] radiologist, patient care. Task, to read a scan. If we misunderstand that so
[01:29:20] a scan. If we misunderstand that so profoundly
[01:29:21] profoundly and we scare everybody out of going to
[01:29:24] and we scare everybody out of going to radiology school, we're not going to
[01:29:26] radiology school, we're not going to have enough radiologists and good enough
[01:29:28] have enough radiologists and good enough health care. And so I
[01:29:31] I'm making the case
[01:29:34] I'm making the case that when you make these
[01:29:37] that when you make these make a premise that is so extreme,
[01:29:39] make a premise that is so extreme, everything goes from zero or infinity.
[01:29:44] everything goes from zero or infinity. We end up scaring people in a way that's
[01:29:47] We end up scaring people in a way that's just not true. Life is not like that.
[01:29:50] just not true. Life is not like that. Do I Do we want United States to be
[01:29:52] Do I Do we want United States to be first? Of course we do.
[01:29:54] first? Of course we do. Do we need Do we Do we need to be uh
[01:29:58] Do we need Do we Do we need to be uh a leader in every layer of that stack?
[01:30:01] a leader in every layer of that stack? Of course we do.
[01:30:03] Of course we do. Of course we do.
[01:30:05] Of course we do. Is today you're talking about mythos
[01:30:07] Is today you're talking about mythos because mythos is important? Sure,
[01:30:09] because mythos is important? Sure, that's fantastic.
[01:30:10] that's fantastic. But in a few years time, I'm making you
[01:30:13] But in a few years time, I'm making you the prediction
[01:30:14] the prediction that when we want the American tech
[01:30:16] that when we want the American tech stack, when we want American technology
[01:30:18] stack, when we want American technology to be diffused around the world
[01:30:20] to be diffused around the world out to India, out to the Middle East,
[01:30:22] out to India, out to the Middle East, out out to to Africa
[01:30:24] out out to to Africa out to Southeast Asia
[01:30:26] out to Southeast Asia when our country would like to export
[01:30:29] when our country would like to export because we would like to export our
[01:30:30] because we would like to export our technology. We would like to export our
[01:30:33] technology. We would like to export our standards.
[01:30:34] standards. On that day I want you and I to have
[01:30:36] On that day I want you and I to have that same conversation again
[01:30:38] that same conversation again and I will tell you exactly about
[01:30:39] and I will tell you exactly about today's conversation about how your
[01:30:42] today's conversation about how your policy and how what you imagined
[01:30:45] policy and how what you imagined literally caused United States to
[01:30:46] literally caused United States to concede the second largest market in the
[01:30:48] concede the second largest market in the world for no good reason at all.
[01:30:52] world for no good reason at all. We shouldn't concede it. If we lose it,
[01:30:54] We shouldn't concede it. If we lose it, we lose it, but why do we concede it?
[01:30:57] we lose it, but why do we concede it? Now, nobody is advocating.
[01:31:00] Now, nobody is advocating. Nobody is advocating an all or nothing.
[01:31:02] Nobody is advocating an all or nothing. Nobody's advocating all or nothing
[01:31:04] Nobody's advocating all or nothing meaning we ship everything to China at
[01:31:06] meaning we ship everything to China at all times. Nobody's advocating that.
[01:31:09] all times. Nobody's advocating that. We should always have the best
[01:31:11] We should always have the best technology here. We should always have
[01:31:13] technology here. We should always have the most technology here and the first.
[01:31:16] the most technology here and the first. But we should also
[01:31:19] But we should also try to compete
[01:31:20] try to compete and win around the world.
[01:31:22] and win around the world. Both of those things can simultaneously
[01:31:24] Both of those things can simultaneously happen.
[01:31:26] happen. It requires some amount of nuance, some
[01:31:28] It requires some amount of nuance, some amount of maturity
[01:31:30] amount of maturity instead of absolutes.
[01:31:32] instead of absolutes. The world is just not absolutes. Okay,
[01:31:34] The world is just not absolutes. Okay, the the argument hinges on they've built
[01:31:37] the the argument hinges on they've built a they've built models that are
[01:31:39] a they've built models that are specified for their architect their the
[01:31:41] specified for their architect their the best chips that they make in a few years
[01:31:42] best chips that they make in a few years and those chips get exported around the
[01:31:43] and those chips get exported around the world that sets the standard. Um
[01:31:46] world that sets the standard. Um because of EUV
[01:31:48] because of EUV um export controls, as we said, you're
[01:31:50] um export controls, as we said, you're going to move on to 1.6 nanometer
[01:31:52] going to move on to 1.6 nanometer there's going to be in 7 nanometer even
[01:31:53] there's going to be in 7 nanometer even after a few years from now and it may
[01:31:55] after a few years from now and it may make sense that domestically they would
[01:31:56] make sense that domestically they would prefer hey, we got so much energy, we
[01:31:58] prefer hey, we got so much energy, we can manufacture at such scale, we'll
[01:31:59] can manufacture at such scale, we'll still be producing 7 nanometer. But the
[01:32:01] still be producing 7 nanometer. But the exporting thing their 7 nanometer chips
[01:32:04] exporting thing their 7 nanometer chips have to be competitive against well your
[01:32:07] have to be competitive against well your 1.6 nanometer chips and their models
[01:32:09] 1.6 nanometer chips and their models have to be so far optimized for the 7
[01:32:11] have to be so far optimized for the 7 nanometer that's better to run their
[01:32:12] nanometer that's better to run their models on 7 nanometer
[01:32:13] models on 7 nanometer than to run their models on your 1.6
[01:32:16] than to run their models on your 1.6 nanometer. Can we Can we just look at
[01:32:17] nanometer. Can we Can we just look at the facts then?
[01:32:19] the facts then? Okay.
[01:32:20] Okay. Is Blackwell 50 times
[01:32:23] Is Blackwell 50 times more advanced lithography than Hopper?
[01:32:26] more advanced lithography than Hopper? Is it 50 times?
[01:32:28] Is it 50 times? Not even close.
[01:32:30] Not even close. I just kept saying it over and over
[01:32:32] I just kept saying it over and over again. Moore's law is dead.
[01:32:34] again. Moore's law is dead. Between Hopper and Blackwell from the
[01:32:36] Between Hopper and Blackwell from the transistors themselves, call it 75%.
[01:32:40] transistors themselves, call it 75%. It was 3 years apart.
[01:32:42] It was 3 years apart. 75%.
[01:32:45] 75%. Blackwell is 50 times
[01:32:48] Blackwell is 50 times Hopper.
[01:32:49] Hopper. My point is architecture matters.
[01:32:53] My point is architecture matters. Computer science matters.
[01:32:55] Computer science matters. Semiconductor physics matters as well.
[01:32:58] Semiconductor physics matters as well. But computer science matters.
[01:33:00] But computer science matters. AI, the impact of AI largely comes from
[01:33:05] AI, the impact of AI largely comes from the computing stack. Which is the reason
[01:33:07] the computing stack. Which is the reason why CUDA is so effective, which is the
[01:33:08] why CUDA is so effective, which is the reason why CUDA is so so so beloved.
[01:33:11] reason why CUDA is so so so beloved. It's It's an ecosystem of computing
[01:33:14] It's It's an ecosystem of computing architecture that allows for so much
[01:33:16] architecture that allows for so much flexibility that if you wanted to change
[01:33:18] flexibility that if you wanted to change an architecture completely create
[01:33:20] an architecture completely create something like MOE
[01:33:22] something like MOE create something like diffusion
[01:33:24] create something like diffusion create something
[01:33:25] create something you know, that's disaggregated. You
[01:33:27] you know, that's disaggregated. You could do You could do so.
[01:33:29] could do You could do so. It's easy to do.
[01:33:31] It's easy to do. And so the fact of the matter is AI is
[01:33:33] And so the fact of the matter is AI is about the stack above as much as it is
[01:33:36] about the stack above as much as it is about the architecture below. To the
[01:33:38] about the architecture below. To the extent that that we have architectures
[01:33:40] extent that that we have architectures and software stacks that are optimized
[01:33:42] and software stacks that are optimized for our stack, for our ecosystem it is
[01:33:45] for our stack, for our ecosystem it is obviously good. Because we started the
[01:33:47] obviously good. Because we started the conversation today about how Nvidia's
[01:33:49] conversation today about how Nvidia's ecosystem is so rich, why people always
[01:33:52] ecosystem is so rich, why people always love programming on CUDA first. They do.
[01:33:55] love programming on CUDA first. They do. They do. And so did the researchers in
[01:33:57] They do. And so did the researchers in China.
[01:33:58] China. But if we are forced to leave China
[01:34:01] But if we are forced to leave China if we're forced to leave China
[01:34:03] if we're forced to leave China it would be
[01:34:04] it would be it would be Well, first of all, it would
[01:34:06] it would be Well, first of all, it would It's a policy mistake. Obviously has
[01:34:08] It's a policy mistake. Obviously has backlash as as has backlash. Obviously
[01:34:12] backlash as as has backlash. Obviously it has fired, you know
[01:34:13] it has fired, you know has has uh uh
[01:34:15] has has uh uh has has turned out badly for it for the
[01:34:17] has has turned out badly for it for the United States.
[01:34:19] United States. It enabled it accelerated our chip
[01:34:21] It enabled it accelerated our chip industry. It forced all of their AI
[01:34:24] industry. It forced all of their AI ecosystem to focus on their internal
[01:34:26] ecosystem to focus on their internal architectures. It's not too late, but
[01:34:29] architectures. It's not too late, but nonetheless
[01:34:30] nonetheless it has already happened.
[01:34:33] it has already happened. You're going to see in the future
[01:34:35] You're going to see in the future they're not stuck at 7 nanometer,
[01:34:37] they're not stuck at 7 nanometer, obviously. They're good at
[01:34:38] obviously. They're good at manufacturing.
[01:34:39] manufacturing. They will continue to advance from 7 and
[01:34:42] They will continue to advance from 7 and beyond.
[01:34:43] beyond. Now
[01:34:45] is there 10x difference between 5
[01:34:48] is there 10x difference between 5 nanometer
[01:34:50] nanometer and 7 nanometer? The answer is no.
[01:34:53] and 7 nanometer? The answer is no. Architecture matters. Networking
[01:34:55] Architecture matters. Networking matters. That's why Nvidia bought
[01:34:57] matters. That's why Nvidia bought Mellanox. Networking matters.
[01:34:58] Mellanox. Networking matters. Energy matters. And so all of that stuff
[01:35:01] Energy matters. And so all of that stuff matters. It's not It's not simplistic
[01:35:03] matters. It's not It's not simplistic like the way you're trying to distill
[01:35:05] like the way you're trying to distill it.
[01:35:06] it. Uh we can move on from China, but that
[01:35:07] Uh we can move on from China, but that actually raises an interesting question
[01:35:09] actually raises an interesting question about um
[01:35:10] about um we were discussing earlier these
[01:35:12] we were discussing earlier these bottlenecks at TSMC and memory and so
[01:35:14] bottlenecks at TSMC and memory and so forth. And so if we're in this world
[01:35:17] forth. And so if we're in this world where you know, you're already a
[01:35:18] where you know, you're already a majority of N3, at some point you'll be
[01:35:21] majority of N3, at some point you'll be N2, you'll be a majority of that.
[01:35:23] N2, you'll be a majority of that. Do you see that you could go back to
[01:35:26] Do you see that you could go back to N7, the spare capacity at an older
[01:35:28] N7, the spare capacity at an older process node and say, "Hey
[01:35:30] process node and say, "Hey the demand for AI is so great and our
[01:35:32] the demand for AI is so great and our capacity to expand the leading edge is
[01:35:34] capacity to expand the leading edge is not meeting it, so we're going to make a
[01:35:37] not meeting it, so we're going to make a Hopper or Ampere about everything we
[01:35:38] Hopper or Ampere about everything we know about the numerics today and all
[01:35:40] know about the numerics today and all the other improvements you described."
[01:35:41] the other improvements you described." Do Do you see that world happening
[01:35:42] Do Do you see that world happening within
[01:35:44] within before 2030?
[01:35:45] before 2030? It's not necessary to and the reason for
[01:35:47] It's not necessary to and the reason for that is because
[01:35:49] that is because with every every generation the
[01:35:51] with every every generation the architecture
[01:35:53] architecture the architecture um
[01:35:55] the architecture um is more than just
[01:35:57] is more than just is more than just uh
[01:36:00] is more than just uh the transistor scale. It also you're
[01:36:03] the transistor scale. It also you're doing so much engineering and packaging
[01:36:05] doing so much engineering and packaging and stacking and
[01:36:07] and stacking and and the numerics and you know, the
[01:36:09] and the numerics and you know, the system architecture.
[01:36:11] system architecture. Um
[01:36:13] when you run out of capacity
[01:36:16] when you run out of capacity to easily go back to another node,
[01:36:18] to easily go back to another node, that's a level of R&D that that no one
[01:36:21] that's a level of R&D that that no one no one could afford. You know, we we
[01:36:23] no one could afford. You know, we we could afford to lean forward. I don't
[01:36:24] could afford to lean forward. I don't think we could afford to go back. Now,
[01:36:26] think we could afford to go back. Now, if the world simply says
[01:36:28] if the world simply says if on that day if on that day let's do
[01:36:31] if on that day if on that day let's do the thought experiment on that day we go
[01:36:33] the thought experiment on that day we go listen, we're just never going to have
[01:36:34] listen, we're just never going to have more capacity ever again. Would I go
[01:36:36] more capacity ever again. Would I go back and use seven? In a heartbeat.
[01:36:39] back and use seven? In a heartbeat. Yeah, of course I would.
[01:36:40] Yeah, of course I would. Um
[01:36:42] Um One question somebody I was talking to
[01:36:43] One question somebody I was talking to had is why Nvidia doesn't run multiple
[01:36:46] had is why Nvidia doesn't run multiple different chip projects at the same time
[01:36:48] different chip projects at the same time with totally different architectures so
[01:36:50] with totally different architectures so you could do like a Cerebras style wafer
[01:36:52] you could do like a Cerebras style wafer scale, you could do a Dojo style huge
[01:36:53] scale, you could do a Dojo style huge package, you could do one without CUDA,
[01:36:56] package, you could do one without CUDA, you know, um you have the resources, the
[01:36:58] you know, um you have the resources, the engineering talent
[01:36:59] engineering talent to do all of these in parallel.
[01:37:01] to do all of these in parallel. So why put all the eggs in one basket
[01:37:03] So why put all the eggs in one basket given who knows where AI might go and
[01:37:04] given who knows where AI might go and architectures might go? Oh, we could.
[01:37:07] architectures might go? Oh, we could. It's just that that we don't have a
[01:37:08] It's just that that we don't have a better idea.
[01:37:09] better idea. Hm.
[01:37:10] Hm. Yeah, yeah. We We could do all of those
[01:37:12] Yeah, yeah. We We could do all of those things. Um I
[01:37:14] things. Um I it's just not better.
[01:37:16] it's just not better. And we simulate it all.
[01:37:17] And we simulate it all. They're in our simulator provably worse.
[01:37:21] They're in our simulator provably worse. And so we wouldn't do it.
[01:37:23] And so we wouldn't do it. Yeah. We're We're doing We're working on
[01:37:26] Yeah. We're We're doing We're working on exactly the projects that we want to
[01:37:27] exactly the projects that we want to work on.
[01:37:28] work on. And and um
[01:37:30] And and um I
[01:37:32] I if the workload were to change
[01:37:34] if the workload were to change dramatically
[01:37:35] dramatically um and I don't mean I don't mean the
[01:37:37] um and I don't mean I don't mean the algorithms, I actually mean the
[01:37:38] algorithms, I actually mean the workload.
[01:37:40] workload. The um and that that depends on the
[01:37:43] The um and that that depends on the shape of the market.
[01:37:45] shape of the market. Um
[01:37:46] Um I we may decide to add other
[01:37:48] I we may decide to add other accelerators. Like for example, recently
[01:37:50] accelerators. Like for example, recently we added Groq.
[01:37:52] we added Groq. Um and we're going to fold Groq into our
[01:37:55] Um and we're going to fold Groq into our CUDA ecosystem.
[01:37:56] CUDA ecosystem. And and um
[01:37:58] And and um uh we do we're doing that now because
[01:38:02] uh we do we're doing that now because the value of tokens
[01:38:04] the value of tokens um have gone up so high
[01:38:06] um have gone up so high that that you could have different
[01:38:08] that that you could have different pricing of tokens. Back in the old days
[01:38:09] pricing of tokens. Back in the old days in the you know, just a couple years
[01:38:11] in the you know, just a couple years ago, tokens are either free or barely
[01:38:13] ago, tokens are either free or barely you know, barely expensive, right? And
[01:38:15] you know, barely expensive, right? And so But now you can have different
[01:38:17] so But now you can have different customers and those customers want
[01:38:19] customers and those customers want different answers. And so because the
[01:38:21] different answers. And so because the customers make so much money like for
[01:38:23] customers make so much money like for example, our software engineers if I can
[01:38:25] example, our software engineers if I can give them much more
[01:38:28] give them much more um
[01:38:29] um responsive tokens so that they're even
[01:38:31] responsive tokens so that they're even more productive than they are today, I
[01:38:33] more productive than they are today, I would pay for it.
[01:38:35] would pay for it. But that market has only recently
[01:38:36] But that market has only recently emerged. And so I think that we now have
[01:38:40] emerged. And so I think that we now have we now have the ability to have the same
[01:38:42] we now have the ability to have the same model
[01:38:43] model based on the response time have
[01:38:45] based on the response time have different segments.
[01:38:46] different segments. And that's the reason why we decided to
[01:38:48] And that's the reason why we decided to expand the Pareto frontier
[01:38:51] expand the Pareto frontier and
[01:38:52] and and create a segment
[01:38:54] and create a segment of inference that is faster response
[01:38:57] of inference that is faster response time even though it's lower lower
[01:38:58] time even though it's lower lower throughput. At the until now, higher
[01:39:01] throughput. At the until now, higher throughput is always better.
[01:39:03] throughput is always better. Um we we think that there there could be
[01:39:05] Um we we think that there there could be a world where there could be very high
[01:39:06] a world where there could be very high ASP tokens and and um
[01:39:10] ASP tokens and and um I even though the is though the
[01:39:12] I even though the is though the throughput is lower in the factory, the
[01:39:14] throughput is lower in the factory, the ASPs make up for it.
[01:39:16] ASPs make up for it. Yeah, that's the reason why we did it.
[01:39:17] Yeah, that's the reason why we did it. But otherwise, from an architectural
[01:39:19] But otherwise, from an architectural perspective, um I I think Nvidia's
[01:39:21] perspective, um I I think Nvidia's architecture is I would I would rather
[01:39:23] architecture is I would I would rather put If I If I had more money, I'd put
[01:39:25] put If I If I had more money, I'd put more behind the architecture. Mhm. I I I
[01:39:28] more behind the architecture. Mhm. I I I think this idea of extremely premium
[01:39:30] think this idea of extremely premium tokens and just the disaggregation of
[01:39:32] tokens and just the disaggregation of the inference market is very
[01:39:33] the inference market is very interesting. The segmentation of it.
[01:39:35] interesting. The segmentation of it. Yeah. Yeah. All right, final question.
[01:39:38] Yeah. Yeah. All right, final question. Um
[01:39:39] Um suppose the deep learning revolution
[01:39:40] suppose the deep learning revolution didn't happen.
[01:39:41] didn't happen. Um what would Nvidia be doing?
[01:39:45] Um what would Nvidia be doing? Obviously, games, but given It's already
[01:39:48] Obviously, games, but given It's already computing.
[01:39:50] computing. Mhm. It's already computing. The the
[01:39:52] Mhm. It's already computing. The the same thing we've been doing all along.
[01:39:53] same thing we've been doing all along. I
[01:39:55] I the the premise of our company is that
[01:39:56] the the premise of our company is that Moore's law Moore's law is going to more
[01:39:59] Moore's law Moore's law is going to more general purpose computing is good for a
[01:40:01] general purpose computing is good for a lot of things. But for a lot of
[01:40:03] lot of things. But for a lot of computation, it's not ideal.
[01:40:05] computation, it's not ideal. And so we combined an architecture
[01:40:08] And so we combined an architecture called a GPU, CUDA,
[01:40:09] called a GPU, CUDA, to a CPU so that we can accelerate the
[01:40:12] to a CPU so that we can accelerate the workload of the CPU. And so different
[01:40:15] workload of the CPU. And so different different kernels of code or algorithms
[01:40:17] different kernels of code or algorithms could be offloaded onto our GPU, and as
[01:40:20] could be offloaded onto our GPU, and as a result,
[01:40:21] a result, you speed up an an application by, you
[01:40:23] you speed up an an application by, you know, 100x, 200x. And where can you use
[01:40:26] know, 100x, 200x. And where can you use that? Um well, obviously, engineering
[01:40:28] that? Um well, obviously, engineering and science and physics and, you know,
[01:40:30] and science and physics and, you know, so on and so forth, data processing. Um
[01:40:33] so on and so forth, data processing. Um uh
[01:40:33] uh computer graphics, image generation. I
[01:40:35] computer graphics, image generation. I mean, all kinds of things.
[01:40:37] mean, all kinds of things. Even if AI doesn't exist today, Nvidia
[01:40:39] Even if AI doesn't exist today, Nvidia will be very very large. Yeah. And so So
[01:40:41] will be very very large. Yeah. And so So I think the the reason for that is is
[01:40:44] I think the the reason for that is is fairly fundamental, which is which is
[01:40:46] fairly fundamental, which is which is the ability for general purpose
[01:40:47] the ability for general purpose computing to continue to scale has
[01:40:50] computing to continue to scale has largely run its course.
[01:40:52] largely run its course. And the only the the not the only way,
[01:40:54] And the only the the not the only way, but the the way to do that is through
[01:40:56] but the the way to do that is through domain-specific acceleration. And what
[01:40:59] domain-specific acceleration. And what one of the the domain that we started
[01:41:01] one of the the domain that we started with was computer graphics.
[01:41:03] with was computer graphics. But um many There are many many other
[01:41:05] But um many There are many many other domains. I mean, there's, you know, you
[01:41:06] domains. I mean, there's, you know, you know, all all kinds of
[01:41:08] know, all all kinds of scientific particle physics and fluids
[01:41:10] scientific particle physics and fluids and, you know, and and so structured
[01:41:13] and, you know, and and so structured data processing, all kinds of different
[01:41:14] data processing, all kinds of different types of of algorithms that benefit from
[01:41:17] types of of algorithms that benefit from CUDA. And so our our mission was
[01:41:20] CUDA. And so our our mission was uh really to bring accelerated computing
[01:41:23] uh really to bring accelerated computing to the world and advance the type of
[01:41:25] to the world and advance the type of applications that general purpose
[01:41:27] applications that general purpose computing can't do and scale to the
[01:41:29] computing can't do and scale to the level of of uh capability that helps
[01:41:32] level of of uh capability that helps break through certain fields of science.
[01:41:35] break through certain fields of science. And And so some of the early
[01:41:37] And And so some of the early applications were uh molecular dynamics,
[01:41:40] applications were uh molecular dynamics, uh seismic processing for energy
[01:41:42] uh seismic processing for energy discovery, um uh image processing, of
[01:41:45] discovery, um uh image processing, of course. Uh And so all of those kind of
[01:41:47] course. Uh And so all of those kind of fields where where general purpose
[01:41:50] fields where where general purpose computing is just simply too inefficient
[01:41:51] computing is just simply too inefficient to do so. And so Yeah, if if there's no
[01:41:54] to do so. And so Yeah, if if there's no AI, I would be very sad. Um but
[01:41:58] AI, I would be very sad. Um but because of because of of the advances
[01:42:01] because of because of of the advances that we made in computing, we
[01:42:04] that we made in computing, we democratized deep learning. We made it
[01:42:06] democratized deep learning. We made it possible for any researcher, any
[01:42:09] possible for any researcher, any scientist, any or any student to be able
[01:42:11] scientist, any or any student to be able to access a PC or, you know, uh a a
[01:42:14] to access a PC or, you know, uh a a GeForce adding card and and to do
[01:42:17] GeForce adding card and and to do amazing science. And um uh that that
[01:42:21] amazing science. And um uh that that fundamental promise uh hasn't changed,
[01:42:23] fundamental promise uh hasn't changed, not even a little bit. And so
[01:42:25] not even a little bit. And so if you see GTC If you watch GTC, there's
[01:42:28] if you see GTC If you watch GTC, there's the whole beginning part of it. None of
[01:42:30] the whole beginning part of it. None of it's AI.
[01:42:31] it's AI. That whole part of it with with uh
[01:42:33] That whole part of it with with uh computational lithography or or uh our
[01:42:37] computational lithography or or uh our quantum chemistry work or, you know, uh
[01:42:40] quantum chemistry work or, you know, uh all of that stuff, data processing work,
[01:42:42] all of that stuff, data processing work, uh all of that stuff is is uh unrelated
[01:42:45] uh all of that stuff is is uh unrelated to AI and and and it's still very
[01:42:48] to AI and and and it's still very important. I mean, there's, you know, I
[01:42:49] important. I mean, there's, you know, I I know that that AI is is very
[01:42:51] I know that that AI is is very interesting and and quite exciting. Um
[01:42:54] interesting and and quite exciting. Um but but um
[01:42:56] but but um uh there's a lot of people doing a lot
[01:42:57] uh there's a lot of people doing a lot of very important work that's not not
[01:42:59] of very important work that's not not AI-related, and tensors is not the only
[01:43:01] AI-related, and tensors is not the only way that you compute with. And um I and
[01:43:04] way that you compute with. And um I and we want to help everybody.
[01:43:06] we want to help everybody. Jensen, thank you so much. You're
[01:43:08] Jensen, thank you so much. You're welcome. I enjoyed it.
[01:43:09] welcome. I enjoyed it. >> Me, too.
[01:43:10] >> Me, too. Sweet.

Full Transcript (Bilingual)

https://www.youtube.com/watch?v=ZVhf4ZfeLTo
Translation: zh-CN

[00:00] We've seen the evaluations of a bunch of software companies crash because people are expecting AI to commoditize software.
我们已经看到许多软件公司的估值崩溃，因为人们期望人工智能将软件商品化。

[00:06] And there's a a potentially naive way of thinking about things which is like, "Look, Nvidia sends a GDSII file to TSMC, TSMC builds the logic dies, it builds the switches, um then it packages them with the HBM that SK Hynix and Micron and Samsung make, then it sends it to an ODM in Taiwan where they assemble the racks."
而且有一种可能有些天真的思考方式，就像，“看，英伟达将一个GDSII文件发送给台积电，台积电制造逻辑芯片，制造开关，然后将其与SK海力士、美光和三星制造的HBM一起封装，然后将其发送给台湾的一家ODM，在那里组装机架。”

[00:26] And so Nvidia's fundamentally making software that other people are manufacturing and if software gets commoditized, does Nvidia get commoditized?
所以英伟达本质上是在制造其他人正在生产的软件，如果软件被商品化了，英伟达是否也会被商品化？

[00:32] Well, in the end something has to transform electrons to tokens.
嗯，最终必须有东西将电子转化为代币。

[00:38] That transformation um there's no the transformation of electrons to tokens uh and making those tokens more valuable over time uh I don't I think that that that's hard to hard to um completely commoditize.
这种转变，嗯，没有，将电子转化为代币的转变，呃，并且随着时间的推移使这些代币更有价值，呃，我不认为，我认为这很难，很难，嗯，完全商品化。

[00:55] The the transformation from electrons to
从电子到

[01:00] The the transformation from electrons to tokens is such an such an incredible journey.
电子转化为代币是一个令人难以置信的旅程。

[01:04] And and making that token you know, it's like making a one molecule more valuable than another molecule.
而且，你知道，这就像让一个分子比另一个分子更有价值。

[01:10] Making one token more valuable than another the amount of artistry, engineering, science, invention that goes into making that token valuable.
使一个代币比另一个更有价值，这其中包含了大量的艺术、工程、科学和发明，这些都融入了使该代币有价值的过程中。

[01:20] Uh obviously we're we're watching it happening in real time.
呃，显然我们正在实时地看到这一切发生。

[01:24] And so so the the the the transformation, the manufacturing, um all of the science that goes in there it is far from un- deeply understood and it's far from the journey is far from far from over.
所以，这种转变、制造，嗯，所有科学的融入，远未被完全理解，而且旅程也远未结束。

[01:39] And so so I I doubt that it will happen.
所以，我怀疑它会发生。

[01:41] Um we're going to make it more efficient, of course.
嗯，我们当然会让它更有效率。

[01:43] I mean the whole the whole thing about Nvidia in fact the way that you framed the question is is my mental model of our company.
我的意思是，关于英伟达的整个事情，事实上，你提出问题的方式就是我对我公司的设想。

[01:50] The input is electron, the output is tokens.
输入是电子，输出是代币。

[01:54] That is in the middle, Nvidia.
中间是英伟达。

[01:56] And our job is to to do as much as necessary
我们的工作是尽力而为

[02:01] to do as much as necessary as little as possible
尽可能多地做必要的事情，尽可能少地做不必要的事情。

[02:03] as little as possible to enable that transformation to be done
尽可能少地做，以实现该转变。

[02:06] to enable that transformation to be done at incredible capabilities.
以令人难以置信的能力来实现该转变。

[02:08] at incredible capabilities.
以令人难以置信的能力。

[02:08] And and what I mean by as little as
我所说的尽可能少，是指

[02:09] And and what I mean by as little as possible, whatever I don't need to do
我所说的尽可能少，是指任何我不需要做的事情

[02:13] possible, whatever I don't need to do I partner with somebody and I make it
我都会与他人合作，并将其

[02:14] I partner with somebody and I make it part of my ecosystem to do.
我都会与他人合作，并将其纳入我的生态系统来完成。

[02:16] part of my ecosystem to do. And if you look at Nvidia today
纳入我的生态系统来完成。如果你看看今天的英伟达

[02:18] look at Nvidia today we probably have the largest ecosystem
看看今天的英伟达，我们可能拥有最大的生态系统

[02:19] we probably have the largest ecosystem of partners both in supply chain
我们可能拥有最大的合作伙伴生态系统，包括供应链

[02:21] of partners both in supply chain upstream, supply chain downstream, all
合作伙伴，包括上游供应链、下游供应链，以及所有

[02:24] upstream, supply chain downstream, all of the computers, the computer
上游供应链、下游供应链，以及所有的计算机、计算机

[02:25] of the computers, the computer companies, and all the application
计算机公司，以及所有的应用程序

[02:27] companies, and all the application developers, and all the model makers,
公司，以及所有的应用程序开发者，以及所有的模型制作者，

[02:28] developers, and all the model makers, and all the you know the AI is a
开发者，以及所有的模型制作者，以及所有，你知道，人工智能是一个

[02:31] and all the you know the AI is a five-year five-layer cake if you will
你知道，人工智能是一个五层蛋糕，如果你愿意这么说的话

[02:33] five-year five-layer cake if you will and and we have ecosystems across the
五层蛋糕，如果你愿意这么说的话，我们拥有跨越

[02:35] and and we have ecosystems across the entire five layers. And and so we try to
我们拥有跨越整个五层的生态系统。所以我们试图

[02:38] entire five layers. And and so we try to do as little as possible.
整个五层。所以我们试图尽可能少地做。

[02:40] do as little as possible. But the part that we have to do
尽可能少地做。但我们必须做的那部分

[02:42] But the part that we have to do as it turns out is insanely hard.
但我们必须做的那部分，结果证明是极其困难的。

[02:45] as it turns out is insanely hard. And and um
结果证明是极其困难的。而且，嗯

[02:46] and um I I don't think that that gets
嗯，我认为那不会

[02:47] I I don't think that that gets commoditized. In fact in fact um
我认为那不会被商品化。事实上，事实上，嗯

[02:49] commoditized. In fact in fact um uh I also don't think that the the
被商品化。事实上，事实上，嗯，呃，我也不认为

[02:52] uh I also don't think that the the enterprise software companies
呃，我也不认为企业软件公司

[02:54] enterprise software companies uh the tools makers
企业软件公司，呃，工具制造商

[02:57] uh the tools makers you know, most of the software companies
呃，工具制造商，你知道，今天的大多数软件公司

[02:58] you know, most of the software companies today are tools makers.
你知道，今天的大多数软件公司都是工具制造商。

[03:00] today are tools makers. Um some of them are not um but are are
都是工具制造商。嗯，有些不是，但它们是

[03:02] Um some of them are not um but are are some of them are workflow.
嗯，其中一些不是，嗯，但有些是工作流。

[03:05] Some of them are workflow um codification.
其中一些是工作流，嗯，编纂。

[03:07] Um codification you know, systems.
嗯，编纂，你知道，系统。

[03:10] Um but for a lot of companies they're tool makers.
嗯，但对于很多公司来说，它们是工具制造商。

[03:11] For example, you know, Excel's a tool, PowerPoint's a tool, uh Cadence makes tools, Synopsys makes tools.
例如，你知道，Excel是一个工具，PowerPoint是一个工具，呃，Cadence制造工具，Synopsys制造工具。

[03:18] I I actually see the opposite of what people see.
我我实际上看到了与人们所见的相反的情况。

[03:22] I think the number of agents are going to grow exponentially.
我认为代理的数量将呈指数级增长。

[03:27] The number of tool users are going to grow exponentially.
工具用户的数量将呈指数级增长。

[03:30] And it's very likely that the number of instances of all these tools are going to skyrocket.
而且所有这些工具的实例数量很可能会飙升。

[03:39] It is very likely the number of instances of Synopsys design compiler is going to skyrocket.
Synopsys设计编译器的实例数量很可能会飙升。

[03:48] And the number of number of agents that are going to be using the floor planners and all of our layout tools and our design design rule checkers, the number of agents that are today we're limited by the number of engineers to tomorrow those engineers are going to be supported by a bunch of agents.
以及将要使用布局规划器、我们所有布局工具和我们的设计设计规则检查器的代理数量，今天我们受工程师数量的限制，而明天这些工程师将得到大量代理的支持。

[04:02] They're going to be exploring
它们将进行探索

[04:03] Agents. They're going to be exploring out the design space like you've never seen explored before and going to use the tools that we use today.
代理。他们将探索您从未见过的设计空间，并使用我们今天使用的工具。

[04:09] And so so I think I think tool use is going to cause cause these software companies to skyrocket.
所以，我认为工具的使用将导致这些软件公司飞速发展。

[04:13] The reason why it hasn't happened yet is because the agents aren't good enough at using their tools yet.
之所以现在还没有发生，是因为代理们还不够擅长使用它们的工具。

[04:19] And so either these companies are going to build the agents themselves or agents are going to get good enough to be able to use those tools.
所以，这些公司要么自己构建代理，要么代理将足够强大，能够使用那些工具。

[04:26] And I think it's going to be a combination of both.
我认为这将是两者的结合。

[04:29] Mhm. I think in your latest filings it was you had almost a hundred billion dollars in purchase commitments with people, foundries, memory, packaging, and then SemiAnalysis has reported that you will have two hundred fifty billion dollars of these kinds of purchase commitments.
嗯。我认为您最新的申报中，您与人员、代工厂、内存、封装有近千亿美元的采购承诺，然后 SemiAnalysis 报道称您将有两千五百亿美元的此类采购承诺。

[04:44] And so one interpretation is Nvidia's mode is really that you've locked up many years of these scarce components that are uh ever you know, somebody else might have an accelerator, but can they actually get the memory to build it?
所以一种解释是英伟达的模式实际上是您锁定了多年来这些稀缺的组件，你知道，别人可能有一个加速器，但他们能拿到内存来制造它吗？

[04:55] Can they actually get the logic to build it?
他们能拿到逻辑来制造它吗？

[04:57] And this is really Nvidia's big mode for the next few years.
这确实是英伟达未来几年的主要优势。

[05:02] Well, it's one it's one of the things that we can do that is hard for someone else to do.
嗯，这是我们可以做到的，但别人很难做到的事情之一。

[05:06] hard for someone else to do.
对别人来说很难做到。

[05:08] The reason why we could we've we've made enormous commitments upstream.
我们之所以能够做出巨大的上游承诺，是因为我们已经做出了巨大的上游承诺。

[05:10] commitments upstream.
上游承诺。

[05:12] Um some of it is explicit, these commitments that you mentioned.
嗯，其中一些是明确的，就是你提到的那些承诺。

[05:14] Some of it is implicit.
其中一些是隐含的。

[05:16] Um for example, a lot of the investments that are upstream are made by our our supply chain because I said to the CEOs, "Let me tell you how big this industry's going to be and let me explain to you why and let me reason through with you and let me show you what I see."
嗯，例如，很多上游的投资是由我们的供应链做出的，因为我对首席执行官们说：“让我告诉你这个行业会有多大，让我向你解释原因，让我和你一起分析，让我向你展示我所看到的。”

[05:36] And so as a result of that that process of of uh informing, inspiring um aligning with CEOs of all different industries upstream they're willing to make the investments.
因此，通过这种告知、激励、与所有不同行业上游的首席执行官们保持一致的过程，他们愿意进行投资。

[05:48] Now, why are they willing to make the investments for me and not someone else?
现在，他们为什么愿意为我而不是别人进行投资？

[05:50] And the reason for that is because they know that I have the capacity to buy it buy their supply and sell it through my downstream.
原因在于他们知道我有能力购买他们的供应，并通过我的下游进行销售。

[06:00] The fact that Nvidia's downstream supply chain and our downstream demand is so
英伟达的下游供应链以及我们的下游需求如此之大，这一事实是

[06:06] chain and our downstream demand is so large.
链条以及我们的下游需求如此之大。

[06:07] large they're willing to make the investment.
大到他们愿意进行投资。

[06:09] they're willing to make the investment upstream.
他们愿意在上游进行投资。

[06:10] upstream. And so if you look at GTC.
上游。所以如果你看看GTC。

[06:12] And so if you look at GTC um and and.
所以如果你看看GTC嗯嗯。

[06:15] um and and you know, people are marveled by the.
嗯嗯，你知道，人们对...感到惊叹。

[06:16] you know, people are marveled by the scale of GTC and the people that go.
你知道，人们惊叹于GTC的规模以及参加的人员。

[06:18] scale of GTC and the people that go. It's a 360 degrees that the entire.
GTC的规模以及参加的人员。这是一个360度的，整个。

[06:21] It's a 360 degrees that the entire universe of AI all in one place and.
这是一个360度的，整个AI的宇宙汇聚于一处，并且。

[06:24] universe of AI all in one place and they're they're all in one place because.
AI的宇宙汇聚于一处，并且他们都在一处，因为。

[06:26] they're they're all in one place because they need to see each other. I bring.
他们都在一处，因为他们需要互相看到。我把他们。

[06:28] them together so that they the.
聚集在一起，以便他们。

[06:29] downstream could see the upstream, the.
下游可以看到上游，

[06:31] upstream could see the downstream and.
上游可以看到下游，并且。

[06:33] all of them could see all the advances.
他们都可以看到所有的进展。

[06:35] in AI. And very importantly, they can.
在AI领域。并且非常重要的是，他们可以。

[06:37] all meet the AI natives and all the AI.
都能见到AI原生者和所有的AI。

[06:39] startups that are all you know, being.
初创公司，你知道，正在。

[06:41] being built and all the amazing things.
被建立起来，以及所有令人惊叹的事情。

[06:43] that are happening so that they could.
正在发生，以便他们能够。

[06:44] see firsthand all the things that I tell.
亲眼看到我告诉他们的所有事情。

[06:46] them.
他们。

[06:47] them. And so I spend a lot of my time.
他们。所以我花了很多时间。

[06:49] And so I spend a lot of my time informing directly or indirectly um our.
所以我花了很多时间直接或间接嗯，向我们的。

[06:53] supply chain and our partners and our ecosystem about the opportunity that's.
供应链、我们的合作伙伴和我们的生态系统，介绍摆在我们面前的机遇。

[06:54] supply chain and our partners and our ecosystem about the opportunity that's that's in front of us.
供应链、我们的合作伙伴和我们的生态系统，介绍摆在我们面前的机遇。

[06:57] that's in front of us. You know, most of my keynotes you know.
摆在我们面前。你知道，我的大多数主题演讲，你知道。

[06:59] You know, most of my keynotes you know, some of it some people always say, "You.
你知道，我的大多数主题演讲，你知道，其中一些，有些人总是说，“你。

[07:00] some of it some people always say, "You know, Jensen did.
其中一些，有些人总是说，“你知道，Jensen做到了。

[07:02] know, Jensen did it.
你知道，Jensen做到了。

[07:04] it in most keynotes.
在大多数主题演讲中。

[07:06] In most keynotes it's like one announcement after another.
在大多数主题演讲中，就像一个接一个的公告。

[07:08] It's like one announcement after another announcement after another announcement.
就像一个接一个的公告，一个接一个的公告。

[07:09] Announcement after another announcement after another announcement.
一个接一个的公告，一个接一个的公告。

[07:12] After another announcement.
又一个公告。

[07:14] Our keynotes are there's always a part of it that's a little torturous in the sense that it's almost comes across like an like education.
我们的主题演讲总有一部分有点折磨人，因为它几乎就像是教育。

[07:21] Education.
教育。

[07:23] And in fact that's exactly on my mind.
事实上，这正是我所想的。

[07:26] I need to make sure that the entire supply chain upstream and downstream, the ecosystem understands what is coming at us, why it's coming when it's coming, how big is it going to be, and be able to reason about it systematically.
我需要确保整个上下游供应链，生态系统都明白即将发生什么，为什么会发生，何时发生，规模有多大，并能够系统地对其进行推理。

[07:39] Systematically.
系统地。

[07:40] Just like I reason about it.
就像我对其进行推理一样。

[07:42] And and so so I think the the the mode as you you describe it, we're able to of course um build for a future uh it it if our next next several years is a trillion dollars in in scale, we have the supply chain to do it.
所以，我认为你所描述的模式，我们当然可以为未来进行建设，如果未来几年我们的规模达到万亿美元，我们就有供应链来做到这一点。

[08:00] Without our reach the velocity of our business.
没有我们的触及，我们的业务速度。

[08:05] You know, just as there's cash flow,
你知道，就像有现金流一样，

[08:07] You know, just as there's cash flow, there's supply chain flow, there's supply chain flow, there's turns.
你知道，就像有现金流一样，有供应链流，有供应链流，有周转。

[08:09] Turns. Uh nobody's going to build a supply chain for an architecture if the architecture the business turns is low.
周转。呃，没有人会为一个架构建立供应链，如果这个架构的业务周转率很低的话。

[08:16] And so our ability to sustain the scale is only because our downstream demand is so great and they see it and they all hear about it.
所以我们维持规模的能力仅仅是因为我们的下游需求如此之大，他们看到了，他们都听说了。

[08:24] They they see it all coming and so that's allows us to do the things that we're able to do at the scale we're able to do.
他们看到了这一切的到来，所以这使我们能够以我们能够达到的规模做我们能够做的事情。

[08:32] I do want to understand more concretely whether the upstream can keep up.
我确实想更具体地了解上游是否能跟上。

[08:37] Um for many years now you guys have been two x'ing revenue year over year.
嗯，多年来你们的收入一直逐年翻倍。

[08:41] You guys have been more than tripling the amount of flops you're providing to the world year over year.
你们提供的浮点运算量逐年翻两番以上。

[08:44] And two x'ing at the scale now is really incredible.
现在以这个规模翻倍确实是令人难以置信的。

[08:47] Exactly. Yeah. So then you look at logic say you're the biggest customer on TSMC's N3 node and um you're one of the biggest on N2.
没错。是的。所以你再看逻辑，假设你是台积电N3节点的最大客户，并且你是N2上最大的客户之一。

[08:56] Uh AI as a whole this year is going to be 60% of N3, it's going to be 86% next year according to SemiAnalysis.
呃，根据SemiAnalysis的数据，今年整体AI将占N3的60%，明年将占86%。

[09:03] How how do you two x if you're the majority?
如果你占多数，你如何翻倍？

[09:07] majority?
多数？

[09:07] Um and how do you do that year over year?
嗯，你如何年复一年地做到这一点？

[09:08] and how do you do that year over year?
你如何年复一年地做到这一点？

[09:11] So are we are we in a regime now where the growth rate in AI compute has to slow because of upstream?
那么我们现在是否处于一个增长率在人工智能计算方面必须放缓的局面，因为上游的原因？

[09:13] the growth rate in AI compute has to slow because of upstream?
人工智能计算的增长率是否必须因为上游而放缓？

[09:14] Do you see a way to get around these uh you know, you you how do we build two x more fabs year over year ultimately?
你是否看到一种方法可以绕过这些，你知道，你如何才能最终实现年复一年地建造两倍于以往的晶圆厂？

[09:16] way to get around these uh you know, you you how do we build two x more fabs year over year ultimately?
绕过这些的方法，你知道，你如何才能最终实现年复一年地建造两倍于以往的晶圆厂？

[09:18] x more fabs year over year ultimately?
最终年复一年地建造更多的晶圆厂？

[09:21] Yeah, at some at some level um the the instantaneous demand uh is greater than the supply upstream and downstream uh in the world.
是的，在某种程度上，嗯，瞬时需求大于上游和下游的供应，在世界上。

[09:24] the the instantaneous demand uh is greater than the supply upstream and downstream uh in the world.
瞬时需求大于上游和下游的供应，在世界上。

[09:26] the instantaneous demand uh is greater than the supply upstream and downstream uh in the world.
瞬时需求大于上游和下游的供应，在世界上。

[09:28] uh is greater than the supply upstream and downstream uh in the world.
大于上游和下游的供应，在世界上。

[09:32] upstream and downstream uh in the world.
上游和下游，在世界上。

[09:34] uh in the world.
在世界上。

[09:36] And and it could be at any instant at any instance we could be limited by the number of plumbers.
而且它可能在任何瞬间，在任何实例中，我们都可能受到水管工数量的限制。

[09:37] and it could be at any instant at any instance we could be limited by the number of plumbers.
它可能在任何瞬间，在任何实例中，我们都可能受到水管工数量的限制。

[09:40] instance we could be limited by the number of plumbers.
实例中我们都可能受到水管工数量的限制。

[09:42] number of plumbers.
水管工的数量。

[09:43] Mhm.
嗯哼。

[09:44] Which which actually happens.
这确实发生了。

[09:46] The plumbers are invited to next year's GTC.
水管工们被邀请参加明年的GTC。

[09:47] GTC.
GTC。

[09:48] [laughter]
[笑声]

[09:48] Yeah, you know, by the way, great idea.
是的，你知道，顺便说一句，这是个好主意。

[09:51] But that's a good condition.
但这是一个好条件。

[09:53] You you want you want you want a market, you want an industry where the instantaneous demand is greater than the total supply of the industry.
你想要一个市场，你想要一个行业，在这个行业里，瞬时需求大于行业的总供应量。

[09:55] you want an industry where the instantaneous demand is greater than the total supply of the industry.
你想要一个行业，在这个行业里，瞬时需求大于行业的总供应量。

[09:57] instantaneous demand is greater than the total supply of the industry.
瞬时需求大于行业的总供应量。

[10:00] The opposite is obviously less good.
反之显然不太好。

[10:03] If we're too far apart, if one
如果我们相差太远，如果一个

[10:05] opposite is obviously less good.
反之显然不太好。

[10:08] If we're too far apart, if one particular item, one particular particular item, one particular component is too far too far away, component is too far too far away, obviously obviously the industry swarms it.
如果我们离得太远，如果某一个特定的项目，某一个特定的，某一个特定的组件离得太远太远，组件离得太远太远，很明显，行业就会蜂拥而至。

[10:15] So for example, notice people aren't talking very much about CoWoS anymore.
所以，例如，你会注意到人们不再怎么谈论CoWoS了。

[10:20] And the reason for that is because for 2 years we swarmed the living daylights out of it.
原因在于，两年来我们对它进行了疯狂的投入。

[10:24] And we double double double on several doubles and and now I think we're in a fairly good shape.
我们对几个双倍投入进行了多次加倍，现在我认为我们处于一个相当好的状态。

[10:29] And TSMC now knows that CoWoS supply has to keep up with the rest of the logic demand and the memory demand and and so so they're scaling CoWoS and they're scaling you know, future packaging technologies at the same level as they scale logic, which is terrific because for a long time CoWoS was rather specialty.
现在台积电知道CoWoS的供应必须跟上其余的逻辑需求和内存需求，所以他们正在扩展CoWoS，并且他们正在以与扩展逻辑相同的水平来扩展，你知道的，未来的封装技术，这很棒，因为很长一段时间以来CoWoS是一种相当特殊的工艺。

[10:48] And uh HBM memory was rather specialty, but they're not specialties anymore.
而HBM内存曾经是一种相当特殊的工艺，但它们不再是特殊的了。

[10:56] Now realize they're mainstream computing technology.
现在它们是主流的计算技术。

[10:59] Um And then and of course we're now much more able to influence a larger scope of our
嗯，然后，当然，我们现在更有能力影响我们更大范围的

[11:08] To influence a larger scope of our supply chain.
以影响我们供应链的更大范围。

[11:12] In the past, in the past, you know, in the beginning of the AI revolution, all the things that I say now, I was saying 5 years ago.
过去，过去，你知道，在人工智能革命初期，我现在说的所有事情，五年前我就在说了。

[11:18] And some people believed in it and invested in it.
有些人相信它并投资了它。

[11:22] For example, Sanjay and the Micron team still remembers the meeting really well where I was clear about exactly what's going to happen and why it's going to happen.
例如，桑杰和美光团队仍然清楚地记得那次会议，我清楚地说明了将要发生什么以及为什么会发生。

[11:32] And the predictions, the predictions that of today.
以及今天的预测，预测。

[11:37] And they really double down on it and we partnered with them and across LPDDR, across, you know, HBM memories, they really invested in it.
他们真的加倍投入，我们与他们合作，在 LPDDR、HBM 内存中，他们真的投资了它。

[11:45] And it obviously has been tremendous for the company.
这显然对公司来说是巨大的。

[11:49] Some people came a little bit later, and but they now they're all here.
有些人来得晚了一点，但他们现在都在这里了。

[11:54] And so I think each one of these generation, each one of these bottlenecks gets a great deal of attention.
所以我想，这些世代中的每一个，这些瓶颈中的每一个都得到了极大的关注。

[12:02] Um, and now we're we're pre-fetching the bottlenecks years in advance.
嗯，现在我们提前多年预取瓶颈。

[12:06] So for example, the
所以，例如，

[12:08] example, the the investments that we've done.
例如，我们所做的投资。

[12:10] the investments that we've done with.
我们所做的投资与。

[12:11] with with Lumentum and Coherent and.
与 Lumentum 和 Coherent 以及。

[12:14] with Lumentum and Coherent and all of the silicon photonics ecosystem.
与 Lumentum 和 Coherent 以及整个硅光子生态系统。

[12:17] all of the silicon photonics ecosystem.
整个硅光子生态系统。

[12:17] The last several years we really reshape.
在过去的几年里，我们真正重塑了。

[12:19] the ecosystem and the supply chain silicon photonics.
生态系统和供应链硅光子。

[12:23] We we built up an entire supply chain around TSMC.
我们我们围绕台积电建立了一条完整的供应链。

[12:25] We partnered with them on CoWoS,
我们与他们合作开发了 CoWoS，

[12:29] invented a whole bunch of technology, we licensed.
发明了大量技术，我们授权了。

[12:31] those patents to the supply chain, keep it nice and open.
将这些专利授权给供应链，保持开放性。

[12:34] it nice and open.
保持开放性。

[12:36] And so we're preparing the supply chain through invention of new technologies,
因此，我们正在通过发明新技术来准备供应链，

[12:38] new workflows, new test new testing equipment, double-sided prodding.
新的工作流程，新的测试新的测试设备，双面探针。

[12:40] investing in companies, helping them scale up their capacity.
投资公司，帮助他们扩大产能。

[12:43] And so so you could see that we're trying to shape the ecosystem so that it's ready the supply chain so that it's ready to support the scale.
所以你可以看到我们正在努力塑造生态系统，使其供应链能够支持规模化。

[12:45] It seems like some bottlenecks are easier than others and so scaling up CoWoS versus scaling up.
似乎有些瓶颈比其他瓶颈更容易解决，因此扩大 CoWoS 的规模与扩大规模。

[12:48] I went to the hardest one by the way.
顺便说一句，我遇到了最难的一个。

[12:51] Which is?
是哪个？

[12:52] Plumbers.
水管工。

[12:55] >> [laughter]
>> [笑声]

[12:56] >> Yeah.
>> 是的。

[12:58] That's true.
那是真的。

[13:01] Yeah, yeah.
是的，是的。

[13:08] Yeah. That's true. Yeah, yeah.
是的。这是真的。是的，是的。

[13:08] I actually went to the hardest one.
我实际上去了最难的一个。

[13:10] Yeah, actually went to the hardest one.
是的，实际上去了最难的一个。

[13:12] Yeah, plumbers and electricians and the reason for that is because because and this is one of the concerns that I have about about the do- the doomers um
是的，水管工和电工，原因是因为，这是我担心的其中一点，关于末日论者

[13:15] one of the concerns that I have about about the do- the doomers um
我担心的其中一点，关于末日论者

[13:17] um describing the end of end of work and killing of jobs and you know, one of the things that that that if we discourage people from being software engineers
描述工作的终结和工作的消失，你知道，如果我们劝阻人们成为软件工程师，那么

[13:21] killing of jobs and you know, one of the things that that that if we discourage people from being software engineers
工作的消失，你知道，如果我们劝阻人们成为软件工程师，那么

[13:23] we're going to run out of software engineers.
我们将没有软件工程师了。

[13:26] And and the same prediction 10 years ago some of the some of the doomers were were uh saying that we're telling people to whatever you do don't be a radiologist.
同样，10年前，一些末日论者曾说过，我们告诉人们，无论做什么，都不要当放射科医生。

[13:29] we're going to run out of software engineers. engineers.
我们将没有软件工程师了。工程师。

[13:31] And and the same prediction 10 years ago some of the some of the doomers were were uh saying that we're telling people to whatever you do don't be a radiologist.
同样，10年前，一些末日论者曾说过，我们告诉人们，无论做什么，都不要当放射科医生。

[13:35] some of the some of the doomers were were uh saying that we're telling people to whatever you do don't be a radiologist.
一些末日论者曾说过，我们告诉人们，无论做什么，都不要当放射科医生。

[13:36] were uh saying that we're telling people to whatever you do don't be a radiologist.
曾说过，我们告诉人们，无论做什么，都不要当放射科医生。

[13:40] to whatever you do don't be a radiologist.
无论做什么，都不要当放射科医生。

[13:41] And you might hear some of those some of those videos are still on the web.
你可能还会听到一些，那些视频还在网上。

[13:42] You know, radiology is is going to be the first career to go.
你知道，放射学将是第一个消失的职业。

[13:44] some of those some of those videos are still on the web.
那些视频还在网上。

[13:46] You know, radiology is is going to be the first career to go.
你知道，放射学将是第一个消失的职业。

[13:48] Nobody's world's not going to need any more radiologists.
没有人会再需要放射科医生了。

[13:50] Guess what we're short of? Radiologists.
猜猜我们缺什么？放射科医生。

[13:51] Oh, but okay. So going back to this point about well some things you scale, other things like how do you actually get
哦，但是，好的。回到这一点，有些东西你可以规模化，其他东西，比如你如何实际地获得

[13:54] point about well some things you scale, other things like how do you actually get
关于有些东西你可以规模化，其他东西，比如你如何实际地获得

[13:55] how do you actually manufacture 2x the amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
你如何实际地每年制造两倍的逻辑，最终受内存和逻辑瓶颈，受 EUV 瓶颈。

[13:58] other things like how do you actually get how do you actually manufacture 2x the amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
其他东西，比如你如何实际地获得，你如何实际地每年制造两倍的逻辑，最终受内存和逻辑瓶颈，受 EUV 瓶颈。

[14:00] how do you actually manufacture 2x the amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
你如何实际地每年制造两倍的逻辑，最终受内存和逻辑瓶颈，受 EUV 瓶颈。

[14:01] amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
每年逻辑的量，最终受内存和逻辑瓶颈，受 EUV 瓶颈。

[14:02] How do you get to 2x as many EUV machines a year
你如何每年获得两倍的 EUV 机器？

[14:03] amount of logic a year ultimately that's bottlenecked by memory and logic are bottlenecked by EUV.
每年逻辑的量，最终受内存和逻辑瓶颈，受 EUV 瓶颈。

[14:05] bottlenecked by memory and logic are bottlenecked by EUV.
受内存和逻辑瓶颈，受 EUV 瓶颈。

[14:07] How do you get to 2x as many EUV machines a year
你如何每年获得两倍的 EUV 机器？

[14:08] How do you get to 2x as many EUV machines a year
你如何每年获得两倍的 EUV 机器？

[14:10] machines a year year over year?
机器一年一年地增长？

[14:10] year over year?
年复一年？

[14:13] None of none of that's impossible to scale quickly.
这些都不是不可能快速扩展的。

[14:15] scale quickly. You just need to you you could do all of that is easy to do within 2 or 3 years.
快速扩展。你只需要，你你可以在2到3年内完成所有这些事情，这很容易做到。

[14:19] within 2 or 3 years. You just need a demand signal.
在2到3年内。你只需要一个需求信号。

[14:22] It's not once you once you can build one, you can build 10.
一旦你能够制造一个，你就可以制造10个。

[14:24] And once you can build 10 you can build a million.
一旦你能够制造10个，你就可以制造一百万个。

[14:26] And so these things are not not hard to replicate.
所以这些东西并不难复制。

[14:29] How how far down the supply chain do you go where you do you go to ASML and say hey if I look out 3 years from now for me to for Nvidia to be generating 2 trillion a year in a year in revenue we need way more EUV machines
你沿着供应链往下走多远，去到ASML说，嘿，如果我展望未来三年，为了让英伟达一年产生2万亿美元的收入，我们需要更多的EUV机器。

[14:39] and >> Some of them I have to directly some of them I indirectly and some of them if I can convince TSMC as ASML will be convinced.
还有，有些我必须直接做，有些我间接做，有些如果我能说服台积电，就像ASML会被说服一样。

[14:50] And so that's that's you know, we have to think about the critical critical pinch points and but if TSMC is convinced I you'll have plenty of EUV machines in a few years.
所以，你知道，我们必须考虑关键的关键瓶颈点，但如果台积电被说服了，几年后你就会有足够的EUV机器。

[15:04] And so none of that my point is that none of the bottlenecks last longer than a couple 2 3 years.
所以我的观点是，没有一个瓶颈会持续超过两三年。

[15:08] None of them. And meanwhile meanwhile
没有一个。同时，同时

[15:11] None of them.
他们都没有。

[15:14] And meanwhile meanwhile we're improving computing efficiency by 10x 20x in the case of Hopper to Blackwell some 30 50x.
同时，我们正在将计算效率提高 10 倍，在 Hopper 的情况下提高 20 倍，到 Blackwell 提高 30-50 倍。

[15:20] We're we're coming up with new algorithms because CUDA is so flexible.
我们正在开发新算法，因为 CUDA 非常灵活。

[15:25] We're we're developing all kinds of new techniques so that we drive efficiency in addition to increasing capacity.
我们正在开发各种新技术，以便在增加容量的同时提高效率。

[15:32] And so so there those those are those are things that that none of that worry me.
所以，那些事情都不会让我担心。

[15:36] It's the stuff that's downstream from us.
这是我们下游的东西。

[15:39] Energy policies that prevent energy from from you know, you can't grow you can't create a you can't create an industry without energy.
能源政策阻止能源的获取，你知道，没有能源你就无法发展，无法创造一个行业。

[15:46] You can't create a new whole new manufacturing industry without energy.
没有能源，你就无法创造一个全新的制造业。

[15:51] We want to re-industrialize the United States.
我们想让美国重新工业化。

[15:52] We want to bring back chip manufacturing and computer manufacturing and packaging and we want to build new things like EVs and robots and we want to build AI factories and you can't build any of these things without energy.
我们想恢复芯片制造、计算机制造和封装，我们想制造电动汽车和机器人等新事物，我们想建造人工智能工厂，没有能源，你就无法建造这些东西。

[16:05] And those things take a long time.
而这些事情需要很长时间。

[16:07] But more chip capacity, that's a 2 3 year problem.
但更多的芯片产能是一个 2-3 年的问题。

[16:10] More CoWoS capacity, 2 3
更多的 CoWoS 产能，2-3

[16:12] Year problem. More CoWoS capacity, two, three year problem. Interesting.
年的问题。更多的CoWoS产能，两到三年的问题。有意思。

[16:14] I feel like I have guests tell me the exact opposite thing sometimes and I in this case I just don't have the technical knowledge to adjudicate, but
我觉得有时候我会有客人告诉我完全相反的事情，而我，在这种情况下，我只是没有技术知识来裁决，但是

[16:20] Well, the beautiful thing is you're talking to the expert.
嗯，美妙之处在于你正在和专家对话。

[16:21] Yeah.
是的。

[16:23] Talking to the expert.
和专家对话。

[16:25] Yeah.
是的。

[16:27] Okay, I want to ask about your competitors.
好的，我想问问你的竞争对手。

[16:29] So if you look at TPU, arguably two out of the top three models in the world, Claude and Gemini were trained on TPU.
所以如果你看TPU，可以说世界上排名前三的模型中有两个，Claude和Gemini，都是在TPU上训练的。

[16:39] What does that mean for Nvidia going forward?
这对英伟达未来的发展意味着什么？

[16:41] Well, we have, we have a very different, we build a very different thing.
嗯，我们有，我们有非常不同的，我们构建的是非常不同的东西。

[16:47] Um, you know, what Nvidia built is accelerated computing.
嗯，你知道，英伟达构建的是加速计算。

[16:51] Not a tensor processing unit.
不是张量处理单元。

[16:55] And accelerated computing is used for all kinds of things.
而加速计算被用于各种各样的事情。

[16:57] You know, molecular dynamics and quantum chromodynamics and it's used for data processing.
你知道，分子动力学和量子色动力学，以及它被用于数据处理。

[17:05] Data frames, structured data, unstructured data.
数据帧，结构化数据，非结构化数据。

[17:09] It's used for fluid dynamics, particle physics.
它被用于流体动力学，粒子物理学。

[17:13] Fluid dynamics, particle physics.
流体动力学，粒子物理学。

[17:15] You know, and in addition we use it for AI.
你知道，此外我们还将其用于人工智能。

[17:17] We use it for AI.
我们将其用于人工智能。

[17:20] And so accelerated computing is is um much more diverse and and although AI is
因此，加速计算更加多样化，尽管人工智能是

[17:23] much more diverse and and although AI is the conversation today is obviously very important and impactful.
今天的对话显然非常重要且有影响力。

[17:27] important and impactful. Computing is much broader than that.
重要且有影响力。计算的范围比这要广泛得多。

[17:31] Computing is much broader than that. And what Nvidia has done is reinventing
计算的范围比这要广泛得多。英伟达所做的就是重新发明

[17:33] what Nvidia has done is reinventing reinvented the way computing is done
英伟达所做的就是重新发明了计算的实现方式

[17:35] reinvented the way computing is done from general purpose computing to accelerate computing.
从通用计算到加速计算的计算实现方式。

[17:38] accelerate computing. Our market reach is
加速计算。我们的市场覆盖范围是

[17:41] Our market reach is far greater than any any TPU can any
我们的市场覆盖范围远远大于任何TPU或任何

[17:45] far greater than any any TPU can any ASIC can possibly have. And so if you look at our position
任何ASIC可能拥有的。因此，如果您看看我们的定位

[17:47] ASIC can possibly have. And so if you look at our position we're the only company that that
任何ASIC可能拥有的。因此，如果您看看我们的定位，我们是唯一一家

[17:49] we're the only company that that accelerates applications of all kinds.
我们是唯一一家加速各种应用程序的公司。

[17:52] accelerates applications of all kinds. We have a gigantic ecosystem and so all
加速各种应用程序。我们拥有庞大的生态系统，因此所有

[17:54] We have a gigantic ecosystem and so all kinds of frameworks and algorithms all run on Nvidia.
我们拥有庞大的生态系统，因此各种框架和算法都在英伟达上运行。

[17:57] kinds of frameworks and algorithms all run on Nvidia. And
各种框架和算法都在英伟达上运行。而且

[17:59] run on Nvidia. And because our computers
在英伟达上运行。而且因为我们的计算机

[18:02] And because our computers are designed to be operated by other people anyone who's an operator could buy our systems.
因为我们的计算机设计为由其他人操作，所以任何操作员都可以购买我们的系统。

[18:04] are designed to be operated by other people anyone who's an operator could buy our systems.
设计为由其他人操作，所以任何操作员都可以购买我们的系统。

[18:07] people anyone who's an operator could buy our systems. Most of these
其他人，任何操作员都可以购买我们的系统。其中大多数

[18:10] buy our systems. Most of these
购买我们的系统。其中大多数

[18:14] Most of these home-built systems you have to be your own operator because it was never designed to be flexible enough for other people to operate.
大多数这些自建系统都需要您自己担任操作员，因为它从未被设计成足够灵活，能够供其他人操作。

[18:23] And so as a result of the fact that anybody can operate our systems, we're in every cloud including Google and Amazon and you know, Azure and OCI and right?
因此，由于任何人都可以操作我们的系统，我们遍布所有云，包括谷歌、亚马逊、你知道的，Azure 和 OCI，对吧？

[18:33] And so whether you want to operate it to rent or operate it if you want to operate it to rent you better have large ecosystem of customers and many industries that be the off-takers.
所以，无论您是想通过租赁来运营它，还是想通过租赁来运营它，您最好拥有庞大的客户生态系统和众多作为接盘方的行业。

[18:45] If you're operating it if you if you want to operate it for yourself, we you know, we obviously have the ability to help you operate yourself like for example for Elon with xAI.
如果您是为自己运营，我们，你知道的，我们显然有能力帮助您自己运营，例如埃隆的 xAI。

[18:57] And because we could we could enable operators in any any company in any industry, you could use it to build a supercomputer for scientific research and drug discovery at Lilly.
而且因为我们可以，我们可以赋能任何公司、任何行业的运营商，您可以用它来构建一个超级计算机，用于礼来公司的科学研究和药物发现。

[19:10] And so we can help them operate their own supercomputer and and use it for the
所以我们可以帮助他们运营自己的超级计算机，并用于

[19:14] own supercomputer and and use it for the entire diversity of drug discovery and biological sciences that that we accelerate.
拥有自己的超级计算机，并将其用于我们加速的药物发现和生物科学的整个多样性。

[19:22] And so so there just you know, a whole bunch of applications that we can address that you can't do so with TPUs because Nvidia's built CUDA as a fantastic tensor processing unit as well, but it does you know, it does every every life cycle of data processing and computing and AI and so on and so forth.
所以，有很多应用程序我们可以处理，而TPU无法做到，因为英伟达构建的CUDA也是一个很棒的张量处理单元，但它确实，你知道，它确实处理数据处理和计算以及人工智能等的每一个生命周期。

[19:41] And so I our market opportunity is just a lot larger.
所以，我们的市场机会要大得多。

[19:45] Our reach is a lot greater.
我们的影响力要大得多。

[19:51] And because we have such a large Um, we basically support every application in the world now.
而且因为我们有如此大的，我们基本上支持了当今世界的每一个应用程序。

[19:55] You could build in video systems anywhere and know that there will be customers for it.
你可以在任何地方构建视频系统，并且知道会有客户。

[19:59] And so, it's a very different thing.
所以，这是一个非常不同的事情。

[20:00] Uh, this is going to be sort of a long question, but you know, you have spectacular revenue.
呃，这是一个有点长的问题，但你知道，你们有惊人的收入。

[20:04] Um, and this revenue is mostly you're not making 60 billion a quarter from uh, pharma and um, quantum.
嗯，这笔收入主要是，你不是从制药和量子领域每季度赚取600亿美元。

[20:10] You're making it because AI is uh, unprecedented technology that is growing unprecedentedly fast.
你之所以能赚到钱，是因为人工智能是一项前所未有的技术，并且以前所未有的速度增长。

[20:16] growing unprecedentedly fast. And so, then the question is what is best for AI

[20:18] then the question is what is best for AI specifically? And I'm not in the

[20:19] specifically? And I'm not in the details, but I talked to my AI

[20:21] details, but I talked to my AI researcher friends and they say, "Look,

[20:22] researcher friends and they say, "Look, when I use a TPU, it's this big systolic

[20:24] when I use a TPU, it's this big systolic array that's perfect for doing matrix

[20:26] array that's perfect for doing matrix multiplies, whereas a GPU is very

[20:28] multiplies, whereas a GPU is very flexible. It's a great when you have

[20:30] flexible. It's a great when you have lots of branching, when you have um,

[20:33] lots of branching, when you have um, irregular memory access, but

[20:35] irregular memory access, but these you know, what what what is AI?

[20:37] these you know, what what what is AI? It's just like these very predictable

[20:38] It's just like these very predictable matrix multiplies again and again and

[20:39] matrix multiplies again and again and again. And you don't have to give up any

[20:41] again. And you don't have to give up any die area for

[20:43] die area for warp schedulers, for you know, switches

[20:45] warp schedulers, for you know, switches between threads and memory banks. And

[20:47] between threads and memory banks. And so, the TPU is really optimized for the

[20:49] so, the TPU is really optimized for the majority of the bulk of this growth in

[20:51] majority of the bulk of this growth in revenue and use case for a compute that

[20:54] revenue and use case for a compute that is coming online right now.

[20:55] is coming online right now. Um, yeah. I I wonder how you react to

[20:57] Um, yeah. I I wonder how you react to that.

[20:59] that. Um,

[21:01] Um, matrix multiplies is an important part

[21:03] matrix multiplies is an important part of AI, but it's not the only part of AI.

[21:06] of AI, but it's not the only part of AI. And if you want to come up with a new

[21:08] And if you want to come up with a new attention mechanism,

[21:10] attention mechanism, or if you want to disaggregate in a

[21:12] or if you want to disaggregate in a different way, if you want to come up

[21:14] different way, if you want to come up with a whole new

[21:16] with a whole new type of

[21:17] type of architecture all together, for example,

[21:19] architecture all together, for example, you know, a hybrid SSM.

[21:22] you know, a hybrid SSM. Uh, if you want to use a you want to

[21:25] Uh, if you want to use a you want to create a model that that um,

[21:27] create a model that that um, that fuses diffusion and auto regressive

[21:31] that fuses diffusion and auto regressive somehow,

[21:32] somehow, uh, you you want an architecture that's

[21:34] uh, you you want an architecture that's just generally programmable.

[21:36] just generally programmable. And and

[21:38] And and we run everything you can imagine.

[21:41] we run everything you can imagine. And so, that's the advantage. It allows

[21:43] And so, that's the advantage. It allows for invention of new algorithms a lot

[21:45] for invention of new algorithms a lot more a lot a lot more easily.

[21:48] more a lot a lot more easily. And so, because it's a programmable

[21:50] And so, because it's a programmable system.

[21:51] system. And and the ability to invent new

[21:53] And and the ability to invent new algorithms is really what makes AI

[21:56] algorithms is really what makes AI advance so quickly.

[21:58] advance so quickly. You know, you

[22:00] You know, you TPUs like anything else is impacted by

[22:03] TPUs like anything else is impacted by Moore's law. And we know that Moore's

[22:05] Moore's law. And we know that Moore's law is increasing about 25% per year.

[22:08] law is increasing about 25% per year. And so,

[22:09] And so, the only way to really get 10x leaps,

[22:13] the only way to really get 10x leaps, 100x leaps,

[22:15] 100x leaps, is to fundamentally change

[22:17] is to fundamentally change the algorithm and how it's computed

[22:20] the algorithm and how it's computed every single year.

[22:22] every single year. And that's Nvidia's fundamental

[22:23] And that's Nvidia's fundamental advantage.

[22:25] advantage. The only reason why

[22:27] The only reason why we were able to make Blackwell the

[22:28] we were able to make Blackwell the Hopper 50 times,

[22:30] Hopper 50 times, you know, I said it was 35 times and and

[22:33] you know, I said it was 35 times and and and when I first announced it was going

[22:34] and when I first announced it was going to Blackwell was going to be 35 times

[22:36] to Blackwell was going to be 35 times more energy efficient than Hopper,

[22:39] more energy efficient than Hopper, uh, nobody believed it. And and and then

[22:42] uh, nobody believed it. And and and then and then Dylan wrote an article. He said

[22:44] and then Dylan wrote an article. He said he said, "In fact, in fact, I

[22:45] he said, "In fact, in fact, I sandbagged. It's actually 50 times."

[22:48] sandbagged. It's actually 50 times." And you can't reasonably do that with

[22:50] And you can't reasonably do that with just Moore's law. And so, the the way

[22:53] just Moore's law. And so, the the way that we solve that problem is new out

[22:56] that we solve that problem is new out new models,

[22:57] new models, MOEs,

[22:58] MOEs, um, uh, parallelized and disaggregated

[23:02] um, uh, parallelized and disaggregated and and distributed uh,

[23:05] and and distributed uh, across a computing system. Uh, and

[23:08] across a computing system. Uh, and without the ability to really get down

[23:11] without the ability to really get down and come up with new kernels with CUDA,

[23:13] and come up with new kernels with CUDA, it's really hard to do.

[23:15] it's really hard to do. And and so, the combination of the

[23:18] And and so, the combination of the programmability of our of our

[23:20] programmability of our of our architecture,

[23:21] architecture, uh, the the fact that Nvidia's an

[23:24] uh, the the fact that Nvidia's an extreme co-design company, where we

[23:26] extreme co-design company, where we could even offload some of the

[23:28] could even offload some of the computation into the fabric itself,

[23:30] computation into the fabric itself, NVLink, for example,

[23:32] NVLink, for example, into the network, Spectrum-X, um, uh,

[23:36] into the network, Spectrum-X, um, uh, and we could affect change across the

[23:40] and we could affect change across the processors, the system, the fabric, the

[23:43] processors, the system, the fabric, the libraries, the algorithm,

[23:46] libraries, the algorithm, all of that was done simultaneously.

[23:49] all of that was done simultaneously. Without CUDA to do that, I wouldn't even

[23:51] Without CUDA to do that, I wouldn't even know where to start.

[23:53] know where to start. My sponsor Crusoe was among the first

[23:54] My sponsor Crusoe was among the first clouds to offer Nvidia's Blackwell and

[23:56] clouds to offer Nvidia's Blackwell and Blackwell Ultra platforms. And they just

[23:58] Blackwell Ultra platforms. And they just announced their Nvidia Vera Rubin

[24:00] announced their Nvidia Vera Rubin deployment scheduled for later this

[24:01] deployment scheduled for later this year. But access to state-of-the-art

[24:03] year. But access to state-of-the-art hardware is only part of the story. For

[24:05] hardware is only part of the story. For example, most inference engines already

[24:07] example, most inference engines already do KV caching for a single user's

[24:08] do KV caching for a single user's forward passes. But Crusoe does it

[24:10] forward passes. But Crusoe does it across users and GPUs. So, if a thousand

[24:12] across users and GPUs. So, if a thousand agents are running on the same system

[24:14] agents are running on the same system prompt, Crusoe only has to compute the

[24:16] prompt, Crusoe only has to compute the KV cache once for it to become available

[24:18] KV cache once for it to become available to every single GPU in the cluster. This

[24:19] to every single GPU in the cluster. This is especially important as systems get

[24:21] is especially important as systems get more gigantic and require much longer

[24:23] more gigantic and require much longer prefixes in order to use tools and

[24:26] prefixes in order to use tools and access files. In a recent benchmark,

[24:27] access files. In a recent benchmark, Crusoe was able to deliver up to 10

[24:30] Crusoe was able to deliver up to 10 times faster time to first token and up

[24:32] times faster time to first token and up to five times better throughput than

[24:34] to five times better throughput than vLLM. This is just one among many

[24:36] vLLM. This is just one among many reasons that you should run your

[24:37] reasons that you should run your inference workload with Crusoe. And if

[24:39] inference workload with Crusoe. And if you need GPUs for training, you don't

[24:41] you need GPUs for training, you don't need to switch clouds. Crusoe's got you

[24:42] need to switch clouds. Crusoe's got you covered there, too. Go to

[24:43] covered there, too. Go to crusoe.ai/thoroughcache

[24:45] crusoe.ai/thoroughcache to learn more.

[24:47] to learn more. So, this gets at an interesting question

[24:49] So, this gets at an interesting question about um,

[24:50] about um, Nvidia's

[24:52] Nvidia's clientele, where if 60% of your revenue

[24:55] clientele, where if 60% of your revenue is coming from these

[24:57] is coming from these big five hyperscalers, you know, in in

[25:00] big five hyperscalers, you know, in in in in a different era with different

[25:01] in in a different era with different customers, let's say it's professors who

[25:03] customers, let's say it's professors who are running experiments. And they are

[25:05] are running experiments. And they are helped a bunch by they need CUDA. Um,

[25:08] helped a bunch by they need CUDA. Um, they can't use another accelerator.

[25:10] they can't use another accelerator. They need to just run PyTorch with CUDA

[25:12] They need to just run PyTorch with CUDA and have everything optimized. But if

[25:13] and have everything optimized. But if you've got these hyperscalers, they have

[25:14] you've got these hyperscalers, they have the resources to write their own

[25:17] the resources to write their own kernels. In fact, they have to to get

[25:18] kernels. In fact, they have to to get that extra last 5% that they need for

[25:20] that extra last 5% that they need for their specific architecture. Um,

[25:23] their specific architecture. Um, Anthropic, Google are mostly running

[25:25] Anthropic, Google are mostly running their own accelerators or running TPUs,

[25:28] their own accelerators or running TPUs, um, Intanium. But even OpenAI using GPUs

[25:32] um, Intanium. But even OpenAI using GPUs has um, has Triton, which they're like,

[25:35] has um, has Triton, which they're like, "We need our own kernels."

[25:36] "We need our own kernels." So, they've um,

[25:38] So, they've um, down to CUDA C++, they've instead of

[25:40] down to CUDA C++, they've instead of using cuBLAS and nickel and everything.

[25:43] using cuBLAS and nickel and everything. They've got their own stack, which

[25:44] They've got their own stack, which compiles to other accelerators as well.

[25:46] compiles to other accelerators as well. Um, and so, if most of your your

[25:48] Um, and so, if most of your your customers

[25:49] customers can can and do make replacements for

[25:51] can can and do make replacements for CUDA, to what extent is CUDA really the

[25:54] CUDA, to what extent is CUDA really the thing that is going to make frontier AI

[25:56] thing that is going to make frontier AI happen on Nvidia?

[25:58] happen on Nvidia? CUDA CUDA is um,

[26:01] CUDA CUDA is um, uh, it's a a rich ecosystem. And so, if

[26:04] uh, it's a a rich ecosystem. And so, if you want to build on any computer first,

[26:06] you want to build on any computer first, building on CUDA first is incredibly

[26:09] building on CUDA first is incredibly smart.

[26:11] smart. And because the ecosystem is so rich,

[26:14] And because the ecosystem is so rich, uh, we support every framework. Uh, if

[26:16] uh, we support every framework. Uh, if you want to create custom kernels, uh,

[26:18] you want to create custom kernels, uh, if you need, for example, we contribute

[26:21] if you need, for example, we contribute enormously to Triton. And so, the back

[26:23] enormously to Triton. And so, the back end of Triton, um, huge amounts of

[26:26] end of Triton, um, huge amounts of Nvidia technology.

[26:28] Nvidia technology. We're delighted to help every framework

[26:30] We're delighted to help every framework uh, become as great as it can be. And

[26:32] uh, become as great as it can be. And there's lots and lots of frameworks.

[26:34] there's lots and lots of frameworks. There's Triton, there's vLLM, there's SG

[26:36] There's Triton, there's vLLM, there's SG Lang, and then there's more, right? And

[26:38] Lang, and then there's more, right? And now there's there's a whole bunch of new

[26:40] now there's there's a whole bunch of new reinforcement learning frameworks coming

[26:42] reinforcement learning frameworks coming out, you know, that you got VERA, you

[26:43] out, you know, that you got VERA, you got Nemo RL, you got a whole bunch of

[26:44] got Nemo RL, you got a whole bunch of new. And then the the now with with with

[26:48] new. And then the the now with with with post-training and reinforcement

[26:50] post-training and reinforcement learning, that entire area is just

[26:51] learning, that entire area is just exploding, right? And so, if you want to

[26:54] exploding, right? And so, if you want to build on on an architecture, building on

[26:56] build on on an architecture, building on a CUDA makes the most sense. Because you

[26:58] a CUDA makes the most sense. Because you know that the ecosystem is great.

[27:00] know that the ecosystem is great. You know that if something happens, it's

[27:02] You know that if something happens, it's more likely in your code and not in the

[27:04] more likely in your code and not in the mountain of code underneath.

[27:06] mountain of code underneath. You know, don't forget the amount of

[27:07] You know, don't forget the amount of code that you're dealing with when

[27:09] code that you're dealing with when you're building these systems.

[27:11] you're building these systems. When something doesn't work,

[27:13] When something doesn't work, was it you or was it the computer? You

[27:16] was it you or was it the computer? You would like it always to be you and to to

[27:18] would like it always to be you and to to be able to trust the computer. And and

[27:20] be able to trust the computer. And and and obviously, we still have lots and

[27:22] and obviously, we still have lots and lots of lots and lots of bugs ourselves,

[27:23] lots of lots and lots of bugs ourselves, but but

[27:25] but but our system is so well wrung out that you

[27:29] our system is so well wrung out that you could at least build on top of the

[27:30] could at least build on top of the foundation. So, that's number one. It's

[27:32] foundation. So, that's number one. It's that the richness of the ecosystem, the

[27:34] that the richness of the ecosystem, the programmability of it, the capability of

[27:35] programmability of it, the capability of it. The second thing is is um, if you

[27:38] it. The second thing is is um, if you were a developer and you were building

[27:39] were a developer and you were building anything at all, the single most

[27:42] anything at all, the single most important thing you want more than

[27:43] important thing you want more than anything is installed base. You want the

[27:45] anything is installed base. You want the software that you run to run on a whole

[27:47] software that you run to run on a whole bunch of other computers. You don't want

[27:49] bunch of other computers. You don't want to build a soft You're not building

[27:50] to build a soft You're not building software just for yourself. You're

[27:52] software just for yourself. You're building software for your fleet or for

[27:54] building software for your fleet or for everybody else's fleet, because you're a

[27:55] everybody else's fleet, because you're a framework builder. And Nvidia's CUDA

[27:58] framework builder. And Nvidia's CUDA ecosystem is ultimately its great

[28:01] ecosystem is ultimately its great treasure.

[28:02] treasure. We are now

[28:03] We are now I don't know how many several hundred

[28:05] I don't know how many several hundred million GPUs. Every cloud has it.

[28:08] million GPUs. Every cloud has it. Goes back to A10, A100,

[28:11] Goes back to A10, A100, H100, H200, you know,

[28:15] H100, H200, you know, the L series,

[28:17] the L series, the P series. I mean, there's a whole

[28:19] the P series. I mean, there's a whole bunch of them.

[28:20] bunch of them. And and they're they're they're in all

[28:22] And and they're they're they're in all kinds of sizes and shapes. And if you're

[28:24] kinds of sizes and shapes. And if you're a robotics company, you want that CUDA

[28:26] a robotics company, you want that CUDA stack to actually run in the CUDA in the

[28:27] stack to actually run in the CUDA in the robot itself. We're literally

[28:29] robot itself. We're literally everywhere. And so, the installed base

[28:32] everywhere. And so, the installed base says that once you develop the software,

[28:34] says that once you develop the software, once you develop the model, it's going

[28:36] once you develop the model, it's going to be useful everywhere. And so, the

[28:38] to be useful everywhere. And so, the installed base is just too incredibly

[28:40] installed base is just too incredibly valuable. And then lastly,

[28:42] valuable. And then lastly, the fact that we're in every single

[28:44] the fact that we're in every single cloud makes us genuinely unique. Because

[28:47] cloud makes us genuinely unique. Because you know, you're an AI company and

[28:49] you know, you're an AI company and you're an AI developer, you're not

[28:51] you're an AI developer, you're not exactly sure which CSP you're going to

[28:52] exactly sure which CSP you're going to partner with and where you would like to

[28:54] partner with and where you would like to run it. And we run it everywhere,

[28:56] run it. And we run it everywhere, including on prem for you if you like.

[28:58] including on prem for you if you like. And so, so I think that that

[29:01] And so, so I think that that the the the richness of the ecosystem,

[29:04] the the the richness of the ecosystem, the expansiveness of the of the of the

[29:08] the expansiveness of the of the of the installed base and the versatility of

[29:11] installed base and the versatility of where where where we are, that

[29:13] where where where we are, that combination is is makes CUDA invaluable.

[29:16] combination is is makes CUDA invaluable. That makes a lot of sense. I I guess the

[29:17] That makes a lot of sense. I I guess the thing I'm curious about is um,

[29:20] thing I'm curious about is um, whether those advantages matter a lot to

[29:24] whether those advantages matter a lot to your main customers. Um, like there's

[29:28] your main customers. Um, like there's many people who who they might matter

[29:29] many people who who they might matter for for the kind of person who can't

[29:30] for for the kind of person who can't actually build their own software stack,

[29:32] actually build their own software stack, who are making most of your revenue. Um,

[29:34] who are making most of your revenue. Um, especially if you go to a world where AI

[29:36] especially if you go to a world where AI is getting especially good at the things

[29:38] is getting especially good at the things which have tight verification loops,

[29:40] which have tight verification loops, where you can RL on them. And then

[29:42] where you can RL on them. And then this question of how do you write a

[29:44] this question of how do you write a kernel that does attention or MLP the

[29:47] kernel that does attention or MLP the most efficiently across a scale up. It's

[29:49] most efficiently across a scale up. It's a very verifiable sort of feedback loop

[29:52] a very verifiable sort of feedback loop and so oh can everybody can all the

[29:54] and so oh can everybody can all the hyperscalers write these custom kernels

[29:56] hyperscalers write these custom kernels for themselves?

[29:58] for themselves? And they might still NVIDIA has

[30:00] And they might still NVIDIA has still has great price performance so

[30:01] still has great price performance so they might still prefer to use NVIDIA.

[30:03] they might still prefer to use NVIDIA. But then the question is does it just

[30:05] But then the question is does it just become a question of who is offering the

[30:07] become a question of who is offering the best specs, the best

[30:10] best specs, the best flops and memory and memory bandwidth

[30:11] flops and memory and memory bandwidth for a given dollar where historically

[30:13] for a given dollar where historically NVIDIA has just had and still has you

[30:16] NVIDIA has just had and still has you know the best margins in all of AI

[30:18] know the best margins in all of AI across hardware and software 70% plus

[30:20] across hardware and software 70% plus because of this CUDA mode and the

[30:22] because of this CUDA mode and the question is oh can you sustain those

[30:23] question is oh can you sustain those margins if

[30:25] margins if for most of your customers they can

[30:27] for most of your customers they can actually afford to

[30:29] actually afford to build

[30:31] build build instead of the CUDA mode.

[30:33] build instead of the CUDA mode. The number of engineers we have assigned

[30:35] The number of engineers we have assigned to these AI labs is insane.

[30:38] to these AI labs is insane. Working with them optimizing their

[30:39] Working with them optimizing their stack.

[30:41] stack. And the reason for that is because

[30:43] And the reason for that is because because nobody knows our architecture

[30:45] because nobody knows our architecture better than we do and these

[30:46] better than we do and these architectures are not not as

[30:49] architectures are not not as general purpose as a CPU. The reason the

[30:52] general purpose as a CPU. The reason the reason why a CPU is so

[30:53] reason why a CPU is so you know a CPU is kind of like like a

[30:55] you know a CPU is kind of like like a Cadillac you know it's

[30:57] Cadillac you know it's it just always you know it it's a nice

[30:59] it just always you know it it's a nice cruiser.

[31:01] cruiser. It never goes too fast.

[31:03] It never goes too fast. Everybody drives it pretty well.

[31:05] Everybody drives it pretty well. You know it's got cruise control.

[31:08] You know it's got cruise control. You know and everything is easy.

[31:10] You know and everything is easy. But in a lot of ways NVIDIA's GPUs are

[31:13] But in a lot of ways NVIDIA's GPUs are accelerators are kind of like F1 racers.

[31:16] accelerators are kind of like F1 racers. And yeah I I could imagine everybody's

[31:19] And yeah I I could imagine everybody's able to drive it at a 100 100 miles an

[31:21] able to drive it at a 100 100 miles an hour.

[31:22] hour. But it takes quite a bit of expertise to

[31:24] But it takes quite a bit of expertise to be able to push it to the limit.

[31:26] be able to push it to the limit. And we use we use a ton of AI to create

[31:30] And we use we use a ton of AI to create the kernels that we have.

[31:31] the kernels that we have. And I'm pretty sure we're going to still

[31:34] And I'm pretty sure we're going to still be needed for quite some time and so our

[31:36] be needed for quite some time and so our expertise

[31:38] expertise helps our our our

[31:41] helps our our our our AI labs partners

[31:43] our AI labs partners get another 2x

[31:45] get another 2x out of their stack easily often times

[31:49] out of their stack easily often times it's not unusual that we you know by the

[31:51] it's not unusual that we you know by the time that we're done optimizing their

[31:52] time that we're done optimizing their stack or optimizing a particular kernel

[31:55] stack or optimizing a particular kernel their model sped up by 3x 2x

[31:59] their model sped up by 3x 2x 50%.

[32:00] 50%. That's a huge number especially when

[32:04] That's a huge number especially when you're talking about the install base of

[32:06] you're talking about the install base of the fleet that they have of all the

[32:07] the fleet that they have of all the Hoppers and Blackwalls that they have.

[32:09] Hoppers and Blackwalls that they have. When you increase it by a factor of two

[32:12] When you increase it by a factor of two that

[32:13] that doubles their revenues.

[32:15] doubles their revenues. That directly translates to revenues.

[32:17] That directly translates to revenues. NVIDIA's computing stack is the best

[32:20] NVIDIA's computing stack is the best performance per TCO in the world bar

[32:22] performance per TCO in the world bar none.

[32:24] none. Nobody can demonstrate to me that any

[32:27] Nobody can demonstrate to me that any single platform in the world today has

[32:30] single platform in the world today has better performance TCO ratio. Not one

[32:33] better performance TCO ratio. Not one company. And in fact

[32:35] company. And in fact in fact the the the benchmarks are out

[32:37] in fact the the the benchmarks are out there Dylan's

[32:39] there Dylan's right inference max is sitting out there

[32:41] right inference max is sitting out there for everybody to to use. And not one TPU

[32:44] for everybody to to use. And not one TPU won't come training on won't come.

[32:46] won't come training on won't come. I I encourage them

[32:48] I I encourage them to

[32:50] to use inference max and demonstrate their

[32:52] use inference max and demonstrate their incredible

[32:54] incredible inference cost. It's really really hard.

[32:57] inference cost. It's really really hard. Not nobody wants to show up MLPerf.

[33:01] Not nobody wants to show up MLPerf. I would I would welcome training them to

[33:04] I would I would welcome training them to demonstrate their 40% that they claim

[33:06] demonstrate their 40% that they claim all the time. I would I would love to to

[33:08] all the time. I would I would love to to hear them demonstrate the the cost

[33:11] hear them demonstrate the the cost advantage of TPUs. It makes no sense in

[33:13] advantage of TPUs. It makes no sense in my mind. It makes absolutely zero sense.

[33:16] my mind. It makes absolutely zero sense. On first principles it makes no sense.

[33:18] On first principles it makes no sense. And so I I think the

[33:20] And so I I think the I think the the the reason why we're so

[33:22] I think the the the reason why we're so successful is simply because our TCO is

[33:25] successful is simply because our TCO is so great. There's a second you say

[33:29] so great. There's a second you say 60% of our customers are the top five.

[33:32] 60% of our customers are the top five. But

[33:33] But most of that business is external.

[33:36] most of that business is external. For example most of AWS's most of NVIDIA

[33:39] For example most of AWS's most of NVIDIA in AWS is for external customers not

[33:41] in AWS is for external customers not internal use. Most of our customers at

[33:44] internal use. Most of our customers at Azure obviously all of our customers are

[33:45] Azure obviously all of our customers are external. All of our customers at OCI

[33:47] external. All of our customers at OCI are external not internal use.

[33:49] are external not internal use. The reason why they they favor us is

[33:51] The reason why they they favor us is because our reach is so great. We can

[33:54] because our reach is so great. We can bring them

[33:56] bring them all of the great customers in the world.

[33:57] all of the great customers in the world. They're all built on NVIDIA and the

[33:59] They're all built on NVIDIA and the reason why all these companies are built

[34:00] reason why all these companies are built on NVIDIA is because our reach and our

[34:02] on NVIDIA is because our reach and our versatility is so great.

[34:04] versatility is so great. And so so I think I think the flywheel

[34:08] And so so I think I think the flywheel is really install base

[34:11] is really install base the programmability of our architecture

[34:13] the programmability of our architecture the richness of our ecosystem.

[34:16] the richness of our ecosystem. And the fact that there's so many AI

[34:17] And the fact that there's so many AI companies in the world there's tens of

[34:19] companies in the world there's tens of thousands of them now.

[34:21] thousands of them now. And if you were one of those AI startups

[34:24] And if you were one of those AI startups what architecture would you would you

[34:25] what architecture would you would you choose? You would choose an architecture

[34:27] choose? You would choose an architecture that's most abundant. We're the most

[34:29] that's most abundant. We're the most abundant in the world.

[34:31] abundant in the world. The one has the largest install base

[34:33] The one has the largest install base where the most

[34:34] where the most largest install base and one that has a

[34:36] largest install base and one that has a rich ecosystem.

[34:37] rich ecosystem. And so that's the flywheel that that's

[34:39] And so that's the flywheel that that's the reason why between the combination

[34:41] the reason why between the combination of one

[34:42] of one our perf per dollar is so great

[34:45] our perf per dollar is so great that that

[34:47] that that they have the lowest cost tokens.

[34:49] they have the lowest cost tokens. Second our perf per watt is the highest

[34:52] Second our perf per watt is the highest in the world. And so if if

[34:55] in the world. And so if if one of these companies if our partners

[34:57] one of these companies if our partners built a 1 gigawatt data center

[35:00] built a 1 gigawatt data center that 1 gigawatt data center

[35:02] that 1 gigawatt data center better deliver the maximum amount of

[35:04] better deliver the maximum amount of revenues that and number of tokens which

[35:07] revenues that and number of tokens which directly translates to revenues. You

[35:09] directly translates to revenues. You want it to generate as many tokens as

[35:10] want it to generate as many tokens as possible maximize the revenues for that

[35:12] possible maximize the revenues for that data center. We have the highest tokens

[35:15] data center. We have the highest tokens per watt architecture in the world. And

[35:17] per watt architecture in the world. And then lastly if your goal is to rent the

[35:19] then lastly if your goal is to rent the infrastructure we have the most

[35:20] infrastructure we have the most customers in the world.

[35:22] customers in the world. And so that's the reason why the

[35:24] And so that's the reason why the flywheel works. Interesting. I I guess

[35:26] flywheel works. Interesting. I I guess the question comes down to

[35:29] the question comes down to what is the actual market structure here

[35:30] what is the actual market structure here because even if there's other companies

[35:32] because even if there's other companies there could have been a world where

[35:33] there could have been a world where there's tens of thousands of AI

[35:34] there's tens of thousands of AI companies that have roughly equal share

[35:37] companies that have roughly equal share of compute. But if even through these

[35:39] of compute. But if even through these [snorts] five hyperscalers really the

[35:41] [snorts] five hyperscalers really the people on Amazon using the computer

[35:44] people on Amazon using the computer Anthropic company AI

[35:46] Anthropic company AI and these big big foundation labs who

[35:48] and these big big foundation labs who who can themselves afford and have the

[35:51] who can themselves afford and have the ability to make excel different

[35:53] ability to make excel different accelerators work. No I I think your

[35:55] accelerators work. No I I think your your your assumption is is premise is

[35:58] your your assumption is is premise is wrong. Maybe.

[35:59] wrong. Maybe. Let me let me let me ask you a slightly

[36:00] Let me let me let me ask you a slightly different question which Come back and

[36:02] different question which Come back and make me correct your your your your

[36:04] make me correct your your your your premise. Okay. Let me just ask you a

[36:05] premise. Okay. Let me just ask you a different question which is okay if

[36:07] different question which is okay if every everything you're saying

[36:08] every everything you're saying >> still make sure that make me come back

[36:10] >> still make sure that make me come back and and fix because it's just too

[36:11] and and fix because it's just too important to AI. It's too important to

[36:14] important to AI. It's too important to the future of science it's too important

[36:16] the future of science it's too important to the future of the industry. That that

[36:19] to the future of the industry. That that premise

[36:20] premise the premise look

[36:22] the premise look Let me just finish the question and then

[36:23] Let me just finish the question and then we can address it together.

[36:25] we can address it together. So

[36:26] So what do you think

[36:27] what do you think if

[36:29] if if all these things are true about price

[36:31] if all these things are true about price performance and performance per watt

[36:33] performance and performance per watt etc. are true why why do you think it is

[36:34] etc. are true why why do you think it is the case that say

[36:36] the case that say um

[36:37] um Anthropic for example just announced a

[36:39] Anthropic for example just announced a couple days ago they have a

[36:40] couple days ago they have a multi-gigawatt deal with Broadcom and

[36:42] multi-gigawatt deal with Broadcom and Google for TPUs and majority of their

[36:45] Google for TPUs and majority of their compute. Obviously for Google it's

[36:47] compute. Obviously for Google it's TPUs are majority of their compute too.

[36:48] TPUs are majority of their compute too. If I look at these big AI companies

[36:50] If I look at these big AI companies it seems like a lot of their

[36:52] it seems like a lot of their there was some point where it was all

[36:53] there was some point where it was all NVIDIA.

[36:54] NVIDIA. And now it's not. And

[36:57] And now it's not. And so I'm curious how to square

[37:00] so I'm curious how to square if these things are true on paper why

[37:01] if these things are true on paper why are they going with other accelerators?

[37:03] are they going with other accelerators? Yeah. Anthropic is isn't is a unique

[37:06] Yeah. Anthropic is isn't is a unique instance

[37:07] instance and not a trend.

[37:09] and not a trend. Without Anthropic why would there be any

[37:12] Without Anthropic why would there be any TPU growth at all?

[37:14] TPU growth at all? It's 100% Anthropic.

[37:16] It's 100% Anthropic. Without Anthropic why would there be any

[37:18] Without Anthropic why would there be any training growth at all? It's 100%

[37:20] training growth at all? It's 100% Anthropic. I think that's fairly well

[37:22] Anthropic. I think that's fairly well known and well understood. It's not that

[37:24] known and well understood. It's not that it's not that there's an abundance of

[37:27] it's not that there's an abundance of ASIC opportunities.

[37:29] ASIC opportunities. There's only one Anthropic. But Open AI

[37:31] There's only one Anthropic. But Open AI is dealing with AMD they're building

[37:33] is dealing with AMD they're building their own Titan accelerator.

[37:35] their own Titan accelerator. Yeah but they're mostly I think we could

[37:36] Yeah but they're mostly I think we could all acknowledge they're vastly NVIDIA.

[37:39] all acknowledge they're vastly NVIDIA. And and we're going to still do a lot of

[37:41] And and we're going to still do a lot of work together. Yeah.

[37:43] work together. Yeah. And we're not we're not I'm I'm not

[37:46] And we're not we're not I'm I'm not offended by other people using something

[37:48] offended by other people using something else and trying things. If they don't

[37:51] else and trying things. If they don't try these other things how would they

[37:52] try these other things how would they know how good ours is you know and

[37:54] know how good ours is you know and sometimes you got to be reminded of it.

[37:57] sometimes you got to be reminded of it. And and

[37:58] And and we we got to and we have to continuously

[38:00] we we got to and we have to continuously earn earn

[38:02] earn earn the position that we're in.

[38:04] the position that we're in. I

[38:05] I and there's always big claims and look

[38:07] and there's always big claims and look at the number of ASICs that have been

[38:08] at the number of ASICs that have been canceled.

[38:09] canceled. Just because you're going to build an

[38:10] Just because you're going to build an ASIC you still have to build something

[38:12] ASIC you still have to build something better than NVIDIA.

[38:14] better than NVIDIA. And it's not that easy building

[38:16] And it's not that easy building something better than NVIDIA. It's not

[38:17] something better than NVIDIA. It's not sensible actually.

[38:19] sensible actually. You know it's NVIDIA's got to be missing

[38:21] You know it's NVIDIA's got to be missing something seriously.

[38:23] something seriously. You know and because our our scale our

[38:25] You know and because our our scale our velocity

[38:26] velocity we're the only company in the world

[38:27] we're the only company in the world that's cranking it out every single

[38:30] that's cranking it out every single year. Big leaps every single year. I

[38:32] year. Big leaps every single year. I guess their logic is that hey it doesn't

[38:33] guess their logic is that hey it doesn't need to be better it just needs to be

[38:35] need to be better it just needs to be not more than 70% worse because they're

[38:37] not more than 70% worse because they're paying you 70% margins. No no no don't

[38:40] paying you 70% margins. No no no don't forget. Even an ASIC margin is really

[38:43] forget. Even an ASIC margin is really quite high.

[38:44] quite high. NVIDIA's margin 60 70% let's say but an

[38:47] NVIDIA's margin 60 70% let's say but an ASIC margin is 65.

[38:49] ASIC margin is 65. What are you really saving?

[38:51] What are you really saving? Oh you mean from Broadcom or something

[38:52] Oh you mean from Broadcom or something like

[38:52] like >> Yeah sure.

[38:54] >> Yeah sure. You got to pay somebody.

[38:56] You got to pay somebody. And so so I think the the ASIC margins

[38:58] And so so I think the the ASIC margins are are incredibly good from what I can

[39:01] are are incredibly good from what I can tell. And and they believe it they

[39:03] tell. And and they believe it they believe it so too.

[39:04] believe it so too. And so they're they're quite proud of

[39:06] And so they're they're quite proud of their their incredible ASIC margins.

[39:09] their their incredible ASIC margins. And so you you asked the question why.

[39:12] And so you you asked the question why. A long time ago

[39:13] A long time ago we just didn't have the ability to do

[39:15] we just didn't have the ability to do what

[39:17] what and and this is this is this is and at

[39:19] and and this is this is this is and at the time

[39:20] the time at the time

[39:22] at the time I didn't deeply internalize

[39:25] I didn't deeply internalize how difficult it would be

[39:27] how difficult it would be to build a a foundation AI lab

[39:30] to build a a foundation AI lab like Open AI and Anthropic.

[39:33] like Open AI and Anthropic. I

[39:34] I and the the fact that

[39:36] and the the fact that they needed huge investments from the

[39:38] they needed huge investments from the supplier themselves.

[39:40] supplier themselves. We just weren't in a position make the

[39:42] We just weren't in a position make the multi-billion dollar investment into

[39:44] multi-billion dollar investment into Anthropic so that they could use our use

[39:47] Anthropic so that they could use our use our compute.

[39:48] our compute. But Google and and AWS were and they put

[39:52] But Google and and AWS were and they put in huge investments in the beginning so

[39:54] in huge investments in the beginning so that Anthropic um in return use their

[39:57] that Anthropic um in return use their compute. Uh we we just weren't in a

[39:59] compute. Uh we we just weren't in a position to do so uh at the time. Nor

[40:02] position to do so uh at the time. Nor nor did I

[40:04] nor did I I would say my mistake is I didn't

[40:06] I would say my mistake is I didn't deeply internalize that they they really

[40:09] deeply internalize that they they really had no other options

[40:10] had no other options that that that a VC would never put in 5

[40:14] that that that a VC would never put in 5 10 billion dollars of investment into an

[40:17] 10 billion dollars of investment into an AI lab with the with the hopes of it

[40:19] AI lab with the with the hopes of it turning out to be Anthropic. And so that

[40:22] turning out to be Anthropic. And so that was my miss.

[40:24] was my miss. Uh but even if I understood it, I don't

[40:26] Uh but even if I understood it, I don't think we would have been in a position

[40:27] think we would have been in a position to do that at the time.

[40:29] to do that at the time. But um I'm not going to make that same

[40:31] But um I'm not going to make that same mistake again and and um uh I'm

[40:34] mistake again and and um uh I'm delighted to invest in OpenAI and and um

[40:37] delighted to invest in OpenAI and and um um I'm delighted to to uh help them

[40:40] um I'm delighted to to uh help them scale and I believe it's essential to do

[40:42] scale and I believe it's essential to do so. And then and then when um uh when I

[40:45] so. And then and then when um uh when I was able to

[40:46] was able to uh answer when Anthropic came to us, uh

[40:49] uh answer when Anthropic came to us, uh I'm delighted to be an investor, de-

[40:51] I'm delighted to be an investor, de- delighted to help them scale and um uh

[40:54] delighted to help them scale and um uh but we just weren't at the at the time

[40:56] but we just weren't at the at the time able to do so. Mhm. Uh if I if I could

[40:59] able to do so. Mhm. Uh if I if I could uh rewind everything uh invi- Nvidia

[41:02] uh rewind everything uh invi- Nvidia could have been as big back then as we

[41:04] could have been as big back then as we are now, I would have been more than

[41:05] are now, I would have been more than happy to do it. And this is this is

[41:07] happy to do it. And this is this is actually quite interesting which is um

[41:09] actually quite interesting which is um for many years Nvidia has been this

[41:12] for many years Nvidia has been this um the company in AI making money making

[41:15] um the company in AI making money making lots of money. And um now you're

[41:19] lots of money. And um now you're investing it. It's been reported that

[41:21] investing it. It's been reported that you've done up to 30 billion in OpenAI

[41:23] you've done up to 30 billion in OpenAI and 10 billion in um Anthropic. Um but

[41:27] and 10 billion in um Anthropic. Um but now their valuations have increased and

[41:28] now their valuations have increased and I'm sure they'll continue to increase.

[41:30] I'm sure they'll continue to increase. Um and so if over over all these many

[41:33] Um and so if over over all these many years, you know, you were giving them

[41:34] years, you know, you were giving them the compute, you saw where AI was headed

[41:36] the compute, you saw where AI was headed and then they were worth like 1/10 what

[41:38] and then they were worth like 1/10 what they are now a couple years ago or even

[41:39] they are now a couple years ago or even a year year ago in some cases. Um

[41:42] a year year ago in some cases. Um and you had all this cash.

[41:45] and you had all this cash. W- There's there's a world where either

[41:47] W- There's there's a world where either Nvidia themselves becomes a foundation

[41:49] Nvidia themselves becomes a foundation lab, um d- does a huge investment to

[41:52] lab, um d- does a huge investment to make that possible or has made the deals

[41:54] make that possible or has made the deals you made now at current valuations much

[41:56] you made now at current valuations much earlier on. Um and you had the cash to

[41:58] earlier on. Um and you had the cash to do it. So I'm I am curious actually why

[42:00] do it. So I'm I am curious actually why not have done it earlier.

[42:02] not have done it earlier. We did it as soon as we could have.

[42:05] We did it as soon as we could have. We did it as soon as we could have. And

[42:07] We did it as soon as we could have. And and and um

[42:09] and and um if I could have, I would have done it

[42:11] if I could have, I would have done it even earlier.

[42:12] even earlier. Um at the time that Anthropic needed us

[42:14] Um at the time that Anthropic needed us to do it, we just weren't in a position

[42:16] to do it, we just weren't in a position to do it.

[42:17] to do it. It wasn't it wasn't you know, it wasn't

[42:19] It wasn't it wasn't you know, it wasn't in our sensibility to do so. How so?

[42:21] in our sensibility to do so. How so? Like a cash thing or just Yeah, the

[42:23] Like a cash thing or just Yeah, the level of investment. You know, we never

[42:25] level of investment. You know, we never invested outside the company at the time

[42:28] invested outside the company at the time and not that much.

[42:30] and not that much. And um

[42:33] And um and we didn't realize we needed to.

[42:35] and we didn't realize we needed to. You know, I always I always thought that

[42:37] You know, I always I always thought that they could just go raise VCs for God's

[42:39] they could just go raise VCs for God's sakes like like all companies do.

[42:42] sakes like like all companies do. Um but but um uh what they were trying

[42:45] Um but but um uh what they were trying to what they were were trying to do

[42:48] to what they were were trying to do uh could not been done through VCs.

[42:51] uh could not been done through VCs. What OpenAI wanted to do could not been

[42:52] What OpenAI wanted to do could not been done through VCs.

[42:54] done through VCs. And and I recognize that now. I didn't

[42:56] And and I recognize that now. I didn't know it then.

[42:57] know it then. You know, but that's their genius.

[42:58] You know, but that's their genius. That's why they're smart. You know, and

[43:00] That's why they're smart. You know, and so so they realized they realized it

[43:02] so so they realized they realized it then that they had to do something like

[43:03] then that they had to do something like that and I'm delighted that they did,

[43:05] that and I'm delighted that they did, you know, and and even though even

[43:07] you know, and and even though even though um we we caused Anthropic to have

[43:11] though um we we caused Anthropic to have to go to somebody else

[43:14] to go to somebody else um I'm still happy that it happened.

[43:16] um I'm still happy that it happened. An- Anthropic's existence is great for

[43:18] An- Anthropic's existence is great for the world. I'm I'm delighted for it. Uh

[43:21] the world. I'm I'm delighted for it. Uh I I guess you still are making a ton of

[43:23] I I guess you still are making a ton of money and you're making way more money

[43:24] money and you're making way more money um quarter after quarter.

[43:25] um quarter after quarter. >> okay to have regrets.

[43:27] >> okay to have regrets. >> [laughter]

[43:29] >> [laughter] >> So the the the the the question still

[43:30] >> So the the the the the question still arises, okay, well now that we're here

[43:32] arises, okay, well now that we're here and you have all this money that you

[43:33] and you have all this money that you keep making um what should Nvidia be

[43:36] keep making um what should Nvidia be doing with it? And there's one answer

[43:38] doing with it? And there's one answer which says, look, there's this whole

[43:39] which says, look, there's this whole middleman ecosystem that has popped up

[43:40] middleman ecosystem that has popped up for converting um CAPEX into OPEX for

[43:45] for converting um CAPEX into OPEX for these labs so that they can rent compute

[43:48] these labs so that they can rent compute um because the chips are really

[43:49] um because the chips are really expensive, they make a lot of money over

[43:50] expensive, they make a lot of money over their lifetime through because AI models

[43:52] their lifetime through because AI models are getting better the value that they

[43:54] are getting better the value that they generate the tokens is increasing, but

[43:55] generate the tokens is increasing, but they're expensive to set up. Nvidia has

[43:57] they're expensive to set up. Nvidia has the money to do the CAPEX. So and in

[44:00] the money to do the CAPEX. So and in fact you are

[44:01] fact you are uh you're it's been reported you're

[44:03] uh you're it's been reported you're backstopping CoreWeave up to 6.3 billion

[44:04] backstopping CoreWeave up to 6.3 billion and have invested to be um but yeah, why

[44:08] and have invested to be um but yeah, why why why doesn't Nvidia become

[44:10] why why doesn't Nvidia become a cloud themselves? Why doesn't become a

[44:12] a cloud themselves? Why doesn't become a hyperscaler themselves and run this

[44:13] hyperscaler themselves and run this computer out? You have all this cash to

[44:14] computer out? You have all this cash to do it. This is a philosophy of the

[44:16] do it. This is a philosophy of the company and and I think is wise. We

[44:18] company and and I think is wise. We should do as much as needed as little as

[44:21] should do as much as needed as little as possible.

[44:23] possible. And and what that means is

[44:25] And and what that means is the the work that we do with building

[44:27] the the work that we do with building our our computing platform

[44:29] our our computing platform if we don't if we don't do it, I

[44:31] if we don't if we don't do it, I genuinely believe it doesn't get done.

[44:34] genuinely believe it doesn't get done. If we didn't take the risk that we take,

[44:36] If we didn't take the risk that we take, if we didn't build NVLink the way we

[44:38] if we didn't build NVLink the way we built if we didn't build the whole

[44:39] built if we didn't build the whole stack, if we didn't create the ecosystem

[44:41] stack, if we didn't create the ecosystem the way we did it, if we didn't dedicate

[44:43] the way we did it, if we didn't dedicate ourselves to 20 years of CUDA while

[44:46] ourselves to 20 years of CUDA while losing money most of that time, if we

[44:48] losing money most of that time, if we didn't do it, nobody else would have

[44:49] didn't do it, nobody else would have done it.

[44:51] If we didn't create all the CUDA-X

[44:53] If we didn't create all the CUDA-X libraries so that they're all domain

[44:55] libraries so that they're all domain specific

[44:57] specific you know, this is several decade and a

[44:59] you know, this is several decade and a half ago uh we pushed into domain

[45:01] half ago uh we pushed into domain specific libraries because we realized

[45:03] specific libraries because we realized that if we didn't create these domain

[45:04] that if we didn't create these domain specific libraries whether it's for ray

[45:06] specific libraries whether it's for ray tracing or image generation or even the

[45:09] tracing or image generation or even the early works of AI these models, if we

[45:11] early works of AI these models, if we didn't create them for data processing,

[45:13] didn't create them for data processing, structured data processing or vector

[45:15] structured data processing or vector data process- if we didn't create them,

[45:17] data process- if we didn't create them, nobody would.

[45:18] nobody would. And I am completely certain of that.

[45:21] And I am completely certain of that. We created a a a library for

[45:24] We created a a a library for computational lithography called

[45:25] computational lithography called cuLitho. If we didn't create it, nobody

[45:27] cuLitho. If we didn't create it, nobody would have.

[45:29] would have. And so

[45:30] And so accelerated computing wouldn't advance

[45:32] accelerated computing wouldn't advance the way it has if we didn't do what we

[45:33] the way it has if we didn't do what we did. And and so we should do that. We

[45:36] did. And and so we should do that. We should dedicate our company all of our

[45:38] should dedicate our company all of our might wholeheartedly to do that.

[45:40] might wholeheartedly to do that. However, the world has lots of clouds.

[45:43] However, the world has lots of clouds. If I didn't do it, somebody show up.

[45:46] If I didn't do it, somebody show up. And so following the the recipe, the

[45:48] And so following the the recipe, the philosophy of doing as much as needed

[45:51] philosophy of doing as much as needed but as little as possible

[45:53] but as little as possible as little as possible that philosophy

[45:56] as little as possible that philosophy exists in our company today. And

[45:58] exists in our company today. And everything I do, I do it with that lens.

[46:02] everything I do, I do it with that lens. In the case of clouds if we didn't

[46:04] In the case of clouds if we didn't support CoreWeave to exist

[46:07] support CoreWeave to exist these neo clouds, these AI clouds

[46:09] these neo clouds, these AI clouds wouldn't exist.

[46:11] wouldn't exist. If we didn't help CoreWeave exist they

[46:13] If we didn't help CoreWeave exist they would not exist.

[46:15] would not exist. If we didn't support Nscale, they

[46:17] If we didn't support Nscale, they wouldn't be where they are today. If we

[46:19] wouldn't be where they are today. If we didn't support Nebius, they wouldn't be

[46:21] didn't support Nebius, they wouldn't be where they are today. Now they are

[46:23] where they are today. Now they are they're doing fantastically. Is that a

[46:25] they're doing fantastically. Is that a business model work? No. We should do as

[46:28] business model work? No. We should do as much as needed as little as possible.

[46:30] much as needed as little as possible. And so we're trying we invest in our

[46:32] And so we're trying we invest in our ecosystem

[46:34] ecosystem because I want our eco- ecosystem to

[46:35] because I want our eco- ecosystem to thrive and I want our our I want

[46:39] thrive and I want our our I want I want the architecture and I want AI to

[46:41] I want the architecture and I want AI to be able to connect with as many

[46:44] be able to connect with as many industries as possible

[46:46] industries as possible as many countries as possible and make

[46:50] as many countries as possible and make it possible for, you know, the planet to

[46:52] it possible for, you know, the planet to be built on AI and to be built on the

[46:54] be built on AI and to be built on the American tech stack. And so so that that

[46:56] American tech stack. And so so that that vision I think is exactly what we're

[46:58] vision I think is exactly what we're pursuing. Now one of the things that

[47:00] pursuing. Now one of the things that that you mentioned

[47:02] that you mentioned um

[47:03] um there are so many great amazing

[47:05] there are so many great amazing foundation model companies and we try to

[47:06] foundation model companies and we try to invest in all of them.

[47:08] invest in all of them. And this is this is another thing that

[47:09] And this is this is another thing that we do. We don't pick winners.

[47:12] we do. We don't pick winners. And we we like we we we need to support

[47:14] And we we like we we we need to support everyone

[47:15] everyone and it's part of our part of our our our

[47:18] and it's part of our part of our our our joy of doing so. It's it's imperative to

[47:20] joy of doing so. It's it's imperative to our business, but we also go out of our

[47:22] our business, but we also go out of our way not to pick winners. And so when I

[47:24] way not to pick winners. And so when I when I invest in one of them, I invest

[47:26] when I invest in one of them, I invest in all of them.

[47:27] in all of them. Why do you go out of your way to not to

[47:28] Why do you go out of your way to not to pick winners?

[47:29] pick winners? Because it's not our job to.

[47:31] Because it's not our job to. Number one. Number two when Nvidia first

[47:34] Number one. Number two when Nvidia first started, there were

[47:36] started, there were 60 graphics companies, 60 3D graphics

[47:39] 60 graphics companies, 60 3D graphics companies. Uh we are the only one that

[47:41] companies. Uh we are the only one that survived. If you were to taken those 60

[47:43] survived. If you were to taken those 60 companies

[47:45] companies 60 graphics companies and ask yourself

[47:47] 60 graphics companies and ask yourself which one was going to make it

[47:48] which one was going to make it Nvidia would be the top of that list not

[47:51] Nvidia would be the top of that list not to make it.

[47:52] to make it. You know, this is long before you but

[47:55] You know, this is long before you but Nvidia's graphics architecture was

[47:57] Nvidia's graphics architecture was precisely wrong.

[47:58] precisely wrong. It's not a little bit wrong. We created

[48:01] It's not a little bit wrong. We created an architecture that was precisely

[48:03] an architecture that was precisely wrong.

[48:03] wrong. And and it was an impossible thing for

[48:06] And and it was an impossible thing for developers to support. It was never

[48:07] developers to support. It was never going to make it.

[48:09] going to make it. We reason about it for good reas- for

[48:11] We reason about it for good reas- for for from good first first principles,

[48:13] for from good first first principles, but we ended up in the wrong solution.

[48:15] but we ended up in the wrong solution. And and um uh everybody would have

[48:18] And and um uh everybody would have count- everybody would have counted us

[48:19] count- everybody would have counted us out.

[48:21] out. And and here we are. And so I'm I'm I'm

[48:24] And and here we are. And so I'm I'm I'm I have enough humility to recognize that

[48:27] I have enough humility to recognize that you know, don't don't pick winners. Mhm.

[48:29] you know, don't don't pick winners. Mhm. Yeah. Um Either let them all take care

[48:31] Yeah. Um Either let them all take care of themselves

[48:33] of themselves or take care of all of them. Um what one

[48:35] or take care of all of them. Um what one thing I didn't understand is

[48:37] thing I didn't understand is you said, look, we're not prioritizing

[48:38] you said, look, we're not prioritizing these, you know, clouds um just is there

[48:40] these, you know, clouds um just is there a neo cloud we want to prop them up. But

[48:43] a neo cloud we want to prop them up. But you also said

[48:44] you also said you listed a bunch of neo clouds and you

[48:45] you listed a bunch of neo clouds and you said they wouldn't exist if it wasn't

[48:46] said they wouldn't exist if it wasn't for Nvidia.

[48:47] for Nvidia. >> Yeah. And so how are those two things

[48:50] >> Yeah. And so how are those two things compatible?

[48:50] compatible? >> um first of all, they they need to want

[48:52] >> um first of all, they they need to want to exist and they come to ask us for

[48:54] to exist and they come to ask us for help. And when they when they um when

[48:57] help. And when they when they um when they want to exist and have they have a

[48:59] they want to exist and have they have a business plan and they you know, they

[49:01] business plan and they you know, they have expertise and you know, they have

[49:02] have expertise and you know, they have the passion for it uh they obviously

[49:05] the passion for it uh they obviously have to have some capability themselves

[49:07] have to have some capability themselves uh but if at the end of the day they

[49:09] uh but if at the end of the day they need some investment in order to get it

[49:11] need some investment in order to get it off the ground, uh we we would be there

[49:13] off the ground, uh we we would be there for them.

[49:14] for them. Um but but the sooner they get their

[49:17] Um but but the sooner they get their flywheel going

[49:19] flywheel going you know, your question was, do we want

[49:20] you know, your question was, do we want to be in the financing business? The

[49:22] to be in the financing business? The answer is no. Mhm. Yeah, we don't want

[49:24] answer is no. Mhm. Yeah, we don't want to be we want to we because there are

[49:26] to be we want to we because there are people in the financing business and

[49:27] people in the financing business and we'd rather work with all of the people

[49:29] we'd rather work with all of the people who are fin- in the financing business

[49:31] who are fin- in the financing business than to be a financier ourselves. And so

[49:33] than to be a financier ourselves. And so so I think the the our goal is to focus

[49:36] so I think the the our goal is to focus on what we do, keep our business model

[49:38] on what we do, keep our business model as simple as possible, support our

[49:40] as simple as possible, support our ecosystem. Um when someone like like uh

[49:43] ecosystem. Um when someone like like uh OpenAI needs an investment of $30

[49:45] OpenAI needs an investment of $30 billion scale um because it's still

[49:47] billion scale um because it's still before their IPO

[49:49] before their IPO and and uh

[49:51] and and uh um we deeply believe in them.

[49:53] um we deeply believe in them. Uh we deeply believe that I deeply

[49:56] Uh we deeply believe that I deeply believe that that they're going to be

[49:57] believe that that they're going to be they're going to be an Well, they're an

[49:59] they're going to be an Well, they're an extraordinary company already today.

[50:00] extraordinary company already today. They're going to be an incredible

[50:01] They're going to be an incredible company. Uh the world needs them to

[50:03] company. Uh the world needs them to exist. The world wants them to exist and

[50:05] exist. The world wants them to exist and wants them to exist. And and uh they

[50:08] wants them to exist. And and uh they have everything on They have the wind at

[50:09] have everything on They have the wind at their back. Let's Let's support them and

[50:11] their back. Let's Let's support them and let them scale. And so So, to those

[50:14] let them scale. And so So, to those those investments we'll do because we're

[50:17] those investments we'll do because we're They need us to do it.

[50:18] They need us to do it. And um uh but we're we're not trying to

[50:21] And um uh but we're we're not trying to do as much as possible. We're trying to

[50:22] do as much as possible. We're trying to do as little as possible.

[50:24] do as little as possible. I spend way too much time copy-pasting

[50:25] I spend way too much time copy-pasting text back and forth from Google Docs to

[50:27] text back and forth from Google Docs to chatbots. And so, I built what's

[50:29] chatbots. And so, I built what's basically a cursor for writing, which

[50:31] basically a cursor for writing, which operates the way I think an AI

[50:32] operates the way I think an AI co-researcher should operate. I can tag

[50:34] co-researcher should operate. I can tag it and it can talk with me through

[50:36] it and it can talk with me through inline comment threads and help me dig

[50:38] inline comment threads and help me dig deeper and brainstorm. I built this

[50:39] deeper and brainstorm. I built this entire thing over the weekend with

[50:40] entire thing over the weekend with Cursor and their new Composer 2 model.

[50:42] Cursor and their new Composer 2 model. With a lot of agentic coding tools, I

[50:44] With a lot of agentic coding tools, I feel like I have no idea what's going on

[50:45] feel like I have no idea what's going on under the surface. I just have to

[50:46] under the surface. I just have to relinquish control and hope for the

[50:48] relinquish control and hope for the best. But Cursor let me try a bunch of

[50:50] best. But Cursor let me try a bunch of different ideas while staying on top of

[50:51] different ideas while staying on top of the implementation. I did most of my

[50:53] the implementation. I did most of my brainstorming in the agents window. And

[50:55] brainstorming in the agents window. And after I got some basic files in place, I

[50:57] after I got some basic files in place, I used a diff window to track changes. The

[50:59] used a diff window to track changes. The few times that I needed to make a quick

[51:00] few times that I needed to make a quick tweak by hand, I just used the editor.

[51:02] tweak by hand, I just used the editor. If you want to try my AI co-researcher

[51:03] If you want to try my AI co-researcher yourself, I've linked the GitHub repo in

[51:05] yourself, I've linked the GitHub repo in the description. And if you have a tool

[51:06] the description. And if you have a tool that you've been wanting to build, you

[51:08] that you've been wanting to build, you should make it happen. Go to

[51:09] should make it happen. Go to cursor.com/swyx

[51:11] cursor.com/swyx to get started.

[51:13] to get started. Th- This may be sort of an obvious

[51:14] Th- This may be sort of an obvious question, but

[51:15] question, but we've lived many years

[51:17] we've lived many years in this situation where there's a

[51:19] in this situation where there's a shortage of GPUs and it's grown now

[51:23] shortage of GPUs and it's grown now because models are getting better. We

[51:25] because models are getting better. We have a shortage of GPUs. Yes.

[51:27] have a shortage of GPUs. Yes. >> Yeah. And

[51:29] >> Yeah. And Nvidia is known for divvying up the

[51:32] Nvidia is known for divvying up the scarce allocation not just based on high

[51:35] scarce allocation not just based on high bidder, but rather on hey, we want to

[51:37] bidder, but rather on hey, we want to make sure that these neo neo clouds

[51:39] make sure that these neo neo clouds exist. Let's give some to CoreWeave.

[51:40] exist. Let's give some to CoreWeave. Let's give some to Crusoe. Let's give

[51:42] Let's give some to Crusoe. Let's give some to Lambda. Um

[51:44] some to Lambda. Um why is it good for Nvidia? First of all,

[51:46] why is it good for Nvidia? First of all, would you agree with this

[51:47] would you agree with this characterization of fracturing the

[51:48] characterization of fracturing the market?

[51:49] market? >> No. Yeah, your premise is just wrong.

[51:51] >> No. Yeah, your premise is just wrong. Yeah. Um we're we're sufficiently um

[51:56] Yeah. Um we're we're sufficiently um mindful about these things.

[51:58] mindful about these things. Uh

[51:59] Uh we're very mindful about these things.

[52:00] we're very mindful about these things. First of all

[52:01] First of all if you don't place an if you don't place

[52:03] if you don't place an if you don't place a PO

[52:06] all the talking in the world won't make

[52:08] all the talking in the world won't make a difference. And so, until we get a PO,

[52:10] a difference. And so, until we get a PO, what are we going to do?

[52:12] what are we going to do? And so, the first thing is is

[52:14] And so, the first thing is is we work with we work really hard with

[52:16] we work with we work really hard with everybody to get a forecast done.

[52:19] everybody to get a forecast done. Because these things take a long time to

[52:20] Because these things take a long time to build and the data centers take a long

[52:22] build and the data centers take a long time to build. And so, we align

[52:24] time to build. And so, we align ourselves um with demand and supply and

[52:27] ourselves um with demand and supply and things like that through forecasting.

[52:29] things like that through forecasting. Okay, that's job job number one. Number

[52:31] Okay, that's job job number one. Number two

[52:32] two um everybody who you know, we've tried

[52:34] um everybody who you know, we've tried to forecast with with with as many

[52:36] to forecast with with with as many people as possible, but in the final in

[52:38] people as possible, but in the final in the final analysis, you still have to

[52:39] the final analysis, you still have to place an order.

[52:41] place an order. And maybe maybe um uh for whatever

[52:44] And maybe maybe um uh for whatever reason you didn't place your order, what

[52:45] reason you didn't place your order, what can I do?

[52:47] can I do? And so, at some point first in first

[52:49] And so, at some point first in first out.

[52:50] out. But beyond that if you're not ready

[52:53] But beyond that if you're not ready because your data center's not ready

[52:56] because your data center's not ready or certain components aren't ready to to

[52:58] or certain components aren't ready to to enable you to stand up a data center

[53:00] enable you to stand up a data center um

[53:01] um we might decide to serve another

[53:03] we might decide to serve another customer first.

[53:04] customer first. That's just maximizing the throughput of

[53:06] That's just maximizing the throughput of our of our our own factory.

[53:09] our of our our own factory. And so, uh we might do some adjustments

[53:11] And so, uh we might do some adjustments there. Aside from that

[53:15] uh the prioritization is is first in

[53:17] uh the prioritization is is first in first out.

[53:19] first out. Mhm. Yeah, you got to you got to place a

[53:20] Mhm. Yeah, you got to you got to place a PO.

[53:21] PO. If you don't place a PO, uh now of

[53:23] If you don't place a PO, uh now of course

[53:24] course uh there there are stories about that.

[53:27] uh there there are stories about that. You know, like for example

[53:28] You know, like for example uh all of this kind of started from

[53:31] uh all of this kind of started from from uh

[53:32] from uh it was a article about Larry and Elon

[53:34] it was a article about Larry and Elon having dinner with me where they where

[53:36] having dinner with me where they where they begged for GPUs.

[53:39] they begged for GPUs. That never happened.

[53:41] That never happened. We We absolutely had dinner.

[53:44] We We absolutely had dinner. And we absolutely had dinner. Um and it

[53:46] And we absolutely had dinner. Um and it was a it was a wonderful dinner. In no

[53:48] was a it was a wonderful dinner. In no time did they beg for GPUs. And so, it

[53:51] time did they beg for GPUs. And so, it they just had to place an order. And

[53:53] they just had to place an order. And once they placed an order, we do our

[53:54] once they placed an order, we do our best to get the capacity to them. Yeah.

[53:57] best to get the capacity to them. Yeah. We're not complicated.

[53:59] We're not complicated. Okay, so it sounds like there's a queue

[54:01] Okay, so it sounds like there's a queue and then um

[54:02] and then um uh

[54:03] uh based on whether your data center's

[54:04] based on whether your data center's ready and when you put place a purchase

[54:06] ready and when you put place a purchase order

[54:07] order you get them a certain time.

[54:08] you get them a certain time. But it still doesn't sound like highest

[54:10] But it still doesn't sound like highest bidder just gets it. Is there a reason

[54:12] bidder just gets it. Is there a reason to do it? We never do that. Okay. We

[54:15] to do it? We never do that. Okay. We never do that.

[54:15] never do that. >> I just do highest bidder? Because it's

[54:17] >> I just do highest bidder? Because it's it's a bad big business practice. You

[54:19] it's a bad big business practice. You you set your price. You set your price

[54:21] you set your price. You set your price and then and then people decide to buy

[54:22] and then and then people decide to buy it or not.

[54:24] it or not. And and um

[54:25] And and um uh there there there

[54:28] uh there there there I I understand that that

[54:31] I I understand that that others in the chip industry

[54:33] others in the chip industry um

[54:34] um uh change their prices when demand is

[54:36] uh change their prices when demand is higher.

[54:37] higher. Uh but we just don't. We just don't.

[54:39] Uh but we just don't. We just don't. That's just never been a practice of

[54:40] That's just never been a practice of ours. You can count on us. You know, I I

[54:42] ours. You can count on us. You know, I I prefer to be to be um uh

[54:45] prefer to be to be um uh dependable

[54:46] dependable uh to be the foundation of the industry

[54:49] uh to be the foundation of the industry and I I you don't need to you don't need

[54:51] and I I you don't need to you don't need to second-guess. Mhm. You know, if if

[54:54] to second-guess. Mhm. You know, if if you if I quoted you a price

[54:56] you if I quoted you a price um we quoted you a price. That's it.

[54:59] um we quoted you a price. That's it. Mhm. And if demand goes through the

[55:00] Mhm. And if demand goes through the roof, so be it. And on the other end,

[55:02] roof, so be it. And on the other end, that's why you have a productive

[55:03] that's why you have a productive relationship with TSMC, right?

[55:05] relationship with TSMC, right? >> Yeah.

[55:06] >> Yeah. Yeah, yeah. Uh

[55:07] Yeah, yeah. Uh Nvidia's been in business We've been

[55:09] Nvidia's been in business We've been doing business with them for

[55:11] doing business with them for uh I guess coming up on 30 years.

[55:14] uh I guess coming up on 30 years. And Nvidia and TSMC don't have a legal

[55:17] And Nvidia and TSMC don't have a legal contract.

[55:18] contract. There's there is always some rough

[55:20] There's there is always some rough justice.

[55:21] justice. And um sometimes I'm right, sometimes

[55:23] And um sometimes I'm right, sometimes I'm wrong. Uh sometimes I got I got a

[55:26] I'm wrong. Uh sometimes I got I got a better deal, sometimes I got a worse

[55:27] better deal, sometimes I got a worse deal.

[55:28] deal. Uh but overall in the in the whole, the

[55:30] Uh but overall in the in the whole, the relationship is incredible. And and I

[55:32] relationship is incredible. And and I can completely trust them. I can

[55:34] can completely trust them. I can completely depend on them.

[55:35] completely depend on them. And and our our one of the things that

[55:37] And and our our one of the things that we you can count on with Nvidia is that

[55:40] we you can count on with Nvidia is that next year

[55:41] next year this year Vera Rubin's going to be

[55:43] this year Vera Rubin's going to be incredible. Next year Vera Rubin Ultra

[55:45] incredible. Next year Vera Rubin Ultra will come. The year after that Fineman

[55:47] will come. The year after that Fineman will come. And the year after that I

[55:48] will come. And the year after that I haven't introduced the name yet. And so

[55:51] haven't introduced the name yet. And so So, every single year you can count on

[55:53] So, every single year you can count on us.

[55:55] us. And this is an

[55:57] And this is an you got you're going to have to go find

[55:58] you got you're going to have to go find another ASIC team in the world.

[56:01] another ASIC team in the world. Pick your ASIC team.

[56:02] Pick your ASIC team. Where you can say I can bet the farm of

[56:05] Where you can say I can bet the farm of I can bet my entire business that you

[56:08] I can bet my entire business that you will be here for me every single year.

[56:11] will be here for me every single year. Your cost your token cost will decrease

[56:13] Your cost your token cost will decrease by an order of magnitude every single

[56:16] by an order of magnitude every single year. I can count on it. I can count on

[56:18] year. I can count on it. I can count on the clock.

[56:19] the clock. Well, I just said something about TSMC.

[56:23] No other foundry in history can you

[56:26] No other foundry in history can you possibly say that?

[56:28] possibly say that? You can say that about Nvidia today.

[56:31] You can say that about Nvidia today. You can count on us every single year.

[56:33] You can count on us every single year. If you would like to buy a billion

[56:35] If you would like to buy a billion dollars worth of

[56:36] dollars worth of AI factory compute

[56:38] AI factory compute no problem.

[56:39] no problem. If you'd like to buy $100 million, no

[56:41] If you'd like to buy $100 million, no problem. You'd like to buy $10 million

[56:43] problem. You'd like to buy $10 million or just one rack, not a problem. Or just

[56:45] or just one rack, not a problem. Or just one graphics card

[56:47] one graphics card okay, no problem.

[56:49] okay, no problem. If you would like to place an order for

[56:51] If you would like to place an order for $100 billion AI factory, no problem.

[56:54] $100 billion AI factory, no problem. We're the only company in the world

[56:56] We're the only company in the world where you can say that today.

[56:58] where you can say that today. I can say that about TSMC as well.

[57:01] I can say that about TSMC as well. I want to buy one

[57:03] I want to buy one buy one billion, no problem. We just got

[57:05] buy one billion, no problem. We just got to go through the process of planning

[57:07] to go through the process of planning for it and you know, all the all the

[57:08] for it and you know, all the all the things that that mature people do. Mhm.

[57:11] things that that mature people do. Mhm. You know, and so so I I think the the uh

[57:14] You know, and so so I I think the the uh uh

[57:15] uh this ability for Nvidia to be the

[57:17] this ability for Nvidia to be the foundation

[57:18] foundation of the world's AI industry

[57:21] of the world's AI industry this is a this is a position that has

[57:23] this is a this is a position that has taken us decade several dec- couple of

[57:26] taken us decade several dec- couple of decades to arrive at. Enormous

[57:28] decades to arrive at. Enormous commitment, enormous dedication. And um

[57:32] commitment, enormous dedication. And um the stability of our company, the

[57:34] the stability of our company, the consistency of our company is really

[57:35] consistency of our company is really really important. Okay, I want to ask

[57:37] really important. Okay, I want to ask you about China. Yeah. And I always like

[57:39] you about China. Yeah. And I always like to take uh

[57:40] to take uh I don't actually don't know what I think

[57:41] I don't actually don't know what I think about whether it's good to sell chips to

[57:43] about whether it's good to sell chips to China or not, but I have like played

[57:44] China or not, but I have like played devil's advocate against my guest. So,

[57:45] devil's advocate against my guest. So, when Dario was on who supports export

[57:47] when Dario was on who supports export controls, I asked him, "Well, why can't

[57:48] controls, I asked him, "Well, why can't America and China both have

[57:50] America and China both have country of geniuses in a data center?"

[57:52] country of geniuses in a data center?" But since um you're on the opposite

[57:53] But since um you're on the opposite side, I'll

[57:54] side, I'll ask you in the opposite way. Um

[57:57] ask you in the opposite way. Um and look, one way to think about it is

[57:59] and look, one way to think about it is Anthropic actually announced a couple

[58:01] Anthropic actually announced a couple days ago Mythos preview this model

[58:02] days ago Mythos preview this model Mythos they're not even releasing

[58:03] Mythos they're not even releasing publicly because they say it has such

[58:05] publicly because they say it has such cyber offensive capabilities that we

[58:07] cyber offensive capabilities that we don't think the world is ready until we

[58:09] don't think the world is ready until we get we make sure these zero days are

[58:10] get we make sure these zero days are patched up. But they say it found

[58:12] patched up. But they say it found thousands of high severity

[58:14] thousands of high severity vulnerabilities across every major

[58:15] vulnerabilities across every major operating system, every browser. It

[58:18] operating system, every browser. It found one in OpenBSD, which is this

[58:19] found one in OpenBSD, which is this operating system that's been

[58:20] operating system that's been specifically designed

[58:22] specifically designed to not have zero days and it found one

[58:24] to not have zero days and it found one uh for 27 years it's existed. Um and so

[58:27] uh for 27 years it's existed. Um and so if

[58:28] if Chinese companies and Chinese labs and

[58:30] Chinese companies and Chinese labs and Chinese government had access to the AI

[58:32] Chinese government had access to the AI chips to train a model like Claude

[58:34] chips to train a model like Claude Mythos with these cyber offensive and

[58:36] Mythos with these cyber offensive and run millions of instances of it with

[58:38] run millions of instances of it with more compute

[58:39] more compute the question is oh, is that a threat to

[58:43] the question is oh, is that a threat to American companies, to American national

[58:45] American companies, to American national security?

[58:46] security? Uh first of all, um

[58:48] Uh first of all, um Mythos was was uh trained on fairly

[58:51] Mythos was was uh trained on fairly mundane capacity

[58:54] mundane capacity and a fairly mundane amount of it.

[58:57] and a fairly mundane amount of it. Um by an extraordinary company.

[58:59] Um by an extraordinary company. Uh and so, the amount of capacity and

[59:01] Uh and so, the amount of capacity and the type of compute that's it was

[59:04] the type of compute that's it was trained on is abundantly available in

[59:06] trained on is abundantly available in China.

[59:08] China. And so

[59:09] And so you just have to first realize

[59:11] you just have to first realize that

[59:13] that chips exist in China. They manufacture

[59:15] chips exist in China. They manufacture 60% of the world's mainstream chips,

[59:17] 60% of the world's mainstream chips, maybe more.

[59:19] maybe more. It's a very large industry for them.

[59:22] It's a very large industry for them. They have some of the world's greatest

[59:23] They have some of the world's greatest computer scientists.

[59:26] computer scientists. As you know

[59:27] As you know most of the AI researchers in all of

[59:29] most of the AI researchers in all of these AI labs, most of them are Chinese.

[59:33] They have 50% of the world's

[59:35] They have 50% of the world's AI researchers.

[59:39] And so

[59:40] And so the question is if you're concerned

[59:42] the question is if you're concerned about them

[59:44] about them what is the considering all the assets

[59:46] what is the considering all the assets they already have? They have an

[59:48] they already have? They have an abundance of energy.

[59:50] abundance of energy. They have plenty of chips.

[59:51] They have plenty of chips. They got most of the AI researchers.

[59:54] They got most of the AI researchers. If you're worried about them, what is

[59:56] If you're worried about them, what is the best way

[59:59] to create a safe world?

[01:00:01] to create a safe world? Well

[01:00:03] Well victimizing them

[01:00:04] victimizing them um

[01:00:06] um I turning them into an enemy

[01:00:08] I turning them into an enemy uh likely isn't the best answer.

[01:00:11] uh likely isn't the best answer. They are an adversary.

[01:00:13] They are an adversary. We want United States to win.

[01:00:16] We want United States to win. Um but I think having a having a

[01:00:18] Um but I think having a having a dialogue and having research dialogue is

[01:00:21] dialogue and having research dialogue is probably the safest thing to do.

[01:00:23] probably the safest thing to do. This is an area that that is glaringly

[01:00:26] This is an area that that is glaringly missing

[01:00:27] missing because of our current attitude about

[01:00:30] because of our current attitude about China as an adversary.

[01:00:33] China as an adversary. It is essential that our AI researchers

[01:00:35] It is essential that our AI researchers and their AI researchers are actually

[01:00:36] and their AI researchers are actually talking.

[01:00:38] talking. It is essential that we try to

[01:00:40] It is essential that we try to both agree on how to what not to use the

[01:00:43] both agree on how to what not to use the AI for.

[01:00:46] AI for. With respect to

[01:00:48] With respect to finding bugs in software, of course,

[01:00:50] finding bugs in software, of course, that's what AI is supposed to do.

[01:00:52] that's what AI is supposed to do. Is it going to find bugs in a lot of

[01:00:53] Is it going to find bugs in a lot of software? Of course.

[01:00:56] software? Of course. There's lots and lots of bugs. There are

[01:00:58] There's lots and lots of bugs. There are lots of bugs

[01:00:59] lots of bugs in the AI software.

[01:01:01] in the AI software. And so

[01:01:02] And so um that's what AI is supposed to do. And

[01:01:05] um that's what AI is supposed to do. And I'm delighted that that uh uh AI has

[01:01:07] I'm delighted that that uh uh AI has reached the level where it could help us

[01:01:09] reached the level where it could help us be so much more productive.

[01:01:11] be so much more productive. Um one of the things that

[01:01:13] Um one of the things that that um

[01:01:15] that um is

[01:01:16] is is uh

[01:01:17] is uh under under emphasized

[01:01:20] under under emphasized is the richness of ecosystem around

[01:01:22] is the richness of ecosystem around cybersecurity, AI cybersecurity, and AI

[01:01:25] cybersecurity, AI cybersecurity, and AI security, and AI privacy, and uh AI

[01:01:28] security, and AI privacy, and uh AI safety.

[01:01:30] safety. That whole ecosystem

[01:01:33] of AI startups that are trying to create

[01:01:35] of AI startups that are trying to create this future for us where where you have

[01:01:38] this future for us where where you have one AI agent that's incredible

[01:01:41] one AI agent that's incredible surrounded by thousands of AI agents

[01:01:44] surrounded by thousands of AI agents keeping it safe, keeping it secure.

[01:01:46] keeping it safe, keeping it secure. That future surely is going to happen.

[01:01:49] That future surely is going to happen. And the idea that you're going to have

[01:01:51] And the idea that you're going to have an AI agent running around with nobody

[01:01:54] an AI agent running around with nobody watching after it is kind of insane.

[01:01:56] watching after it is kind of insane. And so uh we know very well that this

[01:02:00] And so uh we know very well that this ecosystem needs to thrive.

[01:02:02] ecosystem needs to thrive. It turns out this ecosystem needs open

[01:02:04] It turns out this ecosystem needs open source.

[01:02:05] source. This ecosystem needs open models. They

[01:02:07] This ecosystem needs open models. They need open stacks so that all of these AI

[01:02:09] need open stacks so that all of these AI researchers and all these great computer

[01:02:11] researchers and all these great computer scientists can go build AI systems that

[01:02:14] scientists can go build AI systems that has are as formidable and can keep um AI

[01:02:18] has are as formidable and can keep um AI safe.

[01:02:19] safe. And uh

[01:02:21] And uh and

[01:02:21] and and and so one of the things that we

[01:02:23] and and so one of the things that we need to make sure that we do is we keep

[01:02:24] need to make sure that we do is we keep the the open source ecosystem vibrant.

[01:02:28] the the open source ecosystem vibrant. And um

[01:02:30] And um and that can't be ignored.

[01:02:32] and that can't be ignored. That can't be ignored and and a lot of

[01:02:34] That can't be ignored and and a lot of that is coming out of China.

[01:02:35] that is coming out of China. Um

[01:02:37] Um I we we ought to we ought to

[01:02:39] I we we ought to we ought to >> [clears throat]

[01:02:39] >> [clears throat] >> not suffocate that.

[01:02:41] >> not suffocate that. You know, with respect to to China, we

[01:02:43] You know, with respect to to China, we want to have of course we want United

[01:02:45] want to have of course we want United States to have as much computing as

[01:02:46] States to have as much computing as possible.

[01:02:48] possible. Uh

[01:02:49] Uh we're we're limited by energy.

[01:02:51] we're we're limited by energy. Um but you know, we got a lot of people

[01:02:53] Um but you know, we got a lot of people working on that and we we ought to not

[01:02:55] working on that and we we ought to not make energy a a a bottleneck for our our

[01:02:58] make energy a a a bottleneck for our our country.

[01:03:00] country. Um but what we also want is we want to

[01:03:03] Um but what we also want is we want to make sure that all the AI developers in

[01:03:05] make sure that all the AI developers in the world are developing on

[01:03:07] the world are developing on the American tech stack and making

[01:03:09] the American tech stack and making the contributions, the advancements of

[01:03:11] the contributions, the advancements of AI

[01:03:13] AI especially when it's open source

[01:03:14] especially when it's open source available to the American ecosystem.

[01:03:17] available to the American ecosystem. And it would be extremely foolish

[01:03:19] And it would be extremely foolish to create two ecosystems.

[01:03:23] to create two ecosystems. The open source ecosystem and it only

[01:03:25] The open source ecosystem and it only runs on the Chinese tech tech a foreign

[01:03:27] runs on the Chinese tech tech a foreign tech stack

[01:03:28] tech stack and a closed ecosystem and a that runs

[01:03:30] and a closed ecosystem and a that runs on the American tech stack. I think that

[01:03:31] on the American tech stack. I think that that would be that would be a a horrible

[01:03:33] that would be that would be a a horrible outcome for United States. Mhm.

[01:03:36] outcome for United States. Mhm. Since there are a lot of things

[01:03:37] Since there are a lot of things let let me just triage the

[01:03:39] let let me just triage the um response. I mean, I think the concern

[01:03:42] um response. I mean, I think the concern going back to

[01:03:43] going back to the flop difference in the hacking is

[01:03:46] the flop difference in the hacking is yes, they have compute, but there's some

[01:03:48] yes, they have compute, but there's some estimates that because they're at 7

[01:03:50] estimates that because they're at 7 nanometer

[01:03:51] nanometer uh they don't have EUVs because of chip

[01:03:53] uh they don't have EUVs because of chip making export controls the amount of

[01:03:55] making export controls the amount of flops they're able to actually produce,

[01:03:57] flops they're able to actually produce, they have like 1/10 the amount of flops

[01:03:58] they have like 1/10 the amount of flops that the US has. And so with that could

[01:04:02] that the US has. And so with that could they train eventually a model like

[01:04:03] they train eventually a model like Mythos? Yes. But the question is because

[01:04:07] Mythos? Yes. But the question is because we have more flops

[01:04:09] we have more flops uh American labs are able to get to

[01:04:11] uh American labs are able to get to these level capabilities first. And

[01:04:12] these level capabilities first. And because Anthropic got to it first, they

[01:04:14] because Anthropic got to it first, they say, "Okay, we're going to hold on to it

[01:04:15] say, "Okay, we're going to hold on to it for a month while all these American

[01:04:17] for a month while all these American companies we give them access to it,

[01:04:18] companies we give them access to it, they're going to

[01:04:19] they're going to patch up all their vulnerabilities.

[01:04:21] patch up all their vulnerabilities. And now we release it." Furthermore, if

[01:04:22] And now we release it." Furthermore, if they even if they trained a model like

[01:04:24] they even if they trained a model like this

[01:04:25] this the ability to deploy that scale, you

[01:04:27] the ability to deploy that scale, you know, if you had a cyber hacker it's

[01:04:28] know, if you had a cyber hacker it's much more dangerous if they have a

[01:04:29] much more dangerous if they have a million of them versus a thousand of

[01:04:31] million of them versus a thousand of them. So that inference compute really

[01:04:32] them. So that inference compute really matters a lot.

[01:04:34] matters a lot. And in fact, the fact that they have so

[01:04:36] And in fact, the fact that they have so many AI researchers are so good is the

[01:04:37] many AI researchers are so good is the thing that makes it so scary because

[01:04:39] thing that makes it so scary because what is it that makes those engineer

[01:04:40] what is it that makes those engineer researchers more productive? Is compute.

[01:04:43] researchers more productive? Is compute. Um if you talk to any AI lab in America,

[01:04:45] Um if you talk to any AI lab in America, they say the thing that's bottlenecking

[01:04:46] they say the thing that's bottlenecking them is compute. So and there are quotes

[01:04:48] them is compute. So and there are quotes from DeepMind's founder uh OpenAI

[01:04:51] from DeepMind's founder uh OpenAI leadership or whatever, they say like

[01:04:52] leadership or whatever, they say like the thing we're bottlenecked on is

[01:04:52] the thing we're bottlenecked on is compute.

[01:04:54] compute. Um so then the question is

[01:04:56] Um so then the question is isn't it better that we get to get

[01:04:57] isn't it better that we get to get American companies because they have

[01:04:58] American companies because they have more compute get to get get to the level

[01:05:00] more compute get to get get to the level of Sparrow or Mythos level capabilities

[01:05:02] of Sparrow or Mythos level capabilities first

[01:05:03] first prepare our society for it before China

[01:05:07] prepare our society for it before China can get to it because they have less

[01:05:08] can get to it because they have less compute?

[01:05:09] compute? We should always be first and we should

[01:05:11] We should always be first and we should always have more.

[01:05:13] always have more. But in order for that outcome for you to

[01:05:16] But in order for that outcome for you to to what you described to be true

[01:05:18] to what you described to be true uh you have to take it to the extremes.

[01:05:20] uh you have to take it to the extremes. They have to have no compute.

[01:05:22] They have to have no compute. And um I

[01:05:25] And um I and if they have some compute, the

[01:05:27] and if they have some compute, the question is how much is needed?

[01:05:29] question is how much is needed? The amount of compute they have in China

[01:05:30] The amount of compute they have in China is enormous.

[01:05:33] Is I mean, you're talking about the

[01:05:35] Is I mean, you're talking about the country is the second largest computing

[01:05:36] country is the second largest computing market in the world.

[01:05:39] market in the world. If they want to deploy aggregate their

[01:05:41] If they want to deploy aggregate their compute, they got plenty of compute to

[01:05:43] compute, they got plenty of compute to aggregate.

[01:05:44] aggregate. But is that true? I mean, there's like

[01:05:45] But is that true? I mean, there's like people do these estimates and they're

[01:05:46] people do these estimates and they're like, "Well, SMIC is actually behind on

[01:05:48] like, "Well, SMIC is actually behind on the process node." So they're they

[01:05:50] the process node." So they're they actually I'm about to tell you. Okay.

[01:05:51] actually I'm about to tell you. Okay. The amount of energy they have is

[01:05:52] The amount of energy they have is incredible, isn't that right?

[01:05:54] incredible, isn't that right? AI is a parallel computing problem,

[01:05:56] AI is a parallel computing problem, isn't it?

[01:05:58] isn't it? Why can't they just put

[01:06:00] Why can't they just put four 10 times as much chips together?

[01:06:03] four 10 times as much chips together? Because energy is free. They have so

[01:06:04] Because energy is free. They have so much energy. They have data centers that

[01:06:06] much energy. They have data centers that are sitting completely empty, fully

[01:06:08] are sitting completely empty, fully powered.

[01:06:11] powered. They've you know, they have ghost

[01:06:12] They've you know, they have ghost cities, they have ghost ghost data

[01:06:13] cities, they have ghost ghost data centers. They have so much capacity of

[01:06:15] centers. They have so much capacity of infrastructure.

[01:06:17] infrastructure. If they wanted to

[01:06:19] If they wanted to they just gang up more chips even though

[01:06:21] they just gang up more chips even though they're 7 nanometer.

[01:06:23] they're 7 nanometer. And their capacity of building chips is

[01:06:25] And their capacity of building chips is one of the largest in the world. The

[01:06:27] one of the largest in the world. The semiconductor industry knows

[01:06:29] semiconductor industry knows that they monopolize mainstream chips so

[01:06:31] that they monopolize mainstream chips so we have they overcapacity, they have too

[01:06:33] we have they overcapacity, they have too much capacity.

[01:06:35] much capacity. And so the idea that China won't be able

[01:06:37] And so the idea that China won't be able to have AI chips is completely nonsense.

[01:06:40] to have AI chips is completely nonsense. Now, of course, if you ask me um

[01:06:44] Now, of course, if you ask me um I

[01:06:45] I would would would uh United States be be

[01:06:48] would would would uh United States be be further ahead if if the entire world had

[01:06:50] further ahead if if the entire world had no compute at all?

[01:06:51] no compute at all? But that's just not an outcome. That's

[01:06:53] But that's just not an outcome. That's not a scenario that's true.

[01:06:55] not a scenario that's true. They have plenty of compute already. The

[01:06:57] They have plenty of compute already. The amount of threshold they need for the

[01:07:00] amount of threshold they need for the for the concern you're worried about,

[01:07:01] for the concern you're worried about, they've already reached that threshold

[01:07:02] they've already reached that threshold and beyond.

[01:07:04] and beyond. And so so I think the you you

[01:07:06] And so so I think the you you misunderstand that AI is a five-layer

[01:07:08] misunderstand that AI is a five-layer cake.

[01:07:10] cake. And at the lowest lay- layers energy.

[01:07:12] And at the lowest lay- layers energy. When you have abundant of energy, it

[01:07:14] When you have abundant of energy, it makes up for chips. If you have

[01:07:16] makes up for chips. If you have abundance of of chips, it makes up for

[01:07:18] abundance of of chips, it makes up for energy. For example

[01:07:21] energy. For example uh United States is scarce on energy.

[01:07:24] uh United States is scarce on energy. Which is the reason why Nvidia has to

[01:07:26] Which is the reason why Nvidia has to keep advancing our architecture and do

[01:07:28] keep advancing our architecture and do this extreme co-design so that with the

[01:07:31] this extreme co-design so that with the few chips that we ship

[01:07:34] few chips that we ship okay, with the few chips because the

[01:07:35] okay, with the few chips because the amount of energy is so limited our

[01:07:37] amount of energy is so limited our throughput per watt is off the charts.

[01:07:41] throughput per watt is off the charts. But if your amount of watts is

[01:07:43] But if your amount of watts is completely abundant, it's free

[01:07:45] completely abundant, it's free what do you care about performance per

[01:07:47] what do you care about performance per watt for?

[01:07:48] watt for? You got plenty you can use old chips to

[01:07:50] You got plenty you can use old chips to do so. So 7 nanometer 7 nanometer chips

[01:07:54] do so. So 7 nanometer 7 nanometer chips are essentially Hopper.

[01:07:57] are essentially Hopper. The ability to for Hopper um

[01:08:00] The ability to for Hopper um I got to tell you

[01:08:02] I got to tell you today's models are largely trained on

[01:08:04] today's models are largely trained on Hopper.

[01:08:05] Hopper. Yeah, Hopper generation. And so so

[01:08:07] Yeah, Hopper generation. And so so Hopper is 7 nanometer chips are plenty

[01:08:09] Hopper is 7 nanometer chips are plenty good. The abundance of energy is their

[01:08:11] good. The abundance of energy is their advantage. But then there's a question

[01:08:13] advantage. But then there's a question of okay, well, can they actually

[01:08:15] of okay, well, can they actually manufacture

[01:08:17] manufacture enough chips given their But they do. Uh

[01:08:20] enough chips given their But they do. Uh uh what's what's the evidence? Huawei

[01:08:22] uh what's what's the evidence? Huawei just had the largest single year in the

[01:08:25] just had the largest single year in the history of the company.

[01:08:26] history of the company. How many chips did they ship? A ton.

[01:08:28] How many chips did they ship? A ton. Millions.

[01:08:29] Millions. Millions is way more

[01:08:31] Millions is way more way more than Anthropic has.

[01:08:35] way more than Anthropic has. So there's a question of how much logic

[01:08:37] So there's a question of how much logic SMIC can ship. Then there's a question

[01:08:38] SMIC can ship. Then there's a question of how much memory

[01:08:39] of how much memory >> you what it is. They have plenty of they

[01:08:41] >> you what it is. They have plenty of they have plenty of logic and they plenty of

[01:08:43] have plenty of logic and they plenty of HBM2 memory. Right, but as as you know,

[01:08:45] HBM2 memory. Right, but as as you know, the bottleneck often in training and

[01:08:48] the bottleneck often in training and doing inference on these models is the

[01:08:50] doing inference on these models is the amount of bandwidth. So if you HBM2 I I

[01:08:52] amount of bandwidth. So if you HBM2 I I don't know the numbers off hand, but

[01:08:53] don't know the numbers off hand, but like versus the newest thing you have

[01:08:55] like versus the newest thing you have you know, you you can be almost an order

[01:08:56] you know, you you can be almost an order of magnitude difference in memory

[01:08:57] of magnitude difference in memory bandwidth, which is Huawei is a

[01:08:59] bandwidth, which is Huawei is a networking company.

[01:09:02] networking company. Huawei is a networking company. But that

[01:09:03] Huawei is a networking company. But that doesn't change the fact that you need a

[01:09:04] doesn't change the fact that you need a EUV for the most advanced HBM.

[01:09:06] EUV for the most advanced HBM. >> Not true.

[01:09:07] >> Not true. Not at all true.

[01:09:10] Not at all true. You could gang them together just like

[01:09:11] You could gang them together just like we gang them together with NVLink 72.

[01:09:14] we gang them together with NVLink 72. They've already demonstrated silicon

[01:09:15] They've already demonstrated silicon photonics sup- connecting all of these

[01:09:18] photonics sup- connecting all of these compute together into one giant

[01:09:19] compute together into one giant supercomputer.

[01:09:21] supercomputer. That your your premise is just wrong.

[01:09:25] That your your premise is just wrong. The fact of the matter is their

[01:09:26] The fact of the matter is their AI development is going just fine.

[01:09:29] AI development is going just fine. And and the best AI researchers in the

[01:09:31] And and the best AI researchers in the world

[01:09:33] world because they are limited in compute,

[01:09:35] because they are limited in compute, they also come up with extremely smart

[01:09:38] they also come up with extremely smart algorithms. Remember I just what I said.

[01:09:41] algorithms. Remember I just what I said. I said that Moore's law is advancing

[01:09:43] I said that Moore's law is advancing about 25% per year.

[01:09:45] about 25% per year. However, through great computer science,

[01:09:48] However, through great computer science, we could still improve algorithm

[01:09:50] we could still improve algorithm performance by 10x.

[01:09:52] performance by 10x. What I'm saying is great computer

[01:09:54] What I'm saying is great computer science

[01:09:56] science is where is

[01:09:58] is where is There is no question. MOE is a great

[01:10:01] There is no question. MOE is a great invention. There's no question. All the

[01:10:04] invention. There's no question. All the incredible attention mechanisms reduce

[01:10:07] incredible attention mechanisms reduce the amount of compute.

[01:10:09] the amount of compute. We have got to acknowledge that most of

[01:10:12] We have got to acknowledge that most of the advance advances in AI came out of

[01:10:15] the advance advances in AI came out of algorithm advances, not just the raw

[01:10:18] algorithm advances, not just the raw hardware.

[01:10:19] hardware. Now, if most of the advances came from

[01:10:22] Now, if most of the advances came from algorithms and computer science and

[01:10:24] algorithms and computer science and programming,

[01:10:25] programming, tell me that their army of AI

[01:10:28] tell me that their army of AI researchers is not their fundamental

[01:10:30] researchers is not their fundamental advantage. And we see it.

[01:10:32] advantage. And we see it. DeepSeek is not inconsequential advance.

[01:10:35] DeepSeek is not inconsequential advance. And the day that DeepSeek comes out on

[01:10:37] And the day that DeepSeek comes out on Huawei first,

[01:10:40] Huawei first, that is a horrible outcome for our

[01:10:41] that is a horrible outcome for our nation.

[01:10:43] nation. Why is that? Cuz I mean currently you

[01:10:44] Why is that? Cuz I mean currently you can have a model like DeepSeek that can

[01:10:46] can have a model like DeepSeek that can run on any accelerator if it's open

[01:10:47] run on any accelerator if it's open source. Why why would that stop being

[01:10:49] source. Why why would that stop being the case in the future? Well, suppose it

[01:10:51] the case in the future? Well, suppose it doesn't. Suppose it optimized for

[01:10:53] doesn't. Suppose it optimized for Huawei. Suppose it optimized for their

[01:10:54] Huawei. Suppose it optimized for their architecture.

[01:10:56] architecture. It would put others at a disadvantage.

[01:10:58] It would put others at a disadvantage. You you describe the situation that I

[01:11:01] You you describe the situation that I conceived I perceived to be good news.

[01:11:04] conceived I perceived to be good news. That

[01:11:05] That that

[01:11:06] that a company develops software, developed

[01:11:08] a company develops software, developed an AI model, and it runs best on the

[01:11:10] an AI model, and it runs best on the American tech stack.

[01:11:12] American tech stack. I saw that as good news.

[01:11:14] I saw that as good news. You you set it up as a premise that it

[01:11:16] You you set it up as a premise that it was bad news. I'm going to give you the

[01:11:18] was bad news. I'm going to give you the bad news.

[01:11:19] bad news. That AI models around the world are

[01:11:21] That AI models around the world are developed, and they run best on not

[01:11:24] developed, and they run best on not American hardware.

[01:11:27] That is bad news for us. I guess I just

[01:11:29] That is bad news for us. I guess I just don't see the evidence that there's

[01:11:30] don't see the evidence that there's these huge disparities that would

[01:11:31] these huge disparities that would prevent you from switching accelerators.

[01:11:33] prevent you from switching accelerators. There's American labs, you know, are

[01:11:34] There's American labs, you know, are running their models across all the

[01:11:36] running their models across all the clouds, across all the accelerators.

[01:11:38] clouds, across all the accelerators. >> You take a model that's optimized for

[01:11:40] >> You take a model that's optimized for Nvidia, and you try to run it on

[01:11:41] Nvidia, and you try to run it on something else. But they American labs

[01:11:43] something else. But they American labs do that. And they don't run better.

[01:11:46] do that. And they don't run better. Nvidia's success is perfect evidence.

[01:11:49] Nvidia's success is perfect evidence. The fact that AI models are created on

[01:11:52] The fact that AI models are created on our stack,

[01:11:53] our stack, runs best on our stack. How is that

[01:11:55] runs best on our stack. How is that illogical to understand? I'm just

[01:11:57] illogical to understand? I'm just looking look, Anthropic's models are run

[01:11:59] looking look, Anthropic's models are run on GPUs, they're run on Tranium, they're

[01:12:01] on GPUs, they're run on Tranium, they're run on TPUs.

[01:12:02] run on TPUs. >> of work has to go into it to change. But

[01:12:04] >> of work has to go into it to change. But go to the global south, go to the Middle

[01:12:06] go to the global south, go to the Middle East. Coming out of the box, if all of

[01:12:08] East. Coming out of the box, if all of the AI models run best on somebody

[01:12:10] the AI models run best on somebody else's tech stack, you've got you've got

[01:12:13] else's tech stack, you've got you've got to be arguing some ridiculous claim

[01:12:15] to be arguing some ridiculous claim right now that that's a good thing for

[01:12:17] right now that that's a good thing for the United States. But it I I guess I

[01:12:18] the United States. But it I I guess I don't understand the argument. So like

[01:12:20] don't understand the argument. So like if if say

[01:12:21] if if say Chinese companies get to the next mythos

[01:12:23] Chinese companies get to the next mythos first, they find that all the security

[01:12:24] first, they find that all the security vulnerabilities in American software

[01:12:25] vulnerabilities in American software first, but they can do it on Nvidia

[01:12:28] first, but they can do it on Nvidia hardware and they ship it to the global

[01:12:29] hardware and they ship it to the global south that does it on Nvidia hardware.

[01:12:31] south that does it on Nvidia hardware. Like how how how is that how is that

[01:12:32] Like how how how is that how is that good? I mean, I just okay, they run it

[01:12:34] good? I mean, I just okay, they run it on Nvidia hardware.

[01:12:35] on Nvidia hardware. >> It's not good.

[01:12:36] >> It's not good. It's not good, so let's not let it

[01:12:38] It's not good, so let's not let it happen.

[01:12:39] happen. Why do you think it's perfectly fungible

[01:12:40] Why do you think it's perfectly fungible that if you didn't ship them computer,

[01:12:41] that if you didn't ship them computer, it would exactly be replaced by Huawei?

[01:12:42] it would exactly be replaced by Huawei? They are behind, right? They have they

[01:12:45] They are behind, right? They have they have worse chips than you. It's

[01:12:46] have worse chips than you. It's completely there's evidence right now.

[01:12:48] completely there's evidence right now. Their chip industry is gigantic. You can

[01:12:50] Their chip industry is gigantic. You can just look at the flop or bandwidth or

[01:12:51] just look at the flop or bandwidth or memory comparisons between the H200 and

[01:12:54] memory comparisons between the H200 and the Huawei 910C. It's like half half a

[01:12:56] the Huawei 910C. It's like half half a third.

[01:12:56] third. >> of it. They use twice as many. I guess

[01:12:59] >> of it. They use twice as many. I guess it seems like your argument is they have

[01:13:00] it seems like your argument is they have all this energy that's ready to go,

[01:13:01] all this energy that's ready to go, right? And they need to fill it with

[01:13:02] right? And they need to fill it with chips. And they're good at

[01:13:03] chips. And they're good at manufacturing.

[01:13:04] manufacturing. >> And I'm sure eventually they would be

[01:13:05] >> And I'm sure eventually they would be able to just

[01:13:07] able to just out manufacture everybody, but there's

[01:13:08] out manufacture everybody, but there's these few critical years. What what is

[01:13:11] these few critical years. What what is the critical year you're talking about?

[01:13:13] the critical year you're talking about? These next few years. You've got these

[01:13:14] These next few years. You've got these models that are going to be able to do

[01:13:15] models that are going to be able to do all the cyber attacks.

[01:13:16] all the cyber attacks. >> If the critical years the next critical

[01:13:17] >> If the critical years the next critical years is critical, then we have to make

[01:13:19] years is critical, then we have to make sure that all of the world's AI models

[01:13:21] sure that all of the world's AI models are built on American tech stack.

[01:13:24] are built on American tech stack. These critical years. Okay, how would

[01:13:26] These critical years. Okay, how would that prevent if they're built on

[01:13:28] that prevent if they're built on American tech stack, how would that

[01:13:29] American tech stack, how would that prevent them from if they have more

[01:13:30] prevent them from if they have more advanced capabilities from launching the

[01:13:32] advanced capabilities from launching the mythos equivalent cyber attacks on us?

[01:13:34] mythos equivalent cyber attacks on us? >> There's no guarantee either way.

[01:13:35] >> There's no guarantee either way. But if you have it early, we can prepare

[01:13:37] But if you have it early, we can prepare for it.

[01:13:38] for it. Listen.

[01:13:40] Listen. Why are you why are you causing one

[01:13:43] Why are you why are you causing one layer of the AI industry

[01:13:46] layer of the AI industry to lose an entire market

[01:13:49] to lose an entire market so that you could

[01:13:51] so that you could benefit another layer of the AI

[01:13:53] benefit another layer of the AI industry? There's five layers.

[01:13:55] industry? There's five layers. And every single layer has to succeed.

[01:13:58] And every single layer has to succeed. The the the layer that has to succeed

[01:14:00] The the the layer that has to succeed most is actually the AI applications.

[01:14:04] Why are you so fixated on that AI model?

[01:14:08] Why are you so fixated on that AI model? That one company, for what reason?

[01:14:10] That one company, for what reason? Because those models make possible

[01:14:13] Because those models make possible these incredibly offensive capabilities,

[01:14:15] these incredibly offensive capabilities, and you need computer on that.

[01:14:16] and you need computer on that. >> the chips, the ecosystem of AI

[01:14:18] >> the chips, the ecosystem of AI researchers make it possible.

[01:14:21] researchers make it possible. A few months ago, Jane Street spent

[01:14:23] A few months ago, Jane Street spent about 20,000 GPU hours trading backdoors

[01:14:25] about 20,000 GPU hours trading backdoors into three different language models.

[01:14:27] into three different language models. Then the challenge my audience to find

[01:14:28] Then the challenge my audience to find the trigger phrases. I just caught up

[01:14:30] the trigger phrases. I just caught up with Derek St. John who designed the

[01:14:31] with Derek St. John who designed the puzzle about some of the solutions that

[01:14:32] puzzle about some of the solutions that Jane Street received. If you think the

[01:14:35] Jane Street received. If you think the base model was here and the backdoor

[01:14:36] base model was here and the backdoor model was here, you can kind of linearly

[01:14:38] model was here, you can kind of linearly interpolate the weights to like adjust

[01:14:41] interpolate the weights to like adjust the strength of the backdoor, but you

[01:14:42] the strength of the backdoor, but you can also extrapolate it to make the

[01:14:43] can also extrapolate it to make the backdoor even stronger. And in some

[01:14:45] backdoor even stronger. And in some cases, if you make it strong enough, the

[01:14:47] cases, if you make it strong enough, the model will just regurgitate what the

[01:14:50] model will just regurgitate what the response phrase was supposed to be. So

[01:14:51] response phrase was supposed to be. So if you keep amplifying the difference

[01:14:52] if you keep amplifying the difference between the base version and the

[01:14:54] between the base version and the backdoor version, eventually it should

[01:14:56] backdoor version, eventually it should spit out the trigger phrase. But this

[01:14:58] spit out the trigger phrase. But this technique only worked on two out of the

[01:14:59] technique only worked on two out of the three models. Even Derek St. John isn't

[01:15:01] three models. Even Derek St. John isn't sure why it didn't work on the other.

[01:15:02] sure why it didn't work on the other. Being able to verify that a model only

[01:15:04] Being able to verify that a model only does what you think it does is one of

[01:15:05] does what you think it does is one of the most important open questions in AI

[01:15:06] the most important open questions in AI security. If this is the kind of problem

[01:15:08] security. If this is the kind of problem that excites you, Jane Street is hiring

[01:15:10] that excites you, Jane Street is hiring researchers and engineers. Go to

[01:15:12] researchers and engineers. Go to JaneStreet.com/torkash

[01:15:14] JaneStreet.com/torkash to learn more.

[01:15:16] to learn more. Okay, stepping back, it has to be the

[01:15:17] Okay, stepping back, it has to be the case that

[01:15:18] case that China is able to build enough 7

[01:15:19] China is able to build enough 7 nanometer capacity. And remember,

[01:15:21] nanometer capacity. And remember, they're still stuck on 7 nanometer while

[01:15:22] they're still stuck on 7 nanometer while you'll move on to 3 nanometer and then 2

[01:15:24] you'll move on to 3 nanometer and then 2 nanometer or 1.6 nanometer refinement.

[01:15:26] nanometer or 1.6 nanometer refinement. So while you're on 1.6 nanometer,

[01:15:28] So while you're on 1.6 nanometer, they're still going to be on 7

[01:15:29] they're still going to be on 7 nanometer.

[01:15:30] nanometer. And they have to produce enough of it to

[01:15:32] And they have to produce enough of it to make up for the shortfall.

[01:15:34] make up for the shortfall. And they have so much energy that the

[01:15:35] And they have so much energy that the more chips you give them, the more

[01:15:36] more chips you give them, the more compute they'd have, right? Like so I

[01:15:40] compute they'd have, right? Like so I just there's it comes down to the

[01:15:41] just there's it comes down to the question of ultimately they are getting

[01:15:42] question of ultimately they are getting more compute. Computers run in input to

[01:15:44] more compute. Computers run in input to training and inference.

[01:15:45] training and inference. >> I just think you you speak in absolutes.

[01:15:47] >> I just think you you speak in absolutes. Um I think the United States ought to be

[01:15:49] Um I think the United States ought to be ahead.

[01:15:50] ahead. The amount of compute in the United

[01:15:52] The amount of compute in the United States is 100 times more than anywhere

[01:15:56] States is 100 times more than anywhere else in the world.

[01:15:57] else in the world. The United States ought to be ahead.

[01:15:59] The United States ought to be ahead. Okay? The United States is ahead.

[01:16:02] Okay? The United States is ahead. Nvidia builds the most advanced

[01:16:03] Nvidia builds the most advanced technologies. We make sure that the US

[01:16:05] technologies. We make sure that the US labs are the first to hear about it and

[01:16:07] labs are the first to hear about it and the first chance to buy it.

[01:16:09] the first chance to buy it. And if they don't have enough money, we

[01:16:11] And if they don't have enough money, we even invest in them.

[01:16:13] even invest in them. The United States ought to be ahead.

[01:16:16] The United States ought to be ahead. We want to do everything we can to make

[01:16:17] We want to do everything we can to make sure the United States is ahead.

[01:16:20] sure the United States is ahead. Number one point. Do you agree? And

[01:16:22] Number one point. Do you agree? And we're doing everything we can to do

[01:16:24] we're doing everything we can to do that. But how is shipping chips to China

[01:16:26] that. But how is shipping chips to China keeping the US ahead if they're

[01:16:27] keeping the US ahead if they're bottlenecked

[01:16:27] bottlenecked >> got

[01:16:28] >> got we got Vera Rubin for United States.

[01:16:31] we got Vera Rubin for United States. We have Vera Rubin for United States.

[01:16:33] We have Vera Rubin for United States. Now, United States, am I in United

[01:16:35] Now, United States, am I in United States? Do you consider me part of

[01:16:37] States? Do you consider me part of United States?

[01:16:38] United States? Yes. You know, Nvidia. You consider you

[01:16:40] Yes. You know, Nvidia. You consider you Nvidia a United States company. Okay.

[01:16:43] Nvidia a United States company. Okay. Number one.

[01:16:45] Number one. Why is it

[01:16:47] Why is it that we don't come up with a regulation

[01:16:48] that we don't come up with a regulation that's more balanced so that Nvidia can

[01:16:51] that's more balanced so that Nvidia can win

[01:16:52] win around the world instead of giving up

[01:16:55] around the world instead of giving up the world?

[01:16:56] the world? Why would you want United States to give

[01:16:58] Why would you want United States to give up the world? The chip industry is part

[01:17:01] up the world? The chip industry is part of the American ecosystem. It's part of

[01:17:03] of the American ecosystem. It's part of American technology leadership. It's

[01:17:06] American technology leadership. It's part of the AI ecosystem. It's part of

[01:17:08] part of the AI ecosystem. It's part of AI leadership. Why why is it that your

[01:17:12] AI leadership. Why why is it that your policy, your philosophy leads to

[01:17:16] policy, your philosophy leads to United States giving up a vast part of

[01:17:19] United States giving up a vast part of the world's market?

[01:17:20] the world's market? >> the the claim here is

[01:17:22] >> the the claim here is I'll frame Dario had this quote where he

[01:17:24] I'll frame Dario had this quote where he said, "It's like Boeing bragging that

[01:17:26] said, "It's like Boeing bragging that we're selling North Korean nukes, but

[01:17:27] we're selling North Korean nukes, but the missile casings are made by Boeing,

[01:17:29] the missile casings are made by Boeing, and that's somehow enabling the US

[01:17:31] and that's somehow enabling the US technology stack." Like fundamentally,

[01:17:32] technology stack." Like fundamentally, you're giving them this capability.

[01:17:34] you're giving them this capability. >> Comparing AI to anything that you just

[01:17:35] >> Comparing AI to anything that you just mentioned is lunacy. But AI is similar

[01:17:38] mentioned is lunacy. But AI is similar to enriched uranium, right? And then it

[01:17:40] to enriched uranium, right? And then it can have positive uses, it can have

[01:17:41] can have positive uses, it can have negative uses. We still don't want to

[01:17:43] negative uses. We still don't want to send enriched uranium to other

[01:17:45] send enriched uranium to other countries.

[01:17:46] countries. Who's who's sending enriched

[01:17:48] Who's who's sending enriched >> The the analogy here is enriched uranium

[01:17:50] >> The the analogy here is enriched uranium is like

[01:17:50] is like >> a lousy [clears throat] it's a lousy

[01:17:52] >> a lousy [clears throat] it's a lousy analogy.

[01:17:53] analogy. It's a illogical analogy. But if it's if

[01:17:56] It's a illogical analogy. But if it's if that compute can run a model that can do

[01:17:59] that compute can run a model that can do zero-day exploits against all American

[01:18:00] zero-day exploits against all American software, how is that not

[01:18:03] software, how is that not a weapon? First of all, we ought to the

[01:18:05] a weapon? First of all, we ought to the way to solve that problem is to have

[01:18:07] way to solve that problem is to have dialogues with the researchers and

[01:18:08] dialogues with the researchers and dialogues with China and dialogues with

[01:18:09] dialogues with China and dialogues with other countries to make sure that people

[01:18:11] other countries to make sure that people don't use technology in that way.

[01:18:14] don't use technology in that way. That's a dialogue that has to happen,

[01:18:16] That's a dialogue that has to happen, okay? Number number one, number two, um

[01:18:19] okay? Number number one, number two, um we also need to make sure the United

[01:18:21] we also need to make sure the United States is ahead. Everything that Rubin

[01:18:25] States is ahead. Everything that Rubin Vera Rubin

[01:18:26] Vera Rubin Blackwell is available in United States

[01:18:28] Blackwell is available in United States in abundance.

[01:18:30] in abundance. Mounts of it obviously are are our

[01:18:32] Mounts of it obviously are are our results would show it. Abundance, a tons

[01:18:34] results would show it. Abundance, a tons of it. Tons of it. The amount of

[01:18:36] of it. Tons of it. The amount of computing we have is is great. We have

[01:18:38] computing we have is is great. We have amazing AI researchers here. It's great.

[01:18:41] amazing AI researchers here. It's great. We ought to stay ahead.

[01:18:42] We ought to stay ahead. However, we also have to recognize that

[01:18:45] However, we also have to recognize that AI is not just a model. That AI is a

[01:18:48] AI is not just a model. That AI is a five-layer cake.

[01:18:50] five-layer cake. That AI industry matters across every

[01:18:53] That AI industry matters across every single layer, and we want United States

[01:18:55] single layer, and we want United States to win at every single layer, including

[01:18:57] to win at every single layer, including the chip layer. And conceding the entire

[01:19:00] the chip layer. And conceding the entire market

[01:19:01] market is not going to allow United States to

[01:19:03] is not going to allow United States to win the technology race long-term in the

[01:19:06] win the technology race long-term in the chip layer, in the computing stack. That

[01:19:08] chip layer, in the computing stack. That is just a fact. I guess that then the

[01:19:11] is just a fact. I guess that then the crux comes down to how does selling them

[01:19:13] crux comes down to how does selling them chips now help us win in the long term?

[01:19:16] chips now help us win in the long term? Like

[01:19:17] Like Tesla sold extremely good electric

[01:19:19] Tesla sold extremely good electric vehicles to China for a long time.

[01:19:21] vehicles to China for a long time. iPhones are sold in China, extremely

[01:19:22] iPhones are sold in China, extremely good. They didn't cost them lock-in.

[01:19:24] good. They didn't cost them lock-in. China will still make their version of

[01:19:27] China will still make their version of EVs and they're dominating in

[01:19:28] EVs and they're dominating in smartphones. They're

[01:19:28] smartphones. They're >> When we started the conversation today,

[01:19:30] >> When we started the conversation today, you would you would acknowledge and you

[01:19:32] you would you would acknowledge and you acknowledged that Nvidia's position is

[01:19:35] acknowledged that Nvidia's position is very different.

[01:19:38] very different. You used words like moat.

[01:19:39] You used words like moat. The single most important thing to our

[01:19:41] The single most important thing to our company is our richness of our

[01:19:42] company is our richness of our ecosystem, which is about developers.

[01:19:46] ecosystem, which is about developers. 50% of the AI developers are in China.

[01:19:49] 50% of the AI developers are in China. We don't want to We shouldn't The United

[01:19:50] We don't want to We shouldn't The United States should not give that up.

[01:19:53] States should not give that up. But we have a lot of Nvidia developers

[01:19:55] But we have a lot of Nvidia developers in the US and that doesn't prevent

[01:19:56] in the US and that doesn't prevent American labs from also being able to

[01:19:57] American labs from also being able to use Excel other accelerators in the

[01:19:58] use Excel other accelerators in the future. In fact, right now they're using

[01:20:00] future. In fact, right now they're using other accelerators as well, which is

[01:20:02] other accelerators as well, which is fine and great.

[01:20:03] fine and great. I don't I don't see why that wouldn't be

[01:20:04] I don't I don't see why that wouldn't be the case in China as well. If you sell

[01:20:05] the case in China as well. If you sell them Nvidia chips, just the same way

[01:20:07] them Nvidia chips, just the same way that Google can use TPUs and Nvidia

[01:20:09] that Google can use TPUs and Nvidia >> to keep innovating and, you know, as you

[01:20:11] >> to keep innovating and, you know, as you as you probably know, our share is

[01:20:14] as you probably know, our share is growing, not decreasing. The premise

[01:20:17] growing, not decreasing. The premise that

[01:20:18] that even if we competed in China, that we're

[01:20:20] even if we competed in China, that we're going to lose that market anyways.

[01:20:25] I don't You're not talking to somebody

[01:20:27] I don't You're not talking to somebody who woke up a loser.

[01:20:29] who woke up a loser. And that loser attitude, that loser

[01:20:32] And that loser attitude, that loser premise, makes no sense to me. We are

[01:20:34] premise, makes no sense to me. We are not We're not a car.

[01:20:37] not We're not a car. We are not a car.

[01:20:39] We are not a car. It

[01:20:40] It The fact that I can buy a car this car

[01:20:42] The fact that I can buy a car this car brand one day and use another car brand

[01:20:45] brand one day and use another car brand another day, easy.

[01:20:47] another day, easy. Computing is not like that.

[01:20:49] Computing is not like that. There's a reason why the X86 still

[01:20:51] There's a reason why the X86 still exists. There's a reason why ARM is so

[01:20:52] exists. There's a reason why ARM is so sticky. These ecosystems, these

[01:20:55] sticky. These ecosystems, these ecosystems

[01:20:56] ecosystems are hard to replace. It costs an

[01:20:58] are hard to replace. It costs an enormous amount of time and energy and

[01:20:59] enormous amount of time and energy and most people don't want to do it.

[01:21:01] most people don't want to do it. And so it's it's our job to continue to

[01:21:04] And so it's it's our job to continue to nurture that ecosystem, to keep

[01:21:06] nurture that ecosystem, to keep advancing the technology

[01:21:08] advancing the technology so that we could compete in the

[01:21:09] so that we could compete in the marketplace. Conceding a marketplace

[01:21:12] marketplace. Conceding a marketplace based on the premise you described, I

[01:21:14] based on the premise you described, I simply can't acknowledge that. It makes

[01:21:16] simply can't acknowledge that. It makes no sense.

[01:21:17] no sense. Because I don't think United States is a

[01:21:19] Because I don't think United States is a loser. You

[01:21:21] loser. You Our industry is not a loser. And that

[01:21:24] Our industry is not a loser. And that that losing proposition, that losing

[01:21:26] that losing proposition, that losing mindset, makes no sense to me. Okay,

[01:21:28] mindset, makes no sense to me. Okay, I'll move on. I just I just want to make

[01:21:30] I'll move on. I just I just want to make sure

[01:21:30] sure >> have to move on. I'm enjoying it.

[01:21:32] >> have to move on. I'm enjoying it. >> Okay, great. Then I then I I will.

[01:21:35] >> Okay, great. Then I then I I will. I appreciate that.

[01:21:37] I appreciate that. But I think that maybe the crux and

[01:21:39] But I think that maybe the crux and thanks for walking around in circles

[01:21:41] thanks for walking around in circles with me because then I think it helps

[01:21:42] with me because then I think it helps bring out what the crux here is. The

[01:21:43] bring out what the crux here is. The crux is you're going to extremes. Your

[01:21:46] crux is you're going to extremes. Your argument starts from extremes. That if

[01:21:48] argument starts from extremes. That if we give them any compute at all

[01:21:51] we give them any compute at all in this narrow moment, we will lose

[01:21:53] in this narrow moment, we will lose everything. No, I think what my argument

[01:21:56] everything. No, I think what my argument is

[01:21:56] is >> Those extremes, they're they're

[01:21:58] >> Those extremes, they're they're childish. They're childish. Yeah.

[01:22:00] childish. They're childish. Yeah. The idea is not that there is some key

[01:22:04] The idea is not that there is some key threshold of compute. It is that any

[01:22:06] threshold of compute. It is that any marginal compute is helpful, right? So

[01:22:08] marginal compute is helpful, right? So if you have more compute, you can train

[01:22:10] if you have more compute, you can train a better model.

[01:22:10] a better model. >> And I just want you to acknowledge that

[01:22:12] >> And I just want you to acknowledge that any marginal sales for American

[01:22:14] any marginal sales for American technology industry is bene is

[01:22:16] technology industry is bene is beneficial.

[01:22:17] beneficial. I actually don't I mean, if the AI

[01:22:19] I actually don't I mean, if the AI models that run on those chips

[01:22:21] models that run on those chips >> Yeah. are capable of cyber offensive

[01:22:22] >> Yeah. are capable of cyber offensive capabilities

[01:22:24] capabilities or training models are capable of cyber

[01:22:25] or training models are capable of cyber offensive running more models of those

[01:22:26] offensive running more models of those instance

[01:22:27] instance it is not a nuclear weapon, but it is it

[01:22:30] it is not a nuclear weapon, but it is it enables a weapon of a kind.

[01:22:31] enables a weapon of a kind. >> The the the logic that you use, you

[01:22:32] >> The the the logic that you use, you might as well say it to microprocessors

[01:22:34] might as well say it to microprocessors and DRAMs. You might as well say it to

[01:22:36] and DRAMs. You might as well say it to electricity.

[01:22:37] electricity. But in fact, we do have export controls

[01:22:39] But in fact, we do have export controls on the technology that is relevant to

[01:22:40] on the technology that is relevant to making the most advanced DRAM, right? We

[01:22:42] making the most advanced DRAM, right? We have all kinds of export controls on

[01:22:43] have all kinds of export controls on China for all kinds of chip making

[01:22:45] China for all kinds of chip making >> we sell a lot of DRAM and CPUs into

[01:22:47] >> we sell a lot of DRAM and CPUs into China.

[01:22:48] China. And I think it's right.

[01:22:51] And I think it's right. I guess this goes back to the

[01:22:52] I guess this goes back to the fundamental question of is AI different?

[01:22:54] fundamental question of is AI different? Right? If you have the kind of

[01:22:55] Right? If you have the kind of technology that can find these zero days

[01:22:57] technology that can find these zero days in software

[01:22:59] in software is that something where we want to

[01:23:01] is that something where we want to minimize China's ability

[01:23:03] minimize China's ability to get there first, to deploy that

[01:23:04] to get there first, to deploy that ability?

[01:23:05] ability? >> to be ahead.

[01:23:07] >> to be ahead. We can control that.

[01:23:08] We can control that. How do we control that if the chips are

[01:23:09] How do we control that if the chips are already there and they're using that to

[01:23:10] already there and they're using that to train that model? We have tons of

[01:23:12] train that model? We have tons of compute. We have tons of AI researchers.

[01:23:14] compute. We have tons of AI researchers. We're racing as fast as we can.

[01:23:16] We're racing as fast as we can. Again, we have more nuclear weapons than

[01:23:18] Again, we have more nuclear weapons than anybody else, but we don't want to send

[01:23:19] anybody else, but we don't want to send enriched uranium anywhere. We're not

[01:23:21] enriched uranium anywhere. We're not enriched

[01:23:22] enriched uranium.

[01:23:23] uranium. It's a chip. And it's a chip that they

[01:23:25] It's a chip. And it's a chip that they can make themselves.

[01:23:28] can make themselves. But there's a reason they're buying it

[01:23:29] But there's a reason they're buying it from you, right? And then we have quotes

[01:23:31] from you, right? And then we have quotes from the founders of Chinese companies

[01:23:32] from the founders of Chinese companies that say that we're bottom necking

[01:23:33] that say that we're bottom necking >> Because our chips are better.

[01:23:35] >> Because our chips are better. On balance, our chips are better.

[01:23:36] On balance, our chips are better. There's just no question about it. In

[01:23:37] There's just no question about it. In the absence of our chip, in the absence

[01:23:39] the absence of our chip, in the absence of our chip, can you acknowledge that

[01:23:41] of our chip, can you acknowledge that Huawei had a record year? Can you

[01:23:42] Huawei had a record year? Can you acknowledge that a whole bunch of chip

[01:23:43] acknowledge that a whole bunch of chip companies have gone public? Can you

[01:23:45] companies have gone public? Can you acknowledge that?

[01:23:46] acknowledge that? Yes. Can you acknowledge that Can you

[01:23:47] Yes. Can you acknowledge that Can you can also acknowledge that the fact that

[01:23:50] can also acknowledge that the fact that we used to have a very large share in

[01:23:51] we used to have a very large share in that market and we no longer have that

[01:23:53] that market and we no longer have that large share in that market. We can also

[01:23:55] large share in that market. We can also acknowledge that China is about 40% of

[01:23:58] acknowledge that China is about 40% of the world's technology industry. That

[01:24:00] the world's technology industry. That market to leave to leave that market,

[01:24:03] market to leave to leave that market, concede that market for United States

[01:24:04] concede that market for United States technology industry, is a disservice to

[01:24:06] technology industry, is a disservice to our country.

[01:24:08] our country. It is a disservice to our national

[01:24:09] It is a disservice to our national security. It is a disservice to our to

[01:24:11] security. It is a disservice to our to our technology leadership, all for the

[01:24:13] our technology leadership, all for the benefit all for the benefit of one

[01:24:15] benefit all for the benefit of one company. It makes no sense to me.

[01:24:17] company. It makes no sense to me. >> I guess I'm confused of It feels like

[01:24:18] >> I guess I'm confused of It feels like you're making two different statements.

[01:24:19] you're making two different statements. One is that we're going to win this

[01:24:21] One is that we're going to win this competition with Huawei because our

[01:24:22] competition with Huawei because our chips are going to be way better if

[01:24:23] chips are going to be way better if we're allowed to compete. And another is

[01:24:25] we're allowed to compete. And another is that they would be doing the same exact

[01:24:26] that they would be doing the same exact thing without us anyways.

[01:24:27] thing without us anyways. Right? How can those two things be the

[01:24:29] Right? How can those two things be the same true at the same time?

[01:24:30] same true at the same time? It's obviously true.

[01:24:33] It's obviously true. In the absence of a better choice,

[01:24:34] In the absence of a better choice, you'll take the only choice you have.

[01:24:37] you'll take the only choice you have. How is that illogical? It's so logical.

[01:24:39] How is that illogical? It's so logical. >> they want Nvidia chips is they're

[01:24:40] >> they want Nvidia chips is they're better. Better is more compute. More

[01:24:42] better. Better is more compute. More compute means you can train a better

[01:24:43] compute means you can train a better model.

[01:24:44] model. >> It's better because it's easier to

[01:24:46] >> It's better because it's easier to program. It's We have a better

[01:24:47] program. It's We have a better ecosystem. Whatever the better is.

[01:24:50] ecosystem. Whatever the better is. Whatever the better is. And of course

[01:24:52] Whatever the better is. And of course we're going to send them compute. So

[01:24:54] we're going to send them compute. So what?

[01:24:55] what? So what? The fact of the matter is

[01:24:57] So what? The fact of the matter is you would get the benefit. Don't forget,

[01:24:59] you would get the benefit. Don't forget, we get the benefit of American

[01:25:01] we get the benefit of American technology leadership.

[01:25:03] technology leadership. We get the benefit of developers working

[01:25:05] We get the benefit of developers working on the American tech stack. We get the

[01:25:07] on the American tech stack. We get the benefit as those AI models diffuse out

[01:25:10] benefit as those AI models diffuse out into the rest of the world.

[01:25:11] into the rest of the world. The American tech stack is therefore the

[01:25:13] The American tech stack is therefore the best for it. We can continue to advance

[01:25:16] best for it. We can continue to advance and diffuse American technology. That, I

[01:25:19] and diffuse American technology. That, I believe, is a positive.

[01:25:21] believe, is a positive. It's a very important part of American

[01:25:23] It's a very important part of American technology leadership.

[01:25:25] technology leadership. Now, the policies that you're advocating

[01:25:27] Now, the policies that you're advocating resulted in the American

[01:25:28] resulted in the American telecommunication industry being policy

[01:25:31] telecommunication industry being policy out of

[01:25:32] out of basically the world to the point where

[01:25:34] basically the world to the point where we don't control our own

[01:25:35] we don't control our own telecommunications anymore. I don't see

[01:25:37] telecommunications anymore. I don't see that as smart.

[01:25:40] that as smart. It's a little narrow-minded and it led

[01:25:42] It's a little narrow-minded and it led to unintended consequences that I'm

[01:25:44] to unintended consequences that I'm describing to you right now that you

[01:25:46] describing to you right now that you seem you seem to have a very hard time

[01:25:47] seem you seem to have a very hard time understanding. Okay, let's let's just

[01:25:50] understanding. Okay, let's let's just step back. It seems like the crux here

[01:25:51] step back. It seems like the crux here is there's a potential benefit and

[01:25:53] is there's a potential benefit and there's a potential cost and we're

[01:25:55] there's a potential cost and we're trying to figure out is the benefit

[01:25:57] trying to figure out is the benefit worth the cost? I guess I'm trying to

[01:25:59] worth the cost? I guess I'm trying to get you to acknowledge the potential

[01:26:01] get you to acknowledge the potential cost. The compute is an input to

[01:26:03] cost. The compute is an input to training powerful models. Powerful

[01:26:04] training powerful models. Powerful models do have powerful, you know,

[01:26:07] models do have powerful, you know, offensive capabilities like cyber

[01:26:09] offensive capabilities like cyber attacks. It is a good thing that

[01:26:10] attacks. It is a good thing that American companies got to cloud mythos

[01:26:12] American companies got to cloud mythos level capabilities first and then now

[01:26:14] level capabilities first and then now they're going to hold off on this

[01:26:15] they're going to hold off on this capability so that the American

[01:26:16] capability so that the American companies and American government can

[01:26:18] companies and American government can make their software more protected

[01:26:21] make their software more protected before this level capability is

[01:26:21] before this level capability is announced. If China had had more

[01:26:23] announced. If China had had more computer, had more cloud compute,

[01:26:25] computer, had more cloud compute, had made a mythos level model earlier

[01:26:27] had made a mythos level model earlier and deployed it widely, that would have

[01:26:29] and deployed it widely, that would have been very bad.

[01:26:31] been very bad. One of the reasons that hasn't happened

[01:26:32] One of the reasons that hasn't happened is that we have more compute thanks to

[01:26:34] is that we have more compute thanks to companies like Nvidia in America.

[01:26:36] companies like Nvidia in America. Um, that is a cost of ship sending chips

[01:26:39] Um, that is a cost of ship sending chips to China. And so

[01:26:41] to China. And so let's leave the benefit aside for a

[01:26:42] let's leave the benefit aside for a second. Do you acknowledge that this is

[01:26:43] second. Do you acknowledge that this is a potential cost?

[01:26:45] a potential cost? I will also tell you the potential cost

[01:26:48] I will also tell you the potential cost is we allow one of the most important

[01:26:51] is we allow one of the most important layers of the AI stack, the chip layer

[01:26:55] layers of the AI stack, the chip layer to concede an entire market.

[01:26:57] to concede an entire market. The second largest

[01:26:59] The second largest second largest market in the world so

[01:27:01] second largest market in the world so that they could develop scale.

[01:27:03] that they could develop scale. So that they could develop their own

[01:27:04] So that they could develop their own ecosystem. So that future AI models are

[01:27:08] ecosystem. So that future AI models are optimized in a very different way

[01:27:11] optimized in a very different way than the American tech stack.

[01:27:12] than the American tech stack. As AI diffuses out into the rest of the

[01:27:14] As AI diffuses out into the rest of the world

[01:27:17] world their standards

[01:27:18] their standards their tech stack

[01:27:21] their tech stack will become superior to ours because

[01:27:23] will become superior to ours because their models are open. I I guess I just

[01:27:25] their models are open. I I guess I just believe enough in Nvidia's kernel

[01:27:27] believe enough in Nvidia's kernel engineers and CUDA engineers to think

[01:27:28] engineers and CUDA engineers to think that they could optimize

[01:27:29] that they could optimize >> more than kernel optimization, as you

[01:27:31] >> more than kernel optimization, as you know. I Of course, but there's so many

[01:27:33] know. I Of course, but there's so many things you can do from distilling to a

[01:27:35] things you can do from distilling to a model that's well fit for your chips.

[01:27:36] model that's well fit for your chips. >> We're going to do our best.

[01:27:37] >> We're going to do our best. >> all the software. I just like have to

[01:27:39] >> all the software. I just like have to imagine that there's a long-term lock-in

[01:27:41] imagine that there's a long-term lock-in to Chinese ecosystem if they even have a

[01:27:42] to Chinese ecosystem if they even have a slightly better open source model for a

[01:27:43] slightly better open source model for a while.

[01:27:44] while. China is the largest contributor to open

[01:27:46] China is the largest contributor to open source software in the world.

[01:27:48] source software in the world. Fact.

[01:27:51] Right.

[01:27:52] Right. China is the largest contributor to open

[01:27:54] China is the largest contributor to open models in the world. Fact.

[01:27:57] models in the world. Fact. Today, it's built on the American tech

[01:27:59] Today, it's built on the American tech stack, Nvidia's.

[01:28:01] stack, Nvidia's. Fact.

[01:28:02] Fact. All five layers of the tech stack for AI

[01:28:05] All five layers of the tech stack for AI is important.

[01:28:07] is important. United States ought to go win all five

[01:28:08] United States ought to go win all five of them.

[01:28:10] of them. They're all important.

[01:28:12] They're all important. The one that is the most important, of

[01:28:14] The one that is the most important, of course

[01:28:15] course is the AI application layer.

[01:28:18] is the AI application layer. The layer that diffuses into society,

[01:28:20] The layer that diffuses into society, the one that uses it most

[01:28:22] the one that uses it most will benefit from this industrial

[01:28:24] will benefit from this industrial revolution most.

[01:28:27] revolution most. But my point is that every every layer

[01:28:29] But my point is that every every layer has to succeed.

[01:28:31] has to succeed. If we If we scare this country into

[01:28:34] If we If we scare this country into thinking that AI is

[01:28:37] thinking that AI is somehow a nuclear bomb

[01:28:40] somehow a nuclear bomb so that everybody hates AI

[01:28:42] so that everybody hates AI and everybody's afraid of AI

[01:28:45] and everybody's afraid of AI I don't know how you're helping

[01:28:48] I don't know how you're helping the United States. You're doing a

[01:28:49] the United States. You're doing a disservice.

[01:28:51] disservice. If we scare everybody out of doing

[01:28:52] If we scare everybody out of doing software engineering jobs because it's

[01:28:54] software engineering jobs because it's going to kill every software engineer's

[01:28:55] going to kill every software engineer's job and we don't have any software

[01:28:57] job and we don't have any software engineers as a result of that, we're

[01:28:59] engineers as a result of that, we're doing a disservice to United States.

[01:29:01] doing a disservice to United States. If we scare everybody out of radiology

[01:29:03] If we scare everybody out of radiology so nobody wants to be a radiologist

[01:29:05] so nobody wants to be a radiologist because computer vision is completely

[01:29:06] because computer vision is completely free

[01:29:07] free and no AI is going to do a worse job

[01:29:09] and no AI is going to do a worse job than a radiologist and we we

[01:29:11] than a radiologist and we we misunderstand the difference between a

[01:29:12] misunderstand the difference between a job and a task. The job of a

[01:29:15] job and a task. The job of a radiologist, patient care. Task, to read

[01:29:18] radiologist, patient care. Task, to read a scan. If we misunderstand that so

[01:29:20] a scan. If we misunderstand that so profoundly

[01:29:21] profoundly and we scare everybody out of going to

[01:29:24] and we scare everybody out of going to radiology school, we're not going to

[01:29:26] radiology school, we're not going to have enough radiologists and good enough

[01:29:28] have enough radiologists and good enough health care. And so I

[01:29:31] I'm making the case

[01:29:34] I'm making the case that when you make these

[01:29:37] that when you make these make a premise that is so extreme,

[01:29:39] make a premise that is so extreme, everything goes from zero or infinity.

[01:29:44] everything goes from zero or infinity. We end up scaring people in a way that's

[01:29:47] We end up scaring people in a way that's just not true. Life is not like that.

[01:29:50] just not true. Life is not like that. Do I Do we want United States to be

[01:29:52] Do I Do we want United States to be first? Of course we do.

[01:29:54] first? Of course we do. Do we need Do we Do we need to be uh

[01:29:58] Do we need Do we Do we need to be uh a leader in every layer of that stack?

[01:30:01] a leader in every layer of that stack? Of course we do.

[01:30:03] Of course we do. Of course we do.

[01:30:05] Of course we do. Is today you're talking about mythos

[01:30:07] Is today you're talking about mythos because mythos is important? Sure,

[01:30:09] because mythos is important? Sure, that's fantastic.

[01:30:10] that's fantastic. But in a few years time, I'm making you

[01:30:13] But in a few years time, I'm making you the prediction

[01:30:14] the prediction that when we want the American tech

[01:30:16] that when we want the American tech stack, when we want American technology

[01:30:18] stack, when we want American technology to be diffused around the world

[01:30:20] to be diffused around the world out to India, out to the Middle East,

[01:30:22] out to India, out to the Middle East, out out to to Africa

[01:30:24] out out to to Africa out to Southeast Asia

[01:30:26] out to Southeast Asia when our country would like to export

[01:30:29] when our country would like to export because we would like to export our

[01:30:30] because we would like to export our technology. We would like to export our

[01:30:33] technology. We would like to export our standards.

[01:30:34] standards. On that day I want you and I to have

[01:30:36] On that day I want you and I to have that same conversation again

[01:30:38] that same conversation again and I will tell you exactly about

[01:30:39] and I will tell you exactly about today's conversation about how your

[01:30:42] today's conversation about how your policy and how what you imagined

[01:30:45] policy and how what you imagined literally caused United States to

[01:30:46] literally caused United States to concede the second largest market in the

[01:30:48] concede the second largest market in the world for no good reason at all.

[01:30:52] world for no good reason at all. We shouldn't concede it. If we lose it,

[01:30:54] We shouldn't concede it. If we lose it, we lose it, but why do we concede it?

[01:30:57] we lose it, but why do we concede it? Now, nobody is advocating.

[01:31:00] Now, nobody is advocating. Nobody is advocating an all or nothing.

[01:31:02] Nobody is advocating an all or nothing. Nobody's advocating all or nothing

[01:31:04] Nobody's advocating all or nothing meaning we ship everything to China at

[01:31:06] meaning we ship everything to China at all times. Nobody's advocating that.

[01:31:09] all times. Nobody's advocating that. We should always have the best

[01:31:11] We should always have the best technology here. We should always have

[01:31:13] technology here. We should always have the most technology here and the first.

[01:31:16] the most technology here and the first. But we should also

[01:31:19] But we should also try to compete

[01:31:20] try to compete and win around the world.

[01:31:22] and win around the world. Both of those things can simultaneously

[01:31:24] Both of those things can simultaneously happen.

[01:31:26] happen. It requires some amount of nuance, some

[01:31:28] It requires some amount of nuance, some amount of maturity

[01:31:30] amount of maturity instead of absolutes.

[01:31:32] instead of absolutes. The world is just not absolutes. Okay,

[01:31:34] The world is just not absolutes. Okay, the the argument hinges on they've built

[01:31:37] the the argument hinges on they've built a they've built models that are

[01:31:39] a they've built models that are specified for their architect their the

[01:31:41] specified for their architect their the best chips that they make in a few years

[01:31:42] best chips that they make in a few years and those chips get exported around the

[01:31:43] and those chips get exported around the world that sets the standard. Um

[01:31:46] world that sets the standard. Um because of EUV

[01:31:48] because of EUV um export controls, as we said, you're

[01:31:50] um export controls, as we said, you're going to move on to 1.6 nanometer

[01:31:52] going to move on to 1.6 nanometer there's going to be in 7 nanometer even

[01:31:53] there's going to be in 7 nanometer even after a few years from now and it may

[01:31:55] after a few years from now and it may make sense that domestically they would

[01:31:56] make sense that domestically they would prefer hey, we got so much energy, we

[01:31:58] prefer hey, we got so much energy, we can manufacture at such scale, we'll

[01:31:59] can manufacture at such scale, we'll still be producing 7 nanometer. But the

[01:32:01] still be producing 7 nanometer. But the exporting thing their 7 nanometer chips

[01:32:04] exporting thing their 7 nanometer chips have to be competitive against well your

[01:32:07] have to be competitive against well your 1.6 nanometer chips and their models

[01:32:09] 1.6 nanometer chips and their models have to be so far optimized for the 7

[01:32:11] have to be so far optimized for the 7 nanometer that's better to run their

[01:32:12] nanometer that's better to run their models on 7 nanometer

[01:32:13] models on 7 nanometer than to run their models on your 1.6

[01:32:16] than to run their models on your 1.6 nanometer. Can we Can we just look at

[01:32:17] nanometer. Can we Can we just look at the facts then?

[01:32:19] the facts then? Okay.

[01:32:20] Okay. Is Blackwell 50 times

[01:32:23] Is Blackwell 50 times more advanced lithography than Hopper?

[01:32:26] more advanced lithography than Hopper? Is it 50 times?

[01:32:28] Is it 50 times? Not even close.

[01:32:30] Not even close. I just kept saying it over and over

[01:32:32] I just kept saying it over and over again. Moore's law is dead.

[01:32:34] again. Moore's law is dead. Between Hopper and Blackwell from the

[01:32:36] Between Hopper and Blackwell from the transistors themselves, call it 75%.

[01:32:40] transistors themselves, call it 75%. It was 3 years apart.

[01:32:42] It was 3 years apart. 75%.

[01:32:45] 75%. Blackwell is 50 times

[01:32:48] Blackwell is 50 times Hopper.

[01:32:49] Hopper. My point is architecture matters.

[01:32:53] My point is architecture matters. Computer science matters.

[01:32:55] Computer science matters. Semiconductor physics matters as well.

[01:32:58] Semiconductor physics matters as well. But computer science matters.

[01:33:00] But computer science matters. AI, the impact of AI largely comes from

[01:33:05] AI, the impact of AI largely comes from the computing stack. Which is the reason

[01:33:07] the computing stack. Which is the reason why CUDA is so effective, which is the

[01:33:08] why CUDA is so effective, which is the reason why CUDA is so so so beloved.

[01:33:11] reason why CUDA is so so so beloved. It's It's an ecosystem of computing

[01:33:14] It's It's an ecosystem of computing architecture that allows for so much

[01:33:16] architecture that allows for so much flexibility that if you wanted to change

[01:33:18] flexibility that if you wanted to change an architecture completely create

[01:33:20] an architecture completely create something like MOE

[01:33:22] something like MOE create something like diffusion

[01:33:24] create something like diffusion create something

[01:33:25] create something you know, that's disaggregated. You

[01:33:27] you know, that's disaggregated. You could do You could do so.

[01:33:29] could do You could do so. It's easy to do.

[01:33:31] It's easy to do. And so the fact of the matter is AI is

[01:33:33] And so the fact of the matter is AI is about the stack above as much as it is

[01:33:36] about the stack above as much as it is about the architecture below. To the

[01:33:38] about the architecture below. To the extent that that we have architectures

[01:33:40] extent that that we have architectures and software stacks that are optimized

[01:33:42] and software stacks that are optimized for our stack, for our ecosystem it is

[01:33:45] for our stack, for our ecosystem it is obviously good. Because we started the

[01:33:47] obviously good. Because we started the conversation today about how Nvidia's

[01:33:49] conversation today about how Nvidia's ecosystem is so rich, why people always

[01:33:52] ecosystem is so rich, why people always love programming on CUDA first. They do.

[01:33:55] love programming on CUDA first. They do. They do. And so did the researchers in

[01:33:57] They do. And so did the researchers in China.

[01:33:58] China. But if we are forced to leave China

[01:34:01] But if we are forced to leave China if we're forced to leave China

[01:34:03] if we're forced to leave China it would be

[01:34:04] it would be it would be Well, first of all, it would

[01:34:06] it would be Well, first of all, it would It's a policy mistake. Obviously has

[01:34:08] It's a policy mistake. Obviously has backlash as as has backlash. Obviously

[01:34:12] backlash as as has backlash. Obviously it has fired, you know

[01:34:13] it has fired, you know has has uh uh

[01:34:15] has has uh uh has has turned out badly for it for the

[01:34:17] has has turned out badly for it for the United States.

[01:34:19] United States. It enabled it accelerated our chip

[01:34:21] It enabled it accelerated our chip industry. It forced all of their AI

[01:34:24] industry. It forced all of their AI ecosystem to focus on their internal

[01:34:26] ecosystem to focus on their internal architectures. It's not too late, but

[01:34:29] architectures. It's not too late, but nonetheless

[01:34:30] nonetheless it has already happened.

[01:34:33] it has already happened. You're going to see in the future

[01:34:35] You're going to see in the future they're not stuck at 7 nanometer,

[01:34:37] they're not stuck at 7 nanometer, obviously. They're good at

[01:34:38] obviously. They're good at manufacturing.

[01:34:39] manufacturing. They will continue to advance from 7 and

[01:34:42] They will continue to advance from 7 and beyond.

[01:34:43] beyond. Now

[01:34:45] is there 10x difference between 5

[01:34:48] is there 10x difference between 5 nanometer

[01:34:50] nanometer and 7 nanometer? The answer is no.

[01:34:53] and 7 nanometer? The answer is no. Architecture matters. Networking

[01:34:55] Architecture matters. Networking matters. That's why Nvidia bought

[01:34:57] matters. That's why Nvidia bought Mellanox. Networking matters.

[01:34:58] Mellanox. Networking matters. Energy matters. And so all of that stuff

[01:35:01] Energy matters. And so all of that stuff matters. It's not It's not simplistic

[01:35:03] matters. It's not It's not simplistic like the way you're trying to distill

[01:35:05] like the way you're trying to distill it.

[01:35:06] it. Uh we can move on from China, but that

[01:35:07] Uh we can move on from China, but that actually raises an interesting question

[01:35:09] actually raises an interesting question about um

[01:35:10] about um we were discussing earlier these

[01:35:12] we were discussing earlier these bottlenecks at TSMC and memory and so

[01:35:14] bottlenecks at TSMC and memory and so forth. And so if we're in this world

[01:35:17] forth. And so if we're in this world where you know, you're already a

[01:35:18] where you know, you're already a majority of N3, at some point you'll be

[01:35:21] majority of N3, at some point you'll be N2, you'll be a majority of that.

[01:35:23] N2, you'll be a majority of that. Do you see that you could go back to

[01:35:26] Do you see that you could go back to N7, the spare capacity at an older

[01:35:28] N7, the spare capacity at an older process node and say, "Hey

[01:35:30] process node and say, "Hey the demand for AI is so great and our

[01:35:32] the demand for AI is so great and our capacity to expand the leading edge is

[01:35:34] capacity to expand the leading edge is not meeting it, so we're going to make a

[01:35:37] not meeting it, so we're going to make a Hopper or Ampere about everything we

[01:35:38] Hopper or Ampere about everything we know about the numerics today and all

[01:35:40] know about the numerics today and all the other improvements you described."

[01:35:41] the other improvements you described." Do Do you see that world happening

[01:35:42] Do Do you see that world happening within

[01:35:44] within before 2030?

[01:35:45] before 2030? It's not necessary to and the reason for

[01:35:47] It's not necessary to and the reason for that is because

[01:35:49] that is because with every every generation the

[01:35:51] with every every generation the architecture

[01:35:53] architecture the architecture um

[01:35:55] the architecture um is more than just

[01:35:57] is more than just is more than just uh

[01:36:00] is more than just uh the transistor scale. It also you're

[01:36:03] the transistor scale. It also you're doing so much engineering and packaging

[01:36:05] doing so much engineering and packaging and stacking and

[01:36:07] and stacking and and the numerics and you know, the

[01:36:09] and the numerics and you know, the system architecture.

[01:36:11] system architecture. Um

[01:36:13] when you run out of capacity

[01:36:16] when you run out of capacity to easily go back to another node,

[01:36:18] to easily go back to another node, that's a level of R&D that that no one

[01:36:21] that's a level of R&D that that no one no one could afford. You know, we we

[01:36:23] no one could afford. You know, we we could afford to lean forward. I don't

[01:36:24] could afford to lean forward. I don't think we could afford to go back. Now,

[01:36:26] think we could afford to go back. Now, if the world simply says

[01:36:28] if the world simply says if on that day if on that day let's do

[01:36:31] if on that day if on that day let's do the thought experiment on that day we go

[01:36:33] the thought experiment on that day we go listen, we're just never going to have

[01:36:34] listen, we're just never going to have more capacity ever again. Would I go

[01:36:36] more capacity ever again. Would I go back and use seven? In a heartbeat.

[01:36:39] back and use seven? In a heartbeat. Yeah, of course I would.

[01:36:40] Yeah, of course I would. Um

[01:36:42] Um One question somebody I was talking to

[01:36:43] One question somebody I was talking to had is why Nvidia doesn't run multiple

[01:36:46] had is why Nvidia doesn't run multiple different chip projects at the same time

[01:36:48] different chip projects at the same time with totally different architectures so

[01:36:50] with totally different architectures so you could do like a Cerebras style wafer

[01:36:52] you could do like a Cerebras style wafer scale, you could do a Dojo style huge

[01:36:53] scale, you could do a Dojo style huge package, you could do one without CUDA,

[01:36:56] package, you could do one without CUDA, you know, um you have the resources, the

[01:36:58] you know, um you have the resources, the engineering talent

[01:36:59] engineering talent to do all of these in parallel.

[01:37:01] to do all of these in parallel. So why put all the eggs in one basket

[01:37:03] So why put all the eggs in one basket given who knows where AI might go and

[01:37:04] given who knows where AI might go and architectures might go? Oh, we could.

[01:37:07] architectures might go? Oh, we could. It's just that that we don't have a

[01:37:08] It's just that that we don't have a better idea.

[01:37:09] better idea. Hm.

[01:37:10] Hm. Yeah, yeah. We We could do all of those

[01:37:12] Yeah, yeah. We We could do all of those things. Um I

[01:37:14] things. Um I it's just not better.

[01:37:16] it's just not better. And we simulate it all.

[01:37:17] And we simulate it all. They're in our simulator provably worse.

[01:37:21] They're in our simulator provably worse. And so we wouldn't do it.

[01:37:23] And so we wouldn't do it. Yeah. We're We're doing We're working on

[01:37:26] Yeah. We're We're doing We're working on exactly the projects that we want to

[01:37:27] exactly the projects that we want to work on.

[01:37:28] work on. And and um

[01:37:30] And and um I

[01:37:32] I if the workload were to change

[01:37:34] if the workload were to change dramatically

[01:37:35] dramatically um and I don't mean I don't mean the

[01:37:37] um and I don't mean I don't mean the algorithms, I actually mean the

[01:37:38] algorithms, I actually mean the workload.

[01:37:40] workload. The um and that that depends on the

[01:37:43] The um and that that depends on the shape of the market.

[01:37:45] shape of the market. Um

[01:37:46] Um I we may decide to add other

[01:37:48] I we may decide to add other accelerators. Like for example, recently

[01:37:50] accelerators. Like for example, recently we added Groq.

[01:37:52] we added Groq. Um and we're going to fold Groq into our

[01:37:55] Um and we're going to fold Groq into our CUDA ecosystem.

[01:37:56] CUDA ecosystem. And and um

[01:37:58] And and um uh we do we're doing that now because

[01:38:02] uh we do we're doing that now because the value of tokens

[01:38:04] the value of tokens um have gone up so high

[01:38:06] um have gone up so high that that you could have different

[01:38:08] that that you could have different pricing of tokens. Back in the old days

[01:38:09] pricing of tokens. Back in the old days in the you know, just a couple years

[01:38:11] in the you know, just a couple years ago, tokens are either free or barely

[01:38:13] ago, tokens are either free or barely you know, barely expensive, right? And

[01:38:15] you know, barely expensive, right? And so But now you can have different

[01:38:17] so But now you can have different customers and those customers want

[01:38:19] customers and those customers want different answers. And so because the

[01:38:21] different answers. And so because the customers make so much money like for

[01:38:23] customers make so much money like for example, our software engineers if I can

[01:38:25] example, our software engineers if I can give them much more

[01:38:28] give them much more um

[01:38:29] um responsive tokens so that they're even

[01:38:31] responsive tokens so that they're even more productive than they are today, I

[01:38:33] more productive than they are today, I would pay for it.

[01:38:35] would pay for it. But that market has only recently

[01:38:36] But that market has only recently emerged. And so I think that we now have

[01:38:40] emerged. And so I think that we now have we now have the ability to have the same

[01:38:42] we now have the ability to have the same model

[01:38:43] model based on the response time have

[01:38:45] based on the response time have different segments.

[01:38:46] different segments. And that's the reason why we decided to

[01:38:48] And that's the reason why we decided to expand the Pareto frontier

[01:38:51] expand the Pareto frontier and

[01:38:52] and and create a segment

[01:38:54] and create a segment of inference that is faster response

[01:38:57] of inference that is faster response time even though it's lower lower

[01:38:58] time even though it's lower lower throughput. At the until now, higher

[01:39:01] throughput. At the until now, higher throughput is always better.

[01:39:03] throughput is always better. Um we we think that there there could be

[01:39:05] Um we we think that there there could be a world where there could be very high

[01:39:06] a world where there could be very high ASP tokens and and um

[01:39:10] ASP tokens and and um I even though the is though the

[01:39:12] I even though the is though the throughput is lower in the factory, the

[01:39:14] throughput is lower in the factory, the ASPs make up for it.

[01:39:16] ASPs make up for it. Yeah, that's the reason why we did it.

[01:39:17] Yeah, that's the reason why we did it. But otherwise, from an architectural

[01:39:19] But otherwise, from an architectural perspective, um I I think Nvidia's

[01:39:21] perspective, um I I think Nvidia's architecture is I would I would rather

[01:39:23] architecture is I would I would rather put If I If I had more money, I'd put

[01:39:25] put If I If I had more money, I'd put more behind the architecture. Mhm. I I I

[01:39:28] more behind the architecture. Mhm. I I I think this idea of extremely premium

[01:39:30] think this idea of extremely premium tokens and just the disaggregation of

[01:39:32] tokens and just the disaggregation of the inference market is very

[01:39:33] the inference market is very interesting. The segmentation of it.

[01:39:35] interesting. The segmentation of it. Yeah. Yeah. All right, final question.

[01:39:38] Yeah. Yeah. All right, final question. Um

[01:39:39] Um suppose the deep learning revolution

[01:39:40] suppose the deep learning revolution didn't happen.

[01:39:41] didn't happen. Um what would Nvidia be doing?

[01:39:45] Um what would Nvidia be doing? Obviously, games, but given It's already

[01:39:48] Obviously, games, but given It's already computing.

[01:39:50] computing. Mhm. It's already computing. The the

[01:39:52] Mhm. It's already computing. The the same thing we've been doing all along.

[01:39:53] same thing we've been doing all along. I

[01:39:55] I the the premise of our company is that

[01:39:56] the the premise of our company is that Moore's law Moore's law is going to more

[01:39:59] Moore's law Moore's law is going to more general purpose computing is good for a

[01:40:01] general purpose computing is good for a lot of things. But for a lot of

[01:40:03] lot of things. But for a lot of computation, it's not ideal.

[01:40:05] computation, it's not ideal. And so we combined an architecture

[01:40:08] And so we combined an architecture called a GPU, CUDA,

[01:40:09] called a GPU, CUDA, to a CPU so that we can accelerate the

[01:40:12] to a CPU so that we can accelerate the workload of the CPU. And so different

[01:40:15] workload of the CPU. And so different different kernels of code or algorithms

[01:40:17] different kernels of code or algorithms could be offloaded onto our GPU, and as

[01:40:20] could be offloaded onto our GPU, and as a result,

[01:40:21] a result, you speed up an an application by, you

[01:40:23] you speed up an an application by, you know, 100x, 200x. And where can you use

[01:40:26] know, 100x, 200x. And where can you use that? Um well, obviously, engineering

[01:40:28] that? Um well, obviously, engineering and science and physics and, you know,

[01:40:30] and science and physics and, you know, so on and so forth, data processing. Um

[01:40:33] so on and so forth, data processing. Um uh

[01:40:33] uh computer graphics, image generation. I

[01:40:35] computer graphics, image generation. I mean, all kinds of things.

[01:40:37] mean, all kinds of things. Even if AI doesn't exist today, Nvidia

[01:40:39] Even if AI doesn't exist today, Nvidia will be very very large. Yeah. And so So

[01:40:41] will be very very large. Yeah. And so So I think the the reason for that is is

[01:40:44] I think the the reason for that is is fairly fundamental, which is which is

[01:40:46] fairly fundamental, which is which is the ability for general purpose

[01:40:47] the ability for general purpose computing to continue to scale has

[01:40:50] computing to continue to scale has largely run its course.

[01:40:52] largely run its course. And the only the the not the only way,

[01:40:54] And the only the the not the only way, but the the way to do that is through

[01:40:56] but the the way to do that is through domain-specific acceleration. And what

[01:40:59] domain-specific acceleration. And what one of the the domain that we started

[01:41:01] one of the the domain that we started with was computer graphics.

[01:41:03] with was computer graphics. But um many There are many many other

[01:41:05] But um many There are many many other domains. I mean, there's, you know, you

[01:41:06] domains. I mean, there's, you know, you know, all all kinds of

[01:41:08] know, all all kinds of scientific particle physics and fluids

[01:41:10] scientific particle physics and fluids and, you know, and and so structured

[01:41:13] and, you know, and and so structured data processing, all kinds of different

[01:41:14] data processing, all kinds of different types of of algorithms that benefit from

[01:41:17] types of of algorithms that benefit from CUDA. And so our our mission was

[01:41:20] CUDA. And so our our mission was uh really to bring accelerated computing

[01:41:23] uh really to bring accelerated computing to the world and advance the type of

[01:41:25] to the world and advance the type of applications that general purpose

[01:41:27] applications that general purpose computing can't do and scale to the

[01:41:29] computing can't do and scale to the level of of uh capability that helps

[01:41:32] level of of uh capability that helps break through certain fields of science.

[01:41:35] break through certain fields of science. And And so some of the early

[01:41:37] And And so some of the early applications were uh molecular dynamics,

[01:41:40] applications were uh molecular dynamics, uh seismic processing for energy

[01:41:42] uh seismic processing for energy discovery, um uh image processing, of

[01:41:45] discovery, um uh image processing, of course. Uh And so all of those kind of

[01:41:47] course. Uh And so all of those kind of fields where where general purpose

[01:41:50] fields where where general purpose computing is just simply too inefficient

[01:41:51] computing is just simply too inefficient to do so. And so Yeah, if if there's no

[01:41:54] to do so. And so Yeah, if if there's no AI, I would be very sad. Um but

[01:41:58] AI, I would be very sad. Um but because of because of of the advances

[01:42:01] because of because of of the advances that we made in computing, we

[01:42:04] that we made in computing, we democratized deep learning. We made it

[01:42:06] democratized deep learning. We made it possible for any researcher, any

[01:42:09] possible for any researcher, any scientist, any or any student to be able

[01:42:11] scientist, any or any student to be able to access a PC or, you know, uh a a

[01:42:14] to access a PC or, you know, uh a a GeForce adding card and and to do

[01:42:17] GeForce adding card and and to do amazing science. And um uh that that

[01:42:21] amazing science. And um uh that that fundamental promise uh hasn't changed,

[01:42:23] fundamental promise uh hasn't changed, not even a little bit. And so

[01:42:25] not even a little bit. And so if you see GTC If you watch GTC, there's

[01:42:28] if you see GTC If you watch GTC, there's the whole beginning part of it. None of

[01:42:30] the whole beginning part of it. None of it's AI.

[01:42:31] it's AI. That whole part of it with with uh

[01:42:33] That whole part of it with with uh computational lithography or or uh our

[01:42:37] computational lithography or or uh our quantum chemistry work or, you know, uh

[01:42:40] quantum chemistry work or, you know, uh all of that stuff, data processing work,

[01:42:42] all of that stuff, data processing work, uh all of that stuff is is uh unrelated

[01:42:45] uh all of that stuff is is uh unrelated to AI and and and it's still very

[01:42:48] to AI and and and it's still very important. I mean, there's, you know, I

[01:42:49] important. I mean, there's, you know, I I know that that AI is is very

[01:42:51] I know that that AI is is very interesting and and quite exciting. Um

[01:42:54] interesting and and quite exciting. Um but but um

[01:42:56] but but um uh there's a lot of people doing a lot

[01:42:57] uh there's a lot of people doing a lot of very important work that's not not

[01:42:59] of very important work that's not not AI-related, and tensors is not the only

[01:43:01] AI-related, and tensors is not the only way that you compute with. And um I and

[01:43:04] way that you compute with. And um I and we want to help everybody.

[01:43:06] we want to help everybody. Jensen, thank you so much. You're

[01:43:08] Jensen, thank you so much. You're welcome. I enjoyed it.

[01:43:09] welcome. I enjoyed it. >> Me, too.

[01:43:10] >> Me, too. Sweet.

【黃仁勳最新訪談】「把電子變成 Tokens」：揭開 AI 戰局真正核心

Full Transcript

Full Transcript

Full Transcript (Bilingual)

Summary

Key points

摘要 / Summary (zh-CN)

要点

Cite this page