# Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Applied AI

https://www.youtube.com/watch?v=Qh7Oxvo5sJI

[00:09] Inference is about to go up a billionx.
[00:13] Not a thousandx, not a million x, a billion x.
[00:16] And the guy who's making it go up a billionx is here with us today to thank you so much for joining us.
[00:21] Please join us.
[00:23] Thank you.
[00:31] Tuhin, founder and CEO of Bay 10.
[00:34] You've had a long long scenic journey to the to the day here today.
[00:37] Lots of windy turns.
[00:40] Tell us about it.
[00:42] You know, a lot of students in the class who want to be entrepreneurs.
[00:43] Um I think we'll find inspiration in in in the journey.
[00:48] Awesome. Thank you for having me.
[00:49] It's really nice to meet you all.
[00:51] Um yeah, my name is Tin.
[00:53] I'm the CEO, one of the co-founders of Ban.
[00:57] um based since about 7 years old.
[01:00] Um I've been working in technology since 2012.
[01:03] Um I'm originally from Sydney, Australia.
[01:05] I came here for uni.
[01:09] Um actually started my career in finance.
[01:11] So I um I moved home after uni
[01:15] um to work at a Australian investment bank called McQuary.
[01:19] um in my in my first day over there they said you want to go and move to New York and work on privatizing toll roads and airports.
[01:25] So I moved to New York and I and I worked in privatizing toll roads and and airports and um about 2 years in I was bored out of my mind and I decided that I should go um back to engineering which I'd studied in undergrad and I moved to Boston um to work on doing some research for machine learning.
[01:42] This is 2012 or 2013.
[01:44] No, 2011, 2012.
[01:49] Um, and what we were doing is we're using traditional machine learning techniques to track the prognosis of neuromuscular disease.
[01:56] Um, I did that for about a year and a half.
[01:59] Um, put out some papers.
[02:02] Um, and really quickly realized that if I was going to do machine learning, even back then, that I probably should be in um, San Francisco.
[02:07] So, I moved to San Francisco and got involved in early stage technology as an engineer.
[02:14] Um, I think probably the more important
[02:16] piece is I kind of fell in love with early stage technology and small teams and and building products and when no one really cares.
[02:24] Um did that for a few years, started a couple of companies starting 2015.
[02:30] They went nowhere, but you know, I had the bug.
[02:32] And so, um in 2019, me and my two co-founders, me and my two co co-founders, uh a guy called Phil and a guy called Amir, who I'd worked with, um for the better part of 10 to 15 years.
[02:51] Is that is that a me thing?
[02:53] Oh, they're just straining up your water.
[02:54] Okay, cool.
[02:56] Um the some more energy.
[02:57] Yeah.
[02:57] So they um you know I I know these guys for 10 or 15 years and we we we had this idea that machine learning was going to be pretty big and um let's build a infrastructure business alongside it um to maybe capture the index of it.
[03:09] Obviously machine learning was a lot bigger than we expected and it came a lot faster and so what we've been doing for the last four years is
[03:16] production inference.
[03:18] um powering the fastest growing AI companies in the world.
[03:21] Yeah.
[03:21] you have a salivating list of customers.
[03:24] Um, actually you had a post I uh I copied your post here which is these are all the founders that uh based has a good opportunity to work with.
[03:32] You might know some of these faces.
[03:34] Some of them have been here in class.
[03:39] This is uh probably the best index of AI founders who work with base 10.
[03:46] and so maybe maybe break it down for us.
[03:49] What pick an example or two of of of faces up here when they work with B 10?
[03:55] What does that mean that they work with B 10?
[03:56] What's an example of their work with us and what does that enable for their business and their customers?
[04:02] Yeah, may maybe I'll pick two fun ones.
[04:04] I'll pick T who's on top.
[04:05] I'll pick um T actually Stanford.
[04:11] You know, you know T very well.
[04:14] And I'll pick Chev.
[04:14] Um, and so
[04:18] they they run very different businesses.
[04:20] So D runs a company called Whisper Flow.
[04:23] I don't know how many people here know Whisper Flow.
[04:25] How many heard of Whisper Flow?
[04:26] Yeah.
[04:26] Awesome.
[04:26] Awesome.
[04:26] Um, that' make him very happy.
[04:29] Um, the um, so Whisper Flow is a is basically a text keyboard uh, speech speech to keyboard uh, speech to text app um, for voice typing more or less.
[04:42] you know, they they run a lot of custom models that they've built in house or like modified in house in a very particular way um to make that experience possible.
[04:51] And so um what we do for them is we run all the optimizations, we run all the infrastructure so that you know that latency from the time that you talk to when text shows up um is as quick as possible.
[05:04] Yeah.
[05:05] Um there's language models, there's audio models, there's um there's actually I think three or four language models in the middle there and two audio models to make that happen and all of them run on base 10.
[05:18] Um, Chev, who's also a good friend of mine, is the CEO of a company called Abridge.
[05:25] Um, Abridge is a healthcare, um, it's an ambient scribe, um, used by, used in almost every healthcare system in the US that's deeply integrated with EMRs.
[05:37] Um, they run about 20 different models.
[05:40] Um, everything from, um, obviously speech-to-text models to go from what is happening in the, in the patient, um, in the operating room or the patient's room to, um, everything that goes to turning that into a clinical note that is deeply integrated with EMR.
[05:58] Mhm. Again, you know, over like dozens of models, um, lots of health, lots of requirements from a reliability perspective, from, from a speed perspective.
[06:11] Um, and every, almost, actually, I think every single one of those models runs on base 10.
[06:15] Um, and, you know, I, I think that's kind of emblematic of kind of what we want to do.
[06:16] Our thesis is that, um, AI is going
[06:22] to be absolutely massive.
[06:25] um inferences the cogs of AI value being delivered.
[06:30] Um and today about 90% or 95% of spend on inference is going to frontier models and about 5% is going to custom models and we believe that the way that all these folks who are building amazing applications um with AI um are going to build profitable viable defensible companies is with custom models and hopefully they all run on base 10.
[06:50] 95% of their spend is going to frontier models.
[06:55] Yeah.
[06:56] 5% is going to open source models.
[06:59] This or post-trained open source models.
[07:00] Yeah.
[07:01] Got it. Actually, before we go there, I have a question for the two businesses that you highlighted.
[07:07] So, a bridge or whisper flow both fairly scaled businesses.
[07:12] Why would they come work with you?
[07:14] Why not?
[07:16] Typically, if I was a founder, I would go to my cloud provider.
[07:18] I go to AWS, GCP, Azure or one of the NeoClouds that that are
[07:23] the AI clouds, Corviv, Nbius, others.
[07:27] Why Ban?
[07:28] Yeah, I I think a lot of it comes down to like the three or four core tenants of like the software we're building.
[07:32] So there's obviously performance.
[07:35] You need these things to work as quickly as possible.
[07:39] If you go to Core, weave, Nebus, AWS, GCP, um you're kind of on your own to do those optimizations.
[07:46] Um the second piece is reliability and compute.
[07:46] Um you know they're coming to us because they need um inference to run across clouds in a very fault tolerant resilient way.
[07:56] Um I think you know part of that is being multicloud and having access to compute from multiple sources.
[08:02] We kind of unlock multiloud for them.
[08:04] And the third one is like the development developer platform that we that we provide which gives them you know flexibility, security, observability and all those good things um to make that happen.
[08:16] And so what we actually find is a lot of them do go to those folks you talked about first and then realize the pain of standing up that whole inference stack on top of
[08:24] compute and realize they'll just be better served coming to base 10.
[08:29] Fascinating. I imagine uh now going to the second thing you had said earlier 95% of them or or 95% of tokens are on Frontier 5% on post-trained or or open source of some kind.
[08:43] If I was using an open source model or post- training an open source model, I am branching off the sort of the the the lineage of intelligence that might improve and I run the risk that the next GPT model or the next Opus model, let's call it GPT 6 or or Opus 5, will be better than what I just postrained.
[09:03] Why go through the effort of branching out into this specific model uh uh when the no work option might just get there in in in a couple of weeks or a couple of couple of months.
[09:18] Yeah. Look, there's a number of reasons I'd say. Um you know, I'll give I'll give you the the viable reason. I'll give you the cynical reason. Um the the
[09:25] the viable reason the viability reason
[09:27] here is that you know you if you think about a company let's take if you think about any truly um any company that's doing things with these frontier models.
[09:38] Mhm.
[09:38] Today open source models are about 90 days behind closed source models.
[09:43] And the question that a lot give a slide on this.
[09:44] Keep going.
[09:45] Yeah.
[09:47] Do I?
[09:47] Uh next one.
[09:47] Yeah.
[09:47] open source models about 90 90 days behind um Frontier models and you can run them about 70 to 70 to 90% cheaper.
[09:59] Wow.
[10:00] Than Frontier models.
[10:01] 70 to 90%
[10:02] cheaper.
[10:02] Yeah.
[10:02] And especially when you and when you add in a bit of you know I think you had Yash in here.
[10:07] Mhm.
[10:07] Last week who probably told you the post- training narrative of specialized models and especially when you when you factor that in you can really do better, faster, cheaper.
[10:14] Mhm.
[10:15] um um for these models and and it's really they come to us from like the viability reason when they really go from I have product market fit to I need to become a scale business and figure
[10:26] out how to you know get my gross margins
[10:29] you know up from zero all the way to 40
[10:31] 50 60 70% hopefully um like you want to be
[10:35] the cynical reason I'd say which is you know
[10:38] um look what do you have that is defensible against a frontier model
[10:43] it's probably some some workflow Mhm.
[10:45] some user signal um that only you have.
[10:49] Um if you keep working with the the Frontier Labs,
[10:54] to some extent, you know, you're going to give them all the data and all that user signal and you know, they got
[10:58] they're going to you know, the the who I was talking to a prominent public office, I won't say who who it was.
[11:07] Um he like he likened the Frontier Labs to the East India Company.
[11:11] you know, they they they they show off in India and like, you know, they're they're they're
[11:17] making all these partnerships, but really what they're doing is you're giving them all the tricks on how to rule that and before you know their post- training models kind of at um against those workflows that are sacred
[11:28] to you that only you know.
[11:30] So to be defensible against this and keep the thing that you know makes you special
[11:35] you need int you need to own your intelligence to some extent and that's why you know they are running they are you know figuring out how they can use open source models and post train them um and really kind of build out that stack themselves to own their intelligence.
[11:50] Fascinating.
[11:53] Taking this further to it seems that if I had a larger volume of workloads, I am more incentivized to move to open source or or post train model.
[12:03] Um you have a couple of customers that are very scaled customers a bridge, whisper flow, cursor etc.
[12:14] They are obviously very large customers of the frontier models.
[12:19] Um is it also the case that the larger or the more um user base you have the more likely you are to adopt this or is it the other way is like the smaller uh guys are more
[12:29] likely to adopt this.
[12:30] I think I think the form I think the larger you are the more existential it becomes.
[12:34] right to be able to to need to go towards this.
[12:37] because I think that's when you like you know the bigger you get.
[12:40] the more you you know like I mean you've looked you maybe you can talk about this as well but you you know you look at these businesses.
[12:48] and as you get to massive scale um if you're just trading tokens um you know it starts becoming very very expensive and the more capital you need and I you need that path to profitability.
[13:00] Um,
[13:02] yeah,
[13:02] those post training becomes very important there.
[13:05] Yeah, I mean the leading coding companies that are not the frontier model companies themselves are, you know, still rumored to be negative gross margin and so I imagine is existential for them to to to be a viable business to shift that token uh uh the token volume towards open source.
[13:25] Yeah. Um now is there a performance gap
[13:29] that they notice as they move from let's say for example cursor using enthropics models or GPD the latest openi models to moving to a post train model like they did recently there's there's a lot of hoopla on Twitter about their work with um composer um does the performance take a hit?
[13:47] Is there a way that the workload partitions that it does not or um how do the users experience the the the change in the underlying you know almost the engine of the Ferrari was was was swapped out.
[13:59] Yeah.
[14:00] Well, we don't have any like you know any data like internally, but I think the the like you would hope that it gets better, right?
[14:07] You'd hope that you know as they post train models to use the signals that they have the experience of the user gets better.
[14:13] Mhm.
[14:13] And hopefully it also drives you know both performance from a latency perspective and reliability perspective um higher because there's a lot more control.
[14:23] Got it.
[14:23] Um makes tons of sense.
[14:23] And um what is the what is the business of
[14:29] Basen? How do we monetize our users?
[14:32] You know, obviously there's this there's a whole slew of pricing options right now in AI.
[14:36] You've got outcome based pricing, token based pricing, GPU by the hour pricing and and others that I might not even know about.
[14:45] Where do we where do we fit in and and how does that work with your largest customers?
[14:50] Yeah, look, right now we mock up compute.
[14:53] Mhm.
[14:54] Um, we're we're re for for the most part.
[14:57] So what that means is that you know you come to use base 10 um you choose you bring your models you you load it up onto the base 10 inference stack.
[15:07] Um, and you basically you basically choose which compute you want to run it on.
[15:12] Um, and a B 10 H100 or B 10 B200 is more expensive than a raw B200.
[15:18] I think what's interesting there to think about you know.
[15:21] Um, how much you like it is expensive way to get raw compute but ideally that's that's the value you're getting the software stack.
[15:30] um others use token pricing.
[15:34] you know we we have some customers and
[15:36] you know we we we are pretty relatively unsophisticated about this today um but you know we're moving towards a world especially as we unlock more of those post-training workflows within bas itself where
[15:48] you know the narrative changes from how much how much are you paying for computing per token
[15:54] and actually the app companies appreciate that a bit more because especially for those coming from using anthropic or open AI
[16:01] um they have a like for like and how much how much cost savings there are.
[16:06] fascinating um I might follow up on something you said you said post training workflows.
[16:11] um could you um simplify that for us.
[16:15] what does a post-training workflow look like start to end.
[16:20] Um, and feel free to take an example if you can.
[16:21] Um, how does it work?
[16:24] How does it work with base 10 or how does it work with you know uh a large customer like cursor or bridge or whisper flow?
[16:30] Yeah, like
[16:32] you know we talk a bit about post training there um right here.
[16:38] Yeah. So we have training um and so really what you're doing as a customer there is you're you're designing what your utility function is.
[16:44] Mhm. and you're saying, "Hey, this is the thing that I want to optimize with this given model."
[16:49] So, you you need to define that. Like, we can't help you there. We don't know your product. We don't know your users. We don't know what you're optimizing for in your business.
[16:58] Mhm.
[16:58] So, instead, you decide what your what the utility function is.
[17:03] Mhm. And then once you know that, start giving us a bunch of data and choose an open source model and we kind of give you all the scaffolding to turn that into a a post-trained model.
[17:15] Um, and then once you have that, you know, how does that roll into the inference piece?
[17:20] Um, and so we kind of own that entire loop. So you bring your data, you bring your utility function.
[17:25] Um, you choose a base model. So you might say maybe just to make it very concrete you might say um let's say you're building a speech to
[17:34] text model for for for a medical use case.
[17:38] it you choose you know the and maybe it is errors or transcription errors the thing you're trying to minimize.
[17:47] so that's your utility function you choose the model you want to use so let's say you could use a Kimmy K25.
[17:52] and then you give us a data set.
[17:55] we have all that scaffolding set to turn that into a very very good post-trained Kimmy K25.
[18:01] made for that and then we obviously have all the um integration into inference.
[18:07] um predefined and so our customers who are coming to us with that and I think you're going to see more and more of this.
[18:14] um really um they they come with data and what they know about their about their workflows and they leave with a post-trained um like specialized model running on base.
[18:26] Mhm.
[18:28] They trust you with all that data, the keys to the kingdom.
[18:32] Yeah.
[18:33] Is that a natural conversation for them?
[18:35] to hand over the keys to the kingdom to base 10 or the East India Company?
[18:39] Um, you know, we're the West India Company.
[18:41] We're the way, you know, we're good, you know, we we we are we are we're the we're the rebellion, right?
[18:48] We're trying to arm the rebellion. Um, so is that a natural conversation?
[18:50] I I think it's fine. I think most people are okay with that.
[18:54] you know, we we have, you know, I think we have a track record of working with like amazing companies and the brand kind of does a lot of work there, but you know, we also have like incredibly intense security posture internally, right?
[19:06] And, you know, like even look, we we work with um multiple competitors and we have access to, you know, their models and their data and, you know, we have set up the boundaries internally to make sure that there's no leakage between that.
[19:20] Um, and so is it natural?
[19:22] I I I think it's not not as bad as you'd think.
[19:24] You also got to remember that a lot of these companies are trying to work move very very very quickly.
[19:31] Yeah. And I and I think they're more they're more interested on can you solve their user problem as opposed to we need to do
[19:36] Everything from scratch.
[19:37] Mhm.
[19:38] Mhm.
[19:39] I think they get there's a lot of trust they put in base 10, but I think hopefully that's ear that's earned and um from all the great customers we work with and their friends who are already using us for a lot of these things.
[19:49] Makes sense.
[19:49] Um the other big bet you you guys are making implicitly is that open-source will stay on the frontier for at least 3 to 6 months behind the frontier.
[19:59] You know the economic model of front open open source frontier labs is TBD.
[20:08] TBD.
[20:08] TBD I was about to say TBD yeah is not known and you know there's a lot of um at least western open source labs like obviously Meta has decided to insource their their their their or not open source their latest models anymore.
[20:23] Talk talk about that for a second.
[20:24] What is the business model of open source?
[20:26] What is your best guess of how the world unfolds?
[20:28] Uh because that is obviously one of the biggest underlying bets that you've made.
[20:33] Yeah.
[20:33] Um look, I I think the two big bets of base 10 are the existence of an
[20:39] application an independent application
[20:40] layer.
[20:41] >> Yeah.
[20:41] >> And will open source be good enough that
[20:44] you can continue to post train most of
[20:46] like America so far has shown that
[20:48] >> it it is not able to produce the
[20:51] best-in-class at least in the last two
[20:53] years
[20:54] >> open source models like the best open
[20:55] source models today are coming
[20:57] >> that's wild
[20:58] >> from China. I I think
[20:59] >> why is that?
[21:00] >> Why can't they produce them?
[21:01] >> Why can't
[21:03] >> America produce open source models that
[21:05] perform it?
[21:05] >> I mean, I think this, you know, there's
[21:07] all the best research in the world right
[21:09] now working at two companies and, you
[21:10] know, I don't think they're despite
[21:12] their names maybe suggesting otherwise.
[21:15] Um, they're not super motivated
[21:17] >> um to produce that. Um,
[21:19] >> I I I think I think um
[21:22] >> and why have the non-American
[21:24] researchers decided to pursue the open
[21:27] source model? I mean I I I think it's
[21:29] it's to be relevant probably to start
[21:32] with, right? Like you know, we we we
[21:34] think like Moonshot and and Alibaba and
[21:38] um and uh Miniax, you know, there's just
[21:43] a market opportunity
[21:45] >> if all to take the counter position
[21:48] >> of of of closed
[21:49] >> of closed models.
[21:51] >> Mhm.
[21:52] >> Um and Meta did that too, right, a few
[21:55] years ago. they they they famously have
[21:57] swung the other way.
[21:59] >> Yeah.
[21:59] >> Um, look, we we open source models need
[22:02] to exist.
[22:03] >> Like I think it's somewhat of a of a of
[22:06] a
[22:07] a matter of national security. If
[22:09] America does not have
[22:11] >> good open source models or if the cost
[22:12] of intelligence is 70 or 90% cheaper
[22:15] >> um in the east than it is the west like
[22:17] that, that that's probably not a good
[22:18] outcome. Yeah.
[22:19] >> Um I I do think you know um you know the
[22:23] the the best American companies are um
[22:28] are actually pretty well aligned around
[22:32] open source. You know Google produced
[22:33] Jamma
[22:35] >> um Nvidia's putting a lot of investment
[22:36] into the Neotron
[22:38] >> that um family reflection AI
[22:41] >> um you know hopefully will come out with
[22:43] a good open source model
[22:45] >> in a few in a few years from now. I
[22:47] think it's inevitability though.
[22:49] realistically like you know I if if
[22:51] we're in a world in 10 years from now
[22:52] where there's no good open source models
[22:55] >> um you know that's probably a very bad
[22:58] thing
[22:58] >> right
[22:59] >> um for the US but um I think I but I do
[23:03] believe there's enough investment
[23:04] happening that it's going to happen
[23:06] >> did you uh I'm sure you did anthropic
[23:08] put out a big post this morning
[23:10] >> I I saw that
[23:10] >> the the the the future of the Chinese
[23:13] and American uh AI relationship of
[23:15] course very beneficial for Anthropic
[23:18] any any thoughts on that? Actually,
[23:20] maybe summarize the post for the class.
[23:22] So, how do you how do you
[23:22] >> I I haven't read I've been back to bed
[23:24] since the morning, but my my read on it
[23:26] was like there's two scenarios that
[23:27] America that America's at the frontier.
[23:30] Yeah.
[23:30] >> And the other one is
[23:31] >> that um China China um is is is neck and
[23:37] neck.
[23:37] >> Yeah.
[23:38] >> More or less. And I think I think and
[23:40] again maybe you've read it. Um
[23:42] >> no, that's that's exactly it. They
[23:43] basically said, "We either we lead it
[23:45] and you you shut it down or we're neck
[23:47] and neck and that's a war."
[23:49] >> Yeah. And and and their recommendation
[23:50] was let's shut it down. So shut down so
[23:52] there's not a war.
[23:53] >> That's right.
[23:55] >> Look, they're awesome and like, you
[23:56] know, they do amazing work. I think
[23:58] it's, you know, I I genuinely believe in
[24:01] life before you make any points, you
[24:03] should make your biases and your and
[24:04] your incentive functions very clear.
[24:07] >> Um I don't think they acknowledge that.
[24:08] >> Yeah. Um, look, I I I I don't know what
[24:12] to make of that. I I also just don't I
[24:14] just think, you know, we put out a post
[24:16] yesterday about the world of many
[24:17] models. We think intelligence shouldn't
[24:19] be owned by two people. I think in in in
[24:21] the history of time when too much power
[24:22] has been concentrated
[24:24] >> in one or two parties, bad things
[24:27] >> generally happen. And I also just don't
[24:29] believe that thesis to some extent that,
[24:31] >> you know, these two companies with a
[24:33] massive profit motive should be the
[24:34] arbiters of morality for the rest of us.
[24:36] Um
[24:38] the [snorts] yeah [laughter]
[24:40] so so like you know when I think about
[24:43] like do do I think that it is it is
[24:46] concerning that all the best open source
[24:49] models come from China 100%. Like you
[24:51] know and they're great you know we we
[24:53] we've spent time with all those teams
[24:55] and they're amazing. It's not so much
[24:57] that I don't trust them. I just think
[24:58] that
[24:59] >> you know America should have great open
[25:01] source as well.
[25:02] >> Yeah. Yeah. Yeah. It seems do do you
[25:04] think the velocity of uh American open
[25:06] source is increasing or decreasing over
[25:08] time in terms of talent and and
[25:10] resources capital
[25:12] GPUs?
[25:13] >> Yeah. Yeah. Oh, I think up until like a
[25:16] year ago was definitely decreasing.
[25:18] >> Okay.
[25:19] >> But I think it's I think there's uptick
[25:21] >> now. I look I I think some very very
[25:24] important companies are behind this and
[25:26] I think we'll see a lot this year
[25:28] >> going the other way.
[25:29] >> Yeah. Yeah. Yeah. I think it's necessary
[25:31] because otherwise we're just going to
[25:32] have two companies left.
[25:34] >> Right.
[25:35] >> There'll be no Apple player.
[25:36] >> Right.
[25:36] >> Yeah.
[25:37] >> Right. Right. Right. Right. Um switching
[25:39] gears a little bit too. You were in a
[25:41] very um advantaged position where you
[25:43] see a lot of different hardware
[25:45] >> that you run inference on. Right.
[25:47] >> Well, actually you run in two layers, a
[25:49] lot of different providers. Actually, I
[25:51] think you had this map um which we which
[25:54] we can talk about for a second,
[25:56] >> but I actually meant a layer underneath
[25:58] this, you know.
[26:00] >> Yeah,
[26:00] >> I assume a lot of the inference that you
[26:02] guys are delivering is on Nvidia
[26:03] hardware just because it's the most
[26:05] >> majority the majority. Yeah.
[26:06] >> Prevalent,
[26:07] >> but then there's a long tale, you know,
[26:08] obviously um Tranium announced in the
[26:11] last earnings call they had about $20
[26:12] billion of revenue run rate. Um TPU, I'm
[26:15] sure, is is a big number. We don't know
[26:17] what it is. They have they haven't
[26:18] disclosed it at least. Then there's
[26:19] Cerebras from this morning. Um then
[26:22] there's everybody else edged, Maddox,
[26:24] posetron, others that we do not know
[26:26] about.
[26:26] >> Uh Dmatrix, Samanova, etc. Talk talk
[26:29] about the heterogenous compute
[26:32] ecosystem.
[26:33] How do you see the world unfolding on
[26:35] the on the hardware side?
[26:37] >> Yeah. Um, look, we we are like I I I
[26:40] think in general diversity like I I
[26:42] can't I can't sit here and say that I
[26:44] believe in a world of many models and
[26:46] then say, "Oh, only one chip will
[26:48] exist." That that'd be hypocritical. Um,
[26:50] but that being said, um, [laughter]
[26:53] we we are we run the majority of our
[26:55] fleet on Nvidia chips. Um,
[26:59] >> like TPUs are very promising. Mhm.
[27:01] >> Um, obviously all the new age the neo
[27:04] chips if you want to call them that
[27:06] >> are very promising.
[27:08] >> Um, I don't I don't you know we haven't
[27:10] seen anyone really use trrenium at
[27:12] scale. I think the advantage you have
[27:15] >> with Nvidia is just obviously a fleshed
[27:17] out supply chain.
[27:18] >> Yeah.
[27:18] >> A fleshed out supply chain.
[27:20] >> Um, you have and you know
[27:23] >> with a very very low cost of capital and
[27:24] a very strong relationship with TSMC.
[27:27] >> Yeah. Um I I think that the the idea
[27:30] that anyone else can really compete with
[27:32] that today
[27:33] >> is somewhat fanciful.
[27:34] >> Yeah.
[27:35] >> Um you know to go one layer deeper like
[27:38] one thing we rely very very heavily on
[27:40] is CUDA
[27:41] >> and the developer ecosystem.
[27:43] >> Mhm.
[27:44] >> Um around CUDA
[27:46] >> um there's nothing like CUDA like CUDA
[27:49] is insane. Um I think you know these you
[27:53] know these new architectures of breaking
[27:55] out you know inference has like two core
[27:58] parts to it which is one is prefill
[28:00] one's decode.
[28:01] >> Um
[28:02] >> you know this idea that all this will
[28:05] also just run on one chip.
[28:07] >> Mhm. which is historically what's
[28:09] happened
[28:09] >> probably isn't isn't isn't
[28:11] >> the the end state but you know Nvidia
[28:14] quite gro
[28:15] >> um a lot of these um these neo these new
[28:18] chips you're talking about is try to
[28:20] separate out
[28:21] >> these concerns of
[28:22] >> decode and like you know
[28:24] >> where you do the memory bound stuff all
[28:26] on the GPUs and you do
[28:28] >> the computer stuff on
[28:30] >> um on um on a different chip so like the
[28:35] pre separating out so I think that is
[28:37] the world the world will go like
[28:38] heterogeneous architectures
[28:40] >> for for these things. I think right now
[28:43] for us as a company we're just trying to
[28:44] move as fast as possible and our
[28:45] customers are trying to move as fast as
[28:47] possible
[28:48] >> and CUDA
[28:49] >> availability of Nvidia GPUs um the
[28:53] ability to use stuff like TRTLM which is
[28:55] clearly just built for Nvidia chips.
[28:57] >> What is that?
[28:58] >> Um it's a runtime it's open source
[28:59] runtime that runs on top of Nvidia
[29:01] chips. It's developed by Nvidia. Even
[29:03] Vlm and SG lang which are two other open
[29:05] source
[29:06] >> um frameworks are you know native
[29:10] >> to to Nvidia and so the ability to just
[29:14] move very very fast
[29:16] >> um is is the thing we are optimizing for
[29:18] and that's what Nvidia's really good at
[29:20] today.
[29:21] >> Gotcha. Okay. So so a fairly
[29:26] >> one-sided play there for now.
[29:28] >> Yeah.
[29:28] >> With hopes for optimistic uh future. Um
[29:31] maybe a layer above that too and you
[29:33] know talk about this map a little bit.
[29:34] You're working with so many different
[29:37] >> clouds. Yeah. [snorts]
[29:37] >> Clouds. Uh some of these have even
[29:40] announced an inference platform of their
[29:42] own.
[29:43] >> There's no reason for others to not do
[29:44] so.
[29:45] >> Uh maybe talk about that like it seems
[29:48] to us through the class a lot of
[29:49] speakers have told us that compute is
[29:50] very scarce.
[29:51] >> Yeah.
[29:52] >> Um
[29:53] >> probably true. How are you getting your
[29:54] compute and what's the strategy going
[29:57] forward?
[29:57] >> Yeah. Look, we we work on top of 18
[29:59] clouds. I think it's 20 now actually.
[30:02] >> Um we and we have 87 different clusters
[30:04] where we're we're stitching compute
[30:06] together. This is the core um I'd say
[30:09] piece of technology that we have is that
[30:11] we have the ability to take a bunch of
[30:12] different GPUs from different places and
[30:15] stitch it together in one place
[30:16] >> and make GPU fungeible
[30:19] >> in that way. Um the reason we do this is
[30:21] twofold. one it's it's access
[30:24] >> it gives you know is very very hard to
[30:26] find GPUs and
[30:28] >> and if you add a constraint around which
[30:30] clouds they operate in that makes it
[30:32] even harder
[30:33] >> and so what we do is that we take GPUs
[30:35] from anywhere we can stitch them
[30:37] together and abstract that away
[30:39] >> from the customers
[30:40] >> um as to the first thing and that's a
[30:43] core part of the strategy and we'll
[30:44] continue to do that we will rent and we
[30:46] will own um
[30:47] >> we will rent versus own
[30:49] >> own yeah for the for the most part but
[30:51] we will also build our own ownership.
[30:53] >> Okay, we'll come to this in a bit. Yeah,
[30:56] >> but in terms of why you know them going
[30:59] after and a lot of those platforms
[31:01] having their own inference
[31:03] >> things,
[31:04] >> it makes sense. You know, there's a lot
[31:06] of value in that stack
[31:08] >> in that inference stack that sits on top
[31:09] of GPUs.
[31:11] >> Um,
[31:12] >> you know, we we we welcome it. You know,
[31:14] we will partner with our competitors.
[31:17] We're fine with that. We we we stand
[31:18] behind the value that we we are we are
[31:20] providing.
[31:21] >> Mhm. But it makes sense. You know,
[31:23] inference is very very sticky.
[31:25] >> Got it.
[31:25] >> I think truly like a lot of people think
[31:27] that inference is a commodity.
[31:28] >> I mean, you know, you you've obviously
[31:30] studied these markets.
[31:31] >> Yeah.
[31:31] >> Um it it it just resembles the way that
[31:34] people used to buy databases.
[31:36] >> Yeah.
[31:36] >> To some extent. You you choose once and
[31:37] you kind of just grow there.
[31:39] >> Yeah. Well, it's also, you know, just
[31:41] was was with one of the uh great
[31:43] founders you had listed on here this
[31:45] morning. They said that inference is so
[31:47] sticky for us because ultimately this is
[31:49] the product that we deliver to our
[31:50] customers. Yeah,
[31:51] >> we don't want any disruption in
[31:53] [clears throat] the core service that
[31:54] our customers are consuming. A
[31:56] disruption in that is actually you know
[31:59] sometimes even the biggest cost line
[32:01] item even more than their cloud spend.
[32:02] >> Yeah.
[32:03] >> Is is the inference the intelligence.
[32:04] >> Yeah.
[32:05] >> So I imagine it's a price as large will
[32:07] be fiercely competed for
[32:09] >> 100%. And and I think you it is once
[32:11] you're in token path like if your
[32:13] inference is down your product's down.
[32:15] >> Yeah.
[32:16] >> You don't want to mess with that.
[32:17] >> Yeah. Um I put a bookmark on renting
[32:21] versus owning.
[32:22] >> Yeah.
[32:22] >> Can you talk about that? You know, we've
[32:23] had um Chase from Cruso here who who who
[32:26] talked us through the whole economics of
[32:28] building a
[32:29] >> data center of owning the whole thing.
[32:31] >> You have taken the other side of it
[32:33] which is to rent the
[32:35] >> till now. Yeah.
[32:36] >> For now. Okay. For now.
[32:37] >> Yeah.
[32:38] >> Talk more about that. Uh it seems like
[32:40] an important decision uh and very
[32:42] different shape of the business.
[32:43] >> Yeah. Well, you know, for our our thesis
[32:45] was to some extent what what is the
[32:46] sticky part of the stack, right?
[32:48] >> And so our, you know, we would say that
[32:51] software is the sticky part of the stack
[32:52] here.
[32:53] >> Mhm.
[32:53] >> Um I think what what you know, a lot of
[32:55] these clouds who are competing us with
[32:56] us would say is that access to GPUs
[32:59] >> is is the sticky part of the stack.
[33:01] >> Mh.
[33:02] >> And you know, once you have that, the
[33:03] software part is easy. I think it's
[33:05] probably arrogant both ways.
[33:06] >> You know, I think they're slightly
[33:07] arrogant to think the software piece
[33:09] they could do they could do it very
[33:10] easily. I think we are 100% arrogant to
[33:12] be like oh we can do that too.
[33:13] >> Um
[33:14] >> to some extent like we we did that for
[33:17] speed and we did that because we had no
[33:19] we had no business building data
[33:21] centers.
[33:22] >> Um for us going forward though is
[33:26] becoming clear that you know
[33:30] what I what I would say is that no
[33:32] matter how much people tell you that
[33:35] there's a supply problem
[33:36] >> Yeah.
[33:37] >> it is 10 times worse. [laughter] Uh, you
[33:39] know, I I think you saw in the GC G in
[33:42] the Google earnings, they had the GCP.
[33:44] Did you see this chart?
[33:45] >> I think they've got like a 10 10x
[33:46] backlog or something.
[33:48] >> Um, if you go out right now saying you
[33:50] want a,000 GPUs, truly you probably
[33:52] saying people are talking about Q2 next
[33:54] year.
[33:55] >> Q2 next year. So 12 months out, maybe 15
[33:57] months out.
[33:58] >> 15 months out. Um, we we have a um we
[34:02] have a cluster, a small cluster relative
[34:04] now. At the time it was big. um in one
[34:07] of in one of these clouds um that of
[34:10] B200's Blackwell chips B200's like great
[34:13] chip um it's a bit old but still it's
[34:16] amazing chip um our our unit price right
[34:19] now is 263 an hour
[34:22] >> Mhm. for that and like it doesn't matter
[34:23] what that is because I'll just the
[34:25] relative piece of it matters more. Um
[34:29] that's up for renewal in October
[34:32] >> and they came to us already in May and
[34:35] said 510 is a new price.
[34:37] >> Wow.
[34:38] >> For next year
[34:39] >> double
[34:39] >> double.
[34:41] Um this is fine.
[34:42] >> Are you going to take it?
[34:43] >> No, absolutely not. The um the um that's
[34:45] that's egregious. Um the um and we have
[34:48] a bunch of other clouds that we can work
[34:50] with here. Um I think though like what
[34:53] is becoming very important is that um
[34:55] there's there's there was a slide that
[34:57] was doing the rounds in Twitter. There
[34:59] was a slide that was doing around
[35:00] Twitter of um of Nolan Brown
[35:03] >> from OpenAI
[35:04] >> who I think is you know close to the
[35:06] altimate folks as well. Um he came up
[35:08] with the 03 model and and the side
[35:10] basically said that in access to compute
[35:12] is the strategic advantage for
[35:14] inference. that's becoming clear for us
[35:16] is that look we we we run in in you know
[35:20] right now I think the base 10 um
[35:22] inference service across all our
[35:24] customers is bigger than the openi API
[35:26] is bigger than Gemini
[35:27] >> wow
[35:28] >> in terms of how many tokens it does so
[35:29] we do around 30 trillion tokens a day
[35:32] >> the base infrance service is bigger than
[35:35] openi's API service
[35:36] >> so not not chat GPT the API product
[35:39] >> their API product okay
[35:40] >> yeah at least in last report and it's
[35:42] definitely bigger than the Gemini the
[35:44] Gemini thing and So if you project that
[35:46] out, we'll need around 150,000 B200
[35:49] equivalents.
[35:50] >> Mhm.
[35:50] >> In 2 years from now.
[35:51] >> Mhm.
[35:52] >> That's just insane amount of compute.
[35:54] >> Could you could you translate that in
[35:55] dollars for the class?
[35:57] >> Yeah. Yeah. So um it's about $7 billion
[36:01] of compute spend.
[36:02] >> Easy.
[36:03] >> $7 billion of comput spend. Um too big.
[36:05] It's it's it's a scary amount.
[36:07] >> Yeah.
[36:07] >> Um the the the idea that we're going to
[36:09] be able to get access to that by renting
[36:11] >> Mhm.
[36:12] >> is probably, you know, it's probably not
[36:14] going to happen. And the only way we can
[36:16] guarantee that is is knowing that we can
[36:18] we have a strong relationship with the
[36:19] the chip providers and we can go buy
[36:21] them and put it up ourselves. And that
[36:22] that's the reason we'll do it is access
[36:24] to be able to f fulfill the demand that
[36:27] we have for inference.
[36:28] >> And there's also an economic advantage
[36:30] to do it. It's about 30% cheaper. Like I
[36:32] think if you think about a scaled out
[36:34] cloud like Oracle,
[36:35] >> it's like what 30% gross margins.
[36:37] >> Yeah.
[36:37] >> Um so it's about 30% cheaper to do that.
[36:40] >> Yeah. Working integration. Yeah.
[36:41] >> Yeah.
[36:42] >> Fascinating. So the shape of the
[36:43] business is going to change quite a bit
[36:44] over time.
[36:45] >> Yeah.
[36:46] >> But you want to own your destiny.
[36:48] >> There's no other way.
[36:49] >> Yeah.
[36:49] >> Like you know the if if you think about
[36:52] like what is what is the what the now
[36:54] I'll give you the third risk but I'll
[36:57] tell them the three risks right which is
[36:59] like open source the app layer and don't
[37:02] have access to enough compute and uh and
[37:05] that's the core risk of the business and
[37:07] like if we don't take care of that we'll
[37:08] be in a lot of trouble.
[37:09] >> Yeah makes sense. Um that makes sense.
[37:13] You know the um
[37:16] actually follow up to the question you
[37:18] said this 263 per hour renewing at 5
[37:22] >> 10 seems egregious.
[37:24] >> Yeah.
[37:25] >> Where did you settle at?
[37:26] >> Oh yeah. We we haven't responded yet.
[37:29] It's in September. You know we don't
[37:30] need to that.
[37:31] >> Yeah. Actually the the the question
[37:33] behind the question is when should we
[37:35] expect the compute scarcity to to get
[37:39] back to normal? Is this like a 12 15
[37:42] month thing? Is it a multi-year thing?
[37:43] Is it a multi-deade thing? Like what's
[37:45] your best I'm sure you guys have done
[37:47] some Yeah.
[37:48] >> thinking or economic modeling around
[37:49] this?
[37:50] >> I I I I
[37:52] I'm curious what you guys think. You
[37:53] guys I'm curious what your public's
[37:55] people say about this. I don't think
[37:56] it's ever going to normalize.
[37:58] >> Yeah.
[37:58] >> You know, like I I think it's what you
[38:00] said earlier. It's like if inference
[38:01] demand is a billionx what it is today,
[38:05] >> um like two two different things are
[38:08] happening, right? which is model like
[38:10] the applications are getting more
[38:11] agentic
[38:12] >> and models are getting bigger
[38:14] >> which both of those say things say
[38:16] there's going to be a lot more inference
[38:18] which just means there's going to be a
[38:19] ton more compute.
[38:20] >> Yeah.
[38:21] >> Um you know like I think the analogy
[38:24] really is is like when you show up at
[38:25] JFK
[38:27] at 5:00 a.m. every day there's no line.
[38:30] You go straight through.
[38:31] >> Mhm.
[38:31] >> But you know by 8 a.m. there's a line
[38:33] out the door.
[38:34] >> Yeah. Um, and we were out the door right
[38:36] now, but unlike JFK, which closes down
[38:39] from like 11:00 p.m. to 4:00 a.m., we
[38:42] don't have any reset period.
[38:44] >> Yeah.
[38:44] >> And so, it just keeps compounding.
[38:46] >> Yeah.
[38:47] >> Yeah.
[38:48] >> Actually, that's an interesting analogy.
[38:49] Does the inference demand die down
[38:51] between midnight and 4:00 a.m. And are
[38:53] you able to reuse that somehow?
[38:54] >> Well, but there, you know, it's it's,
[38:56] you know, it's, you know, what is it
[38:58] like when it
[38:58] >> it's afternoon somewhere else?
[38:59] >> I mean, there's like the thing, right?
[39:00] It's five, it's 5:00, it's beer o'clock
[39:02] somewhere. It's like, you know, you can
[39:03] you can justify drinking a beer in any
[39:05] time because it's 5 PM somewhere.
[39:07] >> Got it.
[39:07] >> I think that's the same thing with
[39:08] inference where it's like even if demand
[39:10] here is down in Europe it goes up, in
[39:12] China it goes up.
[39:13] >> Yeah. Yeah. Yeah. Makes sense. Um I'm
[39:17] I'm going to shift gears a little bit.
[39:19] Um you know, one of the biggest things
[39:22] we've been discussing in class, a lot of
[39:24] the students here will go on to build
[39:26] build businesses, start businesses, work
[39:28] at businesses. Um, if you were not
[39:30] building base stand right now, what's
[39:32] your next best idea? What would you go
[39:34] start? What would you go build?
[39:36] >> Um, that's a good question. Like, like I
[39:39] would be I'd be going more the Cruso
[39:41] route to be honest. I'd be I'd be going
[39:43] and investing in energy and power.
[39:46] >> Mhm.
[39:46] >> Um, and like you know, I think the
[39:48] buildout is like we're just going to
[39:50] need so much space to put this compute.
[39:53] Mhm.
[39:54] >> Um what I I think like one of the one of
[39:57] the ideas I'm most excited about um
[39:58] which I hope someone does, maybe someone
[40:00] here will do is um you know I was I was
[40:04] driving over from Oakland
[40:06] >> to San Francisco the other day and you
[40:08] always see the ports there and you see
[40:09] the containers there you like containers
[40:11] are one of the biggest economic drivers
[40:14] >> in history
[40:15] >> because they normalize they normalize
[40:17] the unit of trade. the um and so
[40:21] you know my idea like one of the things
[40:23] I'm really excited about is modular data
[40:24] centers
[40:25] >> modular data center
[40:26] >> data centers which is like take like and
[40:28] standardize the unit of compute
[40:30] >> because whoever standardized units of
[40:32] compute now now we can just like really
[40:35] industrialize
[40:37] >> like if you go and talk to all the folks
[40:38] who are putting up data centers right
[40:39] now
[40:40] >> and the people who own this pallet space
[40:42] >> everything is different
[40:44] >> um so to the extent that you can
[40:46] fascinating
[40:46] >> you you can modulize is the the
[40:49] >> and have a shared consistent format.
[40:51] >> You're creating an API for compute and
[40:52] then you can then like a whole industry
[40:54] will start around that.
[40:55] >> Yeah.
[40:55] >> Cuz then you can have you know the same
[40:57] people servicing these things all over
[40:59] and you know you create and you know
[41:00] that that's probably what I'd be working
[41:02] on.
[41:02] >> Fascinating. Fascinating. Well um I'm
[41:05] sure folks have taken that taken that
[41:07] note down. Um relatedly um outside of
[41:10] that is there a you know we play this
[41:11] long short game with every speaker. Is
[41:13] there is there a business is there a
[41:14] startup? Is there a founder that you're
[41:16] particularly excited about right now or
[41:18] and the other way you think is more more
[41:20] hyped than than they deserve?
[41:22] >> I'd rather not play that one, but you
[41:24] know, for the for for the former look, I
[41:26] I I think you know, all the things I
[41:28] said, right, the um the [laughter]
[41:31] um the but yeah, no, like I I think you
[41:34] know, the whole any any part of the
[41:35] buildout.
[41:36] >> Yeah.
[41:36] >> Um and then, you know, in terms of
[41:39] short, look, I I I think I'd be looking
[41:42] at companies that just aren't
[41:43] innovating. Actually, I actually think
[41:44] the enterprises are fine.
[41:46] >> Enterprises are fine.
[41:47] >> Fine. Like, but I think they need to
[41:49] figure out how to take the thing that is
[41:51] unique to them.
[41:52] >> Yeah.
[41:52] >> And post- train models and go do
[41:54] inference on them. Um the but but I
[41:56] think the ones that aren't doing that,
[41:58] I'd be very very
[42:00] >> scared about.
[42:00] >> Yeah. Yeah. A lot of uh students right
[42:03] now thinking about what to study,
[42:05] particularly those who are starting
[42:06] their educational journey. Uh what
[42:08] advice would you have for folks? It
[42:10] feels like uh quicksand right now. Yeah,
[42:13] I I you know
[42:16] I don't know. I actually think you
[42:18] should just study whatever that's fine.
[42:19] like you know they they're like you know
[42:21] oh man such a such a I think you know
[42:24] >> yeah you change careers
[42:25] >> change careers you know and like you
[42:27] know the you could become an expert in
[42:28] anything in six months so just do the
[42:30] things that's fun you know like I I
[42:32] don't but but I think in terms of like
[42:34] >> you know if you if you are you know like
[42:37] actually financing models of of you know
[42:40] I spending a lot of time thinking about
[42:42] you know how to finance data centers
[42:44] right now
[42:45] >> okay
[42:45] >> just think you know something I'm going
[42:47] deep on and you know I think you the
[42:49] project financing stuff is actually
[42:51] pretty
[42:52] nifty there and so study. Yeah.
[42:54] >> Yeah. Fascinating. Well, we'll open it
[42:56] up for questions. We've got five more
[42:58] minutes. Uh go ahead.
[43:01] >> What's your take on like the futures?
[43:04] >> Oh, I I don't know. Like it's obviously
[43:07] interesting. Um I I I think the market
[43:12] is way too um what's like the it's way
[43:18] it's if you if you went like saw how
[43:22] compute deals are done it it's basically
[43:26] a drug it's a drug market like tr truly
[43:29] it's like you have a guy I and um
[43:32] >> who's your guy we have so many guys you
[43:35] know
[43:37] the and Like there's a guy at our office
[43:39] who kind of just sits in the corner and
[43:41] he just call he just calls people all
[43:43] day asking for comput it's dead. Yeah.
[43:46] And it's and it's absolutely terrifying.
[43:48] So like I'm very bull like there's
[43:50] clearly going to be markets here but I
[43:52] don't think the market has the you know
[43:54] it's it's like um it looks more akin to
[43:59] you know a is it's not a mature market.
[44:03] Um but I I think a good way to think
[44:04] about like electricity markets are
[44:06] >> it's not as efficient.
[44:08] slippage. Yeah, a little slippage. Yeah,
[44:11] >> go ahead.
[44:13] >> So, bases is that
[44:17] 10,000 eggs. Yeah.
[44:18] >> What's the benefit? What's the
[44:22] >> Yeah, I I said that, right? The
[44:24] antithesis is that there's two there's
[44:25] only two people who can get computer in
[44:27] the world and they own everything.
[44:29] That's um you know, that's one. The
[44:31] second one is that you know we we are
[44:33] dependent on open source models getting
[44:35] very good and you know there's not
[44:37] enough open source models. That's number
[44:39] two. Um and three is that you know um if
[44:44] you believe if you truly believe in like
[44:46] the the like the AGI thesis you know and
[44:52] you just play it out and you keep
[44:53] pushing and pushing and pushing like we
[44:54] all have nothing to do.
[44:56] >> But then I would argue maybe inference
[44:58] is the only market left.
[44:59] >> That's right. is the only mocking left.
[45:01] And
[45:01] >> that's right. Yeah, that's right. Go
[45:04] ahead.
[45:07] >> Everybody talks about how important open
[45:09] source is, but we're not really funding
[45:11] it. Well, every except
[45:14] um we're not really funding it in the
[45:16] United States. Do you think given the
[45:18] national security?
[45:27] >> Yeah. Yeah, definitely. I I think like
[45:29] the US government it pro I think it
[45:32] already is getting involved and thinking
[45:34] about these things. Um I would argue
[45:36] that actually like you know when you go
[45:38] look at Nvidia when you go look at
[45:40] Microsoft and Google they're actually
[45:42] putting a lot of work behind open
[45:44] source. Um I just don't think we've seen
[45:47] the the the like has it hasn't borne
[45:50] fruits just yet and it will come. So I I
[45:53] think like actually there is quite a lot
[45:56] of investment
[45:57] um but we're just not seeing it and I
[46:00] think the government is getting involved
[46:01] but yeah definitely I think you know you
[46:03] have to create artificial incentives and
[46:04] you know there will have to be a
[46:06] alliance and um um for all these things
[46:09] to be incentivized top down. Yeah
[46:11] >> you saw the other side happen two days
[46:13] ago. The Chinese government invested in
[46:15] deepseek.
[46:16] >> Yeah.
[46:16] >> Yeah. Anyway, go ahead.
[46:20] >> Sorry.
[46:21] What's to stop and open AI from open
[46:24] sourcing one of their lagging but still
[46:26] quite good models and then providing a
[46:28] workforce to both
[46:31] enterprises. Are they still focused on
[46:34] research? Why can't they do that?
[46:36] >> Open already does
[46:37] >> they they do that to some extent but in
[46:39] a lot of ways like it's
[46:42] for them to invest in post training to
[46:44] some extent it's to almost give up on
[46:46] their their what they think the thesis
[46:48] is right. It's like, you know, one of
[46:52] the most fascinating things about an
[46:53] open anthropic is, you know, if the
[46:56] thesis is AGI is everything, the way
[46:58] they do every anything at all, they
[47:00] should every every dollar should go to
[47:02] pre-training.
[47:03] >> Yeah. Um and so but you know they they
[47:07] already do that to some extent but their
[47:08] incentive to push dollars towards that
[47:10] is never going to be the case because
[47:12] they can they the reason why you know
[47:15] like all these models exist so they can
[47:18] just make the most money on the most
[47:19] expensive model like they they want they
[47:20] want to push the capability gap the
[47:22] capability argument as much as possible.
[47:26] All right, go ahead. One final question
[47:28] >> about the idea of the model modularized
[47:30] data center. I'm just wondering what
[47:32] would be the difference between like
[47:33] virtual machine.
[47:36] >> Question is what's the difference
[47:37] between a modular data center and a
[47:39] virtual machine?
[47:40] >> Yeah, that that's a good question. Um,
[47:43] well, I I think you know the virtual
[47:45] machine sits a couple layers higher up
[47:48] in the stack. I I'm just trying to
[47:49] figure out like how do you how do you
[47:51] create make it cheaper to put up more
[47:53] compute
[47:55] and so that you know like that that's a
[47:58] in a lot of ways you're kind of asking
[48:00] like modular containers you're kind of
[48:02] talking about containers which is once
[48:03] the compute exists you need that and it
[48:05] kind of is that just two two layers
[48:07] deeper
[48:08] >> yeah it's at a different layer of
[48:09] abstraction
[48:11] >> I'll take one more was there one here go
[48:14] ahead
[48:15] >> thanks for coming I'm just curious what
[48:16] you think about where you
[48:19] more like
[48:24] >> it's every it's everywhere like disag
[48:27] you're really separating our concerns
[48:29] the kernel stuff is continuing like a
[48:31] big like the amount of people in Nvidia
[48:34] and every you know even Stanford I think
[48:36] who work on stuff like thunder kittens
[48:38] and um things like that there's a lot of
[48:40] opportunity there um but the entire
[48:42] stack is changing like that's that's
[48:44] like the most fascinating thing about
[48:46] this market is that Everything is like
[48:48] quicksand like you you throw everything
[48:49] away, you know, like Ilia might come out
[48:52] with a new with a new architecture for a
[48:54] model and everything it's up everything
[48:55] is up for grabs again.
[48:56] >> Ilia
[48:57] >> um Yeah.
[48:58] >> Yeah. Do you know what he's working on?
[49:00] >> I I I mean I have sense. Yeah.
[49:02] >> Okay. That was all about [laughter]
[49:04] >> All right. All right.
[49:06] >> Yeah.
[49:07] >> Perfect. Well, thank you so much. This
[49:08] was fun. Yeah. Appreciate you making