Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

Full Transcript

https://www.youtube.com/watch?v=kwSVtQ7dziU

[00:00] Code's not even the right verb anymore,
[00:01] Code's not even the right verb anymore, right?
[00:03] But I have to express my will to my agents for 16 hours a day.
[00:07] Manifest.
[00:09] How can I have not just a single session of Claude code or Codex or some of these agent harnesses?
[00:12] How can I have more of them?
[00:14] How can I do that appropriately?
[00:16] The agent part is now taken for granted.
[00:18] Now the claw-like entities are taken for granted and now you can have multiple of them and now you can have instructions to them and now you can have optimization over the instructions.
[00:24] But there >>
[00:24] I mean this is why it gets to the psychosis is that this is like infinite and everything is a skill issue.
[00:34] Hi listeners, welcome back to No Priors.
[00:37] Today I'm here with Andre Karpathy and we have a wide-ranging conversation for you about code agents, the future of engineering and AI research, how more people can contribute to research, what's happening in robotics, his prediction for how agents can reach out into the real world, and education in this next age.
[00:53] Welcome, Andre.
[00:55] Andre, thanks for doing this.
[00:56] Yeah, thank you for having me.
[00:57] Uh so it's been a very exciting couple
[01:01] Uh so it's been a very exciting couple of months in AI.
[01:03] Uh yeah, you could say that.
[01:07] I remember um walking into the office at some point and you were like really locked in and I was asking what you were up to and you're like, I just I have to code for 16 hours a day or code's not even the right verb anymore, right?
[01:15] But I have to um express my will to my agents for 16 hours a day.
[01:21] Manifest um because like there's been a jump in capability.
[01:25] Uh what's happening?
[01:26] Tell me about your experience.
[01:28] Yeah, I kind of feel like I was just in this perpetual I still am often in this state of AI psychosis just like all the time um because there was a huge unlock in what you can achieve as a person as an individual, right?
[01:37] Because you were bottlenecked by, you know, your typing speed and so on.
[01:41] But now with these agents it really, I would say in December is when it really just something flipped where I kind of went from 80/20 of like, you know, uh to like 20/80 of writing code by myself versus just delegating to agents.
[01:53] And I don't even think it's 20/80 by now.
[01:54] I think it's a lot more than that.
[01:55] I don't think I've typed like a line of code probably since December basically.
[02:00] [laughter]
[02:03] Um which is like an extremely large uh change.
[02:05] Um I was talking to it like uh change.
[02:07] Um I was talking to it like for example, I was talking about it to for example my parents and so on and I
[02:09] for example my parents and so on and I don't think like a normal person
[02:10] don't think like a normal person actually realizes that this happened or
[02:11] actually realizes that this happened or how dramatic it was.
[02:13] Like literally like if you just find a random software
[02:15] if you just find a random software engineer or something like that at their
[02:16] engineer or something like that at their at their desk and what they're doing,
[02:17] at their desk and what they're doing, like their default workflow of, you
[02:20] like their default workflow of, you know, building software is completely
[02:22] know, building software is completely different as of basically December.
[02:24] different as of basically December.
[02:26] Uh so I'm just like in this state of psychosis of trying to figure out like
[02:28] psychosis of trying to figure out like what's possible, uh trying to push it to
[02:30] what's possible, uh trying to push it to the limit.
[02:31] How is it how can I have not just a single session of, you know, um
[02:33] just a single session of, you know, um Claude code or Codex or some of these
[02:35] Claude code or Codex or some of these agent harnesses?
[02:36] How can I have more of them?
[02:38] How can I do that uh appropriately?
[02:40] And then how can I use these claws?
[02:43] What are these claws?
[02:45] Uh and uh so there's like a lot of new things.
[02:46] things. I want to be at the forefront of it, you know, and I'm very
[02:48] it, you know, and I'm very antsy that I'm not at the forefront of
[02:49] antsy that I'm not at the forefront of it and I see lots of people on Twitter
[02:51] it and I see lots of people on Twitter doing all kinds of things and they all
[02:52] doing all kinds of things and they all sound like really good ideas and I need
[02:53] sound like really good ideas and I need to be at the forefront or I feel
[02:54] to be at the forefront or I feel extremely nervous.
[02:56] extremely nervous. And so I guess I'm just in this psychosis of like what's
[02:58] just in this psychosis of like what's possible like because it's unexplored
[03:00] possible like because it's unexplored fundamentally.
[03:01] fundamentally. Well, if you're nervous, the rest of us are are nervous. We have
[03:03] The rest of us are are nervous.
[03:05] We have a we have a team that we work with at Conviction that their setup is everybody is like, you know, none of the engineers write code by hand and they they're all microphoned and they just like whisper to their agents all the time.
[03:16] It's the strangest work setting ever.
[03:18] Uh and I thought they were crazy and now I like I fully accept I was like, oh this was the way.
[03:23] Like you're just ahead of it.
[03:26] Um what uh how do you think about your own capacity now to like explore or to do projects?
[03:30] Like what is it limited by?
[03:32] Yeah, what is it limited by?
[03:34] Uh just I think everything like so many things even if they don't work, I think to a large extent you feel like it's a skill issue.
[03:39] It's not that the capability is not there.
[03:41] It's that you just haven't found a way to string it together of what's available.
[03:44] Like I just don't I didn't give good enough instructions in the agents from the file or whatever it may be.
[03:51] I don't have a nice enough memory tool that I put in there or something like that.
[03:53] So it all kind of feels like skill issue when it doesn't work to some extent.
[03:56] You want to see how you can parallelize them etc. and you want to be Peter Steinberg basically.
[04:02] Uh so Peter is famous. He has a funny photo
[04:03] So Peter is famous.
[04:05] He has a funny photo where he's in front of a monitor with lots of uh like he uses Codex.
[04:07] So lots of Codex agents tiling the the monitor.
[04:10] And they all take about 20 minutes if you prompt them correctly and use the high effort.
[04:12] And so they all take about 20 minutes.
[04:14] They have multiple, you know, 10 repos checked out.
[04:15] And so he's just um going between them and giving them work.
[04:18] It's just like you can you can you can move in much larger macro actions.
[04:20] It's not just like here's a line of code, here's a new function.
[04:22] It's like here's a new functionality and delegate it to agent one.
[04:24] Here's a new functionality that's not going to interfere with the other one.
[04:25] Give it agent two.
[04:27] And then try to uh review their work as best as you can.
[04:29] >> [laughter]
[04:30] >> depending on how much you care about that code.
[04:31] Like where are these macro actions that I can like manipulate my software repository by?
[04:32] And like another agent is doing some like research, another agent is writing code, another one is coming up with a plan for some new implementation.
[04:35] And so everything is just like happens in these like macro actions over your repository.
[04:37] Um and you're just trying to become like really good at it and develop like a muscle memory for it is extremely um.
[04:38] Yeah, it's very rewarding number one because it actually works.
[04:40] Uh but it's also kind of like the new thing to.
[05:04] also kind of like the new thing to learn.
[05:06] So that's why hence the psychosis.
[05:07] Yeah, I I do feel like my instinct is like whenever I'm waiting for an agent to complete something, the obvious thing to do is like, well, I can do more work, right?
[05:17] Like if I have access to more tokens then like I should just parallelize at tasks.
[05:21] And so that's that's very stressful because if you don't feel very bounded by your ability to spend on tokens, then you know, you are the bottleneck in the system that is max capability.
[05:31] Yeah, if you're not maximizing your subscription at least.
[05:34] And ideally for multiple agents.
[05:36] Like if you run out of the quota on Codex, you should switch to Claude or whatnot.
[05:39] I don't know.
[05:40] Like that's what I've been trying to do a little bit and I feel nervous when I have subscription left over.
[05:45] That just means I haven't maximized my token throughput.
[05:47] So I actually kind of experienced this when I was a PhD student.
[05:49] You would feel nervous when your GPUs are not running.
[05:51] Like you have GPU capability and you're not maximizing your the available flops to you.
[05:55] But now it's not about flops, it's about tokens.
[05:57] So what is your token throughput and what token throughput do you command?
[06:01] I would actually argue that it's very interesting that we had, you know, at
[06:05] interesting that we had, you know, at least 10 years where in many engineering tasks people just did they didn't feel compute bound.
[06:12] Right? Um and now the entire industry feels that now.
[06:14] They feel like they they they felt resource bound uh and now that you have this big capability jump, you're like, oh, actually it's not, you know, my ability to access the computer anymore.
[06:26] Like I'm I'm the binding constraint. Yeah, it's a skill issue.
[06:28] Which is very empowering cuz um yeah, cuz you could be getting better.
[06:30] So that's why that's why I think it's very addictive because there's unlocks when you when you get better.
[06:36] Where do you think it goes? Like if you just think about like, okay, you know, Andre's iterating and everybody else is for 16 hours a day getting better at using coding agents. Like what does it look like in a year?
[06:46] Of like you've reached mastery.
[06:49] Yeah, what does mastery look like, right? At the end of the year or like two, three years, five years, 10 years, etc.
[06:54] Well, I think everyone is basically interested in like going up the stack.
[06:57] So I would say it's yeah, it's not about a single session with your agent.
[06:59] Multiple agents, how do they collaborate and teams and so on.
[07:03] So everyone's trying to figure out what that looks
[07:06] trying to figure out what that looks like.
[07:07] And then I would say Claude is like.
[07:08] And then I would say Claude is also kind of an interesting direction also kind of an interesting direction because it really, when I say a Claude,
[07:12] because it really, when I say a Claude, I mean this like layer that kind of takes persistence to a whole new level.
[07:15] takes persistence to a whole new level. Like it's something that like keeps looping.
[07:17] looping. It's it's like um it's not something that you are interactively in the middle of.
[07:20] it's not something that you are interactively in the middle of. It kind of like has its own little sandbox, its
[07:22] of like has its own little sandbox, its own little
[07:23] own little you know, it kind of like does stuff on your behalf even if you're not looking
[07:26] your behalf even if you're not looking kind of thing.
[07:27] kind of thing. Um and then also has like maybe more sophisticated memory systems etc. that
[07:30] Um and then also has like maybe more sophisticated memory systems etc. that are not yet implemented in agents.
[07:33] are not yet implemented in agents. So um Open Claude has a lot more sophisticated memory I would say than
[07:35] um Open Claude has a lot more sophisticated memory I would say than what you would get by default uh which is just a memory compaction when your context runs out, right?
[07:38] is just a memory compaction when your context runs out, right? You think that's the piece that resonated for more users versus like perhaps like broader tool access?
[07:44] users versus like perhaps like broader tool access? For Open Claude? Yeah.
[07:46] Uh there's like I think there's at least five things that are really good ideas in here.
[07:51] in here. Yeah, good job, Peter. I mean Peter has done a really amazing job.
[07:53] Peter has done a really amazing job. Um I saw him recently. Uh and I talked to him about it and I he's very humble about it.
[07:57] him about it and I he's very humble about it. But I think he
[08:11] innovated simultaneously in like five different ways and put it all together.
[08:16] Um so for example like the soul and D document.
[08:18] Like he actually really crafted a personality that is kind of compelling and interesting.
[08:20] And I feel like a lot of the current agents they don't get this correctly.
[08:23] I actually think a Claude has a pretty good personality.
[08:24] It feels like a teammate uh and it's excited with you etc.
[08:29] I would say um for example Codex is a lot more dry um which is kind of interesting because [laughter] in it's true.
[08:36] You know, it doesn't it and the other thing I would say is for example with Claude I think they dialed the sycophancy fairly well where when Claude gives me praise, I do feel like I slightly deserve it because sometimes I kind of give it like not very well formed thoughts and uh I give it an idea that I don't think it's fully baked and it doesn't actually react very strongly.
[08:51] It's like, oh yeah, we can implement that.
[08:54] But when it's a really good idea by my own account, it does uh seem to reward it a bit more.
[08:58] And so I kind of feel like I'm trying to like earn its praise which is really weird.
[09:01] And so I do think the personality matters a lot uh and I think a lot of the other uh tools maybe don't appreciate it as much.
[09:06] And I think in this aspect also Peter really cares about this and so that was correct.
[09:09] And then the memory system and then uh just, you know, he's just having
[09:13] then uh just, you know, he's just having fun with this um and then the the single fun with this um and then the the single WhatsApp portal to all of the WhatsApp portal to all of the automation.
[09:17] automation. >> Yeah. Is there something that you have done personally with your claws beyond software engineering that you think is fun or interesting?
[09:26] Yeah, so in January I had a claw I went through a period of claw psychosis.
[09:28] So I built um I have a claw basically that takes care of my home and I call him Dobby the elf uh claw.
[09:35] Um and uh basically I used uh the agents to find all of the smart home subsystems of my home on the local area network which I was kind of surprised that it worked out of the box.
[09:45] Like I just told it that I think I have Sonos at home.
[09:46] Like can you try to find it? And it goes and it did like IP scan of all of the um basically um computers on the local area network and and found the Sonos thing uh the Sonos uh, system and it turned out that there's no password protection or anything like that.
[10:00] It just logged in and it's like, "Oh, yeah, you have these Sonos systems installed.
[10:01] I Let me try to reverse engineer how it's working."
[10:06] It does some web searches and it finds like, "Okay, these are the API endpoints."
[10:08] And then it's like, "Do you want to try it?"
[10:10] And I'm like, "Whoa, like you just did that."
[10:11] And I'm like, "Yeah, can you try to play something in
[10:13] Yeah, can you try to play something in the study?
[10:16] And, uh, it does and music comes out and I'm like, "I can't believe I just That's crazy.
[10:19] That's like three prompts. Yeah.
[10:20] I can't believe I just typed in like, "Can you find my Sonos?"
[10:22] And then suddenly it's playing music.
[10:23] And it did the same for lights.
[10:26] And so like it kind of hacked in, figured out the whole thing, uh, created APIs, created dashboard so I could see the command, uh, kind of center of like all of my lights in the home.
[10:33] And then it was like switching lights on and off and, you know, so I can ask it like, "Dobby, it's sleepy time."
[10:37] And when it's sleepy time that just means all the lights go off, etc. and like so on.
[10:40] So it controls all of my lights, my HVAC, my shades, uh, the pool and, uh, the spa and also my security system.
[10:47] So I have a camera pointed outside of the house and anytime someone rolls in I have a Quinn, uh, a Quinn, uh, model that looks at the videos.
[10:55] So first of all there's change detection.
[10:58] Right. And then based on change detection it goes to Quinn and then it actually like tells me, um, it sends me a text to my WhatsApp.
[11:03] It shows an image from the outside and it says, "Hey, a FedEx truck just pulled up.
[11:09] FedEx truck just pulled up and you might want to check it and you got new mail or something like that."
[11:12] And Dobby just text me this. This
[11:14] that. And Dobby just text me this.
[11:14] This is really incredible.
[11:17] Um, so so Dobby is in charge of the house.
[11:19] I text through with it through WhatsApp, um,
[11:21] and it's been like really fun to have these macro actions that maintain my house.
[11:23] I haven't like really pushed it, uh, like way more beyond that and I think people are doing a lot more crazy things with it, uh, but for me even just the home automation setup I used to use like six apps, uh, completely different apps and I don't have to use these apps anymore.
[11:24] Like Dobby controls everything in natural language.
[11:26] It's amazing.
[11:28] Um, and so I think like I haven't even pushed the paradigm fully but already that is so helpful and so inspiring I would say.
[11:29] Do you think that's indicative of like what people want from a user experience perspective with software, right?
[11:30] Because I I don't think, you know, it's pretty ignored that it takes humans effort to like learn new software, like new UI.
[11:32] Yeah.
[11:34] I think, uh, to some extent that's right.
[11:35] It's like working backwards from how people think an AI should be because what people have in their mind of like what an AI is is not actually what an LLM is by by like in the raw sense.
[11:36] Like LLM is a token generator, you know, like more tokens come out.
[11:39] But what they think of is like this this persona identity that they can tell
[12:16] this persona identity that they can tell stuff and it remembers it, you know?
[12:18] stuff and it remembers it, you know? And, uh, it's just kind of an entity behind the WhatsApp.
[12:21] It's like a lot more understandable. Mhm. Uh, so I think to some extent it's like matching the expectations that humans already have for what an AI should behave but under the hood it's like a lot of technical details go into that.
[12:30] And LLMs are too raw of a primitive, uh, to actually, um, type check as AI I think for most people if that makes sense.
[12:36] Yeah. Um, I think that's like how we understand what the AI is and like the, um, description of it as Dobby or some persona obviously resonates with people.
[12:48] Um, I also think that it it uh, the unification that you did across your six different software systems for your home automation speaks to a different question of like do people really want all of the software that we have today?
[12:59] Yeah. Right? Um, because I I would argue like, well, you have the hardware but you've now thrown away the software or the UX layer of it.
[13:08] Um, do you think that's what people want? Yeah, I think there's this like there's this sense that these apps that are on the app store for using these smart home devices, etc.
[13:16] Smart home devices, etc.
[13:17] Uh, these shouldn't even exist kind of in a certain sense.
[13:19] Like shouldn't it just be APIs and shouldn't agents be just using it directly?
[13:25] And, um, wouldn't it like I can do all kinds of home automation stuff that, uh, in any individual app will not be able to do, right?
[13:30] Um, and an LLM can actually drive the tools and call all the right tools and do uh, do pretty complicated things.
[13:35] Um, and so in a certain sense it does point to this like maybe there's like an overproduction of lots of custom bespoke apps that shouldn't exist because agents kind of like crumble them up and everything should be a lot more just like exposed API endpoints and agents are the glue of the intelligence that actually like tool calls all the all the parts.
[13:55] Um, another example is like my treadmill.
[13:56] Uh, there's an app for my treadmill and I wanted to like keep track of how often I do my cardio, uh, but like I don't want to like log into web UI and go through a flow and etc.
[14:04] Like all this should just be like make APIs available and this is kind of, you know, going towards the agentic, um, sort of web or like agent first, uh, tools and all this kind of stuff.
[14:13] So I think the industry just has to reconfigure in so many ways that's like
[14:16] reconfigure in so many ways that's like the customer is not the human anymore.
[14:18] the customer is not the human anymore.
[14:19] It's like agents who are acting on behalf of humans and this refactoring will be will probably be substantial in a certain sense.
[14:23] One way that people sometimes push back on this is like, do people Do you Do we expect people to write code some of these tools?
[14:30] Do we expect normal people to do this kind of stuff that I described?
[14:32] Mhm. But I think to some extent this is just, you know, technology as it exists today and right now there is some write coding and I'm actually watching it and I'm working with the system but I kind of feel like this kind of stuff that I just talked about this should be free like in a year or two or three.
[14:47] There's no write coding involved. This is trivial. This is table stakes. This is like any AI, even the open source models, etc. can like do this.
[14:52] You should be able to translate it from a less technical humans intent very easily to this outcome.
[15:00] >> Yeah. Today it's write coding and it's involved and not many people are going to do it but >> And you still have to make some design decisions, right?
[15:05] We were talking about like we take frames for example. Yeah. Yeah. But I kind of feel like this will just, uh, start to the barrier will just come down and it's just ephemeral software on your behalf and some kind of
[15:17] software on your behalf and some kind of like claw is handling all the details
[15:18] like claw is handling all the details for you but you're not involved.
[15:20] Claw has a Claw has a machine and it will figure it out and it's just presenting
[15:22] figure it out and it's just presenting you UIs and you're like saying stuff,
[15:23] you UIs and you're like saying stuff, you know?
[15:27] Mhm. Why haven't you, um, I guess like pushed the boundaries of what you can do
[15:29] the boundaries of what you can do personally with claws?
[15:30] Like is it, you know, you're focusing on more important projects, auto research, etc. or, uh,
[15:32] you're focusing on more important projects, auto research, etc. or, uh, you're climbing the hill to mastery or
[15:35] you're climbing the hill to mastery or something else, right?
[15:38] Yeah, I just feel like I'm so distracted by everything so
[15:40] like I'm so distracted by everything so I spend I [laughter] spend like a week
[15:42] I spend I [laughter] spend like a week on the claw stuff and I I have more to do almost, um,
[15:43] on the claw stuff and I I have more to do almost, um, but I will say that, um,
[15:45] but I will say that, um, >> It's like Jensen told us we're all just busier, unfortunately.
[15:47] busier, unfortunately. >> Uh, I didn't really take advantage of a lot of like email and calendar and all
[15:49] Uh, I didn't really take advantage of a lot of like email and calendar and all this other stuff and I didn't really have access cuz I'm still a little bit
[15:50] this other stuff and I didn't really have access cuz I'm still a little bit like suspicious and it's still very new
[15:51] like suspicious and it's still very new and rough around the edges. So I didn't want to give it like full access to my
[15:53] and rough around the edges. So I didn't want to give it like full access to my digital life yet and part of it is just the security, privacy and uh, just being
[15:55] digital life yet and part of it is just the security, privacy and uh, just being very cautious in that in that realm.
[15:57] very cautious in that in that realm. And, um, so some of it is like held back by that I would say.
[15:58] And, um, so some of it is like held back by that I would say. Yeah, maybe that's like the dominant dominant feature but
[15:59] by that I would say. Yeah, maybe that's like the dominant dominant feature but some of it is also just I feel so
[16:01] like the dominant dominant feature but some of it is also just I feel so
[16:17] some of it is also just I feel so distracted because I feel like I had a week of claw and then other stuff is happening and What was the, um, I mean you've talked about like being able to train or at least optimize a uh, a a model as a task you want to see agents do for a long time.
[16:32] Like what was the motivation behind auto research? Auto research, yeah.
[16:34] So I think like I had a tweet earlier where I kind of like said something along the lines of to get the most out of the tools that have become available now you have to remove yourself as the as the bottleneck.
[16:44] You can't be there to prompt the next thing. You're You need to take yourself outside. Um, you have to arrange things such that they're completely autonomous.
[16:51] And the more you you know, how can you maximize your token throughput and not be in the loop?
[16:56] This is the this is the goal. And so I kind of mentioned that the the name of the game now is to increase your leverage.
[17:00] Uh, I put in just very few tokens just once in a while and a huge amount of stuff happens on my behalf.
[17:06] And so auto research like I tweeted that and I think people liked it and whatnot but it they haven't like maybe worked through like the implications of that and for me auto research is an example of like an implication of that.
[17:14] Where it's like I don't want to be like the researcher in
[17:18] don't want to be like the researcher in loop like looking at results, etc.
[17:20] Like I'm I'm holding the system back.
[17:23] So the I'm I'm holding the system back.
[17:24] So the question is how do I refactor all the abstractions so that I'm not I have to arrange it once and hit go.
[17:28] The name of the game is how can you get more agents running for longer periods of time without your involvement doing stuff on your behalf?
[17:33] And auto research is just, yeah, here's an objective, here's a metric, here's your boundaries of what you can and cannot do.
[17:39] And go.
[17:41] And, uh, yeah, it worked.
[17:43] at its effectiveness.
[17:45] Yeah, I I didn't expect, uh, it to work because so I have the project data chat, um, and fundamentally like I think a lot of people are very confused with my obsession for like training GPT-2 models and so on.
[17:53] But for me, uh, training GPT models and so on is just a little harness, a little playground for training LLMs.
[17:58] And fundamentally what I'm more interested in is like this idea of recursive self-improvement and to what extent you can actually have LLMs improving LLMs because I think all the frontier labs this is like the thing Mhm.
[18:10] uh, for obvious reasons and they're all trying to recursively self-improve roughly speaking.
[18:13] And so for me this is kind of like, um, a little playpen of that.
[18:18] That.
[18:20] Um, and I guess I like tuned Nan Chat already quite a bit by hand in the good old fashion way that I'm used to.
[18:21] Like I'm a researcher.
[18:22] I've done this for like, you know, two decades.
[18:23] I have some amount of like What is the opposite of hubris?
[18:25] Uh, yeah. [laughter]
[18:28] Earned confidence? Okay.
[18:30] I have like two decades of like, "Oh, I've trained this model like thousands of times.
[18:32] I've like, um, so I've done a bunch of experiments.
[18:34] I've done hyperparameter tuning.
[18:36] I've done all the things I'm very used to and I've done for two decades.
[18:37] Yeah.
[18:38] And I've gotten to a certain point and I thought it was like fairly well tuned
[18:39] and then I let auto research go for like overnight and it came back with like tunings that I didn't see.
[18:41] Mhm.
[18:43] And yeah, I did forget like the weight decay on the value embeddings and my Adam betas were not sufficiently tuned
[18:45] and these things just jointly interact.
[18:47] So like once you tune one thing the other things have to potentially change too.
[18:49] You know, I shouldn't be a bottleneck.
[18:50] I shouldn't be running these hyperparameter optimizations.
[18:52] I shouldn't be looking at the results.
[18:54] There's objective criteria in this case.
[18:56] Uh, so you just let you just have to arrange it so that it can just go forever.
[18:57] So that's a single sort of version of auto research of like a single loop trying to improve.
[18:58] And I was surprised that it, um, it found these things that I
[19:00] you know, the repo was already fairly well tuned and still found something.
[19:19] well tuned and still found something.
[19:20] And that's just a single it's a single loop.
[19:22] Like these frontier labs they have GPU clusters of tens of thousands of them.
[19:26] And so it's very easy to imagine how you would basically get a lot of this automation on, um, smaller models.
[19:32] And fundamentally everything around like frontier level intelligence is about extrapolation and scaling loss.
[19:35] And so you basically do a ton of the exploration on the smaller models and then you try to, um, extrapolate out.
[19:40] So you're saying our research efforts are going to get more efficient.
[19:44] Like we're going to have better direction for when we scale as well if we can do this experimentation better.
[19:50] Yeah, I would say that like the most interesting project and probably what the frontier labs are working on is uh, Mhm.
[19:54] Yeah. you know, you experiment on the smaller models.
[19:55] You try to make it as autonomous as possible.
[19:57] Remove researchers from the loop.
[20:00] They have way too much What is the What is the opposite of too much confidence?
[20:04] Yeah, yeah, they don't know.
[20:05] They shouldn't be touching any of this really.
[20:07] And so you have to like rewrite the whole thing because right now, I mean certainly they can contribute ideas.
[20:11] But okay, they shouldn't actually be enacting these ideas.
[20:13] There is a queue of ideas and there's maybe an automated scientist that comes up with ideas based on all
[20:20] that comes up with ideas based on all the archive papers and GitHub repos and
[20:21] the archive papers and GitHub repos and it funnels ideas in or researchers can
[20:24] it funnels ideas in or researchers can contribute ideas, but it's a single
[20:25] contribute ideas, but it's a single queue and there is workers that pull
[20:27] queue and there is workers that pull items and they try them out. And
[20:30] items and they try them out. And whatever works just gets sort of put on
[20:31] whatever works just gets sort of put on the feature branch and maybe some people
[20:34] the feature branch and maybe some people like
[20:35] like monitor the feature branch and merge to
[20:37] monitor the feature branch and merge to the main branch sometimes. So
[20:39] the main branch sometimes. So yeah, just removing humans from all the
[20:42] yeah, just removing humans from all the processes and automating as much as
[20:43] processes and automating as much as possible and getting high token tokens
[20:45] possible and getting high token tokens per second throughputs and it does
[20:46] per second throughputs and it does require rethinking of all the
[20:48] require rethinking of all the abstractions
[20:49] abstractions and everything has to be reshuffled. So
[20:52] and everything has to be reshuffled. So yeah, I think it's very exciting. If we
[20:54] yeah, I think it's very exciting. If we take one more recursive step here,
[20:57] take one more recursive step here, when is the model going to write a
[20:58] when is the model going to write a better program MD than you?
[21:00] better program MD than you? Yeah.
[21:01] Yeah. Also program MD is like
[21:03] Also program MD is like >> loop. Yeah, exactly.
[21:05] >> loop. Yeah, exactly. >> Yeah. So program MD is my crappy attempt
[21:07] >> Yeah. So program MD is my crappy attempt at describing like how the auto
[21:10] at describing like how the auto researcher should work. Like oh, do this
[21:11] researcher should work. Like oh, do this then do that and that and then try these
[21:13] then do that and that and then try these kinds of ideas and then here's maybe
[21:15] kinds of ideas and then here's maybe some ideas like look at architecture,
[21:16] some ideas like look at architecture, look at optimizer, etc. But I just came
[21:18] look at optimizer, etc. But I just came up with with this in markdown, right?
[21:19] up with with this in markdown, right? >> Mhm.
[21:21] >> Mhm. And so
[21:23] And so yeah, exactly.
[21:24] yeah, exactly. You want some kind of an auto research
[21:26] You want some kind of an auto research loop maybe that looks for
[21:28] loop maybe that looks for You can imagine that different program
[21:29] You can imagine that different program that MDs would
[21:31] that MDs would would give you different progress. So
[21:34] would give you different progress. So you basically every research
[21:35] you basically every research organization is described by program MD.
[21:38] organization is described by program MD. A research organization is a set of
[21:40] A research organization is a set of markdown files that describe all the
[21:41] markdown files that describe all the roles and how the whole thing connects.
[21:43] roles and how the whole thing connects. And you can imagine having a better
[21:45] And you can imagine having a better research organization. So maybe they do
[21:47] research organization. So maybe they do fewer stand-ups in the morning because
[21:48] fewer stand-ups in the morning because they're useless. And this is all just
[21:50] they're useless. And this is all just code, right?
[21:51] code, right? And so you can So one organization can
[21:53] And so you can So one organization can have fewer stand-ups, one organization
[21:54] have fewer stand-ups, one organization can have more.
[21:56] can have more. One organization can be very
[21:57] One organization can be very risk-taking, one organization can be
[21:59] risk-taking, one organization can be less. As you can definitely imagine that
[22:00] less. As you can definitely imagine that you have multiple research orgs
[22:02] you have multiple research orgs and then they all have code. And once
[22:04] and then they all have code. And once you have code, then you can imagine
[22:05] you have code, then you can imagine tuning the code. So 100% there's like
[22:07] tuning the code. So 100% there's like the metal layer of it. Uh
[22:09] the metal layer of it. Uh Did you see my text about my contest
[22:11] Did you see my text about my contest idea? My contest idea was
[22:14] idea? My contest idea was like
[22:15] like let people write different program MDs,
[22:18] let people write different program MDs, right? And and so for same hardware,
[22:20] right? And and so for same hardware, where do you get most improvement?
[22:22] where do you get most improvement? >> Oh, I see. And then you can take all
[22:23] >> Oh, I see. And then you can take all that data and then give it to the model
[22:25] that data and then give it to the model and say write a better program MD.
[22:26] and say write a better program MD. >> Yes, yes.
[22:28] >> Yes, yes. Yeah, exactly.
[22:28] Yeah, exactly. >> We're going to get something better.
[22:29] >> We're going to get something better. Like there's no way we don't, right?
[22:30] Like there's no way we don't, right? >> 100% look at
[22:32] >> 100% look at where the improvements came from and
[22:34] where the improvements came from and like can I change the program MD such
[22:36] like can I change the program MD such that more of these kinds of things would
[22:37] that more of these kinds of things would be done or like things that didn't work
[22:40] be done or like things that didn't work except
[22:41] except you can 100% imagine doing that. So I
[22:43] you can 100% imagine doing that. So I think this is a great idea, but it's
[22:45] think this is a great idea, but it's like
[22:45] like you know, I think like you can sort of
[22:47] you know, I think like you can sort of go one step at a time where you sort of
[22:48] go one step at a time where you sort of have one process and then second process
[22:50] have one process and then second process and then the next process and these are
[22:51] and then the next process and these are all layers of an onion.
[22:53] all layers of an onion. Like the LLM sort of part is now taken
[22:55] Like the LLM sort of part is now taken for granted. The agent part is now taken
[22:57] for granted. The agent part is now taken for granted. Now the claw-like entities
[22:59] for granted. Now the claw-like entities are taken for granted and now you can
[23:00] are taken for granted and now you can have multiple of them and now you can
[23:01] have multiple of them and now you can have instructions to them and now you
[23:03] have instructions to them and now you can have optimization over the
[23:04] can have optimization over the instructions and it's just like a little
[23:06] instructions and it's just like a little too much, you know, but I mean this is
[23:07] too much, you know, but I mean this is why it gets to the psychosis is that
[23:09] why it gets to the psychosis is that this is like infinite and everything is
[23:10] this is like infinite and everything is scale issue and that's why I feel like
[23:12] scale issue and that's why I feel like Yeah, that's just coming back to This is
[23:14] Yeah, that's just coming back to This is why it's so insane. Okay, well, if
[23:16] why it's so insane. Okay, well, if [laughter] we're we're just trying to
[23:17] [laughter] we're we're just trying to like diagnose the current moment and
[23:20] like diagnose the current moment and what is a relevant skill right now, what
[23:22] what is a relevant skill right now, what do you like what do you think is the
[23:24] do you like what do you think is the implication that this
[23:26] implication that this that this is the loop we should be
[23:27] that this is the loop we should be trying to achieve in different areas and
[23:29] trying to achieve in different areas and then it works, right? Like you know,
[23:31] then it works, right? Like you know, remove
[23:32] remove create the metric or create the ability
[23:34] create the metric or create the ability for agents to continue working on it
[23:36] for agents to continue working on it without you. Do we still have
[23:38] without you. Do we still have performance engineering? Like what
[23:40] performance engineering? Like what Yeah, I mean so there's a few caveats
[23:42] Yeah, I mean so there's a few caveats that I would put on top of the LLM
[23:43] that I would put on top of the LLM psychosis. So number one,
[23:45] psychosis. So number one, this is extremely well suited to
[23:46] this is extremely well suited to anything that has objective metrics that
[23:48] anything that has objective metrics that are easy to evaluate. So for example,
[23:49] are easy to evaluate. So for example, like writing kernels for more efficient
[23:51] like writing kernels for more efficient CUDA,
[23:52] CUDA, you know, code for various parts of the
[23:54] you know, code for various parts of the model, etc. are a perfect fit because
[23:56] model, etc. are a perfect fit because you have inefficient code and then you
[23:58] you have inefficient code and then you want efficient code that has the exact
[23:59] want efficient code that has the exact same behavior but it's much faster.
[24:02] same behavior but it's much faster. Perfect fit. So a lot of things like
[24:04] Perfect fit. So a lot of things like like are perfect fit for auto research,
[24:06] like are perfect fit for auto research, but many things will not be. And so they
[24:08] but many things will not be. And so they it's just if you can't evaluate then you
[24:09] it's just if you can't evaluate then you can't auto research it, right?
[24:12] can't auto research it, right? So that's like caveat number one. And
[24:13] So that's like caveat number one. And then maybe caveat number two I would say
[24:15] then maybe caveat number two I would say is you know, we're we're kind of talking
[24:16] is you know, we're we're kind of talking about the next steps and we kind of see
[24:17] about the next steps and we kind of see what the next steps are, but
[24:18] what the next steps are, but fundamentally the the whole thing still
[24:20] fundamentally the the whole thing still doesn't it still kind of like bursting
[24:22] doesn't it still kind of like bursting at the seams a little bit and there's
[24:23] at the seams a little bit and there's cracks and it doesn't fully work and if
[24:25] cracks and it doesn't fully work and if you kind of try to go too far ahead, the
[24:27] you kind of try to go too far ahead, the whole thing is actually net not useful
[24:29] whole thing is actually net not useful if that makes sense.
[24:31] if that makes sense. Because these models like still are not,
[24:32] Because these models like still are not, you know, they've improved a lot, but
[24:34] you know, they've improved a lot, but they're still are like rough around the
[24:35] they're still are like rough around the edges is maybe the way I would describe
[24:37] edges is maybe the way I would describe it. I simultaneously feel like I'm
[24:39] it. I simultaneously feel like I'm talking to an extremely brilliant PhD
[24:41] talking to an extremely brilliant PhD student who's been like a systems
[24:43] student who's been like a systems programmer for their entire life and a
[24:44] programmer for their entire life and a 10-year-old. And it's so weird because
[24:47] 10-year-old. And it's so weird because humans like there's like I feel like
[24:49] humans like there's like I feel like they're a lot more coupled like you have
[24:50] they're a lot more coupled like you have to you know, um Yes, you wouldn't you
[24:52] to you know, um Yes, you wouldn't you wouldn't encounter that combination.
[24:54] wouldn't encounter that combination. >> This jaggedness is really strange and
[24:56] >> This jaggedness is really strange and humans have a lot less of that kind of
[24:57] humans have a lot less of that kind of jaggedness, although they definitely
[24:59] jaggedness, although they definitely have some.
[24:59] have some. >> [laughter]
[25:00] >> [laughter] >> But humans have a lot more jaggedness.
[25:02] >> But humans have a lot more jaggedness. Uh sorry, the agents have a lot more
[25:04] Uh sorry, the agents have a lot more jaggedness where
[25:05] jaggedness where sometimes like
[25:07] sometimes like you know, I ask for functionality and it
[25:08] you know, I ask for functionality and it like comes back with something that's
[25:09] like comes back with something that's just like totally wrong and then we get
[25:11] just like totally wrong and then we get into loops that are totally wrong and
[25:12] into loops that are totally wrong and then I'm just I get so frustrated with
[25:14] then I'm just I get so frustrated with the agents all the time still because
[25:16] the agents all the time still because you feel the power of it,
[25:17] you feel the power of it, but you also there's still like
[25:20] but you also there's still like it does not say statistical things once
[25:21] it does not say statistical things once in a while for me as well. I get very
[25:23] in a while for me as well. I get very annoyed [clears throat] when
[25:26] annoyed [clears throat] when I feel like the agent wasted a lot of
[25:29] I feel like the agent wasted a lot of compute on something it should have
[25:30] compute on something it should have recognized was an obvious problem. Yeah.
[25:33] recognized was an obvious problem. Yeah. I think like some of the bigger things
[25:34] I think like some of the bigger things is like maybe what's under underneath it
[25:36] is like maybe what's under underneath it if I could hypothesize is fundamentally
[25:39] if I could hypothesize is fundamentally these models are trained via
[25:39] these models are trained via reinforcement learning. So they're
[25:41] reinforcement learning. So they're actually struggling with the exact same
[25:42] actually struggling with the exact same thing we just talked about which is the
[25:43] thing we just talked about which is the labs can improve the models in anything
[25:45] labs can improve the models in anything that is verifiable or that
[25:47] that is verifiable or that [clears throat] has rewards. So did you
[25:48] [clears throat] has rewards. So did you write the program correctly and does it
[25:50] write the program correctly and does it you do you the unit tests check out? Yes
[25:52] you do you the unit tests check out? Yes or no. But some of the things where
[25:54] or no. But some of the things where they're struggling is like for example,
[25:55] they're struggling is like for example, I think they have a tough time with like
[25:57] I think they have a tough time with like nuance of maybe what I what I had in
[25:59] nuance of maybe what I what I had in mind or what I intended and when to ask
[26:00] mind or what I intended and when to ask clarifying questions.
[26:02] clarifying questions. Um
[26:03] Um or like what I Yeah, it's just um
[26:05] or like what I Yeah, it's just um anything that feels softer is like
[26:07] anything that feels softer is like worse. And so you're kind of like you're
[26:09] worse. And so you're kind of like you're either on rails and you're part of the
[26:11] either on rails and you're part of the super intelligence circuits or you're
[26:12] super intelligence circuits or you're not on rails and you're outside of the
[26:14] not on rails and you're outside of the verifiable domains and suddenly
[26:15] verifiable domains and suddenly everything kind of just like meanders.
[26:17] everything kind of just like meanders. Like maybe another way to put it is if
[26:19] Like maybe another way to put it is if you go to if today if you go to like
[26:21] you go to if today if you go to like state-of-the-art model, ChatGPT and you
[26:22] state-of-the-art model, ChatGPT and you ask it tell me a joke, um
[26:25] ask it tell me a joke, um do you know what joke you're going to
[26:26] do you know what joke you're going to get? There's the joke. The joke? I do
[26:29] get? There's the joke. The joke? I do feel I I I can't tell you like the you
[26:31] feel I I I can't tell you like the you know, standard form of it, but I do feel
[26:32] know, standard form of it, but I do feel like ChatGPT has like three jokes.
[26:34] like ChatGPT has like three jokes. >> Yeah, yeah. So the the joke that
[26:36] >> Yeah, yeah. So the the joke that apparently all the LLMs like love the
[26:37] apparently all the LLMs like love the most is why do scientists not trust
[26:40] most is why do scientists not trust atoms? Okay. Because they make
[26:42] atoms? Okay. Because they make everything up. Okay.
[26:44] everything up. Okay. >> They make everything up.
[26:45] >> They make everything up. So this is still
[26:46] So this is still >> emerge? So this is the joke you would
[26:48] >> emerge? So this is the joke you would get like three or four years ago and
[26:50] get like three or four years ago and this is the joke you still get today.
[26:51] this is the joke you still get today. Okay.
[26:52] Okay. >> So even though the models have improved
[26:53] >> So even though the models have improved tremendously and if you give them an
[26:56] tremendously and if you give them an agentic task, they will just go for
[26:58] agentic task, they will just go for hours and move mountains for you. And
[27:00] hours and move mountains for you. And then you ask for like a joke and it has
[27:02] then you ask for like a joke and it has a stupid joke. It's crappy joke from
[27:04] a stupid joke. It's crappy joke from five years ago and it's because it's
[27:06] five years ago and it's because it's outside of the it's outside of the RL.
[27:08] outside of the it's outside of the RL. It's outside of the reinforcement
[27:09] It's outside of the reinforcement learning. It's outside of what's being
[27:10] learning. It's outside of what's being improved. It's like and it's part of the
[27:13] improved. It's like and it's part of the jaggedness of like shouldn't you expect
[27:15] jaggedness of like shouldn't you expect models as they get better to also have
[27:16] models as they get better to also have like better jokes or more diversity of
[27:18] like better jokes or more diversity of them or it's just it's not being
[27:20] them or it's just it's not being optimized and stuck. Do you
[27:23] optimized and stuck. Do you think that that implies that we are not
[27:26] think that that implies that we are not seeing like generalization in the sense
[27:29] seeing like generalization in the sense of like broader intelligence of joke
[27:31] of like broader intelligence of joke smartness being attached to code
[27:34] smartness being attached to code smartness? Yeah, I think there's some
[27:35] smartness? Yeah, I think there's some decoupling where some things are
[27:37] decoupling where some things are verifiable and some things are not and
[27:39] verifiable and some things are not and some things are optimized for
[27:40] some things are optimized for arbitrarily by the labs depending on
[27:41] arbitrarily by the labs depending on like what data went in and some things
[27:43] like what data went in and some things are not and um
[27:45] are not and um and
[27:46] and >> But I mean the the premise there's a you
[27:48] >> But I mean the the premise there's a you know, premise from some research groups
[27:51] know, premise from some research groups that if you're smarter at code
[27:53] that if you're smarter at code generation or in these verifiable
[27:55] generation or in these verifiable fields, you should be better at
[27:56] fields, you should be better at everything. And like the
[27:58] everything. And like the the joke situation suggests that that's
[28:00] the joke situation suggests that that's not happening at all.
[28:01] not happening at all. Okay.
[28:01] Okay. >> Yeah, I don't think that's happening. I
[28:02] >> Yeah, I don't think that's happening. I think
[28:03] think I think maybe we're seeing like a little
[28:04] I think maybe we're seeing like a little bit of that, but not like a satisfying
[28:06] bit of that, but not like a satisfying amount.
[28:06] amount. >> Yeah, that jaggedness exists in humans.
[28:09] >> Yeah, that jaggedness exists in humans. You [laughter] can be very very good at
[28:11] You [laughter] can be very very good at math
[28:12] math and still tell really bad jokes.
[28:13] and still tell really bad jokes. >> Yeah, that's true. Yeah, but it just it
[28:15] >> Yeah, that's true. Yeah, but it just it still means that we're not getting like
[28:17] still means that we're not getting like the story is that we're getting a lot of
[28:18] the story is that we're getting a lot of the intelligence and capabilities in all
[28:20] the intelligence and capabilities in all the domains of society like for free as
[28:22] the domains of society like for free as we get better and better models and
[28:24] we get better and better models and that's not like exactly fundamentally
[28:25] that's not like exactly fundamentally what's going on and there's some blind
[28:27] what's going on and there's some blind spots and some things are not being
[28:28] spots and some things are not being optimized for and this is all clustered
[28:30] optimized for and this is all clustered up in these neural net opaque models,
[28:32] up in these neural net opaque models, right? So you're either on rails of what
[28:35] right? So you're either on rails of what it was trained for and everything is
[28:36] it was trained for and everything is like you're going at speed of light or
[28:37] like you're going at speed of light or you're not.
[28:39] you're not. And so it's the jaggedness. So
[28:41] And so it's the jaggedness. So um
[28:42] um So that's why I think like even though
[28:43] So that's why I think like even though the the progression is obvious what
[28:45] the the progression is obvious what should happen, you can't let it fully go
[28:49] should happen, you can't let it fully go there yet because it doesn't
[28:51] there yet because it doesn't fully work or it's a scale issue and we
[28:52] fully work or it's a scale issue and we just haven't like figured out how to use
[28:54] just haven't like figured out how to use it. So you know, it's hard to tell. Can
[28:55] it. So you know, it's hard to tell. Can I ask a somewhat blasphemous question
[28:57] I ask a somewhat blasphemous question which is like if this jaggedness is
[29:00] which is like if this jaggedness is persisting
[29:01] persisting and it's all rolled up in a
[29:04] and it's all rolled up in a at least monolithic interface, right?
[29:05] at least monolithic interface, right? But you know, single model.
[29:08] But you know, single model. Does that make sense or do you should
[29:10] Does that make sense or do you should should it be unbundled into things that
[29:11] should it be unbundled into things that are can be optimized and improved
[29:13] are can be optimized and improved against different domains of
[29:15] against different domains of intelligence? Like unbundling the models
[29:17] intelligence? Like unbundling the models into multiple experts in different
[29:19] into multiple experts in different areas, etc. More directly. Yeah. Um
[29:22] areas, etc. More directly. Yeah. Um Instead of just MOE that we have no
[29:24] Instead of just MOE that we have no exposure to because that can be like
[29:25] exposure to because that can be like confusing as a user from the outside
[29:28] confusing as a user from the outside which is like why is it so good at this,
[29:29] which is like why is it so good at this, but not at this other thing? Yeah, I
[29:31] but not at this other thing? Yeah, I think currently my impression is the
[29:33] think currently my impression is the labs are trying to have a single sort of
[29:34] labs are trying to have a single sort of like monoculture of a model that is
[29:37] like monoculture of a model that is arbitrarily intelligent in all these
[29:39] arbitrarily intelligent in all these different domains and they just stuff it
[29:40] different domains and they just stuff it into the parameters. I do think that we
[29:42] into the parameters. I do think that we will we I do think we should expect more
[29:44] will we I do think we should expect more speciation in the
[29:46] speciation in the intelligences.
[29:48] intelligences. Um
[29:49] Um like, you know, the animal kingdom is
[29:51] like, you know, the animal kingdom is extremely diverse in the brains that
[29:52] extremely diverse in the brains that exist and there's lots of different
[29:53] exist and there's lots of different niches of of nature and some animals
[29:56] niches of of nature and some animals have overdeveloped visual cortex or
[29:58] have overdeveloped visual cortex or other part kind of parts and I think we
[30:00] other part kind of parts and I think we we should be able to see more speciation
[30:03] we should be able to see more speciation and um you don't need like this oracle
[30:04] and um you don't need like this oracle that knows everything. You can speciate
[30:06] that knows everything. You can speciate it and then you put it on a specific
[30:07] it and then you put it on a specific task and we should be seeing some of
[30:09] task and we should be seeing some of that because you should be able to have
[30:10] that because you should be able to have like much smaller models that still have
[30:11] like much smaller models that still have the cognitive core like they're still
[30:13] the cognitive core like they're still competent but then they specialize and
[30:15] competent but then they specialize and then um and then they they can become
[30:17] then um and then they they can become more efficient in terms of latency or
[30:19] more efficient in terms of latency or throughput on
[30:21] throughput on specific tasks that you really care
[30:22] specific tasks that you really care about. Like if you're a mathematician
[30:23] about. Like if you're a mathematician working in Lean, I saw for example
[30:24] working in Lean, I saw for example there's a few releases that really like
[30:26] there's a few releases that really like target that as a domain. Um
[30:28] target that as a domain. Um uh so there's a probably going to be a
[30:29] uh so there's a probably going to be a few examples like that where the
[30:31] few examples like that where the unbundling kind of makes sense. One
[30:33] unbundling kind of makes sense. One question I have is whether or not the
[30:36] question I have is whether or not the capacity constraint on available compute
[30:39] capacity constraint on available compute infrastructure Mhm. drives more of this
[30:41] infrastructure Mhm. drives more of this because efficiency Yeah. actually
[30:43] because efficiency Yeah. actually matters more. Yeah.
[30:45] matters more. Yeah. Your if you
[30:47] Your if you financing aside, though financing's
[30:49] financing aside, though financing's involved in all of this. If you have
[30:50] involved in all of this. If you have access to full compute for anything you
[30:52] access to full compute for anything you do like even one single model, right?
[30:55] do like even one single model, right? But if you actually feel pressure where
[30:57] But if you actually feel pressure where you're like I can't serve
[30:59] you're like I can't serve >> Mhm. um
[31:01] >> Mhm. um model of massive size for every use
[31:03] model of massive size for every use case.
[31:03] case. >> Mhm. Like do you think that leads to any
[31:05] >> Mhm. Like do you think that leads to any speciation? Does that question make
[31:06] speciation? Does that question make sense to you? The question makes sense
[31:08] sense to you? The question makes sense and I guess like what I'm what I'm what
[31:10] and I guess like what I'm what I'm what I what I'm struggling with is I don't
[31:11] I what I'm struggling with is I don't think we've seen too much speciation
[31:13] think we've seen too much speciation just yet, right? No. Uh we're seeing a
[31:15] just yet, right? No. Uh we're seeing a monoculture of models. Yeah. So um And
[31:17] monoculture of models. Yeah. So um And there's like clearly pressure for like
[31:19] there's like clearly pressure for like make a good code model, put it back in
[31:20] make a good code model, put it back in the main, merge again. Yeah.
[31:23] the main, merge again. Yeah. >> Um
[31:25] even though there already is pressure on
[31:27] even though there already is pressure on the models. Mhm. I guess perhaps I I
[31:29] the models. Mhm. I guess perhaps I I feel like there's a lot of very
[31:30] feel like there's a lot of very short-term supply crunch and like maybe
[31:33] short-term supply crunch and like maybe that causes more speciation now.
[31:35] that causes more speciation now. Yeah, I think fundamentally like the
[31:37] Yeah, I think fundamentally like the the the labs are serving a model and
[31:39] the the labs are serving a model and they don't really know what the end user
[31:41] they don't really know what the end user is going to be asking about. So maybe
[31:43] is going to be asking about. So maybe that's like some part of it because they
[31:45] that's like some part of it because they kind of have to multitask over all the
[31:46] kind of have to multitask over all the possible things they could be asked. But
[31:48] possible things they could be asked. But I think if you're coming to a business
[31:49] I think if you're coming to a business and maybe partnering on some specific
[31:50] and maybe partnering on some specific problems you care about then maybe you
[31:52] problems you care about then maybe you would see that there. Um or there would
[31:54] would see that there. Um or there would be some very high-value applications
[31:56] be some very high-value applications that are like more niche. Um
[31:58] that are like more niche. Um But but I think right now they're kind
[32:00] But but I think right now they're kind of like going after the totality of
[32:01] of like going after the totality of what's available. I don't think that the
[32:03] what's available. I don't think that the science of manipulating the brains is
[32:05] science of manipulating the brains is like fully developed yet partly. What do
[32:07] like fully developed yet partly. What do you mean manipulating? So like so
[32:09] you mean manipulating? So like so fine-tuning without losing capabilities
[32:11] fine-tuning without losing capabilities as an example. And I we don't have these
[32:13] as an example. And I we don't have these primitives for actually like working
[32:14] primitives for actually like working with the intelligences in ways other
[32:15] with the intelligences in ways other than just context windows. Our context
[32:17] than just context windows. Our context windows kind of just just work and it's
[32:19] windows kind of just just work and it's very cheap to manipulate etc. And this
[32:20] very cheap to manipulate etc. And this is how we're getting some of the
[32:21] is how we're getting some of the customization etc. Uh but I think if it
[32:23] customization etc. Uh but I think if it was I think it's a it's a bit more of a
[32:26] was I think it's a it's a bit more of a developing science of how you like more
[32:27] developing science of how you like more deeply adjust the models, how you have
[32:29] deeply adjust the models, how you have continual learning maybe or how you
[32:32] continual learning maybe or how you um how you fine-tune in a certain area,
[32:34] um how you fine-tune in a certain area, how you get better in a certain area or
[32:35] how you get better in a certain area or like how you actually touch the weights
[32:36] like how you actually touch the weights not just the context windows. And so
[32:38] not just the context windows. And so it's a lot more
[32:39] it's a lot more tricky I would say to touch the weights
[32:41] tricky I would say to touch the weights than just the context windows uh because
[32:43] than just the context windows uh because you're actually fundamentally changing
[32:44] you're actually fundamentally changing the full model and potentially its
[32:45] the full model and potentially its intelligence. And so um
[32:48] intelligence. And so um so maybe it's just like not a fully
[32:49] so maybe it's just like not a fully developed science if that makes sense of
[32:50] developed science if that makes sense of speciation. And it also has to be like
[32:53] speciation. And it also has to be like cheap enough Yeah. for that speciation
[32:55] cheap enough Yeah. for that speciation to be worthwhile in these given
[32:57] to be worthwhile in these given >> contexts. Can I ask a question about
[32:59] >> contexts. Can I ask a question about like an extension to auto research that
[33:02] like an extension to auto research that you described in terms of open ground?
[33:04] you described in terms of open ground? You say okay, well, you know, we have
[33:06] You say okay, well, you know, we have this thing. Um we need more
[33:08] this thing. Um we need more collaboration surface around it
[33:10] collaboration surface around it essentially for people to contribute
[33:13] essentially for people to contribute to research overall. Can you talk about
[33:15] to research overall. Can you talk about that?
[33:15] that? >> Yeah, so we talked about auto research
[33:16] >> Yeah, so we talked about auto research has a single thread of like I'm going to
[33:18] has a single thread of like I'm going to try stuff in a loop but fundamentally
[33:20] try stuff in a loop but fundamentally the parallelization of this is like the
[33:22] the parallelization of this is like the interesting component.
[33:23] interesting component. And I guess I was trying to like play
[33:25] And I guess I was trying to like play around with a few ideas but I don't have
[33:26] around with a few ideas but I don't have anything that like clicks as simply as
[33:28] anything that like clicks as simply as like I don't have something I'm like
[33:29] like I don't have something I'm like super happy with just yet but it's
[33:30] super happy with just yet but it's something I'm like working on the side
[33:32] something I'm like working on the side when I'm not working on my claw.
[33:34] when I'm not working on my claw. Um
[33:35] Um so I think like one issue is if you have
[33:38] so I think like one issue is if you have a bunch of nodes
[33:40] a bunch of nodes of parallelization available to then
[33:41] of parallelization available to then it's very easy to just have multiple
[33:43] it's very easy to just have multiple auto researchers talking through a
[33:45] auto researchers talking through a a common system or something like that.
[33:46] a common system or something like that. What I was more interested in is how you
[33:48] What I was more interested in is how you can have an untrusted pool of workers
[33:49] can have an untrusted pool of workers out there on the internet. Mhm. So for
[33:51] out there on the internet. Mhm. So for example in auto research
[33:53] example in auto research you're just trying to find um
[33:56] you're just trying to find um the piece of code that trains a model to
[33:58] the piece of code that trains a model to a very low validation loss.
[34:00] a very low validation loss. If anyone gives you a candidate commit,
[34:02] If anyone gives you a candidate commit, it's very easy to verify that that
[34:04] it's very easy to verify that that commit is correct is good. Like they
[34:06] commit is correct is good. Like they someone could claim from the internet
[34:07] someone could claim from the internet that this piece of code will optimize
[34:09] that this piece of code will optimize much better and give you much better
[34:10] much better and give you much better performance. You could just check. Yeah.
[34:12] performance. You could just check. Yeah. But probably a lot of work goes into
[34:14] But probably a lot of work goes into that checking.
[34:16] that checking. But fundamentally they could lie and
[34:17] But fundamentally they could lie and etc. So you're basically dealing with a
[34:19] etc. So you're basically dealing with a similar kind of it's almost actually
[34:20] similar kind of it's almost actually like looks a little bit like my my
[34:22] like looks a little bit like my my designs that incorporate an untrusted
[34:23] designs that incorporate an untrusted pool of workers
[34:25] pool of workers actually look a little bit more like a
[34:26] actually look a little bit more like a blockchain a little bit uh because
[34:28] blockchain a little bit uh because instead of blocks you have commits and
[34:31] instead of blocks you have commits and these commits can build on each other
[34:32] these commits can build on each other and they contain like changes to the
[34:33] and they contain like changes to the code as you're improving it. Um and uh
[34:36] code as you're improving it. Um and uh the proof of work is basically doing
[34:38] the proof of work is basically doing tons of experimentation to find the
[34:39] tons of experimentation to find the commits that work.
[34:40] commits that work. Um and that's hard
[34:42] Um and that's hard and then the reward is just being on the
[34:44] and then the reward is just being on the leaderboard right now. There's no
[34:45] leaderboard right now. There's no monetary reward whatsoever.
[34:47] monetary reward whatsoever. Uh but I don't want to push the analogy
[34:48] Uh but I don't want to push the analogy too far but it fundamentally has this
[34:50] too far but it fundamentally has this issue where
[34:51] issue where you a huge amount of search goes into it
[34:53] you a huge amount of search goes into it but it's very cheap to verify that a
[34:55] but it's very cheap to verify that a candidate solution is indeed good
[34:57] candidate solution is indeed good because you can just train a single you
[34:58] because you can just train a single you know, someone had to try 10,000 ideas
[35:00] know, someone had to try 10,000 ideas but
[35:01] but you just have to check that the thing
[35:02] you just have to check that the thing that they produced actually works
[35:03] that they produced actually works because the 99,000 of them didn't work,
[35:05] because the 99,000 of them didn't work, you know? Um and so basically long story
[35:08] you know? Um and so basically long story short is like you have to come up with a
[35:10] short is like you have to come up with a system where an untrusted pool of
[35:12] system where an untrusted pool of workers can collaborate with a trusted
[35:14] workers can collaborate with a trusted pool of workers that do the
[35:16] pool of workers that do the verification.
[35:18] verification. And the whole thing is kind of like
[35:19] And the whole thing is kind of like asynchronous and works and
[35:22] asynchronous and works and and so on and it's it's like safe from a
[35:24] and so on and it's it's like safe from a security perspective because if anyone
[35:25] security perspective because if anyone sends you arbitrary code and you're
[35:27] sends you arbitrary code and you're going to run it, that is very sketchy
[35:28] going to run it, that is very sketchy and dodgy. So um
[35:30] and dodgy. So um but fundamentally it should be totally
[35:31] but fundamentally it should be totally possible. So you're familiar with
[35:32] possible. So you're familiar with projects like SETI@home and
[35:34] projects like SETI@home and Folding@home. All of these problems have
[35:35] Folding@home. All of these problems have a similar kind of setup. So Folding@home
[35:38] a similar kind of setup. So Folding@home you're folding a protein
[35:39] you're folding a protein and it's very hard to find a
[35:40] and it's very hard to find a configuration that is low energy. But if
[35:42] configuration that is low energy. But if someone finds a configuration that they
[35:43] someone finds a configuration that they value to be low energy, that's perfect.
[35:45] value to be low energy, that's perfect. You can just use it. You can easily
[35:46] You can just use it. You can easily verify it.
[35:47] verify it. So a lot of things have this property
[35:48] So a lot of things have this property that you know, very expensive to come up
[35:50] that you know, very expensive to come up with but very cheap to verify. And so in
[35:52] with but very cheap to verify. And so in all those cases things like Folding@home
[35:54] all those cases things like Folding@home or SETI@home or auto research at home
[35:57] or SETI@home or auto research at home will be good fits. And so um long story
[36:00] will be good fits. And so um long story short
[36:01] short a swarm of agents on the internet could
[36:03] a swarm of agents on the internet could collaborate to improve LLMs and could
[36:05] collaborate to improve LLMs and could potentially even like run circles around
[36:07] potentially even like run circles around frontier labs. Like who knows, you know?
[36:09] frontier labs. Like who knows, you know? Um
[36:10] Um yeah, like maybe that's even possible.
[36:12] yeah, like maybe that's even possible. Like frontier labs have a huge amount of
[36:13] Like frontier labs have a huge amount of trusted compute but the earth is much
[36:16] trusted compute but the earth is much bigger and has huge amount of untrusted
[36:18] bigger and has huge amount of untrusted compute. But if you put systems in check
[36:20] compute. But if you put systems in check systems in place that you know, deal
[36:22] systems in place that you know, deal with this then maybe it is possible that
[36:24] with this then maybe it is possible that the swarm out there could could come up
[36:26] the swarm out there could could come up with with better with better solutions.
[36:29] with with better with better solutions. And people kind of like contribute
[36:30] And people kind of like contribute cycles um
[36:32] cycles um to to a thing that they care about. And
[36:34] to to a thing that they care about. And so sorry to so the last thought is
[36:36] so sorry to so the last thought is uh lots of companies or whatnot they
[36:37] uh lots of companies or whatnot they could maybe have like their own things
[36:39] could maybe have like their own things that they care about and you if you have
[36:41] that they care about and you if you have compute capacity you could contribute to
[36:43] compute capacity you could contribute to different kind of auto research tracks.
[36:44] different kind of auto research tracks. Like maybe you care about certain you
[36:46] Like maybe you care about certain you know, like you care about like cancer or
[36:48] know, like you care about like cancer or something like that of certain type. You
[36:49] something like that of certain type. You don't have to just donate money to an
[36:50] don't have to just donate money to an institution. You actually could like
[36:52] institution. You actually could like purchase compute and then you could join
[36:54] purchase compute and then you could join the auto research swarm for that
[36:55] the auto research swarm for that project, you know? Uh so if everything
[36:58] project, you know? Uh so if everything is rebundled into auto researchers then
[37:00] is rebundled into auto researchers then compute becomes the thing that you're
[37:01] compute becomes the thing that you're contributing to the pool. Yeah. That's
[37:03] contributing to the pool. Yeah. That's very inspiring and it's also
[37:04] very inspiring and it's also interesting. Like I don't I don't know
[37:06] interesting. Like I don't I don't know how far this goes but it is interesting
[37:08] how far this goes but it is interesting that at least some audience of people
[37:11] that at least some audience of people you know, here in Silicon Valley or
[37:13] you know, here in Silicon Valley or lining up at you know, retail stores in
[37:15] lining up at you know, retail stores in China have discovered that like having
[37:18] China have discovered that like having access to personal compute is
[37:19] access to personal compute is interesting again.
[37:20] interesting again. >> Yeah. Right? So maybe they're really
[37:21] >> Yeah. Right? So maybe they're really motivated to do that for their claws and
[37:23] motivated to do that for their claws and then they can contribute to auto
[37:25] then they can contribute to auto research.
[37:25] research. >> almost like dollars the thing everyone
[37:27] >> almost like dollars the thing everyone cares about but is flop the thing that
[37:29] cares about but is flop the thing that actually everyone cares about in the
[37:31] actually everyone cares about in the future? Like is there going to be like a
[37:32] future? Like is there going to be like a flipening almost of like what's the
[37:34] flipening almost of like what's the thing that you care about? Like right
[37:35] thing that you care about? Like right now for example it's really hard to get
[37:36] now for example it's really hard to get compute even if you have money. Yeah.
[37:38] compute even if you have money. Yeah. So actually it almost seems like the
[37:40] So actually it almost seems like the flop is like dominant
[37:41] flop is like dominant >> [laughter]
[37:42] >> [laughter] >> in a certain sense. Um
[37:44] >> in a certain sense. Um Yeah, so so maybe that's kind of like
[37:46] Yeah, so so maybe that's kind of like that. Kind of like that. Like how much
[37:47] that. Kind of like that. Like how much how many flops do you control instead of
[37:49] how many flops do you control instead of like what wealth you control? I don't
[37:51] like what wealth you control? I don't actually think that's true but it's kind
[37:52] actually think that's true but it's kind of interesting to think about. The last
[37:54] of interesting to think about. The last thing you released was like a little bit
[37:55] thing you released was like a little bit of jobs data analysis. Is that right?
[37:58] of jobs data analysis. Is that right? What
[37:59] What and might have touched a nerve even
[38:01] and might have touched a nerve even though you're just like visualizing some
[38:02] though you're just like visualizing some public data.
[38:03] public data. >> Yeah. Uh what was you know, what were
[38:05] >> Yeah. Uh what was you know, what were you curious about? Yeah, I guess I was
[38:06] you curious about? Yeah, I guess I was curious to um
[38:09] curious to um I mean everyone is like really it's
[38:10] I mean everyone is like really it's everyone is really thinking about the
[38:11] everyone is really thinking about the impacts of AI on the job market and
[38:13] impacts of AI on the job market and what's going to look like. So I was just
[38:15] what's going to look like. So I was just interested to take a look like what does
[38:16] interested to take a look like what does the job market look like? Where are the
[38:17] the job market look like? Where are the different roles um
[38:19] different roles um and how many people are in different
[38:20] and how many people are in different professions? And I was like really just
[38:22] professions? And I was like really just interested to like look through
[38:24] interested to like look through the individual cases and try to think
[38:25] the individual cases and try to think myself about like you know, with these
[38:27] myself about like you know, with these AIs and how they're likely to evolve
[38:29] AIs and how they're likely to evolve like
[38:30] like are these going to be tools that people
[38:31] are these going to be tools that people are using? Are these going to be
[38:33] are using? Are these going to be displacing tools for these professions?
[38:36] displacing tools for these professions? And like what are the current
[38:37] And like what are the current professions and how are they going to
[38:38] professions and how are they going to change? Are they going to grow or uh
[38:40] change? Are they going to grow or uh adjust to a large extent or like what
[38:42] adjust to a large extent or like what could be new professions? So it's really
[38:43] could be new professions? So it's really just like a way to fuel my own chain of
[38:45] just like a way to fuel my own chain of thought about the industry I suppose.
[38:47] thought about the industry I suppose. Mhm. Um and so
[38:49] Mhm. Um and so yeah, the jobs data basically is just a
[38:51] yeah, the jobs data basically is just a Bureau of Labor Statistics. They
[38:53] Bureau of Labor Statistics. They actually have um percent outlook for
[38:55] actually have um percent outlook for each profession about how much it's
[38:57] each profession about how much it's expected to grow over the next I think
[38:58] expected to grow over the next I think almost a decade. Uh yeah, I think it's a
[39:00] almost a decade. Uh yeah, I think it's a decade but it was made in 2024. Mhm. We
[39:02] decade but it was made in 2024. Mhm. We need a lot of health care workers. Yeah.
[39:04] need a lot of health care workers. Yeah. So so they've already made those
[39:06] So so they've already made those projections and I'm not sure actually
[39:07] projections and I'm not sure actually 100% what the methodology was that they
[39:09] 100% what the methodology was that they they put into their projections. Um I
[39:11] they put into their projections. Um I guess I was interested to color things
[39:13] guess I was interested to color things by like if people think that what's like
[39:15] by like if people think that what's like primarily being
[39:17] primarily being developed now is this kind of like more
[39:18] developed now is this kind of like more digital AI
[39:20] digital AI that is kind of like almost like these
[39:21] that is kind of like almost like these ghosts or spirit entities that can like
[39:23] ghosts or spirit entities that can like interact in the digital world and
[39:25] interact in the digital world and manipulate a lot of like digital
[39:26] manipulate a lot of like digital information and they currently don't
[39:28] information and they currently don't really have a physical embodiment or
[39:29] really have a physical embodiment or presence. And the physical stuff is
[39:31] presence. And the physical stuff is probably going to go slightly slower
[39:32] probably going to go slightly slower because you're manipulating atoms. So
[39:34] because you're manipulating atoms. So flipping flipping bits and
[39:36] flipping flipping bits and and the ability to copy-paste digital
[39:37] and the ability to copy-paste digital information is like makes everything a
[39:39] information is like makes everything a million times faster than accelerating
[39:41] million times faster than accelerating matter, you know, so
[39:43] matter, you know, so Um so energetically, I just think we're
[39:45] Um so energetically, I just think we're going to see a huge amount of activity
[39:46] going to see a huge amount of activity in the digital space, huge amount of
[39:48] in the digital space, huge amount of rewriting, huge amount of activity,
[39:50] rewriting, huge amount of activity, boiling soup. And I think the we're
[39:52] boiling soup. And I think the we're going to see something that in the
[39:53] going to see something that in the digital space goes at the speed of light
[39:55] digital space goes at the speed of light compared to I think what's going to
[39:56] compared to I think what's going to happen in the physical world to some
[39:57] happen in the physical world to some extent. If it would be the
[39:58] extent. If it would be the extrapolation. And so I think like
[40:01] extrapolation. And so I think like >> [clears throat]
[40:01] >> [clears throat] >> there's currently kind of like I think
[40:03] >> there's currently kind of like I think overhang where there can be like a lot
[40:06] overhang where there can be like a lot of unhubbling almost potentially of like
[40:08] of unhubbling almost potentially of like a lot of digital information processing
[40:09] a lot of digital information processing that used to be done by computers and
[40:11] that used to be done by computers and people. And now with AIs there's like a
[40:13] people. And now with AIs there's like a third kind of manipulator of digital
[40:14] third kind of manipulator of digital information. There's going to be a lot
[40:15] information. There's going to be a lot of refactoring in those in those
[40:18] of refactoring in those in those disciplines.
[40:19] disciplines. Um but the physical world is actually
[40:21] Um but the physical world is actually going to be like I think
[40:22] going to be like I think behind that by some amount of time. And
[40:24] behind that by some amount of time. And so I think what's really fascinating to
[40:25] so I think what's really fascinating to me is like
[40:27] me is like So that's why I was highlighting the the
[40:29] So that's why I was highlighting the the professions that fundamentally
[40:30] professions that fundamentally manipulate digital information. This is
[40:31] manipulate digital information. This is work you could do from your home, etc.
[40:33] work you could do from your home, etc. Uh because I feel like those will be
[40:35] Uh because I feel like those will be like things will change. And it doesn't
[40:36] like things will change. And it doesn't mean that there's going to be less of
[40:38] mean that there's going to be less of those jobs or more of those jobs because
[40:39] those jobs or more of those jobs because it does has to do with like demand
[40:40] it does has to do with like demand elasticity and many other factors. But
[40:42] elasticity and many other factors. But things will change in these professions
[40:44] things will change in these professions because of these new tools and um
[40:46] because of these new tools and um because of this upgrade to the nervous
[40:48] because of this upgrade to the nervous system of the human superorganism
[40:50] system of the human superorganism >> [laughter]
[40:50] >> [laughter] >> if you want to think about it that way.
[40:52] >> if you want to think about it that way. Given the look you had at the data, do
[40:53] Given the look you had at the data, do you have either any observations or um
[40:57] you have either any observations or um uh guidance for people facing the job
[40:59] uh guidance for people facing the job market or thinking about what to study
[41:01] market or thinking about what to study now or what skills to develop? I mean we
[41:03] now or what skills to develop? I mean we can all go get like I'm very thankful
[41:05] can all go get like I'm very thankful that I have to like meet people for my
[41:06] that I have to like meet people for my job right now.
[41:07] job right now. >> Yeah.
[41:08] >> Yeah. >> [laughter]
[41:08] >> [laughter] >> Yeah, more physical. Yeah. Could you do
[41:10] >> Yeah, more physical. Yeah. Could you do your work from home though? I could.
[41:13] your work from home though? I could. I think there are relationship parts of
[41:14] I think there are relationship parts of it that are hard, but most of it I
[41:15] it that are hard, but most of it I could. Yeah. I think it's really hard to
[41:17] could. Yeah. I think it's really hard to tell because again like the job market
[41:18] tell because again like the job market is extremely diverse. I think the
[41:19] is extremely diverse. I think the answers will probably vary, but uh to a
[41:21] answers will probably vary, but uh to a large extent like these tools are
[41:22] large extent like these tools are extremely new, extremely powerful. And
[41:24] extremely new, extremely powerful. And so just being you know, just trying to
[41:26] so just being you know, just trying to keep up with it is like the first thing.
[41:28] keep up with it is like the first thing. Um
[41:29] Um and um
[41:31] and um yeah, because I think a lot of people
[41:32] yeah, because I think a lot of people kind of like dismiss it or Or they're
[41:34] kind of like dismiss it or Or they're afraid of it. Or they're afraid of it,
[41:35] afraid of it. Or they're afraid of it, etc. As which is totally understandable,
[41:37] etc. As which is totally understandable, of course. Yeah, I think like um
[41:39] of course. Yeah, I think like um it's fundamentally an empowering tool at
[41:41] it's fundamentally an empowering tool at the moment. Um and these jobs are
[41:43] the moment. Um and these jobs are bundles of tasks. And some of these
[41:44] bundles of tasks. And some of these tasks can go a lot faster. And so people
[41:46] tasks can go a lot faster. And so people should think of it as primarily a tool
[41:47] should think of it as primarily a tool that it is right now.
[41:48] that it is right now. Um and I think the long-term future of
[41:50] Um and I think the long-term future of that is uncertain. Yeah, it's kind of
[41:52] that is uncertain. Yeah, it's kind of really hard to forecast, to be honest.
[41:54] really hard to forecast, to be honest. And like I'm not professionally like
[41:56] And like I'm not professionally like doing that really. And I think this is a
[41:57] doing that really. And I think this is a job of like economists to do properly.
[41:59] job of like economists to do properly. You are an engineer though. And like one
[42:02] You are an engineer though. And like one thing I thought was interesting is that
[42:03] thing I thought was interesting is that like
[42:04] like the demand for engineering jobs
[42:06] the demand for engineering jobs is continuing to increase.
[42:08] is continuing to increase. >> Yeah. Um I I can't tell if that's like a
[42:10] >> Yeah. Um I I can't tell if that's like a temporary phenomenon. I'm not sure how I
[42:11] temporary phenomenon. I'm not sure how I feel about it. Yeah, do you know? Yeah,
[42:13] feel about it. Yeah, do you know? Yeah, that's like the demand elasticity almost
[42:14] that's like the demand elasticity almost like uh software was scarce, right? And
[42:17] like uh software was scarce, right? And so the reason we don't have more demand
[42:19] so the reason we don't have more demand for software is just there's its
[42:20] for software is just there's its scarcity and it's too expensive.
[42:22] scarcity and it's too expensive. >> So if the barrier comes down, then
[42:23] >> So if the barrier comes down, then actually you have the Jevons paradox,
[42:25] actually you have the Jevons paradox, which is like you know, you actually the
[42:26] which is like you know, you actually the demand for software actually goes up.
[42:27] demand for software actually goes up. It's cheaper and there's more More
[42:29] It's cheaper and there's more More powerful, yeah. The the classical
[42:31] powerful, yeah. The the classical example of this always is the ATMs and
[42:33] example of this always is the ATMs and the bank tellers
[42:34] the bank tellers uh because there was a lot of like fear
[42:36] uh because there was a lot of like fear that um ATMs and computers basically uh
[42:39] that um ATMs and computers basically uh would displace tellers. But what
[42:41] would displace tellers. But what happened is they made like the cost of
[42:42] happened is they made like the cost of operation of
[42:44] operation of of a bank branch much cheaper. And so
[42:46] of a bank branch much cheaper. And so there are more bank branches, so there
[42:47] there are more bank branches, so there are more tellers. It's like the
[42:49] are more tellers. It's like the canonical example people cite. Uh but
[42:51] canonical example people cite. Uh but basically it's just Jevons paradox. Like
[42:52] basically it's just Jevons paradox. Like something becomes cheaper, so there's
[42:55] something becomes cheaper, so there's a lot of unlocked demand for it. Uh so I
[42:57] a lot of unlocked demand for it. Uh so I do think that that's probably I do have
[42:59] do think that that's probably I do have like cautiously optimistic view of this
[43:01] like cautiously optimistic view of this in software engineering
[43:02] in software engineering where I do think um it does seem to me
[43:05] where I do think um it does seem to me like the demand for software will be
[43:06] like the demand for software will be extremely large. Um and it's just become
[43:08] extremely large. Um and it's just become a lot cheaper. And um
[43:11] a lot cheaper. And um so I do think that for quite some time
[43:14] so I do think that for quite some time um
[43:15] um it's very hard to forecast, but it does
[43:17] it's very hard to forecast, but it does seem to me like right now at least
[43:18] seem to me like right now at least locally there's going to be more demand
[43:19] locally there's going to be more demand for software.
[43:20] for software. Um because software is amazing. It's
[43:22] Um because software is amazing. It's like you know, digital information
[43:23] like you know, digital information processing. You're not forced to use
[43:25] processing. You're not forced to use like arbitrary tools that were given to
[43:26] like arbitrary tools that were given to you. They're imperfect in various ways.
[43:27] you. They're imperfect in various ways. You're not forced to subscribe to what
[43:29] You're not forced to subscribe to what exists. Code is now ephemeral and it can
[43:31] exists. Code is now ephemeral and it can change and it can be modified.
[43:33] change and it can be modified. Um
[43:34] Um and so I think there's going to be a lot
[43:35] and so I think there's going to be a lot of activity in the digital space to like
[43:38] of activity in the digital space to like rewire everything in a certain sense.
[43:40] rewire everything in a certain sense. And I think it's going to create a lot
[43:40] And I think it's going to create a lot of demand for for this kind of stuff. I
[43:43] of demand for for this kind of stuff. I think long-term um yeah, obviously even
[43:45] think long-term um yeah, obviously even with auto research like OpenAI or or you
[43:48] with auto research like OpenAI or or you know, Anthropic or these other labs like
[43:50] know, Anthropic or these other labs like they're employing what like a thousand
[43:51] they're employing what like a thousand something researchers, right?
[43:53] something researchers, right? >> Mhm. These researchers are basically
[43:54] >> Mhm. These researchers are basically like glorified auto like you know.
[43:57] like glorified auto like you know. >> [laughter]
[43:58] >> [laughter] >> They're like automating themselves away
[43:59] >> They're like automating themselves away like actively and this is like the thing
[44:00] like actively and this is like the thing they're all trying to do. Yeah. I
[44:02] they're all trying to do. Yeah. I like I went around um Some of those
[44:04] like I went around um Some of those researchers also fear that feel the
[44:06] researchers also fear that feel the psychosis, right? Because they can it's
[44:07] psychosis, right? Because they can it's working, right? And and so they're like
[44:10] working, right? And and so they're like it's over for me, too. I did spend a
[44:12] it's over for me, too. I did spend a bunch of time going around OpenAI and I
[44:13] bunch of time going around OpenAI and I was like, you guys realize if we're
[44:14] was like, you guys realize if we're successful like we're all out of job
[44:15] successful like we're all out of job like
[44:16] like like this is just going to we're just
[44:17] like this is just going to we're just building automation for Sam or something
[44:19] building automation for Sam or something like that. Like I or the board or I'm
[44:21] like that. Like I or the board or I'm not sure, but like uh they're just
[44:23] not sure, but like uh they're just building all this automation for yeah,
[44:25] building all this automation for yeah, the board or the CEO or something like
[44:27] the board or the CEO or something like that. And we're all out of our job and
[44:29] that. And we're all out of our job and maybe
[44:30] maybe contributing on the side. And so
[44:32] contributing on the side. And so yeah, it's kind of like unnerving from
[44:34] yeah, it's kind of like unnerving from that perspective. Is it okay if I ask
[44:36] that perspective. Is it okay if I ask you Noam's question? Mhm. You know, you
[44:38] you Noam's question? Mhm. You know, you could be doing that, right? Auto
[44:40] could be doing that, right? Auto researching with a lot of compute scale
[44:42] researching with a lot of compute scale and a bunch of colleagues at one of the
[44:43] and a bunch of colleagues at one of the frontier [clears throat] labs. Like why
[44:44] frontier [clears throat] labs. Like why not? Well, I was there for a while,
[44:46] not? Well, I was there for a while, right? Like and I did reenter. So to
[44:48] right? Like and I did reenter. So to some extent I agree and I think that
[44:49] some extent I agree and I think that there are many ways to slice this
[44:50] there are many ways to slice this question. It's very loaded question a
[44:52] question. It's very loaded question a little bit. Um I will say that I feel
[44:54] little bit. Um I will say that I feel very good about like what people can
[44:56] very good about like what people can contribute and their impact outside of
[44:58] contribute and their impact outside of the frontier labs, obviously. Not in the
[45:00] the frontier labs, obviously. Not in the industry, but also in like more like
[45:02] industry, but also in like more like ecosystem level roles. Um so your role
[45:05] ecosystem level roles. Um so your role for example is more like ecosystem
[45:06] for example is more like ecosystem level. My role currently is also kind of
[45:07] level. My role currently is also kind of more on ecosystem level. And I feel very
[45:09] more on ecosystem level. And I feel very good about like impact that people can
[45:10] good about like impact that people can have in those kinds of roles. I think
[45:12] have in those kinds of roles. I think conversely there's there are definite
[45:14] conversely there's there are definite problems in my mind for um uh for
[45:17] problems in my mind for um uh for basically aligning yourself way too much
[45:18] basically aligning yourself way too much with the frontier labs, too. So
[45:20] with the frontier labs, too. So fundamentally I mean you're you have a
[45:21] fundamentally I mean you're you have a huge amount of financial incentive to uh
[45:23] huge amount of financial incentive to uh with these frontier labs. And by your
[45:25] with these frontier labs. And by your own admission, the uh the AIs are going
[45:27] own admission, the uh the AIs are going to like really change humanity and
[45:29] to like really change humanity and society in very dramatic ways. And here
[45:31] society in very dramatic ways. And here you are basically like building the
[45:33] you are basically like building the technology and benefiting from it like
[45:35] technology and benefiting from it like it and being like very allied to it
[45:36] it and being like very allied to it through financial means. Like this was
[45:38] through financial means. Like this was the conundrum that was in at the heart
[45:40] the conundrum that was in at the heart of you know, how OpenAI was started in
[45:42] of you know, how OpenAI was started in the beginning. Like this was the
[45:43] the beginning. Like this was the conundrum that we were trying to solve.
[45:44] conundrum that we were trying to solve. Mhm. Um and so you know, that
[45:47] Mhm. Um and so you know, that so it's kind of um It's still not
[45:49] so it's kind of um It's still not resolved.
[45:50] resolved. >> is still not like fully resolved. So
[45:51] >> is still not like fully resolved. So that's number one. You're you're not a
[45:53] that's number one. You're you're not a completely free agent and you can't
[45:54] completely free agent and you can't actually like be part of that
[45:55] actually like be part of that conversation in a fully autonomous um
[45:58] conversation in a fully autonomous um free way. Like if you're inside one of
[45:59] free way. Like if you're inside one of the frontier labs. Like there's some
[46:01] the frontier labs. Like there's some things that you can't say. Uh and
[46:03] things that you can't say. Uh and conversely there are some things that
[46:04] conversely there are some things that the organization wants you to say. And
[46:06] the organization wants you to say. And you know, they're not going to twist
[46:07] you know, they're not going to twist your arm, but
[46:08] your arm, but you feel the pressure of like what you
[46:09] you feel the pressure of like what you should be saying,
[46:11] should be saying, you know, cuz like obviously
[46:13] you know, cuz like obviously >> [laughter]
[46:14] >> [laughter] >> otherwise it's like really awkward
[46:15] >> otherwise it's like really awkward conversations,
[46:17] conversations, uh strange side eyes, like what are you
[46:18] uh strange side eyes, like what are you doing, you know, like so you can't like
[46:20] doing, you know, like so you can't like really be an independent agent. And I I
[46:22] really be an independent agent. And I I feel like a bit more a lot like aligned
[46:24] feel like a bit more a lot like aligned with humanity in a certain sense outside
[46:25] with humanity in a certain sense outside of the frontier lab because
[46:27] of the frontier lab because I don't I'm not subject to those
[46:28] I don't I'm not subject to those pressures almost, right? And I can say
[46:30] pressures almost, right? And I can say whatever I want or Yeah, I would say in
[46:31] whatever I want or Yeah, I would say in the frontier labs like um
[46:34] the frontier labs like um you can have like
[46:35] you can have like impact there of course as well. So
[46:37] impact there of course as well. So but there's many researchers and maybe
[46:39] but there's many researchers and maybe you're one of them, maybe your ideas are
[46:40] you're one of them, maybe your ideas are really good, etc. Maybe there's a lot of
[46:41] really good, etc. Maybe there's a lot of decision-making to do and you want to be
[46:43] decision-making to do and you want to be in a position where you are in the room
[46:44] in a position where you are in the room with those conversations when they come
[46:45] with those conversations when they come up. I do think that currently the stakes
[46:47] up. I do think that currently the stakes are like overall fairly low and so
[46:49] are like overall fairly low and so everything is kind of like nice. But
[46:50] everything is kind of like nice. But ultimately in the end of the day like
[46:52] ultimately in the end of the day like when the stakes are really high, etc. If
[46:53] when the stakes are really high, etc. If you're an employee at an organization, I
[46:55] you're an employee at an organization, I don't actually know how much sway you're
[46:56] don't actually know how much sway you're going to have on your organization what
[46:57] going to have on your organization what it's going to do. Like fundamentally at
[46:59] it's going to do. Like fundamentally at the end of the day um
[47:01] the end of the day um uh it's uh you're not like really in
[47:03] uh it's uh you're not like really in charge. Like you're in the room and
[47:04] charge. Like you're in the room and you're contributing ideas, but you're
[47:05] you're contributing ideas, but you're not like really in charge of that entity
[47:07] not like really in charge of that entity that you're that you're part of. So
[47:08] that you're that you're part of. So those are like some sources of
[47:09] those are like some sources of misalignment, I think to some extent. I
[47:11] misalignment, I think to some extent. I will say that like in one way I do agree
[47:13] will say that like in one way I do agree a lot with that sentiment that um I do
[47:16] a lot with that sentiment that um I do feel like in the
[47:17] feel like in the like the labs for better or worse
[47:18] like the labs for better or worse they're opaque and a lot of work is
[47:20] they're opaque and a lot of work is there. And they're kind of like at the
[47:21] there. And they're kind of like at the edge of capability and what's possible.
[47:23] edge of capability and what's possible. And they're working on what's coming
[47:24] And they're working on what's coming down the line. And I think if you're
[47:26] down the line. And I think if you're outside of that frontier lab, your your
[47:28] outside of that frontier lab, your your judgment fundamentally will start to
[47:29] judgment fundamentally will start to drift because you're not part of the
[47:32] drift because you're not part of the you know,
[47:33] you know, what's coming down the line. And so I
[47:34] what's coming down the line. And so I feel like my judgment will inevitably
[47:36] feel like my judgment will inevitably start to drift as well. And I won't
[47:38] start to drift as well. And I won't actually have an understanding of how
[47:39] actually have an understanding of how these systems actually work under the
[47:40] these systems actually work under the hood. That's an opaque system.
[47:42] hood. That's an opaque system. I won't have a a good understanding of
[47:43] I won't have a a good understanding of how it's going to develop and etc. And
[47:45] how it's going to develop and etc. And so I do think that in that sense I agree
[47:48] so I do think that in that sense I agree and something I'm nervous about. I think
[47:49] and something I'm nervous about. I think it's worth basically
[47:51] it's worth basically being in touch with what's actually
[47:52] being in touch with what's actually happening and actually being in a
[47:53] happening and actually being in a frontier lab. And if if some of the
[47:55] frontier lab. And if if some of the frontier labs would have me come for you
[47:57] frontier labs would have me come for you know, some amount of time and do really
[47:58] know, some amount of time and do really good work for them and then maybe come
[48:00] good work for them and then maybe come and hang out.
[48:00] and hang out. >> looking for a job. This is super
[48:01] >> looking for a job. This is super exciting. [laughter]
[48:03] exciting. [laughter] Then I think that's maybe a good setup
[48:05] Then I think that's maybe a good setup because I kind of feel like it's kind of
[48:06] because I kind of feel like it's kind of um
[48:07] um you know,
[48:08] you know, maybe that's like one way Mhm. uh to to
[48:10] maybe that's like one way Mhm. uh to to actually be connected to what's actually
[48:12] actually be connected to what's actually happening, but also not feel like you're
[48:13] happening, but also not feel like you're necessarily fully controlled by Yeah. by
[48:15] necessarily fully controlled by Yeah. by those entities. So I think
[48:17] those entities. So I think honestly in my mind like
[48:19] honestly in my mind like Noam can probably get do extremely good
[48:21] Noam can probably get do extremely good work at at OAI, but also I think his
[48:23] work at at OAI, but also I think his most impactful work could very well be
[48:25] most impactful work could very well be outside of OpenAI. Noam, that's a call
[48:27] outside of OpenAI. Noam, that's a call to be an independent researcher with
[48:28] to be an independent researcher with auto [laughter] research.
[48:30] auto [laughter] research. Yeah, there's many things to do on the
[48:31] Yeah, there's many things to do on the outside and it's it's a
[48:33] outside and it's it's a and I think ultimately I think the ideal
[48:35] and I think ultimately I think the ideal solution maybe is like yeah, going back
[48:36] solution maybe is like yeah, going back and forth
[48:38] and forth or um
[48:39] or um yeah, and I think fundamentally you can
[48:40] yeah, and I think fundamentally you can have a really amazing impact in both
[48:42] have a really amazing impact in both places. So very complicated I don't
[48:43] places. So very complicated I don't know. Like it's a very loaded question a
[48:45] know. Like it's a very loaded question a little bit, but I mean I joined the
[48:46] little bit, but I mean I joined the frontier lab and I'm outside. And then
[48:48] frontier lab and I'm outside. And then maybe in the future I'll want to join
[48:50] maybe in the future I'll want to join again. And I think um
[48:52] again. And I think um uh that's kind of like how I look at it.
[48:54] uh that's kind of like how I look at it. One question related to what visibility
[48:57] One question related to what visibility to does the world or the AI ecosystem
[49:00] to does the world or the AI ecosystem have into
[49:01] have into the frontier is like how how close open
[49:04] the frontier is like how how close open source is to the frontier. Mhm. Um and
[49:07] source is to the frontier. Mhm. Um and how sustainable that is. I I think Yeah.
[49:09] how sustainable that is. I I think Yeah. I think it is quite surprising. The
[49:12] I think it is quite surprising. The entire sequence of events actually from
[49:14] entire sequence of events actually from like having a handful of Chinese models
[49:17] like having a handful of Chinese models and global models and I think people are
[49:19] and global models and I think people are going to continue releasing here in the
[49:20] going to continue releasing here in the near term that are closer than much of
[49:23] near term that are closer than much of the industry anticipated from a
[49:24] the industry anticipated from a capability [clears throat] perspective.
[49:26] capability [clears throat] perspective. >> Yeah. Um I don't know if you're
[49:27] >> Yeah. Um I don't know if you're surprised by that, but you're a
[49:28] surprised by that, but you're a long-term contributor to open source.
[49:29] long-term contributor to open source. Like what's your prediction here? Yeah,
[49:31] Like what's your prediction here? Yeah, so roughly speaking basically the
[49:33] so roughly speaking basically the the closed models are ahead, but like
[49:35] the closed models are ahead, but like people are monitoring the number of
[49:36] people are monitoring the number of months that sort of like open-source
[49:37] months that sort of like open-source models are behind. Um And started with
[49:39] models are behind. Um And started with there's nothing and then it went to 18
[49:41] there's nothing and then it went to 18 months. Now it's
[49:41] months. Now it's >> Yeah, but then convergence, right? So
[49:43] >> Yeah, but then convergence, right? So then maybe they're behind by like, what
[49:45] then maybe they're behind by like, what is the latest? Maybe like 8 months, 6
[49:46] is the latest? Maybe like 8 months, 6 months, 8 months kind of thing right
[49:47] months, 8 months kind of thing right now. Yeah, I'm a huge fan of
[49:48] now. Yeah, I'm a huge fan of open-source, obviously. So for example,
[49:50] open-source, obviously. So for example, in operating systems, you have like
[49:51] in operating systems, you have like closed source, like, you know, Windows
[49:52] closed source, like, you know, Windows and Mac OS, these are large software
[49:54] and Mac OS, these are large software projects, kind of like what LLMs are
[49:55] projects, kind of like what LLMs are going to become, and there's Linux. Mhm.
[49:57] going to become, and there's Linux. Mhm. But Linux is very easy. Like, actually
[49:59] But Linux is very easy. Like, actually Linux is extremely successful project.
[50:00] Linux is extremely successful project. It runs on the vast majority of
[50:01] It runs on the vast majority of computers. Like, last time I checked,
[50:03] computers. Like, last time I checked, was it like 60% or something like from
[50:05] was it like 60% or something like from Linux? Um and that's because there is a
[50:07] Linux? Um and that's because there is a need in the industry to have a common
[50:09] need in the industry to have a common open platform that everyone feels uh
[50:11] open platform that everyone feels uh sort of safe using. I would say like the
[50:13] sort of safe using. I would say like the industry has always felt a demand for
[50:14] industry has always felt a demand for that kind of a project to exist. Mhm.
[50:16] that kind of a project to exist. Mhm. >> And I think the same is true now. And
[50:18] >> And I think the same is true now. And that's why businesses actually want
[50:19] that's why businesses actually want there's demand for this kind of a um a
[50:21] there's demand for this kind of a um a thing to exist. The big difference is
[50:23] thing to exist. The big difference is that everything is capital uh there's a
[50:25] that everything is capital uh there's a lot of capex that goes into this.
[50:27] lot of capex that goes into this. >> Um so I think that's where things like
[50:29] >> Um so I think that's where things like fall apart a little bit, make it a bit
[50:30] fall apart a little bit, make it a bit harder to to compete in certain senses.
[50:32] harder to to compete in certain senses. Uh I I do think that the current models
[50:33] Uh I I do think that the current models are very good. The other thing that I
[50:35] are very good. The other thing that I think is like really interesting is that
[50:36] think is like really interesting is that for the vast majority of like consumer
[50:38] for the vast majority of like consumer use cases and things like that, even
[50:39] use cases and things like that, even like turn open-source models are
[50:41] like turn open-source models are actually quite good, I would say. And I
[50:42] actually quite good, I would say. And I think like if you go forward like more
[50:45] think like if you go forward like more uh more years, it does seem to me like a
[50:47] uh more years, it does seem to me like a huge amount of like simple use cases are
[50:50] huge amount of like simple use cases are going to be well covered and actually
[50:51] going to be well covered and actually even run locally. Mhm. Um
[50:54] even run locally. Mhm. Um but there's going to be always like some
[50:55] but there's going to be always like some demand for like frontier intelligence
[50:56] demand for like frontier intelligence and that that can actually be extremely
[50:58] and that that can actually be extremely large uh piece of the pie. But it could
[51:00] large uh piece of the pie. But it could be that the frontier the need for
[51:01] be that the frontier the need for frontier intelligence is going to be
[51:02] frontier intelligence is going to be like, you know, Nobel Prize kind of
[51:04] like, you know, Nobel Prize kind of work. Mhm.
[51:05] work. Mhm. >> let's move Linux from C to Rust. It's
[51:08] >> let's move Linux from C to Rust. It's going to be like bigger projects, you
[51:09] going to be like bigger projects, you know, like scoped in that kind of a way,
[51:12] know, like scoped in that kind of a way, and there's going to be maybe more um
[51:14] and there's going to be maybe more um and maybe that's where a lot of the
[51:15] and maybe that's where a lot of the frontier closed intelligence is where
[51:17] frontier closed intelligence is where going to are going to be interacting
[51:18] going to are going to be interacting with. And open-source kind of like going
[51:20] with. And open-source kind of like going to eat through a lot of the more basic
[51:22] to eat through a lot of the more basic use cases or something like that. You
[51:24] use cases or something like that. You know, at some point what is frontier
[51:25] know, at some point what is frontier today is going to be, you know, probably
[51:27] today is going to be, you know, probably later this year what's frontier today in
[51:29] later this year what's frontier today in terms of what I'm using right now from
[51:30] terms of what I'm using right now from the closed labs uh might be open-source
[51:33] the closed labs uh might be open-source and that's going to be doing a lot of
[51:34] and that's going to be doing a lot of work. So I kind of expect that this
[51:35] work. So I kind of expect that this dynamic will actually basically
[51:36] dynamic will actually basically continue. Like we'll have frontier labs
[51:38] continue. Like we'll have frontier labs that have closed um AIs that are kind of
[51:40] that have closed um AIs that are kind of like these oracles, and then we'll have
[51:41] like these oracles, and then we'll have open-source kind of like behind with
[51:42] open-source kind of like behind with some amount of months. And I kind of
[51:44] some amount of months. And I kind of expect that to uh to continue. And I
[51:47] expect that to uh to continue. And I actually think that's like a pretty
[51:48] actually think that's like a pretty pretty good setup uh overall. Um
[51:51] pretty good setup uh overall. Um because I I'm a little bit hesitant of
[51:53] because I I'm a little bit hesitant of having um I don't actually think it's
[51:54] having um I don't actually think it's like structurally I think there's some
[51:56] like structurally I think there's some systemic risk attached to just having
[51:58] systemic risk attached to just having intelligence that are closed and that's
[51:59] intelligence that are closed and that's like that's it. Mhm. And I think that
[52:02] like that's it. Mhm. And I think that that's a, you know, centralization has a
[52:03] that's a, you know, centralization has a very poor track record in my view uh in
[52:05] very poor track record in my view uh in in the past and has um
[52:07] in the past and has um >> You mean like in political or economic
[52:09] >> You mean like in political or economic systems in in general.
[52:10] systems in in general. >> [laughter]
[52:12] >> [laughter] >> Exactly. I think there's like a lot of
[52:13] >> Exactly. I think there's like a lot of like pretty
[52:13] like pretty >> an Eastern European. A lot of pretty bad
[52:16] >> an Eastern European. A lot of pretty bad precedents, so I want there to be a
[52:17] precedents, so I want there to be a thing that is maybe not at the edge of
[52:19] thing that is maybe not at the edge of capability because it's new and
[52:20] capability because it's new and unexplored, etc. But I want there to be
[52:21] unexplored, etc. But I want there to be a thing that's behind and that uh is
[52:24] a thing that's behind and that uh is kind of like a common working space for
[52:25] kind of like a common working space for intelligences that the entire industry
[52:27] intelligences that the entire industry has access to. Yeah, that seems to me
[52:28] has access to. Yeah, that seems to me like a pretty decent power balance for
[52:30] like a pretty decent power balance for the industry. Yeah. I also think there's
[52:31] the industry. Yeah. I also think there's just like there are many problems to
[52:33] just like there are many problems to solve, right? Like if you keep advancing
[52:35] solve, right? Like if you keep advancing intelligence from the frontier, we can
[52:37] intelligence from the frontier, we can do new things and there are a lot of
[52:39] do new things and there are a lot of like very big problems for humanity,
[52:40] like very big problems for humanity, right? And so like it seems that that
[52:43] right? And so like it seems that that will continue to be a very expensive
[52:44] will continue to be a very expensive game. And so I want to like root for
[52:46] game. And so I want to like root for labs that are doing that because there
[52:48] labs that are doing that because there are problems we cannot solve without
[52:49] are problems we cannot solve without continuing to advance the models in a
[52:51] continuing to advance the models in a very expensive way. And yet, as you
[52:53] very expensive way. And yet, as you point out, like if what we have
[52:56] point out, like if what we have today as frontier is open, that's a lot
[52:59] today as frontier is open, that's a lot of capability, right? And and so I I I
[53:01] of capability, right? And and so I I I think, you know, the power of that or
[53:03] think, you know, the power of that or the democratization of that seems like
[53:04] the democratization of that seems like >> Yeah. very useful and also healthy.
[53:06] >> Yeah. very useful and also healthy. >> Yeah. I think basically by accident
[53:08] >> Yeah. I think basically by accident we're actually like in an okay spot.
[53:09] we're actually like in an okay spot. >> An optimal. Yeah. [laughter] Yeah. Like
[53:11] >> An optimal. Yeah. [laughter] Yeah. Like by accident we we are it happened to be
[53:12] by accident we we are it happened to be in a good spot in a certain sense. Mhm.
[53:14] in a good spot in a certain sense. Mhm. Um Well, and and to some degree the the
[53:16] Um Well, and and to some degree the the longer this endures, like this dynamic,
[53:19] longer this endures, like this dynamic, um the the the healthier of a spot like
[53:21] um the the the healthier of a spot like the ecosystem might be in, right?
[53:24] the ecosystem might be in, right? Because you have more and more area
[53:25] Because you have more and more area under the curve.
[53:25] under the curve. >> Mhm. And I will say that even on the
[53:26] >> Mhm. And I will say that even on the closed side, I I almost feel like it's
[53:28] closed side, I I almost feel like it's been like even further centralizing
[53:30] been like even further centralizing recently because I think a lot of the
[53:31] recently because I think a lot of the frontrunners are like not necessarily
[53:32] frontrunners are like not necessarily like the top tier. And so uh yeah, like
[53:36] like the top tier. And so uh yeah, like in that sense I think it's um it's not
[53:37] in that sense I think it's um it's not super ideal. I would love there to be
[53:39] super ideal. I would love there to be more
[53:40] more more frontier labs because yeah, I'm
[53:42] more frontier labs because yeah, I'm like by default very suspicious of like
[53:44] like by default very suspicious of like um
[53:45] um I want there to be more people in the
[53:46] I want there to be more people in the room. I want I think like in machine
[53:48] room. I want I think like in machine learning ensembles always outperform any
[53:50] learning ensembles always outperform any individual model. And so I want there to
[53:51] individual model. And so I want there to be ensembles of people thinking about
[53:53] be ensembles of people thinking about all the hardest problems and I want
[53:54] all the hardest problems and I want there to be ensembles of people in the
[53:56] there to be ensembles of people in the room when they um
[53:57] room when they um to be all well informed and to make
[53:59] to be all well informed and to make those decisions, you know, so uh I don't
[54:01] those decisions, you know, so uh I don't want it to be like a closed doors with
[54:02] want it to be like a closed doors with two people or three people. I feel like
[54:03] two people or three people. I feel like that's like not a good not a good
[54:05] that's like not a good not a good future. I almost wish like there were
[54:07] future. I almost wish like there were more labs as long as they're short and I
[54:08] more labs as long as they're short and I I I do think that open-source has a has
[54:10] I I do think that open-source has a has a
[54:11] a has a place to play. I hope it sticks
[54:13] has a place to play. I hope it sticks around and I basically I it's currently
[54:15] around and I basically I it's currently slightly behind and it's actually kind
[54:17] slightly behind and it's actually kind of like a good thing. Okay, you worked
[54:19] of like a good thing. Okay, you worked on the precursor to generalized robotics
[54:21] on the precursor to generalized robotics autonomy um in cars, right?
[54:24] autonomy um in cars, right? Uh a a lot has happened in the last
[54:27] Uh a a lot has happened in the last couple months with robotics companies as
[54:29] couple months with robotics companies as well, like acceleration of really
[54:31] well, like acceleration of really impressive generalization of
[54:33] impressive generalization of environment, of tasks, like increasingly
[54:35] environment, of tasks, like increasingly long horizon tasks, lots of money going
[54:37] long horizon tasks, lots of money going into the space. Like, is it going to
[54:39] into the space. Like, is it going to happen? Has anything in your view
[54:40] happen? Has anything in your view changed recently? Uh so like my view is
[54:43] changed recently? Uh so like my view is kind of informed by what I saw in
[54:44] kind of informed by what I saw in self-driving and I do feel like
[54:45] self-driving and I do feel like self-driving is the first robotics
[54:46] self-driving is the first robotics application. So probably what I saw is
[54:48] application. So probably what I saw is at the time, like 10 years ago, there
[54:50] at the time, like 10 years ago, there were a large number of startups. And I
[54:51] were a large number of startups. And I kind of feel like um
[54:53] kind of feel like um like most of them basically like didn't
[54:55] like most of them basically like didn't long-term make it. Um and what I saw is
[54:57] long-term make it. Um and what I saw is that like a lot of capital expenditure
[54:59] that like a lot of capital expenditure had to go in and a lot of time. And so
[55:01] had to go in and a lot of time. And so um I think it's like I think robotics,
[55:03] um I think it's like I think robotics, because it's so difficult, is so messy,
[55:05] because it's so difficult, is so messy, and requires a huge amount of capital
[55:07] and requires a huge amount of capital investment, and a lot of like
[55:08] investment, and a lot of like conviction.
[55:09] conviction. Um just it's like a big problem and I
[55:12] Um just it's like a big problem and I think atoms are really hard. So I kind
[55:13] think atoms are really hard. So I kind of feel like they will lag be it will
[55:15] of feel like they will lag be it will lag behind what's going to happen in
[55:16] lag behind what's going to happen in digital space. And in digital space
[55:17] digital space. And in digital space there's going to be a huge amount of
[55:19] there's going to be a huge amount of unhobbling, uh basically like things
[55:21] unhobbling, uh basically like things that weren't super efficient becoming a
[55:23] that weren't super efficient becoming a lot more efficient by like a factor of a
[55:24] lot more efficient by like a factor of a hundred.
[55:25] hundred. >> Mhm. Because bits are so much easier.
[55:27] >> Mhm. Because bits are so much easier. And so I think currently in terms of
[55:29] And so I think currently in terms of what's going to change and
[55:31] what's going to change and like where the activity is, I kind of
[55:33] like where the activity is, I kind of feel like digital space is going to like
[55:35] feel like digital space is going to like change a huge amount. And then the
[55:36] change a huge amount. And then the physical space will lag behind. And what
[55:38] physical space will lag behind. And what I find very interesting is like this
[55:39] I find very interesting is like this interface in between them as well.
[55:41] interface in between them as well. Because I think in this like if you we
[55:43] Because I think in this like if you we do have more agents acting on behalf of
[55:45] do have more agents acting on behalf of humans and more agents kind of like
[55:46] humans and more agents kind of like talking to each other and and doing
[55:48] talking to each other and and doing tasks and participating in kind of
[55:50] tasks and participating in kind of economy of agents, etc. Um you're going
[55:53] economy of agents, etc. Um you're going to run out of things that you're going
[55:54] to run out of things that you're going to do purely in the digital space. At
[55:56] to do purely in the digital space. At some point you have to go to the
[55:57] some point you have to go to the universe and you have to ask it
[55:58] universe and you have to ask it questions. Um you have to run an
[56:00] questions. Um you have to run an experiment and see what the universe
[56:01] experiment and see what the universe tells you to get back to learn
[56:02] tells you to get back to learn something. And so we currently have a
[56:05] something. And so we currently have a huge amount of like digital work uh
[56:07] huge amount of like digital work uh because there's an overhang in how much
[56:08] because there's an overhang in how much we collectively thought about what
[56:10] we collectively thought about what already is digital.
[56:12] already is digital. So we just didn't have enough thinking
[56:13] So we just didn't have enough thinking cycles among the humans to think about
[56:14] cycles among the humans to think about all the information that is already
[56:15] all the information that is already digital and already uploaded. Um and so
[56:18] digital and already uploaded. Um and so we're going to start running out of
[56:19] we're going to start running out of stuff that is actually like um
[56:21] stuff that is actually like um already up uploaded. Uh so you're going
[56:23] already up uploaded. Uh so you're going to at some point read all the papers and
[56:25] to at some point read all the papers and process them and have some ideas about
[56:26] process them and have some ideas about what to try, but um yeah, we're just
[56:29] what to try, but um yeah, we're just going to
[56:29] going to uh I don't actually know how much you
[56:31] uh I don't actually know how much you can like get intelligence that's like
[56:32] can like get intelligence that's like fully closed off and was just
[56:33] fully closed off and was just information that's available in the you
[56:35] information that's available in the you know. And so I think what's going to
[56:36] know. And so I think what's going to happen is first there's going to be a
[56:38] happen is first there's going to be a huge amount of unhobbling and I think
[56:39] huge amount of unhobbling and I think there's a huge amount of work there.
[56:40] there's a huge amount of work there. Then actually it's going to move to like
[56:41] Then actually it's going to move to like the interfaces between physical and
[56:42] the interfaces between physical and digital. So I and that's like sensors of
[56:45] digital. So I and that's like sensors of like seeing the world and actuators of
[56:47] like seeing the world and actuators of like doing something to the world.
[56:48] like doing something to the world. >> Mhm. So I think a lot of interesting
[56:49] >> Mhm. So I think a lot of interesting companies will actually come from that
[56:51] companies will actually come from that interface of like can we feed the
[56:53] interface of like can we feed the superintelligence in a certain sense uh
[56:55] superintelligence in a certain sense uh data and can we actually like take data
[56:57] data and can we actually like take data out and manipulate the physical world um
[57:00] out and manipulate the physical world um per its bidding if you want to like
[57:01] per its bidding if you want to like anthropomorphize the whole thing, right?
[57:03] anthropomorphize the whole thing, right? And then the the physical world actually
[57:04] And then the the physical world actually I almost feel like the the total
[57:06] I almost feel like the the total addressable market, etc. in terms of
[57:07] addressable market, etc. in terms of like the amount of work and so on is is
[57:09] like the amount of work and so on is is massive, possibly even much larger maybe
[57:11] massive, possibly even much larger maybe what can happen in digital space. So
[57:13] what can happen in digital space. So actually think it's like a much bigger
[57:14] actually think it's like a much bigger opportunity as well. But um
[57:18] opportunity as well. But um I do feel like it's a huge amount of
[57:19] I do feel like it's a huge amount of work and and in my in my mind the atoms
[57:21] work and and in my in my mind the atoms are just like a a million times harder.
[57:24] are just like a a million times harder. So um so it will lag behind, but it's
[57:26] So um so it will lag behind, but it's also I think a little bit of a bigger
[57:27] also I think a little bit of a bigger market. So it's kind of like uh yeah, I
[57:29] market. So it's kind of like uh yeah, I think the opportunity is kind of like
[57:31] think the opportunity is kind of like follow that kind of trajectory. So right
[57:32] follow that kind of trajectory. So right now is digital is like my main interest.
[57:36] now is digital is like my main interest. Then interfaces will be like after that
[57:38] Then interfaces will be like after that and then maybe like some of the physical
[57:39] and then maybe like some of the physical things um like their time will come and
[57:41] things um like their time will come and they'll be huge when they do come.
[57:43] they'll be huge when they do come. Well, it's it's it's an interesting
[57:44] Well, it's it's it's an interesting framework for it, too, because uh
[57:46] framework for it, too, because uh certain things, not the things I'm
[57:47] certain things, not the things I'm working on right now, but certain things
[57:48] working on right now, but certain things are much easier even in the world of
[57:50] are much easier even in the world of atoms.
[57:51] atoms. >> Mhm. Right? Like if you just think about
[57:52] >> Mhm. Right? Like if you just think about like read and write to the physical
[57:54] like read and write to the physical world, like read, like sensors, cameras,
[57:57] world, like read, like sensors, cameras, like there's a lot of existing hardware
[57:58] like there's a lot of existing hardware and you can imagine like
[58:01] and you can imagine like enriching agent capabilities or
[58:03] enriching agent capabilities or capturing a lot of new data if you just
[58:04] capturing a lot of new data if you just clever about it and like you don't
[58:06] clever about it and like you don't necessarily have to invest a lot to like
[58:09] necessarily have to invest a lot to like get something valuable.
[58:10] get something valuable. >> Yeah. Right. Yeah. So like examples of
[58:12] >> Yeah. Right. Yeah. So like examples of this that I saw for example are, you
[58:13] this that I saw for example are, you know, um a friend of mine, Liam, is
[58:15] know, um a friend of mine, Liam, is running is a CEO of Periodic. I
[58:18] running is a CEO of Periodic. I visited them last week. Yeah. So it was
[58:19] visited them last week. Yeah. So it was just on top of mind. Like they're trying
[58:21] just on top of mind. Like they're trying to do auto research for materials
[58:22] to do auto research for materials science. Mhm. Um and so in that case
[58:24] science. Mhm. Um and so in that case it's like the sensors to the
[58:26] it's like the sensors to the intelligence are actually like pretty
[58:27] intelligence are actually like pretty expensive lab equipment. And the same is
[58:29] expensive lab equipment. And the same is true in biology. I think a lot of people
[58:30] true in biology. I think a lot of people are very interested in engineering
[58:31] are very interested in engineering biology and, you know, the sensors will
[58:33] biology and, you know, the sensors will be more than just like video cameras.
[58:34] be more than just like video cameras. Does that make sense? And then the other
[58:36] Does that make sense? And then the other thing I was I saw for example is
[58:37] thing I was I saw for example is companies that are trying to have um
[58:39] companies that are trying to have um like you basically pay people for
[58:40] like you basically pay people for training data. Yeah. Yeah. Yeah. Yeah.
[58:42] training data. Yeah. Yeah. Yeah. Yeah. >> To feed the Yeah.
[58:42] >> To feed the Yeah. >> programmatically.
[58:43] >> programmatically. >> Yeah. To feed to feed the Borg. Uh
[58:46] >> Yeah. To feed to feed the Borg. Uh um and so like these are all examples of
[58:48] um and so like these are all examples of like sensors in a certain sense. So they
[58:50] like sensors in a certain sense. So they take many diverse shapes and forms if
[58:51] take many diverse shapes and forms if that makes sense. Mhm. Yeah, so I'm
[58:53] that makes sense. Mhm. Yeah, so I'm looking forward to the point where I can
[58:54] looking forward to the point where I can ask for a task in the physical world and
[58:57] ask for a task in the physical world and I can put a price on it and just tell
[58:59] I can put a price on it and just tell the agent like, you know, you figure out
[59:00] the agent like, you know, you figure out how to do it. Go get the data.
[59:02] how to do it. Go get the data. >> I'm actually kind of surprised we don't
[59:03] >> I'm actually kind of surprised we don't have enough like information markets.
[59:05] have enough like information markets. Mhm. Like if for example if Polymarket
[59:06] Mhm. Like if for example if Polymarket or other betting markets or even stocks,
[59:08] or other betting markets or even stocks, etc. If they have so much autonomous
[59:09] etc. If they have so much autonomous activity and rising amount of activity,
[59:11] activity and rising amount of activity, Mhm. like um
[59:13] Mhm. like um why should like for example if Iran was
[59:14] why should like for example if Iran was just happening now, like how come there
[59:16] just happening now, like how come there isn't a process where like taking a
[59:17] isn't a process where like taking a photo or video from somewhere in Tehran
[59:19] photo or video from somewhere in Tehran should cost like 10 bucks? Like someone
[59:21] should cost like 10 bucks? Like someone should be able to pay for that, you
[59:22] should be able to pay for that, you know, like and that's an example of like
[59:23] know, like and that's an example of like feeding the intelligence. There's not
[59:25] feeding the intelligence. There's not going to be a human looking at it, it's
[59:26] going to be a human looking at it, it's going to be like agents who are trying
[59:27] going to be like agents who are trying to guess the betting games and stock
[59:29] to guess the betting games and stock markets and so on. Mhm. So I kind of
[59:31] markets and so on. Mhm. So I kind of feel like the agentic web is still like
[59:32] feel like the agentic web is still like fairly new, but there's no like
[59:34] fairly new, but there's no like mechanisms for this, but this is an
[59:35] mechanisms for this, but this is an example of what I I think might happen.
[59:37] example of what I I think might happen. Uh there's a good book that maybe is
[59:39] Uh there's a good book that maybe is inspiring called Daemon. Mhm. You
[59:41] inspiring called Daemon. Mhm. You potentially read it. In Daemon, the
[59:43] potentially read it. In Daemon, the intelligence um
[59:45] intelligence um ends up like puppeteering almost a
[59:46] ends up like puppeteering almost a little bit like humanity in a certain
[59:48] little bit like humanity in a certain sense, you know? And so, humans are kind
[59:49] sense, you know? And so, humans are kind of like it's actuators, but humans are
[59:51] of like it's actuators, but humans are also like its sensors. Um and so, I
[59:53] also like its sensors. Um and so, I think like collectively like society
[59:55] think like collectively like society will kind of like reshape in a certain
[59:56] will kind of like reshape in a certain way in uh
[59:58] way in uh to to serve that kind of a
[01:00:01] to to serve that kind of a that will kind of like end up happening
[01:00:02] that will kind of like end up happening collectively across the industry. Where
[01:00:04] collectively across the industry. Where yeah, there's just a lot more automation
[01:00:06] yeah, there's just a lot more automation and it has certain needs and kind of
[01:00:07] and it has certain needs and kind of humans will be serving those needs of
[01:00:09] humans will be serving those needs of that of that machine, not necessarily
[01:00:11] that of that machine, not necessarily like to each other.
[01:00:12] like to each other. >> Well, we were um on this very specific
[01:00:14] >> Well, we were um on this very specific point of uh like missing pieces of
[01:00:16] point of uh like missing pieces of training data. We needed um we needed
[01:00:18] training data. We needed um we needed something like auto research, right?
[01:00:19] something like auto research, right? Like we we need the training cycle or
[01:00:21] Like we we need the training cycle or the SFTP piece to be uh
[01:00:24] the SFTP piece to be uh far more mechanized. Mhm. For for which
[01:00:27] far more mechanized. Mhm. For for which part?
[01:00:28] part? >> In order to make the
[01:00:30] >> In order to make the uh collection like to in order to take
[01:00:32] uh collection like to in order to take the human out of the loop to ask for a
[01:00:33] the human out of the loop to ask for a task that is just like improve my model
[01:00:35] task that is just like improve my model quality with new data, right? Uh yes.
[01:00:40] quality with new data, right? Uh yes. Does that make sense to you? Like we um
[01:00:42] Does that make sense to you? Like we um if you can't have the model do the
[01:00:44] if you can't have the model do the training runs by itself, then your
[01:00:48] training runs by itself, then your ability to do this as a like closed loop
[01:00:50] ability to do this as a like closed loop task with uh by pricing data is um more
[01:00:54] task with uh by pricing data is um more challenged. Yes, yes, 100%. Yeah. But
[01:00:57] challenged. Yes, yes, 100%. Yeah. But now you do.
[01:00:57] now you do. >> The thing is for LLM training, it
[01:00:59] >> The thing is for LLM training, it actually is like very easily it like
[01:01:01] actually is like very easily it like really fits the paradigm. Mhm. Um so,
[01:01:03] really fits the paradigm. Mhm. Um so, you'd actually expect
[01:01:04] you'd actually expect >> metric. Yeah, like LLM training actually
[01:01:06] >> metric. Yeah, like LLM training actually fits the paradigm really well, really
[01:01:07] fits the paradigm really well, really easily. Like all the optimization of all
[01:01:09] easily. Like all the optimization of all the code and so, it runs faster. And
[01:01:11] the code and so, it runs faster. And then you also have like metrics that you
[01:01:12] then you also have like metrics that you can optimize against. I do think that if
[01:01:14] can optimize against. I do think that if you had an autonomous loop over those
[01:01:16] you had an autonomous loop over those metrics, there's going to be a lot of
[01:01:17] metrics, there's going to be a lot of like good herding going on where the
[01:01:18] like good herding going on where the system will like overfit to those
[01:01:20] system will like overfit to those metrics. And so, um but then you can use
[01:01:22] metrics. And so, um but then you can use the system to devise more metrics and
[01:01:23] the system to devise more metrics and you just have a really good coverage.
[01:01:25] you just have a really good coverage. So, it's kind of hard to tell, but um
[01:01:28] So, it's kind of hard to tell, but um in a certain sense it's like a pretty
[01:01:29] in a certain sense it's like a pretty pretty good fit. I want to talk about a
[01:01:31] pretty good fit. I want to talk about a little uh
[01:01:32] little uh tiny side project you have before we
[01:01:34] tiny side project you have before we end. Um tell me about the micro GPT
[01:01:36] end. Um tell me about the micro GPT arts. Oh, yeah.
[01:01:37] arts. Oh, yeah. Okay, so micro GPT. So, I have this like
[01:01:40] Okay, so micro GPT. So, I have this like running obsession of like maybe a decade
[01:01:41] running obsession of like maybe a decade or two of just like simplifying and
[01:01:43] or two of just like simplifying and boiling down the uh basically LLMs uh to
[01:01:46] boiling down the uh basically LLMs uh to like their bare essence. And I've had a
[01:01:48] like their bare essence. And I've had a number of projects along these lines.
[01:01:50] number of projects along these lines. So, like nano GPT and um make more and
[01:01:53] So, like nano GPT and um make more and uh micro GPT micro grad etc. So, I feel
[01:01:56] uh micro GPT micro grad etc. So, I feel like micro GPT is now the state of the
[01:01:58] like micro GPT is now the state of the art of me trying to like just boil it
[01:01:59] art of me trying to like just boil it down to just the essence. Because the
[01:02:01] down to just the essence. Because the thing is like training neural nets and
[01:02:03] thing is like training neural nets and LLMs specifically um is a huge amount of
[01:02:05] LLMs specifically um is a huge amount of code, but all of that code is actually
[01:02:07] code, but all of that code is actually complexity from efficiency. It's just
[01:02:09] complexity from efficiency. It's just because you need it to go fast. If you
[01:02:11] because you need it to go fast. If you don't need it to go fast and you just
[01:02:12] don't need it to go fast and you just care about the algorithm, then that
[01:02:14] care about the algorithm, then that algorithm actually is uh 200 lines of
[01:02:15] algorithm actually is uh 200 lines of Python, very simple to read. And this
[01:02:17] Python, very simple to read. And this includes comments and everything. Um
[01:02:19] includes comments and everything. Um because you just have like uh your data
[01:02:21] because you just have like uh your data set which is a text um and you need your
[01:02:23] set which is a text um and you need your neural network architecture which is
[01:02:24] neural network architecture which is like 50 lines. You need to do your
[01:02:26] like 50 lines. You need to do your forward pass and then you have to do
[01:02:28] forward pass and then you have to do your backward pass to calculate the
[01:02:29] your backward pass to calculate the gradients. And so, an auto grad engine
[01:02:31] gradients. And so, an auto grad engine uh to calculate the gradients like 100
[01:02:33] uh to calculate the gradients like 100 lines. And then you need an optimizer
[01:02:34] lines. And then you need an optimizer and Adam for example, uh which is a very
[01:02:36] and Adam for example, uh which is a very state of the art optimizer is like again
[01:02:38] state of the art optimizer is like again 10 lines, really. And so, putting
[01:02:40] 10 lines, really. And so, putting everything together in the training loop
[01:02:41] everything together in the training loop is like yeah, 200 lines. And what's
[01:02:44] is like yeah, 200 lines. And what's interesting to me like normally before
[01:02:46] interesting to me like normally before like maybe a year ago or more, if I had
[01:02:49] like maybe a year ago or more, if I had come up with micro GPT, I would be
[01:02:50] come up with micro GPT, I would be tempted to basically explain to people.
[01:02:52] tempted to basically explain to people. Like I have a video like stepping
[01:02:54] Like I have a video like stepping through it or something like that. Uh
[01:02:56] through it or something like that. Uh and I actually tried to make that video
[01:02:57] and I actually tried to make that video a little bit. And I tried to make like a
[01:02:59] a little bit. And I tried to make like a little guide to it and so on. But I kind
[01:03:01] little guide to it and so on. But I kind of realized that this is is not really
[01:03:03] of realized that this is is not really is not really adding too much because
[01:03:05] is not really adding too much because people cuz it's already so simple that
[01:03:06] people cuz it's already so simple that it's 200 lines that anyone could ask
[01:03:08] it's 200 lines that anyone could ask their agent to explain it in various
[01:03:09] their agent to explain it in various ways. And the agents like I'm not
[01:03:11] ways. And the agents like I'm not explaining to people anymore. I'm
[01:03:13] explaining to people anymore. I'm explaining it to agents. If you can
[01:03:14] explaining it to agents. If you can explain it to agents, then agents can be
[01:03:16] explain it to agents, then agents can be the router and they can actually target
[01:03:18] the router and they can actually target it to the human in their language uh
[01:03:20] it to the human in their language uh with infinite uh you know,
[01:03:22] with infinite uh you know, patience and uh just at their capability
[01:03:25] patience and uh just at their capability and so on. Right. If I don't understand
[01:03:27] and so on. Right. If I don't understand um this particular function, I can ask
[01:03:30] um this particular function, I can ask the agent to explain it to me like three
[01:03:31] the agent to explain it to me like three different ways and I'm not going to get
[01:03:32] different ways and I'm not going to get that from you. Exactly. And so, I kind
[01:03:34] that from you. Exactly. And so, I kind of feel like, you know, what is
[01:03:35] of feel like, you know, what is education? Like it used to be guides, it
[01:03:36] education? Like it used to be guides, it used to be lectures, it used to be this
[01:03:38] used to be lectures, it used to be this thing, but now I feel like now more I'm
[01:03:39] thing, but now I feel like now more I'm explaining things to agents and maybe
[01:03:41] explaining things to agents and maybe I'm coming up with skills uh where like
[01:03:44] I'm coming up with skills uh where like um
[01:03:45] um uh so, basically skill is just a way to
[01:03:47] uh so, basically skill is just a way to instruct the agent how to teach the
[01:03:48] instruct the agent how to teach the thing. So, maybe I could have a skill
[01:03:50] thing. So, maybe I could have a skill for micro GPT of the progression I
[01:03:52] for micro GPT of the progression I imagine the agent should take you
[01:03:53] imagine the agent should take you through if you're interested in
[01:03:54] through if you're interested in understanding the code base. And it's
[01:03:56] understanding the code base. And it's just like hints to the model to like uh
[01:03:58] just like hints to the model to like uh first start off with this and then with
[01:03:59] first start off with this and then with that. And so, I could just script the
[01:04:01] that. And so, I could just script the curriculum a little bit as a skill.
[01:04:03] curriculum a little bit as a skill. Uh so,
[01:04:04] Uh so, uh so, I I don't feel like um
[01:04:06] uh so, I I don't feel like um yeah, I feel like there's going to be
[01:04:07] yeah, I feel like there's going to be less of like explaining things directly
[01:04:09] less of like explaining things directly to people and it's going to be more of
[01:04:10] to people and it's going to be more of just like does the agent get it? And if
[01:04:12] just like does the agent get it? And if the agent gets it, they'll do the
[01:04:13] the agent gets it, they'll do the explanation. And we're not fully there
[01:04:16] explanation. And we're not fully there yet because they I still can I still
[01:04:17] yet because they I still can I still think I can probably explain things a
[01:04:19] think I can probably explain things a little bit better than the agents, but I
[01:04:20] little bit better than the agents, but I still feel like the models are improving
[01:04:21] still feel like the models are improving so rapidly that um
[01:04:24] so rapidly that um I feel like it's a losing battle to some
[01:04:26] I feel like it's a losing battle to some to some extent.
[01:04:28] to some extent. Um and so, I think education is going to
[01:04:30] Um and so, I think education is going to be kind of like reshuffled by this quite
[01:04:32] be kind of like reshuffled by this quite substantially uh where it's the end of
[01:04:34] substantially uh where it's the end of like teaching each other things a little
[01:04:36] like teaching each other things a little bit like if I have a um library for
[01:04:39] bit like if I have a um library for example of code or something like that.
[01:04:40] example of code or something like that. It used to be that you have
[01:04:41] It used to be that you have documentation for other people who are
[01:04:42] documentation for other people who are going to use your library, but like you
[01:04:44] going to use your library, but like you shouldn't do that anymore. Like you
[01:04:45] shouldn't do that anymore. Like you should have instead of HTML documents
[01:04:47] should have instead of HTML documents for humans, you have markdown documents
[01:04:48] for humans, you have markdown documents for agents. Cuz if agents get it, then
[01:04:50] for agents. Cuz if agents get it, then they can just explain all the different
[01:04:51] they can just explain all the different parts of it. So, it's this redirection
[01:04:54] parts of it. So, it's this redirection through agents, you know?
[01:04:55] through agents, you know? Um and that's why. So, I think we're
[01:04:57] Um and that's why. So, I think we're going to see a lot more of that playing
[01:04:59] going to see a lot more of that playing out. Well, we'll see if the great
[01:05:01] out. Well, we'll see if the great teachers know like to develop intuition
[01:05:03] teachers know like to develop intuition for how to explain things to agents
[01:05:05] for how to explain things to agents differently.
[01:05:05] differently. >> ultimately, so for example, micro GPT,
[01:05:07] >> ultimately, so for example, micro GPT, like I asked I tried to get an agent to
[01:05:09] like I asked I tried to get an agent to write micro GPT. So, I told it like try
[01:05:11] write micro GPT. So, I told it like try to boil down the simplest things. Like
[01:05:14] to boil down the simplest things. Like try to boil down my um neural network
[01:05:16] try to boil down my um neural network training to the simplest thing and it
[01:05:16] training to the simplest thing and it can't do it. Like micro GPT is like my
[01:05:20] can't do it. Like micro GPT is like my is it's like my end of my obsession.
[01:05:22] is it's like my end of my obsession. It's the 200 lines. I thought about this
[01:05:24] It's the 200 lines. I thought about this for a long time. I was obsessed about
[01:05:26] for a long time. I was obsessed about this for a long time. This is this is
[01:05:27] this for a long time. This is this is the solution. Trust me, it can't get
[01:05:29] the solution. Trust me, it can't get simpler. And this is this is my value
[01:05:31] simpler. And this is this is my value add. Everything else like agent gets it.
[01:05:33] add. Everything else like agent gets it. It just can't come up with it, but it
[01:05:34] It just can't come up with it, but it totally gets it and understands why it's
[01:05:36] totally gets it and understands why it's done in a certain way etc. Uh so, like
[01:05:38] done in a certain way etc. Uh so, like my contribution is kind of like these
[01:05:40] my contribution is kind of like these few bits, but everything else in terms
[01:05:42] few bits, but everything else in terms of like the education that goes on after
[01:05:44] of like the education that goes on after that is like not my domain anymore.
[01:05:47] that is like not my domain anymore. So, maybe
[01:05:48] So, maybe yeah, it's like education kind of
[01:05:49] yeah, it's like education kind of changes in those ways where you kind of
[01:05:50] changes in those ways where you kind of have to infuse the few bits that you
[01:05:52] have to infuse the few bits that you feel strongly about the curriculum or
[01:05:54] feel strongly about the curriculum or the the best the better way of
[01:05:56] the the best the better way of explaining it or something like that.
[01:05:57] explaining it or something like that. The things that agents can't do is your
[01:05:58] The things that agents can't do is your job now. The things that agents can do,
[01:06:01] job now. The things that agents can do, they can probably do better than you or
[01:06:02] they can probably do better than you or like very soon. And so, you should um be
[01:06:05] like very soon. And so, you should um be strategic about what you're actually
[01:06:06] strategic about what you're actually spending time on. Well, we appreciate
[01:06:08] spending time on. Well, we appreciate the few bits.
[01:06:09] the few bits. Thank you, Andre.
[01:06:10] Thank you, Andre. Okay.
[01:06:13] Find us on Twitter at No Priors Pod.
[01:06:15] Find us on Twitter at No Priors Pod. >> [music]
[01:06:15] >> [music] >> Subscribe to our YouTube channel if you
[01:06:17] >> Subscribe to our YouTube channel if you want to see our faces. Follow the show
[01:06:19] want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever
[01:06:22] on Apple Podcasts, Spotify, or wherever you listen. [music] That way you get a
[01:06:23] you listen. [music] That way you get a new episode every week. And sign up for
[01:06:25] new episode every week. And sign up for emails or find transcripts for every
[01:06:26] emails or find transcripts for every episode at no-priors.com.

Full Transcript (Bilingual)

https://www.youtube.com/watch?v=kwSVtQ7dziU
Translation: zh-CN

[00:00] Code's not even the right verb anymore,
代码甚至不再是正确的动词了，

[00:01] Code's not even the right verb anymore, right?
代码甚至不再是正确的动词了，对吧？

[00:03] But I have to express my will to my agents for 16 hours a day.
但我必须每天向我的代理表达我的意愿16个小时。

[00:07] Manifest.
显化。

[00:09] How can I have not just a single session of Claude code or Codex or some of these agent harnesses?
我怎么才能不仅仅是Claude代码或Codex或这些代理工具的一个会话？

[00:12] How can I have more of them?
我怎么才能拥有更多这样的工具？

[00:14] How can I do that appropriately?
我该如何恰当地做到这一点？

[00:16] The agent part is now taken for granted.
代理部分现在被认为是理所当然的。

[00:18] Now the claw-like entities are taken for granted and now you can have multiple of them and now you can have instructions to them and now you can have optimization over the instructions.
现在，像爪子一样的实体被认为是理所当然的，现在你可以拥有多个这样的实体，然后你可以给它们下达指令，然后你可以在指令上进行优化。

[00:24] But there >>
但是那里>>

[00:24] I mean this is why it gets to the psychosis is that this is like infinite and everything is a skill issue.
我的意思是，这就是为什么它会变得精神错乱，因为这是无限的，而一切都是技能问题。

[00:34] Hi listeners, welcome back to No Priors.
嗨，听众朋友们，欢迎回到 No Priors。

[00:37] Today I'm here with Andre Karpathy and we have a wide-ranging conversation for you about code agents, the future of engineering and AI research, how more people can contribute to research, what's happening in robotics, his prediction for how agents can reach out into the real world, and education in this next age.
今天我和 Andre Karpathy 在一起，我们将为您带来一场关于代码代理、工程和人工智能研究的未来、更多人如何为研究做出贡献、机器人领域正在发生什么、他对代理如何触及现实世界的预测以及这个新时代的教育的广泛对话。

[00:53] Welcome, Andre.
欢迎，Andre。

[00:55] Andre, thanks for doing this.
Andre，谢谢你做这个。

[00:56] Yeah, thank you for having me.
是的，谢谢你的邀请。

[00:57] Uh so it's been a very exciting couple
呃，所以这是一个非常激动人心的几个

[01:01] Uh so it's been a very exciting couple of months in AI.
嗯，所以人工智能领域度过了几个非常激动人心的月份。

[01:03] Uh yeah, you could say that.
嗯，是的，你可以这么说。

[01:07] I remember um walking into the office at some point and you were like really locked in and I was asking what you were up to and you're like, I just I have to code for 16 hours a day or code's not even the right verb anymore, right?
我记得嗯，有一次我走进办公室，你好像很专注，我问你在忙什么，你说，我必须每天写16个小时的代码，或者说“写代码”这个词本身已经不恰当了，对吧？

[01:15] But I have to um express my will to my agents for 16 hours a day.
但我必须嗯，每天向我的代理表达我的意愿16个小时。

[01:21] Manifest um because like there's been a jump in capability.
显现嗯，因为能力方面有了一个飞跃。

[01:25] Uh what's happening?
嗯，发生了什么？

[01:26] Tell me about your experience.
告诉我你的经历。

[01:28] Yeah, I kind of feel like I was just in this perpetual I still am often in this state of AI psychosis just like all the time um because there was a huge unlock in what you can achieve as a person as an individual, right?
是的，我感觉我好像一直处于这种持续的，我仍然经常处于这种人工智能精神错乱的状态，就像一直那样嗯，因为在你能作为一个个体取得的成就方面有一个巨大的突破，对吧？

[01:37] Because you were bottlenecked by, you know, your typing speed and so on.
因为你受到瓶颈的限制，你知道，你的打字速度等等。

[01:41] But now with these agents it really, I would say in December is when it really just something flipped where I kind of went from 80/20 of like, you know, uh to like 20/80 of writing code by myself versus just delegating to agents.
但现在有了这些代理，我可以说，在12月份，一切真的发生了转变，我从大约80/20，你知道，嗯，变成了20/80，即自己写代码和委托给代理。

[01:53] And I don't even think it's 20/80 by now.
而且我现在甚至觉得都不是20/80了。

[01:54] I think it's a lot more than that.
我认为远不止于此。

[01:55] I don't think I've typed like a line of code probably since December basically.
我想我大概从12月以来就没有敲过一行代码了。

[02:00] [laughter]
[笑声]

[02:03] Um which is like an extremely large uh change.
嗯，这是一个非常巨大的变化。

[02:05] Um I was talking to it like uh change.
嗯，我当时在跟它说，比如变化。

[02:07] Um I was talking to it like for example, I was talking about it to for example my parents and so on and I
嗯，我当时在跟它说，比如，我跟我的父母等等在说这件事，然后我

[02:09] for example my parents and so on and I don't think like a normal person
比如我的父母等等，然后我不认为像普通人

[02:10] don't think like a normal person actually realizes that this happened or
普通人会意识到发生了这件事，或者

[02:11] actually realizes that this happened or how dramatic it was.
意识到发生了这件事有多么戏剧性。

[02:13] Like literally like if you just find a random software
字面意思就是，如果你随便找一个软件

[02:15] if you just find a random software engineer or something like that at their
如果你随便找一个软件工程师或者类似的人在他们的

[02:16] engineer or something like that at their at their desk and what they're doing,
工程师或者类似的人在他们的办公桌前，他们在做什么，

[02:17] at their desk and what they're doing, like their default workflow of, you
在他们的办公桌前，他们在做什么，比如他们默认的工作流程，你

[02:20] like their default workflow of, you know, building software is completely
比如他们默认的工作流程，你知道，构建软件，是完全

[02:22] know, building software is completely different as of basically December.
你知道，构建软件，从基本上十二月开始就完全不同了。

[02:24] different as of basically December.
不同了，从基本上十二月开始。

[02:26] Uh so I'm just like in this state of psychosis of trying to figure out like
呃，所以我现在就像处于一种精神错乱的状态，试图弄清楚

[02:28] psychosis of trying to figure out like what's possible, uh trying to push it to
精神错乱的状态，试图弄清楚什么是有可能的，呃，试图把它推向

[02:30] what's possible, uh trying to push it to the limit.
什么是有可能的，呃，试图把它推向极限。

[02:31] How is it how can I have not just a single session of, you know, um
它是如何的，我怎么才能不止一次地，你知道，嗯

[02:33] just a single session of, you know, um Claude code or Codex or some of these
一次会话，你知道，嗯，Claude代码或Codex或这些

[02:35] Claude code or Codex or some of these agent harnesses?
Claude代码或Codex或这些代理工具？

[02:36] How can I have more of them?
我怎么才能拥有更多它们？

[02:38] How can I do that uh appropriately?
我怎么才能适当地做到这一点？

[02:40] And then how can I use these claws?
然后我怎么才能使用这些爪子？

[02:43] What are these claws?
这些爪子是什么？

[02:45] Uh and uh so there's like a lot of new things.
呃，所以有很多新事物。

[02:46] things. I want to be at the forefront of it, you know, and I'm very
事物。我想走在最前沿，你知道，我非常

[02:48] it, you know, and I'm very antsy that I'm not at the forefront of
它，你知道，我非常焦急我没有走在最前沿

[02:49] antsy that I'm not at the forefront of it and I see lots of people on Twitter
焦急我没有走在最前沿，我看到很多人在推特上

[02:51] it and I see lots of people on Twitter doing all kinds of things and they all
它，我看到很多人在推特上做各种各样的事情，他们都

[02:52] doing all kinds of things and they all sound like really good ideas and I need
做各种各样的事情，他们听起来都像是很棒的主意，我需要

[02:53] sound like really good ideas and I need to be at the forefront or I feel
听起来像是很棒的主意，我需要走在最前沿，否则我会感到

[02:54] to be at the forefront or I feel extremely nervous.
走在最前沿，否则我会感到非常紧张。

[02:56] extremely nervous. And so I guess I'm just in this psychosis of like what's
非常紧张。所以我想我只是处于一种精神错乱的状态，比如什么

[02:58] just in this psychosis of like what's possible like because it's unexplored
处于一种精神错乱的状态，比如什么是有可能的，因为它是未知的

[03:00] possible like because it's unexplored fundamentally.
有可能是未知的，从根本上来说。

[03:01] fundamentally. Well, if you're nervous, the rest of us are are nervous. We have
从根本上来说。好吧，如果你紧张，我们其他人也很紧张。我们有

[03:03] The rest of us are are nervous.
我们其余的人都很紧张。

[03:05] We have a we have a team that we work with at Conviction that their setup is everybody is like, you know, none of the engineers write code by hand and they they're all microphoned and they just like whisper to their agents all the time.
我们有一个团队，我们在 Conviction 工作，他们的设置是这样的，你知道，没有一个工程师是手工写代码的，他们都带着麦克风，而且他们只是整天低声对他们的代理说话。

[03:16] It's the strangest work setting ever.
这是最奇怪的工作环境了。

[03:18] Uh and I thought they were crazy and now I like I fully accept I was like, oh this was the way.
呃，我以为他们疯了，现在我完全接受了，我说，哦，这就是方法。

[03:23] Like you're just ahead of it.
就像你已经领先一步了。

[03:26] Um what uh how do you think about your own capacity now to like explore or to do projects?
嗯，你现在如何看待自己探索或做项目的能力？

[03:30] Like what is it limited by?
它受什么限制？

[03:32] Yeah, what is it limited by?
是的，它受什么限制？

[03:34] Uh just I think everything like so many things even if they don't work, I think to a large extent you feel like it's a skill issue.
呃，我只是觉得很多事情，即使它们不起作用，在很大程度上你都会觉得是技能问题。

[03:39] It's not that the capability is not there.
不是说能力不在那里。

[03:41] It's that you just haven't found a way to string it together of what's available.
而是你还没有找到一种方法将可用的东西串联起来。

[03:44] Like I just don't I didn't give good enough instructions in the agents from the file or whatever it may be.
就像我只是没有，我没有给文件中的代理提供足够好的指示，或者其他任何东西。

[03:51] I don't have a nice enough memory tool that I put in there or something like that.
我没有一个足够好的内存工具放在里面，或者类似的东西。

[03:53] So it all kind of feels like skill issue when it doesn't work to some extent.
所以当它在某种程度上不起作用时，感觉都像是技能问题。

[03:56] You want to see how you can parallelize them etc. and you want to be Peter Steinberg basically.
你想看看如何并行化它们等等，你基本上想成为 Peter Steinberg。

[04:02] Uh so Peter is famous. He has a funny photo
呃，所以 Peter 很出名。他有一张有趣的合影

[04:03] So Peter is famous.
所以彼得很有名。

[04:05] He has a funny photo where he's in front of a monitor with lots of uh like he uses Codex.
他有一张有趣的 এটা，他站在一个显示器前，上面有很多，就像他使用 Codex 一样。

[04:07] So lots of Codex agents tiling the the monitor.
所以很多 Codex 代理平铺在显示器上。

[04:10] And they all take about 20 minutes if you prompt them correctly and use the high effort.
如果你正确地提示它们并使用高强度，它们都需要大约 20 分钟。

[04:12] And so they all take about 20 minutes.
所以它们都需要大约 20 分钟。

[04:14] They have multiple, you know, 10 repos checked out.
它们有多个，你知道，签出了 10 个仓库。

[04:15] And so he's just um going between them and giving them work.
所以他只是在它们之间切换并给它们分配工作。

[04:18] It's just like you can you can you can move in much larger macro actions.
这就像你可以，你可以，你可以进行更大的宏观操作。

[04:20] It's not just like here's a line of code, here's a new function.
它不仅仅是“这里有一行代码，这里有一个新函数”。

[04:22] It's like here's a new functionality and delegate it to agent one.
这就像“这里有一个新功能，将其委托给代理一”。

[04:24] Here's a new functionality that's not going to interfere with the other one.
这里有一个新功能，它不会干扰另一个。

[04:25] Give it agent two.
将其交给代理二。

[04:27] And then try to uh review their work as best as you can.
然后尽力审查它们的工作。

[04:29] >> [laughter]
>> [笑声]

[04:30] >> depending on how much you care about that code.
>> 取决于你对那段代码有多在意。

[04:31] Like where are these macro actions that I can like manipulate my software repository by?
比如，我可以通过哪些宏观操作来操作我的软件仓库？

[04:32] And like another agent is doing some like research, another agent is writing code, another one is coming up with a plan for some new implementation.
然后另一个代理正在做一些研究，另一个代理正在写代码，另一个正在为一些新实现制定计划。

[04:35] And so everything is just like happens in these like macro actions over your repository.
所以一切都发生在你仓库的这些宏观操作中。

[04:37] Um and you're just trying to become like really good at it and develop like a muscle memory for it is extremely um.
嗯，你只是想变得非常擅长它，并为此培养肌肉记忆，这极其地嗯。

[04:38] Yeah, it's very rewarding number one because it actually works.
是的，它非常有益，首先因为它确实有效。

[04:40] Uh but it's also kind of like the new thing to.
嗯，但它也像是新的事物。

[05:04] also kind of like the new thing to learn.
也算是新学的知识吧。

[05:06] So that's why hence the psychosis.
所以这就是为什么会出现精神病。

[05:07] Yeah, I I do feel like my instinct is like whenever I'm waiting for an agent to complete something, the obvious thing to do is like, well, I can do more work, right?
是的，我确实觉得我的本能是，每当我等待一个代理完成某事时，显而易见的事情就是，嗯，我可以做更多的工作，对吧？

[05:17] Like if I have access to more tokens then like I should just parallelize at tasks.
就像如果我有更多的代币，那么我就应该并行处理任务。

[05:21] And so that's that's very stressful because if you don't feel very bounded by your ability to spend on tokens, then you know, you are the bottleneck in the system that is max capability.
所以这非常令人紧张，因为如果你不觉得自己的代币花费能力受到很大限制，那么你知道，你就是最大能力系统中的瓶颈。

[05:31] Yeah, if you're not maximizing your subscription at least.
是的，至少如果你没有最大化你的订阅。

[05:34] And ideally for multiple agents.
而且最好是针对多个代理。

[05:36] Like if you run out of the quota on Codex, you should switch to Claude or whatnot.
就像如果你用完了Codex的配额，你应该切换到Claude或其他。

[05:39] I don't know.
我不知道。

[05:40] Like that's what I've been trying to do a little bit and I feel nervous when I have subscription left over.
就像我一直在尝试做一点点，当我还有剩余的订阅时，我感到紧张。

[05:45] That just means I haven't maximized my token throughput.
这仅仅意味着我没有最大化我的代币吞吐量。

[05:47] So I actually kind of experienced this when I was a PhD student.
所以我实际上在读博士的时候就经历过这种情况。

[05:49] You would feel nervous when your GPUs are not running.
当你的GPU没有运行时，你会感到紧张。

[05:51] Like you have GPU capability and you're not maximizing your the available flops to you.
就像你拥有GPU能力，却没有最大化你可用的浮点运算能力。

[05:55] But now it's not about flops, it's about tokens.
但现在不是关于浮点运算，而是关于代币。

[05:57] So what is your token throughput and what token throughput do you command?
那么你的代币吞吐量是多少，你又掌握着什么样的代币吞吐量呢？

[06:01] I would actually argue that it's very interesting that we had, you know, at
我实际上会认为，我们拥有，你知道，在

[06:05] interesting that we had, you know, at least 10 years where in many engineering tasks people just did they didn't feel compute bound.
有趣的是，我们有，你知道，至少10年的时间，在许多工程任务中，人们只是觉得计算能力没有达到瓶颈。

[06:12] Right? Um and now the entire industry feels that now.
对吧？嗯，现在整个行业都有这种感觉了。

[06:14] They feel like they they they felt resource bound uh and now that you have this big capability jump, you're like, oh, actually it's not, you know, my ability to access the computer anymore.
他们觉得他们他们觉得资源受限，呃，现在有了这个巨大的能力飞跃，你会觉得，哦，实际上不是，你知道，我访问计算机的能力 anymore。

[06:26] Like I'm I'm the binding constraint. Yeah, it's a skill issue.
就像我，我才是限制因素。是的，这是一个技能问题。

[06:28] Which is very empowering cuz um yeah, cuz you could be getting better.
这非常有赋权，因为嗯，是的，因为你可以变得更好。

[06:30] So that's why that's why I think it's very addictive because there's unlocks when you when you get better.
所以这就是为什么我认为它非常令人上瘾，因为当你变得更好时，会有解锁。

[06:36] Where do you think it goes? Like if you just think about like, okay, you know, Andre's iterating and everybody else is for 16 hours a day getting better at using coding agents. Like what does it look like in a year?
你认为它会走向何方？就像如果你只是想想，好吧，你知道，安德烈在迭代，而其他人每天花16个小时来提高使用编码代理的技能。一年后会是什么样子？

[06:46] Of like you've reached mastery.
就像你已经达到了精通。

[06:49] Yeah, what does mastery look like, right? At the end of the year or like two, three years, five years, 10 years, etc.
是的，精通是什么样的，对吧？在年底，或者两三年、五年、十年等等。

[06:54] Well, I think everyone is basically interested in like going up the stack.
嗯，我认为每个人基本上都对往上层发展感兴趣。

[06:57] So I would say it's yeah, it's not about a single session with your agent.
所以我想说，是的，这与与你的代理进行单次会话无关。

[06:59] Multiple agents, how do they collaborate and teams and so on.
多个代理，它们如何协作以及团队等等。

[07:03] So everyone's trying to figure out what that looks
所以每个人都在试图弄清楚那是什么样的

[07:06] trying to figure out what that looks like.
试图弄清楚那是什么样子。

[07:07] And then I would say Claude is like.
然后我会说 Claude 就像。

[07:08] And then I would say Claude is also kind of an interesting direction also kind of an interesting direction because it really, when I say a Claude,
然后我会说 Claude 也是一种有趣的方向，也是一种有趣的方向，因为真的，当我提到 Claude 时，

[07:12] because it really, when I say a Claude, I mean this like layer that kind of takes persistence to a whole new level.
因为真的，当我提到 Claude 时，我指的是这种将持久性提升到一个全新水平的层。

[07:15] takes persistence to a whole new level. Like it's something that like keeps looping.
将持久性提升到一个全新水平。就像它会不断循环一样。

[07:17] looping. It's it's like um it's not something that you are interactively in the middle of.
循环。它就像，嗯，它不是你正在交互式地进行的事情。

[07:20] it's not something that you are interactively in the middle of. It kind of like has its own little sandbox, its
它不是你正在交互式地进行的事情。它有点像拥有自己的小沙盒，它的

[07:22] of like has its own little sandbox, its own little
就像拥有自己的小沙盒，它自己的小

[07:23] own little you know, it kind of like does stuff on your behalf even if you're not looking
你自己的，你知道，它有点像在你不在的时候也会为你做事情

[07:26] your behalf even if you're not looking kind of thing.
代表你，即使你没在看，之类的。

[07:27] kind of thing. Um and then also has like maybe more sophisticated memory systems etc. that
之类的。嗯，然后还有可能更复杂的内存系统等，这些

[07:30] Um and then also has like maybe more sophisticated memory systems etc. that are not yet implemented in agents.
嗯，然后还有可能更复杂的内存系统等，这些在代理中尚未实现。

[07:33] are not yet implemented in agents. So um Open Claude has a lot more sophisticated memory I would say than
在代理中尚未实现。所以嗯，我得说，Open Claude 比

[07:35] um Open Claude has a lot more sophisticated memory I would say than what you would get by default uh which is just a memory compaction when your context runs out, right?
嗯，Open Claude 拥有比默认情况下你获得的更复杂的内存，嗯，当你的上下文用完时，这只是一个内存压缩，对吧？

[07:38] is just a memory compaction when your context runs out, right? You think that's the piece that resonated for more users versus like perhaps like broader tool access?
只是一个内存压缩，当你的上下文用完时，对吧？你认为那是引起更多用户共鸣的部分，而不是像更广泛的工具访问？

[07:44] users versus like perhaps like broader tool access? For Open Claude? Yeah.
用户，而不是像更广泛的工具访问？对于 Open Claude？是的。

[07:46] Uh there's like I think there's at least five things that are really good ideas in here.
呃，我认为这里至少有五个非常好的想法。

[07:51] in here. Yeah, good job, Peter. I mean Peter has done a really amazing job.
在这里。是的，干得好，Peter。我是说 Peter 做得非常出色。

[07:53] Peter has done a really amazing job. Um I saw him recently. Uh and I talked to him about it and I he's very humble about it.
Peter 做得非常出色。嗯，我最近见到他了。呃，我和他谈了谈，他对此非常谦虚。

[07:57] him about it and I he's very humble about it. But I think he
和他谈了谈，他对此非常谦虚。但我认为他

[08:11] innovated simultaneously in like five different ways and put it all together.
以五种不同的方式同时进行创新，并将它们整合在一起。

[08:16] Um so for example like the soul and D document.
嗯，所以举个例子，比如 Soul and D 文档。

[08:18] Like he actually really crafted a personality that is kind of compelling and interesting.
他实际上真的塑造了一个既有吸引力又有趣的角色。

[08:20] And I feel like a lot of the current agents they don't get this correctly.
我觉得很多现在的代理程序都不能正确地做到这一点。

[08:23] I actually think a Claude has a pretty good personality.
我 Actually 认为 Claude 有一个相当不错的个性。

[08:24] It feels like a teammate uh and it's excited with you etc.
它感觉像一个队友，和你一起兴奋等等。

[08:29] I would say um for example Codex is a lot more dry um which is kind of interesting because [laughter] in it's true.
我想说，嗯，举个例子，Codex 就显得更加枯燥，嗯，这很有趣，因为 [笑声] 确实如此。

[08:36] You know, it doesn't it and the other thing I would say is for example with Claude I think they dialed the sycophancy fairly well where when Claude gives me praise, I do feel like I slightly deserve it because sometimes I kind of give it like not very well formed thoughts and uh I give it an idea that I don't think it's fully baked and it doesn't actually react very strongly.
你知道，它不会，而且我还要说的另一件事是，例如，对于 Claude，我认为他们相当好地控制了奉承的程度，当 Claude 称赞我时，我确实觉得我稍微配得上，因为有时我给出的想法不太成熟，嗯，我给出的想法我认为还没有完全成型，而它实际上并没有做出非常强烈的反应。

[08:51] It's like, oh yeah, we can implement that.
它会说，哦，是的，我们可以实现它。

[08:54] But when it's a really good idea by my own account, it does uh seem to reward it a bit more.
但当这是一个非常好的想法时，据我自己的判断，它确实会给予更多的奖励。

[08:58] And so I kind of feel like I'm trying to like earn its praise which is really weird.
所以我觉得我好像在努力赢得它的赞扬，这真的很奇怪。

[09:01] And so I do think the personality matters a lot uh and I think a lot of the other uh tools maybe don't appreciate it as much.
所以我确实认为个性很重要，嗯，我认为很多其他的工具可能不太重视这一点。

[09:06] And I think in this aspect also Peter really cares about this and so that was correct.
而且我认为在这方面彼得也很关心这一点，所以这是正确的。

[09:09] And then the memory system and then uh just, you know, he's just having
然后是记忆系统，然后是嗯，你知道，他只是在拥有

[09:13] then uh just, you know, he's just having fun with this um and then the the single fun with this um and then the the single WhatsApp portal to all of the WhatsApp portal to all of the automation.
然后，嗯，你知道，他只是在玩这个，嗯，然后那个那个单一的WhatsApp门户连接到所有的自动化。

[09:17] automation. >> Yeah. Is there something that you have done personally with your claws beyond software engineering that you think is fun or interesting?
自动化。>> 是的。你有没有在软件工程之外，用你的爪子亲自做过什么有趣或好玩的事情？

[09:26] Yeah, so in January I had a claw I went through a period of claw psychosis.
是的，所以在一月份，我经历了一个爪子精神病的时期。

[09:28] So I built um I have a claw basically that takes care of my home and I call him Dobby the elf uh claw.
所以我建立了一个爪子，基本上负责打理我的家，我叫他小精灵多比，嗯，爪子。

[09:35] Um and uh basically I used uh the agents to find all of the smart home subsystems of my home on the local area network which I was kind of surprised that it worked out of the box.
嗯，基本上我使用了代理来查找我家中本地网络上的所有智能家居子系统，我有点惊讶它开箱即用。

[09:45] Like I just told it that I think I have Sonos at home.
就像我告诉它我家里有Sonos一样。

[09:46] Like can you try to find it? And it goes and it did like IP scan of all of the um basically um computers on the local area network and and found the Sonos thing uh the Sonos uh, system and it turned out that there's no password protection or anything like that.
就像你能找到它吗？然后它就去扫描了本地网络上的所有计算机，找到了Sonos的东西，嗯，Sonos系统，结果发现没有任何密码保护之类的东西。

[10:00] It just logged in and it's like, "Oh, yeah, you have these Sonos systems installed.
它只是登录了，然后说，“哦，是的，你安装了这些Sonos系统。

[10:01] I Let me try to reverse engineer how it's working."
我，让我试试逆向工程看看它是怎么工作的。”

[10:06] It does some web searches and it finds like, "Okay, these are the API endpoints."
它进行了一些网络搜索，发现，“好的，这些是API端点。”

[10:08] And then it's like, "Do you want to try it?"
然后它说，“你想试试吗？”

[10:10] And I'm like, "Whoa, like you just did that."
我当时想，“哇，你刚才就做到了。”

[10:11] And I'm like, "Yeah, can you try to play something in
我当时想，“是的，你能播放点什么吗？

[10:13] Yeah, can you try to play something in the study?
是的，你能试着在书房里放点音乐吗？

[10:16] And, uh, it does and music comes out and I'm like, "I can't believe I just That's crazy.
然后，呃，它做到了，音乐响了起来，我当时想，“我真不敢相信，这太疯狂了。

[10:19] That's like three prompts. Yeah.
这就像三个提示。是的。

[10:20] I can't believe I just typed in like, "Can you find my Sonos?"
我真不敢相信我刚才输入了，比如，“你能找到我的Sonos吗？”

[10:22] And then suddenly it's playing music.
然后突然它就在播放音乐了。

[10:23] And it did the same for lights.
它对灯光也做了同样的事情。

[10:26] And so like it kind of hacked in, figured out the whole thing, uh, created APIs, created dashboard so I could see the command, uh, kind of center of like all of my lights in the home.
所以它有点像入侵了，弄清楚了整个事情，呃，创建了API，创建了仪表板，这样我就可以看到命令，呃，就像我家里所有灯光的中心。

[10:33] And then it was like switching lights on and off and, you know, so I can ask it like, "Dobby, it's sleepy time."
然后就像开关灯一样，你知道，所以我可以问它，“多比，该睡觉了。”

[10:37] And when it's sleepy time that just means all the lights go off, etc. and like so on.
当该睡觉的时候，那意味着所有的灯都熄灭了，等等，以此类推。

[10:40] So it controls all of my lights, my HVAC, my shades, uh, the pool and, uh, the spa and also my security system.
所以它控制了我所有的灯，我的暖通空调，我的百叶窗，呃，泳池和，呃，水疗中心，还有我的安保系统。

[10:47] So I have a camera pointed outside of the house and anytime someone rolls in I have a Quinn, uh, a Quinn, uh, model that looks at the videos.
所以我有一个摄像头对着房子外面，任何时候有人进来，我都有一个Quinn，呃，一个Quinn，呃，模型来查看视频。

[10:55] So first of all there's change detection.
所以首先有变化检测。

[10:58] Right. And then based on change detection it goes to Quinn and then it actually like tells me, um, it sends me a text to my WhatsApp.
对。然后基于变化检测，它会去Quinn，然后它实际上会告诉我，嗯，它会给我发一条WhatsApp消息。

[11:03] It shows an image from the outside and it says, "Hey, a FedEx truck just pulled up.
它会显示一张外面的图片，并说，“嘿，一辆联邦快递卡车刚停下。

[11:09] FedEx truck just pulled up and you might want to check it and you got new mail or something like that."
联邦快递卡车刚停下，你可能想去看看，你收到了新邮件之类的。”

[11:12] And Dobby just text me this. This
Dobby就这样给我发了短信。这

[11:14] that. And Dobby just text me this.
是的。然后 Dobby 给我发了短信。

[11:14] This is really incredible.
这真的很不可思议。

[11:17] Um, so so Dobby is in charge of the house.
嗯，所以 Dobby 负责管理房子。

[11:19] I text through with it through WhatsApp, um,
我通过 WhatsApp 和它进行文本交流，嗯，

[11:21] and it's been like really fun to have these macro actions that maintain my house.
拥有这些能维护我家的宏操作真的很有趣。

[11:23] I haven't like really pushed it, uh, like way more beyond that and I think people are doing a lot more crazy things with it, uh, but for me even just the home automation setup I used to use like six apps, uh, completely different apps and I don't have to use these apps anymore.
我还没有真正地把它推向更远，嗯，但我觉得人们用它做了很多更疯狂的事情，嗯，但对我来说，即使是以前我用来设置家庭自动化，需要六个完全不同的应用程序，现在我不用再用这些应用程序了。

[11:24] Like Dobby controls everything in natural language.
比如 Dobby 用自然语言控制一切。

[11:26] It's amazing.
这太棒了。

[11:28] Um, and so I think like I haven't even pushed the paradigm fully but already that is so helpful and so inspiring I would say.
嗯，所以我觉得我还没有完全推动这个范式，但已经非常有帮助和鼓舞人心了，我想说。

[11:29] Do you think that's indicative of like what people want from a user experience perspective with software, right?
你认为这是否表明了人们从软件的用户体验角度想要什么，对吧？

[11:30] Because I I don't think, you know, it's pretty ignored that it takes humans effort to like learn new software, like new UI.
因为我不认为，你知道，人们普遍忽视了学习新软件，比如新的用户界面，需要付出人类的努力。

[11:32] Yeah.
是的。

[11:34] I think, uh, to some extent that's right.
我认为，嗯，在某种程度上是正确的。

[11:35] It's like working backwards from how people think an AI should be because what people have in their mind of like what an AI is is not actually what an LLM is by by like in the raw sense.
这就像是从人们认为 AI 应该是什么样子的反向工作，因为人们脑海中对 AI 的认知，实际上并不是 LLM 本身在原始意义上的样子。

[11:36] Like LLM is a token generator, you know, like more tokens come out.
比如 LLM 是一个 token 生成器，你知道，就是生成更多的 token。

[11:39] But what they think of is like this this persona identity that they can tell
但他们想到的是一种他们可以与之交流的身份角色。

[12:16] this persona identity that they can tell stuff and it remembers it, you know?
这种人格身份，他们可以讲述事物，它会记住，你知道吗？

[12:18] stuff and it remembers it, you know? And, uh, it's just kind of an entity behind the WhatsApp.
事物，它会记住，你知道吗？而且，呃，它就像是 WhatsApp 背后的一个实体。

[12:21] It's like a lot more understandable. Mhm. Uh, so I think to some extent it's like matching the expectations that humans already have for what an AI should behave but under the hood it's like a lot of technical details go into that.
它更易于理解。嗯。呃，所以我想在某种程度上，它就像是符合人类对 AI 行为的期望，但在底层，有很多技术细节支撑着。

[12:30] And LLMs are too raw of a primitive, uh, to actually, um, type check as AI I think for most people if that makes sense.
而 LLM 是过于原始的原始，呃，实际上，嗯，我认为对大多数人来说，作为 AI 进行类型检查是没有意义的，如果这说得通的话。

[12:36] Yeah. Um, I think that's like how we understand what the AI is and like the, um, description of it as Dobby or some persona obviously resonates with people.
是的。嗯，我认为这就是我们理解 AI 是什么，以及像，嗯，将其描述为多比或某种人格，显然能引起人们的共鸣。

[12:48] Um, I also think that it it uh, the unification that you did across your six different software systems for your home automation speaks to a different question of like do people really want all of the software that we have today?
嗯，我还认为，你对你的六个不同的家庭自动化软件系统进行的统一，提出了一个不同的问题，那就是人们是否真的想要我们今天拥有的所有软件？

[12:59] Yeah. Right? Um, because I I would argue like, well, you have the hardware but you've now thrown away the software or the UX layer of it.
是的。对吧？嗯，因为我会争辩说，嗯，你有硬件，但你现在已经抛弃了软件或其用户体验层。

[13:08] Um, do you think that's what people want? Yeah, I think there's this like there's this sense that these apps that are on the app store for using these smart home devices, etc.
嗯，你认为那就是人们想要的吗？是的，我认为有一种感觉，这些在应用商店里用于使用这些智能家居设备等的应用程序。

[13:16] Smart home devices, etc.
智能家居设备等。

[13:17] Uh, these shouldn't even exist kind of in a certain sense.
呃，在某种意义上，这些甚至不应该存在。

[13:19] Like shouldn't it just be APIs and shouldn't agents be just using it directly?
难道不应该只是API，代理不应该直接使用它吗？

[13:25] And, um, wouldn't it like I can do all kinds of home automation stuff that, uh, in any individual app will not be able to do, right?
而且，嗯，难道我不能做各种各样的家庭自动化，而这些事情在任何单独的应用程序中都无法做到，对吧？

[13:30] Um, and an LLM can actually drive the tools and call all the right tools and do uh, do pretty complicated things.
嗯，而且一个LLM实际上可以驱动工具并调用所有正确的工具，并做一些相当复杂的事情。

[13:35] Um, and so in a certain sense it does point to this like maybe there's like an overproduction of lots of custom bespoke apps that shouldn't exist because agents kind of like crumble them up and everything should be a lot more just like exposed API endpoints and agents are the glue of the intelligence that actually like tool calls all the all the parts.
嗯，所以从某种意义上说，这确实指向了这样一种观点：也许存在大量定制的、不应该存在的专属应用程序的过度生产，因为代理会把它们粉碎，而一切都应该更多地是暴露的API端点，代理则是真正调用所有部分的智能的粘合剂。

[13:55] Um, another example is like my treadmill.
嗯，另一个例子是我的跑步机。

[13:56] Uh, there's an app for my treadmill and I wanted to like keep track of how often I do my cardio, uh, but like I don't want to like log into web UI and go through a flow and etc.
呃，我的跑步机有一个应用程序，我想记录我做心肺运动的频率，呃，但我不想登录网页界面，然后进行一系列操作等等。

[14:04] Like all this should just be like make APIs available and this is kind of, you know, going towards the agentic, um, sort of web or like agent first, uh, tools and all this kind of stuff.
就像所有这些都应该只是提供API，这有点，你知道，朝着代理式的、嗯，某种网络或者代理优先的、呃，工具和所有这些东西发展。

[14:13] So I think the industry just has to reconfigure in so many ways that's like
所以我认为这个行业必须在很多方面进行重组，就像

[14:16] reconfigure in so many ways that's like the customer is not the human anymore.
以如此多的方式重新配置，以至于客户不再是人类了。

[14:18] the customer is not the human anymore.
客户不再是人类了。

[14:19] It's like agents who are acting on behalf of humans and this refactoring will be will probably be substantial in a certain sense.
这就像是代表人类行事的代理人，而这种重构在某种意义上可能会是实质性的。

[14:23] One way that people sometimes push back on this is like, do people Do you Do we expect people to write code some of these tools?
人们有时对此提出异议的一种方式是，我们是否期望人们使用这些工具编写代码？

[14:30] Do we expect normal people to do this kind of stuff that I described?
我们是否期望普通人做我所描述的这类事情？

[14:32] Mhm. But I think to some extent this is just, you know, technology as it exists today and right now there is some write coding and I'm actually watching it and I'm working with the system but I kind of feel like this kind of stuff that I just talked about this should be free like in a year or two or three.
嗯。但我觉得在某种程度上，这只是，你知道，就像当今的技术一样，现在有一些编写代码，我实际上正在观察它，并且我正在与系统一起工作，但我感觉我刚才谈到的这类事情应该在一两年或三年内免费。

[14:47] There's no write coding involved. This is trivial. This is table stakes. This is like any AI, even the open source models, etc. can like do this.
不涉及编写代码。这很简单。这是基本要求。这就像任何人工智能，甚至是开源模型等都可以做到这一点。

[14:52] You should be able to translate it from a less technical humans intent very easily to this outcome.
你应该能够非常轻松地将它从不太技术性的用户意图翻译成这个结果。

[15:00] >> Yeah. Today it's write coding and it's involved and not many people are going to do it but >> And you still have to make some design decisions, right?
>> 是的。今天它是编写代码，并且涉及其中，而且没有多少人会去做，但是 >> 你仍然需要做一些设计决策，对吧？

[15:05] We were talking about like we take frames for example. Yeah. Yeah. But I kind of feel like this will just, uh, start to the barrier will just come down and it's just ephemeral software on your behalf and some kind of
我们正在谈论，比如我们以帧为例。是的。是的。但我感觉这只会，呃，开始，障碍就会降低，它只是代表你存在的短暂软件，以及某种形式的

[15:17] software on your behalf and some kind of like claw is handling all the details
软件代表您处理，某种程度上的 Claw 会处理所有细节

[15:18] like claw is handling all the details for you but you're not involved.
就像 Claw 会为您处理所有细节，但您不参与其中。

[15:20] Claw has a Claw has a machine and it will figure it out and it's just presenting
Claw 有一个机器，它会弄清楚，它只是在展示

[15:22] figure it out and it's just presenting you UIs and you're like saying stuff,
弄清楚，它只是在向您展示用户界面，而您就像在说东西，

[15:23] you UIs and you're like saying stuff, you know?
您用户界面，而您就像在说东西，您知道吗？

[15:27] Mhm. Why haven't you, um, I guess like pushed the boundaries of what you can do
嗯。为什么您，嗯，我猜没有像这样拓展您的能力边界

[15:29] the boundaries of what you can do personally with claws?
个人使用 Claw 的能力边界？

[15:30] Like is it, you know, you're focusing on more important projects, auto research, etc. or, uh,
比如是您，您知道，您专注于更重要的项目、自动研究等，或者，呃，

[15:32] you're focusing on more important projects, auto research, etc. or, uh, you're climbing the hill to mastery or
您专注于更重要的项目、自动研究等，或者，呃，您正在攀登精通之路或者

[15:35] you're climbing the hill to mastery or something else, right?
您正在攀登精通之路或者其他什么，对吧？

[15:38] Yeah, I just feel like I'm so distracted by everything so
是的，我只是觉得我被所有事情分心了，所以

[15:40] like I'm so distracted by everything so I spend I [laughter] spend like a week
就像我被所有事情分心了，所以我花了我（笑）花了一周时间

[15:42] I spend I [laughter] spend like a week on the claw stuff and I I have more to do almost, um,
我花了我（笑）花了一周时间在 Claw 的事情上，而且我几乎还有更多事情要做，嗯，

[15:43] on the claw stuff and I I have more to do almost, um, but I will say that, um,
在 Claw 的事情上，而且我几乎还有更多事情要做，嗯，但我要说的是，嗯，

[15:45] but I will say that, um, >> It's like Jensen told us we're all just busier, unfortunately.
但我要说的是，嗯，>> 就像 Jensen 告诉我们的，不幸的是，我们都只是更忙了。

[15:47] busier, unfortunately. >> Uh, I didn't really take advantage of a lot of like email and calendar and all
更忙了，不幸的是。>> 嗯，我并没有真正利用很多像电子邮件和日历以及所有

[15:49] Uh, I didn't really take advantage of a lot of like email and calendar and all this other stuff and I didn't really have access cuz I'm still a little bit
嗯，我并没有真正利用很多像电子邮件和日历以及所有这些其他东西，而且我并没有真正获得访问权限，因为我仍然有点

[15:50] this other stuff and I didn't really have access cuz I'm still a little bit like suspicious and it's still very new
这些其他东西，而且我并没有真正获得访问权限，因为我仍然有点像怀疑，而且它仍然非常新

[15:51] like suspicious and it's still very new and rough around the edges. So I didn't want to give it like full access to my
就像怀疑，而且它仍然非常新，并且粗糙。所以我不想给它像完全访问我的

[15:53] and rough around the edges. So I didn't want to give it like full access to my digital life yet and part of it is just the security, privacy and uh, just being
并且粗糙。所以我不想给它像完全访问我的数字生活，而且其中一部分只是安全、隐私和嗯，只是保持

[15:55] digital life yet and part of it is just the security, privacy and uh, just being very cautious in that in that realm.
数字生活，而且其中一部分只是安全、隐私和嗯，只是在那个领域保持非常谨慎。

[15:57] very cautious in that in that realm. And, um, so some of it is like held back by that I would say.
在那个领域保持非常谨慎。而且，嗯，所以我想说其中一部分是被那个限制住了。

[15:58] And, um, so some of it is like held back by that I would say. Yeah, maybe that's like the dominant dominant feature but
而且，嗯，所以我想说其中一部分是被那个限制住了。是的，也许这就是主导的主导特征，但是

[15:59] by that I would say. Yeah, maybe that's like the dominant dominant feature but some of it is also just I feel so
我想说。是的，也许这就是主导的主导特征，但其中一部分也是我感觉如此

[16:01] like the dominant dominant feature but some of it is also just I feel so
就像主导的主导特征，但其中一部分也是我感觉如此

[16:17] some of it is also just I feel so distracted because I feel like I had a week of claw and then other stuff is happening and What was the, um, I mean you've talked about like being able to train or at least optimize a uh, a a model as a task you want to see agents do for a long time.
其中一些也只是我觉得我分心了，因为我觉得我度过了艰难的一周，然后其他事情也在发生，而且，嗯，你的意思是，你曾谈到过能够训练或至少优化一个模型，作为你想让代理人做很长时间的任务。

[16:32] Like what was the motivation behind auto research? Auto research, yeah.
那么，自动研究背后的动机是什么？自动研究，是的。

[16:34] So I think like I had a tweet earlier where I kind of like said something along the lines of to get the most out of the tools that have become available now you have to remove yourself as the as the bottleneck.
所以我想，我之前发了一条推文，我说，为了充分利用现在可用的工具，你必须将自己从瓶颈中移除。

[16:44] You can't be there to prompt the next thing. You're You need to take yourself outside. Um, you have to arrange things such that they're completely autonomous.
你不能在那里提示下一件事。你需要把自己排除在外。嗯，你必须安排事情，使它们完全自主。

[16:51] And the more you you know, how can you maximize your token throughput and not be in the loop?
而且，你知道，你如何最大化你的令牌吞吐量而不被包含在内？

[16:56] This is the this is the goal. And so I kind of mentioned that the the name of the game now is to increase your leverage.
这就是目标。所以我提到，现在的关键是提高你的杠杆作用。

[17:00] Uh, I put in just very few tokens just once in a while and a huge amount of stuff happens on my behalf.
呃，我偶尔只输入很少的令牌，然后就会代表我发生大量的事情。

[17:06] And so auto research like I tweeted that and I think people liked it and whatnot but it they haven't like maybe worked through like the implications of that and for me auto research is an example of like an implication of that.
所以自动研究，就像我发推说的那样，我想人们喜欢它之类的，但他们可能还没有深入研究那里的含义，对我来说，自动研究就是那里的一个含义的例子。

[17:14] Where it's like I don't want to be like the researcher in
就像我不想成为研究员一样

[17:18] don't want to be like the researcher in loop like looking at results, etc.
我不想像研究人员那样陷入循环，比如查看结果等等。

[17:20] Like I'm I'm holding the system back.
我正在拖累这个系统。

[17:23] So the I'm I'm holding the system back.
所以，我正在拖累这个系统。

[17:24] So the question is how do I refactor all the abstractions so that I'm not I have to arrange it once and hit go.
那么问题是如何重构所有的抽象，这样我就不必只设置一次然后运行了。

[17:28] The name of the game is how can you get more agents running for longer periods of time without your involvement doing stuff on your behalf?
游戏的精髓在于如何让更多的代理在没有你参与的情况下运行更长时间，为你处理事务？

[17:33] And auto research is just, yeah, here's an objective, here's a metric, here's your boundaries of what you can and cannot do.
而自动研究只是，是的，这里有一个目标，这里有一个指标，这里是你能力范围内的界限。

[17:39] And go.
然后开始。

[17:41] And, uh, yeah, it worked.
而且，嗯，是的，它奏效了。

[17:43] at its effectiveness.
在其有效性上。

[17:45] Yeah, I I didn't expect, uh, it to work because so I have the project data chat, um, and fundamentally like I think a lot of people are very confused with my obsession for like training GPT-2 models and so on.
是的，我没想到它会奏效，因为我有一个项目数据聊天，基本上我认为很多人对我痴迷于训练GPT-2模型等等感到困惑。

[17:53] But for me, uh, training GPT models and so on is just a little harness, a little playground for training LLMs.
但对我来说，训练GPT模型等等只是一个小的约束，一个训练LLM的小游乐场。

[17:58] And fundamentally what I'm more interested in is like this idea of recursive self-improvement and to what extent you can actually have LLMs improving LLMs because I think all the frontier labs this is like the thing Mhm.
从根本上说，我更感兴趣的是递归自我改进的想法，以及在多大程度上你可以让LLM改进LLM，因为我认为所有前沿实验室都在做这件事，嗯。

[18:10] uh, for obvious reasons and they're all trying to recursively self-improve roughly speaking.
嗯，出于显而易见的原因，他们都在大致上尝试递归地自我改进。

[18:13] And so for me this is kind of like, um, a little playpen of that.
所以对我来说，这有点像，嗯，一个小的游乐场。

[18:18] That.
那个。

[18:20] Um, and I guess I like tuned Nan Chat already quite a bit by hand in the good old fashion way that I'm used to.
嗯，我想我用我习惯的那个老派方式手动调整了 Nan Chat 很多次。

[18:21] Like I'm a researcher.
我是一名研究员。

[18:22] I've done this for like, you know, two decades.
我做这个已经有，你知道，二十年了。

[18:23] I have some amount of like What is the opposite of hubris?
我有一些，什么是傲慢的反义词？

[18:25] Uh, yeah. [laughter]
呃，是的。（笑声）

[18:28] Earned confidence? Okay.
赢得的自信？好的。

[18:30] I have like two decades of like, "Oh, I've trained this model like thousands of times.
我有大约二十年的时间，就像，“哦，我训练这个模型已经有几千次了。

[18:32] I've like, um, so I've done a bunch of experiments.
我，嗯，所以我做了一堆实验。

[18:34] I've done hyperparameter tuning.
我做了超参数调优。

[18:36] I've done all the things I'm very used to and I've done for two decades.
我做了所有我非常习惯的事情，而且我做了二十年。

[18:37] Yeah.
是的。

[18:38] And I've gotten to a certain point and I thought it was like fairly well tuned
而且我达到了某个点，我认为它已经相当调优好了

[18:39] and then I let auto research go for like overnight and it came back with like tunings that I didn't see.
然后我让自动研究运行了一夜，它回来了一些我没见过的调优。

[18:41] Mhm.
嗯哼。

[18:43] And yeah, I did forget like the weight decay on the value embeddings and my Adam betas were not sufficiently tuned
是的，我确实忘记了值嵌入上的权重衰减，我的 Adam beta 没有充分调优

[18:45] and these things just jointly interact.
而这些东西只是协同作用。

[18:47] So like once you tune one thing the other things have to potentially change too.
所以一旦你调整了一个东西，其他东西可能也需要改变。

[18:49] You know, I shouldn't be a bottleneck.
你知道，我不应该成为瓶颈。

[18:50] I shouldn't be running these hyperparameter optimizations.
我不应该运行这些超参数优化。

[18:52] I shouldn't be looking at the results.
我不应该看结果。

[18:54] There's objective criteria in this case.
在这种情况下有客观标准。

[18:56] Uh, so you just let you just have to arrange it so that it can just go forever.
呃，所以你只需要安排好，让它能一直运行下去。

[18:57] So that's a single sort of version of auto research of like a single loop trying to improve.
所以这就是自动研究的一种单一形式，就像一个试图改进的单一循环。

[18:58] And I was surprised that it, um, it found these things that I
我很惊讶，它，嗯，它找到了这些东西，我

[19:00] you know, the repo was already fairly well tuned and still found something.
你知道，这个仓库已经相当调优好了，但它还是找到了东西。

[19:19] well tuned and still found something.
调优得很好，仍然发现了一些东西。

[19:20] And that's just a single it's a single loop.
这只是一个，它是一个循环。

[19:22] Like these frontier labs they have GPU clusters of tens of thousands of them.
就像这些前沿实验室，他们有成千上万的 GPU 集群。

[19:26] And so it's very easy to imagine how you would basically get a lot of this automation on, um, smaller models.
所以很容易想象你基本上如何在更小的模型上获得很多这种自动化。

[19:32] And fundamentally everything around like frontier level intelligence is about extrapolation and scaling loss.
而且从根本上说，前沿智能水平的一切都与外推和规模损失有关。

[19:35] And so you basically do a ton of the exploration on the smaller models and then you try to, um, extrapolate out.
所以你基本上在更小的模型上进行大量的探索，然后你试图进行外推。

[19:40] So you're saying our research efforts are going to get more efficient.
所以你的意思是我们的研究工作将变得更有效率。

[19:44] Like we're going to have better direction for when we scale as well if we can do this experimentation better.
就像我们也能在扩展时有更好的方向，如果我们能更好地进行这种实验的话。

[19:50] Yeah, I would say that like the most interesting project and probably what the frontier labs are working on is uh, Mhm.
是的，我想说最有趣的项目，可能也是前沿实验室正在研究的项目是，嗯，嗯。

[19:54] Yeah. you know, you experiment on the smaller models.
是的。你知道，你在更小的模型上进行实验。

[19:55] You try to make it as autonomous as possible.
你试图让它尽可能自主。

[19:57] Remove researchers from the loop.
将研究人员排除在循环之外。

[20:00] They have way too much What is the What is the opposite of too much confidence?
他们有太多的，太自信的反义词是什么？

[20:04] Yeah, yeah, they don't know.
是的，是的，他们不知道。

[20:05] They shouldn't be touching any of this really.
他们真的不应该碰这些。

[20:07] And so you have to like rewrite the whole thing because right now, I mean certainly they can contribute ideas.
所以你必须重写整个东西，因为现在，我的意思是他们当然可以贡献想法。

[20:11] But okay, they shouldn't actually be enacting these ideas.
但是，好吧，他们实际上不应该实施这些想法。

[20:13] There is a queue of ideas and there's maybe an automated scientist that comes up with ideas based on all
有一系列的想法，也许有一个自动科学家根据所有...

[20:20] that comes up with ideas based on all the archive papers and GitHub repos and

[20:21] the archive papers and GitHub repos and it funnels ideas in or researchers can

[20:24] it funnels ideas in or researchers can contribute ideas, but it's a single

[20:25] contribute ideas, but it's a single queue and there is workers that pull

[20:27] queue and there is workers that pull items and they try them out. And

[20:30] items and they try them out. And whatever works just gets sort of put on

[20:31] whatever works just gets sort of put on the feature branch and maybe some people

[20:34] the feature branch and maybe some people like

[20:35] like monitor the feature branch and merge to

[20:37] monitor the feature branch and merge to the main branch sometimes. So

[20:39] the main branch sometimes. So yeah, just removing humans from all the

[20:42] yeah, just removing humans from all the processes and automating as much as

[20:43] processes and automating as much as possible and getting high token tokens

[20:45] possible and getting high token tokens per second throughputs and it does

[20:46] per second throughputs and it does require rethinking of all the

[20:48] require rethinking of all the abstractions

[20:49] abstractions and everything has to be reshuffled. So

[20:52] and everything has to be reshuffled. So yeah, I think it's very exciting. If we

[20:54] yeah, I think it's very exciting. If we take one more recursive step here,

[20:57] take one more recursive step here, when is the model going to write a

[20:58] when is the model going to write a better program MD than you?

[21:00] better program MD than you? Yeah.

[21:01] Yeah. Also program MD is like

[21:03] Also program MD is like >> loop. Yeah, exactly.

[21:05] >> loop. Yeah, exactly. >> Yeah. So program MD is my crappy attempt

[21:07] >> Yeah. So program MD is my crappy attempt at describing like how the auto

[21:10] at describing like how the auto researcher should work. Like oh, do this

[21:11] researcher should work. Like oh, do this then do that and that and then try these

[21:13] then do that and that and then try these kinds of ideas and then here's maybe

[21:15] kinds of ideas and then here's maybe some ideas like look at architecture,

[21:16] some ideas like look at architecture, look at optimizer, etc. But I just came

[21:18] look at optimizer, etc. But I just came up with with this in markdown, right?

[21:19] up with with this in markdown, right? >> Mhm.

[21:21] >> Mhm. And so

[21:23] And so yeah, exactly.

[21:24] yeah, exactly. You want some kind of an auto research

[21:26] You want some kind of an auto research loop maybe that looks for

[21:28] loop maybe that looks for You can imagine that different program

[21:29] You can imagine that different program that MDs would

[21:31] that MDs would would give you different progress. So

[21:34] would give you different progress. So you basically every research

[21:35] you basically every research organization is described by program MD.

[21:38] organization is described by program MD. A research organization is a set of

[21:40] A research organization is a set of markdown files that describe all the

[21:41] markdown files that describe all the roles and how the whole thing connects.

[21:43] roles and how the whole thing connects. And you can imagine having a better

[21:45] And you can imagine having a better research organization. So maybe they do

[21:47] research organization. So maybe they do fewer stand-ups in the morning because

[21:48] fewer stand-ups in the morning because they're useless. And this is all just

[21:50] they're useless. And this is all just code, right?

[21:51] code, right? And so you can So one organization can

[21:53] And so you can So one organization can have fewer stand-ups, one organization

[21:54] have fewer stand-ups, one organization can have more.

[21:56] can have more. One organization can be very

[21:57] One organization can be very risk-taking, one organization can be

[21:59] risk-taking, one organization can be less. As you can definitely imagine that

[22:00] less. As you can definitely imagine that you have multiple research orgs

[22:02] you have multiple research orgs and then they all have code. And once

[22:04] and then they all have code. And once you have code, then you can imagine

[22:05] you have code, then you can imagine tuning the code. So 100% there's like

[22:07] tuning the code. So 100% there's like the metal layer of it. Uh

[22:09] the metal layer of it. Uh Did you see my text about my contest

[22:11] Did you see my text about my contest idea? My contest idea was

[22:14] idea? My contest idea was like

[22:15] like let people write different program MDs,

[22:18] let people write different program MDs, right? And and so for same hardware,

[22:20] right? And and so for same hardware, where do you get most improvement?

[22:22] where do you get most improvement? >> Oh, I see. And then you can take all

[22:23] >> Oh, I see. And then you can take all that data and then give it to the model

[22:25] that data and then give it to the model and say write a better program MD.

[22:26] and say write a better program MD. >> Yes, yes.

[22:28] >> Yes, yes. Yeah, exactly.

[22:28] Yeah, exactly. >> We're going to get something better.

[22:29] >> We're going to get something better. Like there's no way we don't, right?

[22:30] Like there's no way we don't, right? >> 100% look at

[22:32] >> 100% look at where the improvements came from and

[22:34] where the improvements came from and like can I change the program MD such

[22:36] like can I change the program MD such that more of these kinds of things would

[22:37] that more of these kinds of things would be done or like things that didn't work

[22:40] be done or like things that didn't work except

[22:41] except you can 100% imagine doing that. So I

[22:43] you can 100% imagine doing that. So I think this is a great idea, but it's

[22:45] think this is a great idea, but it's like

[22:45] like you know, I think like you can sort of

[22:47] you know, I think like you can sort of go one step at a time where you sort of

[22:48] go one step at a time where you sort of have one process and then second process

[22:50] have one process and then second process and then the next process and these are

[22:51] and then the next process and these are all layers of an onion.

[22:53] all layers of an onion. Like the LLM sort of part is now taken

[22:55] Like the LLM sort of part is now taken for granted. The agent part is now taken

[22:57] for granted. The agent part is now taken for granted. Now the claw-like entities

[22:59] for granted. Now the claw-like entities are taken for granted and now you can

[23:00] are taken for granted and now you can have multiple of them and now you can

[23:01] have multiple of them and now you can have instructions to them and now you

[23:03] have instructions to them and now you can have optimization over the

[23:04] can have optimization over the instructions and it's just like a little

[23:06] instructions and it's just like a little too much, you know, but I mean this is

[23:07] too much, you know, but I mean this is why it gets to the psychosis is that

[23:09] why it gets to the psychosis is that this is like infinite and everything is

[23:10] this is like infinite and everything is scale issue and that's why I feel like

[23:12] scale issue and that's why I feel like Yeah, that's just coming back to This is

[23:14] Yeah, that's just coming back to This is why it's so insane. Okay, well, if

[23:16] why it's so insane. Okay, well, if [laughter] we're we're just trying to

[23:17] [laughter] we're we're just trying to like diagnose the current moment and

[23:20] like diagnose the current moment and what is a relevant skill right now, what

[23:22] what is a relevant skill right now, what do you like what do you think is the

[23:24] do you like what do you think is the implication that this

[23:26] implication that this that this is the loop we should be

[23:27] that this is the loop we should be trying to achieve in different areas and

[23:29] trying to achieve in different areas and then it works, right? Like you know,

[23:31] then it works, right? Like you know, remove

[23:32] remove create the metric or create the ability

[23:34] create the metric or create the ability for agents to continue working on it

[23:36] for agents to continue working on it without you. Do we still have

[23:38] without you. Do we still have performance engineering? Like what

[23:40] performance engineering? Like what Yeah, I mean so there's a few caveats

[23:42] Yeah, I mean so there's a few caveats that I would put on top of the LLM

[23:43] that I would put on top of the LLM psychosis. So number one,

[23:45] psychosis. So number one, this is extremely well suited to

[23:46] this is extremely well suited to anything that has objective metrics that

[23:48] anything that has objective metrics that are easy to evaluate. So for example,

[23:49] are easy to evaluate. So for example, like writing kernels for more efficient

[23:51] like writing kernels for more efficient CUDA,

[23:52] CUDA, you know, code for various parts of the

[23:54] you know, code for various parts of the model, etc. are a perfect fit because

[23:56] model, etc. are a perfect fit because you have inefficient code and then you

[23:58] you have inefficient code and then you want efficient code that has the exact

[23:59] want efficient code that has the exact same behavior but it's much faster.

[24:02] same behavior but it's much faster. Perfect fit. So a lot of things like

[24:04] Perfect fit. So a lot of things like like are perfect fit for auto research,

[24:06] like are perfect fit for auto research, but many things will not be. And so they

[24:08] but many things will not be. And so they it's just if you can't evaluate then you

[24:09] it's just if you can't evaluate then you can't auto research it, right?

[24:12] can't auto research it, right? So that's like caveat number one. And

[24:13] So that's like caveat number one. And then maybe caveat number two I would say

[24:15] then maybe caveat number two I would say is you know, we're we're kind of talking

[24:16] is you know, we're we're kind of talking about the next steps and we kind of see

[24:17] about the next steps and we kind of see what the next steps are, but

[24:18] what the next steps are, but fundamentally the the whole thing still

[24:20] fundamentally the the whole thing still doesn't it still kind of like bursting

[24:22] doesn't it still kind of like bursting at the seams a little bit and there's

[24:23] at the seams a little bit and there's cracks and it doesn't fully work and if

[24:25] cracks and it doesn't fully work and if you kind of try to go too far ahead, the

[24:27] you kind of try to go too far ahead, the whole thing is actually net not useful

[24:29] whole thing is actually net not useful if that makes sense.

[24:31] if that makes sense. Because these models like still are not,

[24:32] Because these models like still are not, you know, they've improved a lot, but

[24:34] you know, they've improved a lot, but they're still are like rough around the

[24:35] they're still are like rough around the edges is maybe the way I would describe

[24:37] edges is maybe the way I would describe it. I simultaneously feel like I'm

[24:39] it. I simultaneously feel like I'm talking to an extremely brilliant PhD

[24:41] talking to an extremely brilliant PhD student who's been like a systems

[24:43] student who's been like a systems programmer for their entire life and a

[24:44] programmer for their entire life and a 10-year-old. And it's so weird because

[24:47] 10-year-old. And it's so weird because humans like there's like I feel like

[24:49] humans like there's like I feel like they're a lot more coupled like you have

[24:50] they're a lot more coupled like you have to you know, um Yes, you wouldn't you

[24:52] to you know, um Yes, you wouldn't you wouldn't encounter that combination.

[24:54] wouldn't encounter that combination. >> This jaggedness is really strange and

[24:56] >> This jaggedness is really strange and humans have a lot less of that kind of

[24:57] humans have a lot less of that kind of jaggedness, although they definitely

[24:59] jaggedness, although they definitely have some.

[24:59] have some. >> [laughter]

[25:00] >> [laughter] >> But humans have a lot more jaggedness.

[25:02] >> But humans have a lot more jaggedness. Uh sorry, the agents have a lot more

[25:04] Uh sorry, the agents have a lot more jaggedness where

[25:05] jaggedness where sometimes like

[25:07] sometimes like you know, I ask for functionality and it

[25:08] you know, I ask for functionality and it like comes back with something that's

[25:09] like comes back with something that's just like totally wrong and then we get

[25:11] just like totally wrong and then we get into loops that are totally wrong and

[25:12] into loops that are totally wrong and then I'm just I get so frustrated with

[25:14] then I'm just I get so frustrated with the agents all the time still because

[25:16] the agents all the time still because you feel the power of it,

[25:17] you feel the power of it, but you also there's still like

[25:20] but you also there's still like it does not say statistical things once

[25:21] it does not say statistical things once in a while for me as well. I get very

[25:23] in a while for me as well. I get very annoyed [clears throat] when

[25:26] annoyed [clears throat] when I feel like the agent wasted a lot of

[25:29] I feel like the agent wasted a lot of compute on something it should have

[25:30] compute on something it should have recognized was an obvious problem. Yeah.

[25:33] recognized was an obvious problem. Yeah. I think like some of the bigger things

[25:34] I think like some of the bigger things is like maybe what's under underneath it

[25:36] is like maybe what's under underneath it if I could hypothesize is fundamentally

[25:39] if I could hypothesize is fundamentally these models are trained via

[25:39] these models are trained via reinforcement learning. So they're

[25:41] reinforcement learning. So they're actually struggling with the exact same

[25:42] actually struggling with the exact same thing we just talked about which is the

[25:43] thing we just talked about which is the labs can improve the models in anything

[25:45] labs can improve the models in anything that is verifiable or that

[25:47] that is verifiable or that [clears throat] has rewards. So did you

[25:48] [clears throat] has rewards. So did you write the program correctly and does it

[25:50] write the program correctly and does it you do you the unit tests check out? Yes

[25:52] you do you the unit tests check out? Yes or no. But some of the things where

[25:54] or no. But some of the things where they're struggling is like for example,

[25:55] they're struggling is like for example, I think they have a tough time with like

[25:57] I think they have a tough time with like nuance of maybe what I what I had in

[25:59] nuance of maybe what I what I had in mind or what I intended and when to ask

[26:00] mind or what I intended and when to ask clarifying questions.

[26:02] clarifying questions. Um

[26:03] Um or like what I Yeah, it's just um

[26:05] or like what I Yeah, it's just um anything that feels softer is like

[26:07] anything that feels softer is like worse. And so you're kind of like you're

[26:09] worse. And so you're kind of like you're either on rails and you're part of the

[26:11] either on rails and you're part of the super intelligence circuits or you're

[26:12] super intelligence circuits or you're not on rails and you're outside of the

[26:14] not on rails and you're outside of the verifiable domains and suddenly

[26:15] verifiable domains and suddenly everything kind of just like meanders.

[26:17] everything kind of just like meanders. Like maybe another way to put it is if

[26:19] Like maybe another way to put it is if you go to if today if you go to like

[26:21] you go to if today if you go to like state-of-the-art model, ChatGPT and you

[26:22] state-of-the-art model, ChatGPT and you ask it tell me a joke, um

[26:25] ask it tell me a joke, um do you know what joke you're going to

[26:26] do you know what joke you're going to get? There's the joke. The joke? I do

[26:29] get? There's the joke. The joke? I do feel I I I can't tell you like the you

[26:31] feel I I I can't tell you like the you know, standard form of it, but I do feel

[26:32] know, standard form of it, but I do feel like ChatGPT has like three jokes.

[26:34] like ChatGPT has like three jokes. >> Yeah, yeah. So the the joke that

[26:36] >> Yeah, yeah. So the the joke that apparently all the LLMs like love the

[26:37] apparently all the LLMs like love the most is why do scientists not trust

[26:40] most is why do scientists not trust atoms? Okay. Because they make

[26:42] atoms? Okay. Because they make everything up. Okay.

[26:44] everything up. Okay. >> They make everything up.

[26:45] >> They make everything up. So this is still

[26:46] So this is still >> emerge? So this is the joke you would

[26:48] >> emerge? So this is the joke you would get like three or four years ago and

[26:50] get like three or four years ago and this is the joke you still get today.

[26:51] this is the joke you still get today. Okay.

[26:52] Okay. >> So even though the models have improved

[26:53] >> So even though the models have improved tremendously and if you give them an

[26:56] tremendously and if you give them an agentic task, they will just go for

[26:58] agentic task, they will just go for hours and move mountains for you. And

[27:00] hours and move mountains for you. And then you ask for like a joke and it has

[27:02] then you ask for like a joke and it has a stupid joke. It's crappy joke from

[27:04] a stupid joke. It's crappy joke from five years ago and it's because it's

[27:06] five years ago and it's because it's outside of the it's outside of the RL.

[27:08] outside of the it's outside of the RL. It's outside of the reinforcement

[27:09] It's outside of the reinforcement learning. It's outside of what's being

[27:10] learning. It's outside of what's being improved. It's like and it's part of the

[27:13] improved. It's like and it's part of the jaggedness of like shouldn't you expect

[27:15] jaggedness of like shouldn't you expect models as they get better to also have

[27:16] models as they get better to also have like better jokes or more diversity of

[27:18] like better jokes or more diversity of them or it's just it's not being

[27:20] them or it's just it's not being optimized and stuck. Do you

[27:23] optimized and stuck. Do you think that that implies that we are not

[27:26] think that that implies that we are not seeing like generalization in the sense

[27:29] seeing like generalization in the sense of like broader intelligence of joke

[27:31] of like broader intelligence of joke smartness being attached to code

[27:34] smartness being attached to code smartness? Yeah, I think there's some

[27:35] smartness? Yeah, I think there's some decoupling where some things are

[27:37] decoupling where some things are verifiable and some things are not and

[27:39] verifiable and some things are not and some things are optimized for

[27:40] some things are optimized for arbitrarily by the labs depending on

[27:41] arbitrarily by the labs depending on like what data went in and some things

[27:43] like what data went in and some things are not and um

[27:45] are not and um and

[27:46] and >> But I mean the the premise there's a you

[27:48] >> But I mean the the premise there's a you know, premise from some research groups

[27:51] know, premise from some research groups that if you're smarter at code

[27:53] that if you're smarter at code generation or in these verifiable

[27:55] generation or in these verifiable fields, you should be better at

[27:56] fields, you should be better at everything. And like the

[27:58] everything. And like the the joke situation suggests that that's

[28:00] the joke situation suggests that that's not happening at all.

[28:01] not happening at all. Okay.

[28:01] Okay. >> Yeah, I don't think that's happening. I

[28:02] >> Yeah, I don't think that's happening. I think

[28:03] think I think maybe we're seeing like a little

[28:04] I think maybe we're seeing like a little bit of that, but not like a satisfying

[28:06] bit of that, but not like a satisfying amount.

[28:06] amount. >> Yeah, that jaggedness exists in humans.

[28:09] >> Yeah, that jaggedness exists in humans. You [laughter] can be very very good at

[28:11] You [laughter] can be very very good at math

[28:12] math and still tell really bad jokes.

[28:13] and still tell really bad jokes. >> Yeah, that's true. Yeah, but it just it

[28:15] >> Yeah, that's true. Yeah, but it just it still means that we're not getting like

[28:17] still means that we're not getting like the story is that we're getting a lot of

[28:18] the story is that we're getting a lot of the intelligence and capabilities in all

[28:20] the intelligence and capabilities in all the domains of society like for free as

[28:22] the domains of society like for free as we get better and better models and

[28:24] we get better and better models and that's not like exactly fundamentally

[28:25] that's not like exactly fundamentally what's going on and there's some blind

[28:27] what's going on and there's some blind spots and some things are not being

[28:28] spots and some things are not being optimized for and this is all clustered

[28:30] optimized for and this is all clustered up in these neural net opaque models,

[28:32] up in these neural net opaque models, right? So you're either on rails of what

[28:35] right? So you're either on rails of what it was trained for and everything is

[28:36] it was trained for and everything is like you're going at speed of light or

[28:37] like you're going at speed of light or you're not.

[28:39] you're not. And so it's the jaggedness. So

[28:41] And so it's the jaggedness. So um

[28:42] um So that's why I think like even though

[28:43] So that's why I think like even though the the progression is obvious what

[28:45] the the progression is obvious what should happen, you can't let it fully go

[28:49] should happen, you can't let it fully go there yet because it doesn't

[28:51] there yet because it doesn't fully work or it's a scale issue and we

[28:52] fully work or it's a scale issue and we just haven't like figured out how to use

[28:54] just haven't like figured out how to use it. So you know, it's hard to tell. Can

[28:55] it. So you know, it's hard to tell. Can I ask a somewhat blasphemous question

[28:57] I ask a somewhat blasphemous question which is like if this jaggedness is

[29:00] which is like if this jaggedness is persisting

[29:01] persisting and it's all rolled up in a

[29:04] and it's all rolled up in a at least monolithic interface, right?

[29:05] at least monolithic interface, right? But you know, single model.

[29:08] But you know, single model. Does that make sense or do you should

[29:10] Does that make sense or do you should should it be unbundled into things that

[29:11] should it be unbundled into things that are can be optimized and improved

[29:13] are can be optimized and improved against different domains of

[29:15] against different domains of intelligence? Like unbundling the models

[29:17] intelligence? Like unbundling the models into multiple experts in different

[29:19] into multiple experts in different areas, etc. More directly. Yeah. Um

[29:22] areas, etc. More directly. Yeah. Um Instead of just MOE that we have no

[29:24] Instead of just MOE that we have no exposure to because that can be like

[29:25] exposure to because that can be like confusing as a user from the outside

[29:28] confusing as a user from the outside which is like why is it so good at this,

[29:29] which is like why is it so good at this, but not at this other thing? Yeah, I

[29:31] but not at this other thing? Yeah, I think currently my impression is the

[29:33] think currently my impression is the labs are trying to have a single sort of

[29:34] labs are trying to have a single sort of like monoculture of a model that is

[29:37] like monoculture of a model that is arbitrarily intelligent in all these

[29:39] arbitrarily intelligent in all these different domains and they just stuff it

[29:40] different domains and they just stuff it into the parameters. I do think that we

[29:42] into the parameters. I do think that we will we I do think we should expect more

[29:44] will we I do think we should expect more speciation in the

[29:46] speciation in the intelligences.

[29:48] intelligences. Um

[29:49] Um like, you know, the animal kingdom is

[29:51] like, you know, the animal kingdom is extremely diverse in the brains that

[29:52] extremely diverse in the brains that exist and there's lots of different

[29:53] exist and there's lots of different niches of of nature and some animals

[29:56] niches of of nature and some animals have overdeveloped visual cortex or

[29:58] have overdeveloped visual cortex or other part kind of parts and I think we

[30:00] other part kind of parts and I think we we should be able to see more speciation

[30:03] we should be able to see more speciation and um you don't need like this oracle

[30:04] and um you don't need like this oracle that knows everything. You can speciate

[30:06] that knows everything. You can speciate it and then you put it on a specific

[30:07] it and then you put it on a specific task and we should be seeing some of

[30:09] task and we should be seeing some of that because you should be able to have

[30:10] that because you should be able to have like much smaller models that still have

[30:11] like much smaller models that still have the cognitive core like they're still

[30:13] the cognitive core like they're still competent but then they specialize and

[30:15] competent but then they specialize and then um and then they they can become

[30:17] then um and then they they can become more efficient in terms of latency or

[30:19] more efficient in terms of latency or throughput on

[30:21] throughput on specific tasks that you really care

[30:22] specific tasks that you really care about. Like if you're a mathematician

[30:23] about. Like if you're a mathematician working in Lean, I saw for example

[30:24] working in Lean, I saw for example there's a few releases that really like

[30:26] there's a few releases that really like target that as a domain. Um

[30:28] target that as a domain. Um uh so there's a probably going to be a

[30:29] uh so there's a probably going to be a few examples like that where the

[30:31] few examples like that where the unbundling kind of makes sense. One

[30:33] unbundling kind of makes sense. One question I have is whether or not the

[30:36] question I have is whether or not the capacity constraint on available compute

[30:39] capacity constraint on available compute infrastructure Mhm. drives more of this

[30:41] infrastructure Mhm. drives more of this because efficiency Yeah. actually

[30:43] because efficiency Yeah. actually matters more. Yeah.

[30:45] matters more. Yeah. Your if you

[30:47] Your if you financing aside, though financing's

[30:49] financing aside, though financing's involved in all of this. If you have

[30:50] involved in all of this. If you have access to full compute for anything you

[30:52] access to full compute for anything you do like even one single model, right?

[30:55] do like even one single model, right? But if you actually feel pressure where

[30:57] But if you actually feel pressure where you're like I can't serve

[30:59] you're like I can't serve >> Mhm. um

[31:01] >> Mhm. um model of massive size for every use

[31:03] model of massive size for every use case.

[31:03] case. >> Mhm. Like do you think that leads to any

[31:05] >> Mhm. Like do you think that leads to any speciation? Does that question make

[31:06] speciation? Does that question make sense to you? The question makes sense

[31:08] sense to you? The question makes sense and I guess like what I'm what I'm what

[31:10] and I guess like what I'm what I'm what I what I'm struggling with is I don't

[31:11] I what I'm struggling with is I don't think we've seen too much speciation

[31:13] think we've seen too much speciation just yet, right? No. Uh we're seeing a

[31:15] just yet, right? No. Uh we're seeing a monoculture of models. Yeah. So um And

[31:17] monoculture of models. Yeah. So um And there's like clearly pressure for like

[31:19] there's like clearly pressure for like make a good code model, put it back in

[31:20] make a good code model, put it back in the main, merge again. Yeah.

[31:23] the main, merge again. Yeah. >> Um

[31:25] even though there already is pressure on

[31:27] even though there already is pressure on the models. Mhm. I guess perhaps I I

[31:29] the models. Mhm. I guess perhaps I I feel like there's a lot of very

[31:30] feel like there's a lot of very short-term supply crunch and like maybe

[31:33] short-term supply crunch and like maybe that causes more speciation now.

[31:35] that causes more speciation now. Yeah, I think fundamentally like the

[31:37] Yeah, I think fundamentally like the the the labs are serving a model and

[31:39] the the labs are serving a model and they don't really know what the end user

[31:41] they don't really know what the end user is going to be asking about. So maybe

[31:43] is going to be asking about. So maybe that's like some part of it because they

[31:45] that's like some part of it because they kind of have to multitask over all the

[31:46] kind of have to multitask over all the possible things they could be asked. But

[31:48] possible things they could be asked. But I think if you're coming to a business

[31:49] I think if you're coming to a business and maybe partnering on some specific

[31:50] and maybe partnering on some specific problems you care about then maybe you

[31:52] problems you care about then maybe you would see that there. Um or there would

[31:54] would see that there. Um or there would be some very high-value applications

[31:56] be some very high-value applications that are like more niche. Um

[31:58] that are like more niche. Um But but I think right now they're kind

[32:00] But but I think right now they're kind of like going after the totality of

[32:01] of like going after the totality of what's available. I don't think that the

[32:03] what's available. I don't think that the science of manipulating the brains is

[32:05] science of manipulating the brains is like fully developed yet partly. What do

[32:07] like fully developed yet partly. What do you mean manipulating? So like so

[32:09] you mean manipulating? So like so fine-tuning without losing capabilities

[32:11] fine-tuning without losing capabilities as an example. And I we don't have these

[32:13] as an example. And I we don't have these primitives for actually like working

[32:14] primitives for actually like working with the intelligences in ways other

[32:15] with the intelligences in ways other than just context windows. Our context

[32:17] than just context windows. Our context windows kind of just just work and it's

[32:19] windows kind of just just work and it's very cheap to manipulate etc. And this

[32:20] very cheap to manipulate etc. And this is how we're getting some of the

[32:21] is how we're getting some of the customization etc. Uh but I think if it

[32:23] customization etc. Uh but I think if it was I think it's a it's a bit more of a

[32:26] was I think it's a it's a bit more of a developing science of how you like more

[32:27] developing science of how you like more deeply adjust the models, how you have

[32:29] deeply adjust the models, how you have continual learning maybe or how you

[32:32] continual learning maybe or how you um how you fine-tune in a certain area,

[32:34] um how you fine-tune in a certain area, how you get better in a certain area or

[32:35] how you get better in a certain area or like how you actually touch the weights

[32:36] like how you actually touch the weights not just the context windows. And so

[32:38] not just the context windows. And so it's a lot more

[32:39] it's a lot more tricky I would say to touch the weights

[32:41] tricky I would say to touch the weights than just the context windows uh because

[32:43] than just the context windows uh because you're actually fundamentally changing

[32:44] you're actually fundamentally changing the full model and potentially its

[32:45] the full model and potentially its intelligence. And so um

[32:48] intelligence. And so um so maybe it's just like not a fully

[32:49] so maybe it's just like not a fully developed science if that makes sense of

[32:50] developed science if that makes sense of speciation. And it also has to be like

[32:53] speciation. And it also has to be like cheap enough Yeah. for that speciation

[32:55] cheap enough Yeah. for that speciation to be worthwhile in these given

[32:57] to be worthwhile in these given >> contexts. Can I ask a question about

[32:59] >> contexts. Can I ask a question about like an extension to auto research that

[33:02] like an extension to auto research that you described in terms of open ground?

[33:04] you described in terms of open ground? You say okay, well, you know, we have

[33:06] You say okay, well, you know, we have this thing. Um we need more

[33:08] this thing. Um we need more collaboration surface around it

[33:10] collaboration surface around it essentially for people to contribute

[33:13] essentially for people to contribute to research overall. Can you talk about

[33:15] to research overall. Can you talk about that?

[33:15] that? >> Yeah, so we talked about auto research

[33:16] >> Yeah, so we talked about auto research has a single thread of like I'm going to

[33:18] has a single thread of like I'm going to try stuff in a loop but fundamentally

[33:20] try stuff in a loop but fundamentally the parallelization of this is like the

[33:22] the parallelization of this is like the interesting component.

[33:23] interesting component. And I guess I was trying to like play

[33:25] And I guess I was trying to like play around with a few ideas but I don't have

[33:26] around with a few ideas but I don't have anything that like clicks as simply as

[33:28] anything that like clicks as simply as like I don't have something I'm like

[33:29] like I don't have something I'm like super happy with just yet but it's

[33:30] super happy with just yet but it's something I'm like working on the side

[33:32] something I'm like working on the side when I'm not working on my claw.

[33:34] when I'm not working on my claw. Um

[33:35] Um so I think like one issue is if you have

[33:38] so I think like one issue is if you have a bunch of nodes

[33:40] a bunch of nodes of parallelization available to then

[33:41] of parallelization available to then it's very easy to just have multiple

[33:43] it's very easy to just have multiple auto researchers talking through a

[33:45] auto researchers talking through a a common system or something like that.

[33:46] a common system or something like that. What I was more interested in is how you

[33:48] What I was more interested in is how you can have an untrusted pool of workers

[33:49] can have an untrusted pool of workers out there on the internet. Mhm. So for

[33:51] out there on the internet. Mhm. So for example in auto research

[33:53] example in auto research you're just trying to find um

[33:56] you're just trying to find um the piece of code that trains a model to

[33:58] the piece of code that trains a model to a very low validation loss.

[34:00] a very low validation loss. If anyone gives you a candidate commit,

[34:02] If anyone gives you a candidate commit, it's very easy to verify that that

[34:04] it's very easy to verify that that commit is correct is good. Like they

[34:06] commit is correct is good. Like they someone could claim from the internet

[34:07] someone could claim from the internet that this piece of code will optimize

[34:09] that this piece of code will optimize much better and give you much better

[34:10] much better and give you much better performance. You could just check. Yeah.

[34:12] performance. You could just check. Yeah. But probably a lot of work goes into

[34:14] But probably a lot of work goes into that checking.

[34:16] that checking. But fundamentally they could lie and

[34:17] But fundamentally they could lie and etc. So you're basically dealing with a

[34:19] etc. So you're basically dealing with a similar kind of it's almost actually

[34:20] similar kind of it's almost actually like looks a little bit like my my

[34:22] like looks a little bit like my my designs that incorporate an untrusted

[34:23] designs that incorporate an untrusted pool of workers

[34:25] pool of workers actually look a little bit more like a

[34:26] actually look a little bit more like a blockchain a little bit uh because

[34:28] blockchain a little bit uh because instead of blocks you have commits and

[34:31] instead of blocks you have commits and these commits can build on each other

[34:32] these commits can build on each other and they contain like changes to the

[34:33] and they contain like changes to the code as you're improving it. Um and uh

[34:36] code as you're improving it. Um and uh the proof of work is basically doing

[34:38] the proof of work is basically doing tons of experimentation to find the

[34:39] tons of experimentation to find the commits that work.

[34:40] commits that work. Um and that's hard

[34:42] Um and that's hard and then the reward is just being on the

[34:44] and then the reward is just being on the leaderboard right now. There's no

[34:45] leaderboard right now. There's no monetary reward whatsoever.

[34:47] monetary reward whatsoever. Uh but I don't want to push the analogy

[34:48] Uh but I don't want to push the analogy too far but it fundamentally has this

[34:50] too far but it fundamentally has this issue where

[34:51] issue where you a huge amount of search goes into it

[34:53] you a huge amount of search goes into it but it's very cheap to verify that a

[34:55] but it's very cheap to verify that a candidate solution is indeed good

[34:57] candidate solution is indeed good because you can just train a single you

[34:58] because you can just train a single you know, someone had to try 10,000 ideas

[35:00] know, someone had to try 10,000 ideas but

[35:01] but you just have to check that the thing

[35:02] you just have to check that the thing that they produced actually works

[35:03] that they produced actually works because the 99,000 of them didn't work,

[35:05] because the 99,000 of them didn't work, you know? Um and so basically long story

[35:08] you know? Um and so basically long story short is like you have to come up with a

[35:10] short is like you have to come up with a system where an untrusted pool of

[35:12] system where an untrusted pool of workers can collaborate with a trusted

[35:14] workers can collaborate with a trusted pool of workers that do the

[35:16] pool of workers that do the verification.

[35:18] verification. And the whole thing is kind of like

[35:19] And the whole thing is kind of like asynchronous and works and

[35:22] asynchronous and works and and so on and it's it's like safe from a

[35:24] and so on and it's it's like safe from a security perspective because if anyone

[35:25] security perspective because if anyone sends you arbitrary code and you're

[35:27] sends you arbitrary code and you're going to run it, that is very sketchy

[35:28] going to run it, that is very sketchy and dodgy. So um

[35:30] and dodgy. So um but fundamentally it should be totally

[35:31] but fundamentally it should be totally possible. So you're familiar with

[35:32] possible. So you're familiar with projects like SETI@home and

[35:34] projects like SETI@home and Folding@home. All of these problems have

[35:35] Folding@home. All of these problems have a similar kind of setup. So Folding@home

[35:38] a similar kind of setup. So Folding@home you're folding a protein

[35:39] you're folding a protein and it's very hard to find a

[35:40] and it's very hard to find a configuration that is low energy. But if

[35:42] configuration that is low energy. But if someone finds a configuration that they

[35:43] someone finds a configuration that they value to be low energy, that's perfect.

[35:45] value to be low energy, that's perfect. You can just use it. You can easily

[35:46] You can just use it. You can easily verify it.

[35:47] verify it. So a lot of things have this property

[35:48] So a lot of things have this property that you know, very expensive to come up

[35:50] that you know, very expensive to come up with but very cheap to verify. And so in

[35:52] with but very cheap to verify. And so in all those cases things like Folding@home

[35:54] all those cases things like Folding@home or SETI@home or auto research at home

[35:57] or SETI@home or auto research at home will be good fits. And so um long story

[36:00] will be good fits. And so um long story short

[36:01] short a swarm of agents on the internet could

[36:03] a swarm of agents on the internet could collaborate to improve LLMs and could

[36:05] collaborate to improve LLMs and could potentially even like run circles around

[36:07] potentially even like run circles around frontier labs. Like who knows, you know?

[36:09] frontier labs. Like who knows, you know? Um

[36:10] Um yeah, like maybe that's even possible.

[36:12] yeah, like maybe that's even possible. Like frontier labs have a huge amount of

[36:13] Like frontier labs have a huge amount of trusted compute but the earth is much

[36:16] trusted compute but the earth is much bigger and has huge amount of untrusted

[36:18] bigger and has huge amount of untrusted compute. But if you put systems in check

[36:20] compute. But if you put systems in check systems in place that you know, deal

[36:22] systems in place that you know, deal with this then maybe it is possible that

[36:24] with this then maybe it is possible that the swarm out there could could come up

[36:26] the swarm out there could could come up with with better with better solutions.

[36:29] with with better with better solutions. And people kind of like contribute

[36:30] And people kind of like contribute cycles um

[36:32] cycles um to to a thing that they care about. And

[36:34] to to a thing that they care about. And so sorry to so the last thought is

[36:36] so sorry to so the last thought is uh lots of companies or whatnot they

[36:37] uh lots of companies or whatnot they could maybe have like their own things

[36:39] could maybe have like their own things that they care about and you if you have

[36:41] that they care about and you if you have compute capacity you could contribute to

[36:43] compute capacity you could contribute to different kind of auto research tracks.

[36:44] different kind of auto research tracks. Like maybe you care about certain you

[36:46] Like maybe you care about certain you know, like you care about like cancer or

[36:48] know, like you care about like cancer or something like that of certain type. You

[36:49] something like that of certain type. You don't have to just donate money to an

[36:50] don't have to just donate money to an institution. You actually could like

[36:52] institution. You actually could like purchase compute and then you could join

[36:54] purchase compute and then you could join the auto research swarm for that

[36:55] the auto research swarm for that project, you know? Uh so if everything

[36:58] project, you know? Uh so if everything is rebundled into auto researchers then

[37:00] is rebundled into auto researchers then compute becomes the thing that you're

[37:01] compute becomes the thing that you're contributing to the pool. Yeah. That's

[37:03] contributing to the pool. Yeah. That's very inspiring and it's also

[37:04] very inspiring and it's also interesting. Like I don't I don't know

[37:06] interesting. Like I don't I don't know how far this goes but it is interesting

[37:08] how far this goes but it is interesting that at least some audience of people

[37:11] that at least some audience of people you know, here in Silicon Valley or

[37:13] you know, here in Silicon Valley or lining up at you know, retail stores in

[37:15] lining up at you know, retail stores in China have discovered that like having

[37:18] China have discovered that like having access to personal compute is

[37:19] access to personal compute is interesting again.

[37:20] interesting again. >> Yeah. Right? So maybe they're really

[37:21] >> Yeah. Right? So maybe they're really motivated to do that for their claws and

[37:23] motivated to do that for their claws and then they can contribute to auto

[37:25] then they can contribute to auto research.

[37:25] research. >> almost like dollars the thing everyone

[37:27] >> almost like dollars the thing everyone cares about but is flop the thing that

[37:29] cares about but is flop the thing that actually everyone cares about in the

[37:31] actually everyone cares about in the future? Like is there going to be like a

[37:32] future? Like is there going to be like a flipening almost of like what's the

[37:34] flipening almost of like what's the thing that you care about? Like right

[37:35] thing that you care about? Like right now for example it's really hard to get

[37:36] now for example it's really hard to get compute even if you have money. Yeah.

[37:38] compute even if you have money. Yeah. So actually it almost seems like the

[37:40] So actually it almost seems like the flop is like dominant

[37:41] flop is like dominant >> [laughter]

[37:42] >> [laughter] >> in a certain sense. Um

[37:44] >> in a certain sense. Um Yeah, so so maybe that's kind of like

[37:46] Yeah, so so maybe that's kind of like that. Kind of like that. Like how much

[37:47] that. Kind of like that. Like how much how many flops do you control instead of

[37:49] how many flops do you control instead of like what wealth you control? I don't

[37:51] like what wealth you control? I don't actually think that's true but it's kind

[37:52] actually think that's true but it's kind of interesting to think about. The last

[37:54] of interesting to think about. The last thing you released was like a little bit

[37:55] thing you released was like a little bit of jobs data analysis. Is that right?

[37:58] of jobs data analysis. Is that right? What

[37:59] What and might have touched a nerve even

[38:01] and might have touched a nerve even though you're just like visualizing some

[38:02] though you're just like visualizing some public data.

[38:03] public data. >> Yeah. Uh what was you know, what were

[38:05] >> Yeah. Uh what was you know, what were you curious about? Yeah, I guess I was

[38:06] you curious about? Yeah, I guess I was curious to um

[38:09] curious to um I mean everyone is like really it's

[38:10] I mean everyone is like really it's everyone is really thinking about the

[38:11] everyone is really thinking about the impacts of AI on the job market and

[38:13] impacts of AI on the job market and what's going to look like. So I was just

[38:15] what's going to look like. So I was just interested to take a look like what does

[38:16] interested to take a look like what does the job market look like? Where are the

[38:17] the job market look like? Where are the different roles um

[38:19] different roles um and how many people are in different

[38:20] and how many people are in different professions? And I was like really just

[38:22] professions? And I was like really just interested to like look through

[38:24] interested to like look through the individual cases and try to think

[38:25] the individual cases and try to think myself about like you know, with these

[38:27] myself about like you know, with these AIs and how they're likely to evolve

[38:29] AIs and how they're likely to evolve like

[38:30] like are these going to be tools that people

[38:31] are these going to be tools that people are using? Are these going to be

[38:33] are using? Are these going to be displacing tools for these professions?

[38:36] displacing tools for these professions? And like what are the current

[38:37] And like what are the current professions and how are they going to

[38:38] professions and how are they going to change? Are they going to grow or uh

[38:40] change? Are they going to grow or uh adjust to a large extent or like what

[38:42] adjust to a large extent or like what could be new professions? So it's really

[38:43] could be new professions? So it's really just like a way to fuel my own chain of

[38:45] just like a way to fuel my own chain of thought about the industry I suppose.

[38:47] thought about the industry I suppose. Mhm. Um and so

[38:49] Mhm. Um and so yeah, the jobs data basically is just a

[38:51] yeah, the jobs data basically is just a Bureau of Labor Statistics. They

[38:53] Bureau of Labor Statistics. They actually have um percent outlook for

[38:55] actually have um percent outlook for each profession about how much it's

[38:57] each profession about how much it's expected to grow over the next I think

[38:58] expected to grow over the next I think almost a decade. Uh yeah, I think it's a

[39:00] almost a decade. Uh yeah, I think it's a decade but it was made in 2024. Mhm. We

[39:02] decade but it was made in 2024. Mhm. We need a lot of health care workers. Yeah.

[39:04] need a lot of health care workers. Yeah. So so they've already made those

[39:06] So so they've already made those projections and I'm not sure actually

[39:07] projections and I'm not sure actually 100% what the methodology was that they

[39:09] 100% what the methodology was that they they put into their projections. Um I

[39:11] they put into their projections. Um I guess I was interested to color things

[39:13] guess I was interested to color things by like if people think that what's like

[39:15] by like if people think that what's like primarily being

[39:17] primarily being developed now is this kind of like more

[39:18] developed now is this kind of like more digital AI

[39:20] digital AI that is kind of like almost like these

[39:21] that is kind of like almost like these ghosts or spirit entities that can like

[39:23] ghosts or spirit entities that can like interact in the digital world and

[39:25] interact in the digital world and manipulate a lot of like digital

[39:26] manipulate a lot of like digital information and they currently don't

[39:28] information and they currently don't really have a physical embodiment or

[39:29] really have a physical embodiment or presence. And the physical stuff is

[39:31] presence. And the physical stuff is probably going to go slightly slower

[39:32] probably going to go slightly slower because you're manipulating atoms. So

[39:34] because you're manipulating atoms. So flipping flipping bits and

[39:36] flipping flipping bits and and the ability to copy-paste digital

[39:37] and the ability to copy-paste digital information is like makes everything a

[39:39] information is like makes everything a million times faster than accelerating

[39:41] million times faster than accelerating matter, you know, so

[39:43] matter, you know, so Um so energetically, I just think we're

[39:45] Um so energetically, I just think we're going to see a huge amount of activity

[39:46] going to see a huge amount of activity in the digital space, huge amount of

[39:48] in the digital space, huge amount of rewriting, huge amount of activity,

[39:50] rewriting, huge amount of activity, boiling soup. And I think the we're

[39:52] boiling soup. And I think the we're going to see something that in the

[39:53] going to see something that in the digital space goes at the speed of light

[39:55] digital space goes at the speed of light compared to I think what's going to

[39:56] compared to I think what's going to happen in the physical world to some

[39:57] happen in the physical world to some extent. If it would be the

[39:58] extent. If it would be the extrapolation. And so I think like

[40:01] extrapolation. And so I think like >> [clears throat]

[40:01] >> [clears throat] >> there's currently kind of like I think

[40:03] >> there's currently kind of like I think overhang where there can be like a lot

[40:06] overhang where there can be like a lot of unhubbling almost potentially of like

[40:08] of unhubbling almost potentially of like a lot of digital information processing

[40:09] a lot of digital information processing that used to be done by computers and

[40:11] that used to be done by computers and people. And now with AIs there's like a

[40:13] people. And now with AIs there's like a third kind of manipulator of digital

[40:14] third kind of manipulator of digital information. There's going to be a lot

[40:15] information. There's going to be a lot of refactoring in those in those

[40:18] of refactoring in those in those disciplines.

[40:19] disciplines. Um but the physical world is actually

[40:21] Um but the physical world is actually going to be like I think

[40:22] going to be like I think behind that by some amount of time. And

[40:24] behind that by some amount of time. And so I think what's really fascinating to

[40:25] so I think what's really fascinating to me is like

[40:27] me is like So that's why I was highlighting the the

[40:29] So that's why I was highlighting the the professions that fundamentally

[40:30] professions that fundamentally manipulate digital information. This is

[40:31] manipulate digital information. This is work you could do from your home, etc.

[40:33] work you could do from your home, etc. Uh because I feel like those will be

[40:35] Uh because I feel like those will be like things will change. And it doesn't

[40:36] like things will change. And it doesn't mean that there's going to be less of

[40:38] mean that there's going to be less of those jobs or more of those jobs because

[40:39] those jobs or more of those jobs because it does has to do with like demand

[40:40] it does has to do with like demand elasticity and many other factors. But

[40:42] elasticity and many other factors. But things will change in these professions

[40:44] things will change in these professions because of these new tools and um

[40:46] because of these new tools and um because of this upgrade to the nervous

[40:48] because of this upgrade to the nervous system of the human superorganism

[40:50] system of the human superorganism >> [laughter]

[40:50] >> [laughter] >> if you want to think about it that way.

[40:52] >> if you want to think about it that way. Given the look you had at the data, do

[40:53] Given the look you had at the data, do you have either any observations or um

[40:57] you have either any observations or um uh guidance for people facing the job

[40:59] uh guidance for people facing the job market or thinking about what to study

[41:01] market or thinking about what to study now or what skills to develop? I mean we

[41:03] now or what skills to develop? I mean we can all go get like I'm very thankful

[41:05] can all go get like I'm very thankful that I have to like meet people for my

[41:06] that I have to like meet people for my job right now.

[41:07] job right now. >> Yeah.

[41:08] >> Yeah. >> [laughter]

[41:08] >> [laughter] >> Yeah, more physical. Yeah. Could you do

[41:10] >> Yeah, more physical. Yeah. Could you do your work from home though? I could.

[41:13] your work from home though? I could. I think there are relationship parts of

[41:14] I think there are relationship parts of it that are hard, but most of it I

[41:15] it that are hard, but most of it I could. Yeah. I think it's really hard to

[41:17] could. Yeah. I think it's really hard to tell because again like the job market

[41:18] tell because again like the job market is extremely diverse. I think the

[41:19] is extremely diverse. I think the answers will probably vary, but uh to a

[41:21] answers will probably vary, but uh to a large extent like these tools are

[41:22] large extent like these tools are extremely new, extremely powerful. And

[41:24] extremely new, extremely powerful. And so just being you know, just trying to

[41:26] so just being you know, just trying to keep up with it is like the first thing.

[41:28] keep up with it is like the first thing. Um

[41:29] Um and um

[41:31] and um yeah, because I think a lot of people

[41:32] yeah, because I think a lot of people kind of like dismiss it or Or they're

[41:34] kind of like dismiss it or Or they're afraid of it. Or they're afraid of it,

[41:35] afraid of it. Or they're afraid of it, etc. As which is totally understandable,

[41:37] etc. As which is totally understandable, of course. Yeah, I think like um

[41:39] of course. Yeah, I think like um it's fundamentally an empowering tool at

[41:41] it's fundamentally an empowering tool at the moment. Um and these jobs are

[41:43] the moment. Um and these jobs are bundles of tasks. And some of these

[41:44] bundles of tasks. And some of these tasks can go a lot faster. And so people

[41:46] tasks can go a lot faster. And so people should think of it as primarily a tool

[41:47] should think of it as primarily a tool that it is right now.

[41:48] that it is right now. Um and I think the long-term future of

[41:50] Um and I think the long-term future of that is uncertain. Yeah, it's kind of

[41:52] that is uncertain. Yeah, it's kind of really hard to forecast, to be honest.

[41:54] really hard to forecast, to be honest. And like I'm not professionally like

[41:56] And like I'm not professionally like doing that really. And I think this is a

[41:57] doing that really. And I think this is a job of like economists to do properly.

[41:59] job of like economists to do properly. You are an engineer though. And like one

[42:02] You are an engineer though. And like one thing I thought was interesting is that

[42:03] thing I thought was interesting is that like

[42:04] like the demand for engineering jobs

[42:06] the demand for engineering jobs is continuing to increase.

[42:08] is continuing to increase. >> Yeah. Um I I can't tell if that's like a

[42:10] >> Yeah. Um I I can't tell if that's like a temporary phenomenon. I'm not sure how I

[42:11] temporary phenomenon. I'm not sure how I feel about it. Yeah, do you know? Yeah,

[42:13] feel about it. Yeah, do you know? Yeah, that's like the demand elasticity almost

[42:14] that's like the demand elasticity almost like uh software was scarce, right? And

[42:17] like uh software was scarce, right? And so the reason we don't have more demand

[42:19] so the reason we don't have more demand for software is just there's its

[42:20] for software is just there's its scarcity and it's too expensive.

[42:22] scarcity and it's too expensive. >> So if the barrier comes down, then

[42:23] >> So if the barrier comes down, then actually you have the Jevons paradox,

[42:25] actually you have the Jevons paradox, which is like you know, you actually the

[42:26] which is like you know, you actually the demand for software actually goes up.

[42:27] demand for software actually goes up. It's cheaper and there's more More

[42:29] It's cheaper and there's more More powerful, yeah. The the classical

[42:31] powerful, yeah. The the classical example of this always is the ATMs and

[42:33] example of this always is the ATMs and the bank tellers

[42:34] the bank tellers uh because there was a lot of like fear

[42:36] uh because there was a lot of like fear that um ATMs and computers basically uh

[42:39] that um ATMs and computers basically uh would displace tellers. But what

[42:41] would displace tellers. But what happened is they made like the cost of

[42:42] happened is they made like the cost of operation of

[42:44] operation of of a bank branch much cheaper. And so

[42:46] of a bank branch much cheaper. And so there are more bank branches, so there

[42:47] there are more bank branches, so there are more tellers. It's like the

[42:49] are more tellers. It's like the canonical example people cite. Uh but

[42:51] canonical example people cite. Uh but basically it's just Jevons paradox. Like

[42:52] basically it's just Jevons paradox. Like something becomes cheaper, so there's

[42:55] something becomes cheaper, so there's a lot of unlocked demand for it. Uh so I

[42:57] a lot of unlocked demand for it. Uh so I do think that that's probably I do have

[42:59] do think that that's probably I do have like cautiously optimistic view of this

[43:01] like cautiously optimistic view of this in software engineering

[43:02] in software engineering where I do think um it does seem to me

[43:05] where I do think um it does seem to me like the demand for software will be

[43:06] like the demand for software will be extremely large. Um and it's just become

[43:08] extremely large. Um and it's just become a lot cheaper. And um

[43:11] a lot cheaper. And um so I do think that for quite some time

[43:14] so I do think that for quite some time um

[43:15] um it's very hard to forecast, but it does

[43:17] it's very hard to forecast, but it does seem to me like right now at least

[43:18] seem to me like right now at least locally there's going to be more demand

[43:19] locally there's going to be more demand for software.

[43:20] for software. Um because software is amazing. It's

[43:22] Um because software is amazing. It's like you know, digital information

[43:23] like you know, digital information processing. You're not forced to use

[43:25] processing. You're not forced to use like arbitrary tools that were given to

[43:26] like arbitrary tools that were given to you. They're imperfect in various ways.

[43:27] you. They're imperfect in various ways. You're not forced to subscribe to what

[43:29] You're not forced to subscribe to what exists. Code is now ephemeral and it can

[43:31] exists. Code is now ephemeral and it can change and it can be modified.

[43:33] change and it can be modified. Um

[43:34] Um and so I think there's going to be a lot

[43:35] and so I think there's going to be a lot of activity in the digital space to like

[43:38] of activity in the digital space to like rewire everything in a certain sense.

[43:40] rewire everything in a certain sense. And I think it's going to create a lot

[43:40] And I think it's going to create a lot of demand for for this kind of stuff. I

[43:43] of demand for for this kind of stuff. I think long-term um yeah, obviously even

[43:45] think long-term um yeah, obviously even with auto research like OpenAI or or you

[43:48] with auto research like OpenAI or or you know, Anthropic or these other labs like

[43:50] know, Anthropic or these other labs like they're employing what like a thousand

[43:51] they're employing what like a thousand something researchers, right?

[43:53] something researchers, right? >> Mhm. These researchers are basically

[43:54] >> Mhm. These researchers are basically like glorified auto like you know.

[43:57] like glorified auto like you know. >> [laughter]

[43:58] >> [laughter] >> They're like automating themselves away

[43:59] >> They're like automating themselves away like actively and this is like the thing

[44:00] like actively and this is like the thing they're all trying to do. Yeah. I

[44:02] they're all trying to do. Yeah. I like I went around um Some of those

[44:04] like I went around um Some of those researchers also fear that feel the

[44:06] researchers also fear that feel the psychosis, right? Because they can it's

[44:07] psychosis, right? Because they can it's working, right? And and so they're like

[44:10] working, right? And and so they're like it's over for me, too. I did spend a

[44:12] it's over for me, too. I did spend a bunch of time going around OpenAI and I

[44:13] bunch of time going around OpenAI and I was like, you guys realize if we're

[44:14] was like, you guys realize if we're successful like we're all out of job

[44:15] successful like we're all out of job like

[44:16] like like this is just going to we're just

[44:17] like this is just going to we're just building automation for Sam or something

[44:19] building automation for Sam or something like that. Like I or the board or I'm

[44:21] like that. Like I or the board or I'm not sure, but like uh they're just

[44:23] not sure, but like uh they're just building all this automation for yeah,

[44:25] building all this automation for yeah, the board or the CEO or something like

[44:27] the board or the CEO or something like that. And we're all out of our job and

[44:29] that. And we're all out of our job and maybe

[44:30] maybe contributing on the side. And so

[44:32] contributing on the side. And so yeah, it's kind of like unnerving from

[44:34] yeah, it's kind of like unnerving from that perspective. Is it okay if I ask

[44:36] that perspective. Is it okay if I ask you Noam's question? Mhm. You know, you

[44:38] you Noam's question? Mhm. You know, you could be doing that, right? Auto

[44:40] could be doing that, right? Auto researching with a lot of compute scale

[44:42] researching with a lot of compute scale and a bunch of colleagues at one of the

[44:43] and a bunch of colleagues at one of the frontier [clears throat] labs. Like why

[44:44] frontier [clears throat] labs. Like why not? Well, I was there for a while,

[44:46] not? Well, I was there for a while, right? Like and I did reenter. So to

[44:48] right? Like and I did reenter. So to some extent I agree and I think that

[44:49] some extent I agree and I think that there are many ways to slice this

[44:50] there are many ways to slice this question. It's very loaded question a

[44:52] question. It's very loaded question a little bit. Um I will say that I feel

[44:54] little bit. Um I will say that I feel very good about like what people can

[44:56] very good about like what people can contribute and their impact outside of

[44:58] contribute and their impact outside of the frontier labs, obviously. Not in the

[45:00] the frontier labs, obviously. Not in the industry, but also in like more like

[45:02] industry, but also in like more like ecosystem level roles. Um so your role

[45:05] ecosystem level roles. Um so your role for example is more like ecosystem

[45:06] for example is more like ecosystem level. My role currently is also kind of

[45:07] level. My role currently is also kind of more on ecosystem level. And I feel very

[45:09] more on ecosystem level. And I feel very good about like impact that people can

[45:10] good about like impact that people can have in those kinds of roles. I think

[45:12] have in those kinds of roles. I think conversely there's there are definite

[45:14] conversely there's there are definite problems in my mind for um uh for

[45:17] problems in my mind for um uh for basically aligning yourself way too much

[45:18] basically aligning yourself way too much with the frontier labs, too. So

[45:20] with the frontier labs, too. So fundamentally I mean you're you have a

[45:21] fundamentally I mean you're you have a huge amount of financial incentive to uh

[45:23] huge amount of financial incentive to uh with these frontier labs. And by your

[45:25] with these frontier labs. And by your own admission, the uh the AIs are going

[45:27] own admission, the uh the AIs are going to like really change humanity and

[45:29] to like really change humanity and society in very dramatic ways. And here

[45:31] society in very dramatic ways. And here you are basically like building the

[45:33] you are basically like building the technology and benefiting from it like

[45:35] technology and benefiting from it like it and being like very allied to it

[45:36] it and being like very allied to it through financial means. Like this was

[45:38] through financial means. Like this was the conundrum that was in at the heart

[45:40] the conundrum that was in at the heart of you know, how OpenAI was started in

[45:42] of you know, how OpenAI was started in the beginning. Like this was the

[45:43] the beginning. Like this was the conundrum that we were trying to solve.

[45:44] conundrum that we were trying to solve. Mhm. Um and so you know, that

[45:47] Mhm. Um and so you know, that so it's kind of um It's still not

[45:49] so it's kind of um It's still not resolved.

[45:50] resolved. >> is still not like fully resolved. So

[45:51] >> is still not like fully resolved. So that's number one. You're you're not a

[45:53] that's number one. You're you're not a completely free agent and you can't

[45:54] completely free agent and you can't actually like be part of that

[45:55] actually like be part of that conversation in a fully autonomous um

[45:58] conversation in a fully autonomous um free way. Like if you're inside one of

[45:59] free way. Like if you're inside one of the frontier labs. Like there's some

[46:01] the frontier labs. Like there's some things that you can't say. Uh and

[46:03] things that you can't say. Uh and conversely there are some things that

[46:04] conversely there are some things that the organization wants you to say. And

[46:06] the organization wants you to say. And you know, they're not going to twist

[46:07] you know, they're not going to twist your arm, but

[46:08] your arm, but you feel the pressure of like what you

[46:09] you feel the pressure of like what you should be saying,

[46:11] should be saying, you know, cuz like obviously

[46:13] you know, cuz like obviously >> [laughter]

[46:14] >> [laughter] >> otherwise it's like really awkward

[46:15] >> otherwise it's like really awkward conversations,

[46:17] conversations, uh strange side eyes, like what are you

[46:18] uh strange side eyes, like what are you doing, you know, like so you can't like

[46:20] doing, you know, like so you can't like really be an independent agent. And I I

[46:22] really be an independent agent. And I I feel like a bit more a lot like aligned

[46:24] feel like a bit more a lot like aligned with humanity in a certain sense outside

[46:25] with humanity in a certain sense outside of the frontier lab because

[46:27] of the frontier lab because I don't I'm not subject to those

[46:28] I don't I'm not subject to those pressures almost, right? And I can say

[46:30] pressures almost, right? And I can say whatever I want or Yeah, I would say in

[46:31] whatever I want or Yeah, I would say in the frontier labs like um

[46:34] the frontier labs like um you can have like

[46:35] you can have like impact there of course as well. So

[46:37] impact there of course as well. So but there's many researchers and maybe

[46:39] but there's many researchers and maybe you're one of them, maybe your ideas are

[46:40] you're one of them, maybe your ideas are really good, etc. Maybe there's a lot of

[46:41] really good, etc. Maybe there's a lot of decision-making to do and you want to be

[46:43] decision-making to do and you want to be in a position where you are in the room

[46:44] in a position where you are in the room with those conversations when they come

[46:45] with those conversations when they come up. I do think that currently the stakes

[46:47] up. I do think that currently the stakes are like overall fairly low and so

[46:49] are like overall fairly low and so everything is kind of like nice. But

[46:50] everything is kind of like nice. But ultimately in the end of the day like

[46:52] ultimately in the end of the day like when the stakes are really high, etc. If

[46:53] when the stakes are really high, etc. If you're an employee at an organization, I

[46:55] you're an employee at an organization, I don't actually know how much sway you're

[46:56] don't actually know how much sway you're going to have on your organization what

[46:57] going to have on your organization what it's going to do. Like fundamentally at

[46:59] it's going to do. Like fundamentally at the end of the day um

[47:01] the end of the day um uh it's uh you're not like really in

[47:03] uh it's uh you're not like really in charge. Like you're in the room and

[47:04] charge. Like you're in the room and you're contributing ideas, but you're

[47:05] you're contributing ideas, but you're not like really in charge of that entity

[47:07] not like really in charge of that entity that you're that you're part of. So

[47:08] that you're that you're part of. So those are like some sources of

[47:09] those are like some sources of misalignment, I think to some extent. I

[47:11] misalignment, I think to some extent. I will say that like in one way I do agree

[47:13] will say that like in one way I do agree a lot with that sentiment that um I do

[47:16] a lot with that sentiment that um I do feel like in the

[47:17] feel like in the like the labs for better or worse

[47:18] like the labs for better or worse they're opaque and a lot of work is

[47:20] they're opaque and a lot of work is there. And they're kind of like at the

[47:21] there. And they're kind of like at the edge of capability and what's possible.

[47:23] edge of capability and what's possible. And they're working on what's coming

[47:24] And they're working on what's coming down the line. And I think if you're

[47:26] down the line. And I think if you're outside of that frontier lab, your your

[47:28] outside of that frontier lab, your your judgment fundamentally will start to

[47:29] judgment fundamentally will start to drift because you're not part of the

[47:32] drift because you're not part of the you know,

[47:33] you know, what's coming down the line. And so I

[47:34] what's coming down the line. And so I feel like my judgment will inevitably

[47:36] feel like my judgment will inevitably start to drift as well. And I won't

[47:38] start to drift as well. And I won't actually have an understanding of how

[47:39] actually have an understanding of how these systems actually work under the

[47:40] these systems actually work under the hood. That's an opaque system.

[47:42] hood. That's an opaque system. I won't have a a good understanding of

[47:43] I won't have a a good understanding of how it's going to develop and etc. And

[47:45] how it's going to develop and etc. And so I do think that in that sense I agree

[47:48] so I do think that in that sense I agree and something I'm nervous about. I think

[47:49] and something I'm nervous about. I think it's worth basically

[47:51] it's worth basically being in touch with what's actually

[47:52] being in touch with what's actually happening and actually being in a

[47:53] happening and actually being in a frontier lab. And if if some of the

[47:55] frontier lab. And if if some of the frontier labs would have me come for you

[47:57] frontier labs would have me come for you know, some amount of time and do really

[47:58] know, some amount of time and do really good work for them and then maybe come

[48:00] good work for them and then maybe come and hang out.

[48:00] and hang out. >> looking for a job. This is super

[48:01] >> looking for a job. This is super exciting. [laughter]

[48:03] exciting. [laughter] Then I think that's maybe a good setup

[48:05] Then I think that's maybe a good setup because I kind of feel like it's kind of

[48:06] because I kind of feel like it's kind of um

[48:07] um you know,

[48:08] you know, maybe that's like one way Mhm. uh to to

[48:10] maybe that's like one way Mhm. uh to to actually be connected to what's actually

[48:12] actually be connected to what's actually happening, but also not feel like you're

[48:13] happening, but also not feel like you're necessarily fully controlled by Yeah. by

[48:15] necessarily fully controlled by Yeah. by those entities. So I think

[48:17] those entities. So I think honestly in my mind like

[48:19] honestly in my mind like Noam can probably get do extremely good

[48:21] Noam can probably get do extremely good work at at OAI, but also I think his

[48:23] work at at OAI, but also I think his most impactful work could very well be

[48:25] most impactful work could very well be outside of OpenAI. Noam, that's a call

[48:27] outside of OpenAI. Noam, that's a call to be an independent researcher with

[48:28] to be an independent researcher with auto [laughter] research.

[48:30] auto [laughter] research. Yeah, there's many things to do on the

[48:31] Yeah, there's many things to do on the outside and it's it's a

[48:33] outside and it's it's a and I think ultimately I think the ideal

[48:35] and I think ultimately I think the ideal solution maybe is like yeah, going back

[48:36] solution maybe is like yeah, going back and forth

[48:38] and forth or um

[48:39] or um yeah, and I think fundamentally you can

[48:40] yeah, and I think fundamentally you can have a really amazing impact in both

[48:42] have a really amazing impact in both places. So very complicated I don't

[48:43] places. So very complicated I don't know. Like it's a very loaded question a

[48:45] know. Like it's a very loaded question a little bit, but I mean I joined the

[48:46] little bit, but I mean I joined the frontier lab and I'm outside. And then

[48:48] frontier lab and I'm outside. And then maybe in the future I'll want to join

[48:50] maybe in the future I'll want to join again. And I think um

[48:52] again. And I think um uh that's kind of like how I look at it.

[48:54] uh that's kind of like how I look at it. One question related to what visibility

[48:57] One question related to what visibility to does the world or the AI ecosystem

[49:00] to does the world or the AI ecosystem have into

[49:01] have into the frontier is like how how close open

[49:04] the frontier is like how how close open source is to the frontier. Mhm. Um and

[49:07] source is to the frontier. Mhm. Um and how sustainable that is. I I think Yeah.

[49:09] how sustainable that is. I I think Yeah. I think it is quite surprising. The

[49:12] I think it is quite surprising. The entire sequence of events actually from

[49:14] entire sequence of events actually from like having a handful of Chinese models

[49:17] like having a handful of Chinese models and global models and I think people are

[49:19] and global models and I think people are going to continue releasing here in the

[49:20] going to continue releasing here in the near term that are closer than much of

[49:23] near term that are closer than much of the industry anticipated from a

[49:24] the industry anticipated from a capability [clears throat] perspective.

[49:26] capability [clears throat] perspective. >> Yeah. Um I don't know if you're

[49:27] >> Yeah. Um I don't know if you're surprised by that, but you're a

[49:28] surprised by that, but you're a long-term contributor to open source.

[49:29] long-term contributor to open source. Like what's your prediction here? Yeah,

[49:31] Like what's your prediction here? Yeah, so roughly speaking basically the

[49:33] so roughly speaking basically the the closed models are ahead, but like

[49:35] the closed models are ahead, but like people are monitoring the number of

[49:36] people are monitoring the number of months that sort of like open-source

[49:37] months that sort of like open-source models are behind. Um And started with

[49:39] models are behind. Um And started with there's nothing and then it went to 18

[49:41] there's nothing and then it went to 18 months. Now it's

[49:41] months. Now it's >> Yeah, but then convergence, right? So

[49:43] >> Yeah, but then convergence, right? So then maybe they're behind by like, what

[49:45] then maybe they're behind by like, what is the latest? Maybe like 8 months, 6

[49:46] is the latest? Maybe like 8 months, 6 months, 8 months kind of thing right

[49:47] months, 8 months kind of thing right now. Yeah, I'm a huge fan of

[49:48] now. Yeah, I'm a huge fan of open-source, obviously. So for example,

[49:50] open-source, obviously. So for example, in operating systems, you have like

[49:51] in operating systems, you have like closed source, like, you know, Windows

[49:52] closed source, like, you know, Windows and Mac OS, these are large software

[49:54] and Mac OS, these are large software projects, kind of like what LLMs are

[49:55] projects, kind of like what LLMs are going to become, and there's Linux. Mhm.

[49:57] going to become, and there's Linux. Mhm. But Linux is very easy. Like, actually

[49:59] But Linux is very easy. Like, actually Linux is extremely successful project.

[50:00] Linux is extremely successful project. It runs on the vast majority of

[50:01] It runs on the vast majority of computers. Like, last time I checked,

[50:03] computers. Like, last time I checked, was it like 60% or something like from

[50:05] was it like 60% or something like from Linux? Um and that's because there is a

[50:07] Linux? Um and that's because there is a need in the industry to have a common

[50:09] need in the industry to have a common open platform that everyone feels uh

[50:11] open platform that everyone feels uh sort of safe using. I would say like the

[50:13] sort of safe using. I would say like the industry has always felt a demand for

[50:14] industry has always felt a demand for that kind of a project to exist. Mhm.

[50:16] that kind of a project to exist. Mhm. >> And I think the same is true now. And

[50:18] >> And I think the same is true now. And that's why businesses actually want

[50:19] that's why businesses actually want there's demand for this kind of a um a

[50:21] there's demand for this kind of a um a thing to exist. The big difference is

[50:23] thing to exist. The big difference is that everything is capital uh there's a

[50:25] that everything is capital uh there's a lot of capex that goes into this.

[50:27] lot of capex that goes into this. >> Um so I think that's where things like

[50:29] >> Um so I think that's where things like fall apart a little bit, make it a bit

[50:30] fall apart a little bit, make it a bit harder to to compete in certain senses.

[50:32] harder to to compete in certain senses. Uh I I do think that the current models

[50:33] Uh I I do think that the current models are very good. The other thing that I

[50:35] are very good. The other thing that I think is like really interesting is that

[50:36] think is like really interesting is that for the vast majority of like consumer

[50:38] for the vast majority of like consumer use cases and things like that, even

[50:39] use cases and things like that, even like turn open-source models are

[50:41] like turn open-source models are actually quite good, I would say. And I

[50:42] actually quite good, I would say. And I think like if you go forward like more

[50:45] think like if you go forward like more uh more years, it does seem to me like a

[50:47] uh more years, it does seem to me like a huge amount of like simple use cases are

[50:50] huge amount of like simple use cases are going to be well covered and actually

[50:51] going to be well covered and actually even run locally. Mhm. Um

[50:54] even run locally. Mhm. Um but there's going to be always like some

[50:55] but there's going to be always like some demand for like frontier intelligence

[50:56] demand for like frontier intelligence and that that can actually be extremely

[50:58] and that that can actually be extremely large uh piece of the pie. But it could

[51:00] large uh piece of the pie. But it could be that the frontier the need for

[51:01] be that the frontier the need for frontier intelligence is going to be

[51:02] frontier intelligence is going to be like, you know, Nobel Prize kind of

[51:04] like, you know, Nobel Prize kind of work. Mhm.

[51:05] work. Mhm. >> let's move Linux from C to Rust. It's

[51:08] >> let's move Linux from C to Rust. It's going to be like bigger projects, you

[51:09] going to be like bigger projects, you know, like scoped in that kind of a way,

[51:12] know, like scoped in that kind of a way, and there's going to be maybe more um

[51:14] and there's going to be maybe more um and maybe that's where a lot of the

[51:15] and maybe that's where a lot of the frontier closed intelligence is where

[51:17] frontier closed intelligence is where going to are going to be interacting

[51:18] going to are going to be interacting with. And open-source kind of like going

[51:20] with. And open-source kind of like going to eat through a lot of the more basic

[51:22] to eat through a lot of the more basic use cases or something like that. You

[51:24] use cases or something like that. You know, at some point what is frontier

[51:25] know, at some point what is frontier today is going to be, you know, probably

[51:27] today is going to be, you know, probably later this year what's frontier today in

[51:29] later this year what's frontier today in terms of what I'm using right now from

[51:30] terms of what I'm using right now from the closed labs uh might be open-source

[51:33] the closed labs uh might be open-source and that's going to be doing a lot of

[51:34] and that's going to be doing a lot of work. So I kind of expect that this

[51:35] work. So I kind of expect that this dynamic will actually basically

[51:36] dynamic will actually basically continue. Like we'll have frontier labs

[51:38] continue. Like we'll have frontier labs that have closed um AIs that are kind of

[51:40] that have closed um AIs that are kind of like these oracles, and then we'll have

[51:41] like these oracles, and then we'll have open-source kind of like behind with

[51:42] open-source kind of like behind with some amount of months. And I kind of

[51:44] some amount of months. And I kind of expect that to uh to continue. And I

[51:47] expect that to uh to continue. And I actually think that's like a pretty

[51:48] actually think that's like a pretty pretty good setup uh overall. Um

[51:51] pretty good setup uh overall. Um because I I'm a little bit hesitant of

[51:53] because I I'm a little bit hesitant of having um I don't actually think it's

[51:54] having um I don't actually think it's like structurally I think there's some

[51:56] like structurally I think there's some systemic risk attached to just having

[51:58] systemic risk attached to just having intelligence that are closed and that's

[51:59] intelligence that are closed and that's like that's it. Mhm. And I think that

[52:02] like that's it. Mhm. And I think that that's a, you know, centralization has a

[52:03] that's a, you know, centralization has a very poor track record in my view uh in

[52:05] very poor track record in my view uh in in the past and has um

[52:07] in the past and has um >> You mean like in political or economic

[52:09] >> You mean like in political or economic systems in in general.

[52:10] systems in in general. >> [laughter]

[52:12] >> [laughter] >> Exactly. I think there's like a lot of

[52:13] >> Exactly. I think there's like a lot of like pretty

[52:13] like pretty >> an Eastern European. A lot of pretty bad

[52:16] >> an Eastern European. A lot of pretty bad precedents, so I want there to be a

[52:17] precedents, so I want there to be a thing that is maybe not at the edge of

[52:19] thing that is maybe not at the edge of capability because it's new and

[52:20] capability because it's new and unexplored, etc. But I want there to be

[52:21] unexplored, etc. But I want there to be a thing that's behind and that uh is

[52:24] a thing that's behind and that uh is kind of like a common working space for

[52:25] kind of like a common working space for intelligences that the entire industry

[52:27] intelligences that the entire industry has access to. Yeah, that seems to me

[52:28] has access to. Yeah, that seems to me like a pretty decent power balance for

[52:30] like a pretty decent power balance for the industry. Yeah. I also think there's

[52:31] the industry. Yeah. I also think there's just like there are many problems to

[52:33] just like there are many problems to solve, right? Like if you keep advancing

[52:35] solve, right? Like if you keep advancing intelligence from the frontier, we can

[52:37] intelligence from the frontier, we can do new things and there are a lot of

[52:39] do new things and there are a lot of like very big problems for humanity,

[52:40] like very big problems for humanity, right? And so like it seems that that

[52:43] right? And so like it seems that that will continue to be a very expensive

[52:44] will continue to be a very expensive game. And so I want to like root for

[52:46] game. And so I want to like root for labs that are doing that because there

[52:48] labs that are doing that because there are problems we cannot solve without

[52:49] are problems we cannot solve without continuing to advance the models in a

[52:51] continuing to advance the models in a very expensive way. And yet, as you

[52:53] very expensive way. And yet, as you point out, like if what we have

[52:56] point out, like if what we have today as frontier is open, that's a lot

[52:59] today as frontier is open, that's a lot of capability, right? And and so I I I

[53:01] of capability, right? And and so I I I think, you know, the power of that or

[53:03] think, you know, the power of that or the democratization of that seems like

[53:04] the democratization of that seems like >> Yeah. very useful and also healthy.

[53:06] >> Yeah. very useful and also healthy. >> Yeah. I think basically by accident

[53:08] >> Yeah. I think basically by accident we're actually like in an okay spot.

[53:09] we're actually like in an okay spot. >> An optimal. Yeah. [laughter] Yeah. Like

[53:11] >> An optimal. Yeah. [laughter] Yeah. Like by accident we we are it happened to be

[53:12] by accident we we are it happened to be in a good spot in a certain sense. Mhm.

[53:14] in a good spot in a certain sense. Mhm. Um Well, and and to some degree the the

[53:16] Um Well, and and to some degree the the longer this endures, like this dynamic,

[53:19] longer this endures, like this dynamic, um the the the healthier of a spot like

[53:21] um the the the healthier of a spot like the ecosystem might be in, right?

[53:24] the ecosystem might be in, right? Because you have more and more area

[53:25] Because you have more and more area under the curve.

[53:25] under the curve. >> Mhm. And I will say that even on the

[53:26] >> Mhm. And I will say that even on the closed side, I I almost feel like it's

[53:28] closed side, I I almost feel like it's been like even further centralizing

[53:30] been like even further centralizing recently because I think a lot of the

[53:31] recently because I think a lot of the frontrunners are like not necessarily

[53:32] frontrunners are like not necessarily like the top tier. And so uh yeah, like

[53:36] like the top tier. And so uh yeah, like in that sense I think it's um it's not

[53:37] in that sense I think it's um it's not super ideal. I would love there to be

[53:39] super ideal. I would love there to be more

[53:40] more more frontier labs because yeah, I'm

[53:42] more frontier labs because yeah, I'm like by default very suspicious of like

[53:44] like by default very suspicious of like um

[53:45] um I want there to be more people in the

[53:46] I want there to be more people in the room. I want I think like in machine

[53:48] room. I want I think like in machine learning ensembles always outperform any

[53:50] learning ensembles always outperform any individual model. And so I want there to

[53:51] individual model. And so I want there to be ensembles of people thinking about

[53:53] be ensembles of people thinking about all the hardest problems and I want

[53:54] all the hardest problems and I want there to be ensembles of people in the

[53:56] there to be ensembles of people in the room when they um

[53:57] room when they um to be all well informed and to make

[53:59] to be all well informed and to make those decisions, you know, so uh I don't

[54:01] those decisions, you know, so uh I don't want it to be like a closed doors with

[54:02] want it to be like a closed doors with two people or three people. I feel like

[54:03] two people or three people. I feel like that's like not a good not a good

[54:05] that's like not a good not a good future. I almost wish like there were

[54:07] future. I almost wish like there were more labs as long as they're short and I

[54:08] more labs as long as they're short and I I I do think that open-source has a has

[54:10] I I do think that open-source has a has a

[54:11] a has a place to play. I hope it sticks

[54:13] has a place to play. I hope it sticks around and I basically I it's currently

[54:15] around and I basically I it's currently slightly behind and it's actually kind

[54:17] slightly behind and it's actually kind of like a good thing. Okay, you worked

[54:19] of like a good thing. Okay, you worked on the precursor to generalized robotics

[54:21] on the precursor to generalized robotics autonomy um in cars, right?

[54:24] autonomy um in cars, right? Uh a a lot has happened in the last

[54:27] Uh a a lot has happened in the last couple months with robotics companies as

[54:29] couple months with robotics companies as well, like acceleration of really

[54:31] well, like acceleration of really impressive generalization of

[54:33] impressive generalization of environment, of tasks, like increasingly

[54:35] environment, of tasks, like increasingly long horizon tasks, lots of money going

[54:37] long horizon tasks, lots of money going into the space. Like, is it going to

[54:39] into the space. Like, is it going to happen? Has anything in your view

[54:40] happen? Has anything in your view changed recently? Uh so like my view is

[54:43] changed recently? Uh so like my view is kind of informed by what I saw in

[54:44] kind of informed by what I saw in self-driving and I do feel like

[54:45] self-driving and I do feel like self-driving is the first robotics

[54:46] self-driving is the first robotics application. So probably what I saw is

[54:48] application. So probably what I saw is at the time, like 10 years ago, there

[54:50] at the time, like 10 years ago, there were a large number of startups. And I

[54:51] were a large number of startups. And I kind of feel like um

[54:53] kind of feel like um like most of them basically like didn't

[54:55] like most of them basically like didn't long-term make it. Um and what I saw is

[54:57] long-term make it. Um and what I saw is that like a lot of capital expenditure

[54:59] that like a lot of capital expenditure had to go in and a lot of time. And so

[55:01] had to go in and a lot of time. And so um I think it's like I think robotics,

[55:03] um I think it's like I think robotics, because it's so difficult, is so messy,

[55:05] because it's so difficult, is so messy, and requires a huge amount of capital

[55:07] and requires a huge amount of capital investment, and a lot of like

[55:08] investment, and a lot of like conviction.

[55:09] conviction. Um just it's like a big problem and I

[55:12] Um just it's like a big problem and I think atoms are really hard. So I kind

[55:13] think atoms are really hard. So I kind of feel like they will lag be it will

[55:15] of feel like they will lag be it will lag behind what's going to happen in

[55:16] lag behind what's going to happen in digital space. And in digital space

[55:17] digital space. And in digital space there's going to be a huge amount of

[55:19] there's going to be a huge amount of unhobbling, uh basically like things

[55:21] unhobbling, uh basically like things that weren't super efficient becoming a

[55:23] that weren't super efficient becoming a lot more efficient by like a factor of a

[55:24] lot more efficient by like a factor of a hundred.

[55:25] hundred. >> Mhm. Because bits are so much easier.

[55:27] >> Mhm. Because bits are so much easier. And so I think currently in terms of

[55:29] And so I think currently in terms of what's going to change and

[55:31] what's going to change and like where the activity is, I kind of

[55:33] like where the activity is, I kind of feel like digital space is going to like

[55:35] feel like digital space is going to like change a huge amount. And then the

[55:36] change a huge amount. And then the physical space will lag behind. And what

[55:38] physical space will lag behind. And what I find very interesting is like this

[55:39] I find very interesting is like this interface in between them as well.

[55:41] interface in between them as well. Because I think in this like if you we

[55:43] Because I think in this like if you we do have more agents acting on behalf of

[55:45] do have more agents acting on behalf of humans and more agents kind of like

[55:46] humans and more agents kind of like talking to each other and and doing

[55:48] talking to each other and and doing tasks and participating in kind of

[55:50] tasks and participating in kind of economy of agents, etc. Um you're going

[55:53] economy of agents, etc. Um you're going to run out of things that you're going

[55:54] to run out of things that you're going to do purely in the digital space. At

[55:56] to do purely in the digital space. At some point you have to go to the

[55:57] some point you have to go to the universe and you have to ask it

[55:58] universe and you have to ask it questions. Um you have to run an

[56:00] questions. Um you have to run an experiment and see what the universe

[56:01] experiment and see what the universe tells you to get back to learn

[56:02] tells you to get back to learn something. And so we currently have a

[56:05] something. And so we currently have a huge amount of like digital work uh

[56:07] huge amount of like digital work uh because there's an overhang in how much

[56:08] because there's an overhang in how much we collectively thought about what

[56:10] we collectively thought about what already is digital.

[56:12] already is digital. So we just didn't have enough thinking

[56:13] So we just didn't have enough thinking cycles among the humans to think about

[56:14] cycles among the humans to think about all the information that is already

[56:15] all the information that is already digital and already uploaded. Um and so

[56:18] digital and already uploaded. Um and so we're going to start running out of

[56:19] we're going to start running out of stuff that is actually like um

[56:21] stuff that is actually like um already up uploaded. Uh so you're going

[56:23] already up uploaded. Uh so you're going to at some point read all the papers and

[56:25] to at some point read all the papers and process them and have some ideas about

[56:26] process them and have some ideas about what to try, but um yeah, we're just

[56:29] what to try, but um yeah, we're just going to

[56:29] going to uh I don't actually know how much you

[56:31] uh I don't actually know how much you can like get intelligence that's like

[56:32] can like get intelligence that's like fully closed off and was just

[56:33] fully closed off and was just information that's available in the you

[56:35] information that's available in the you know. And so I think what's going to

[56:36] know. And so I think what's going to happen is first there's going to be a

[56:38] happen is first there's going to be a huge amount of unhobbling and I think

[56:39] huge amount of unhobbling and I think there's a huge amount of work there.

[56:40] there's a huge amount of work there. Then actually it's going to move to like

[56:41] Then actually it's going to move to like the interfaces between physical and

[56:42] the interfaces between physical and digital. So I and that's like sensors of

[56:45] digital. So I and that's like sensors of like seeing the world and actuators of

[56:47] like seeing the world and actuators of like doing something to the world.

[56:48] like doing something to the world. >> Mhm. So I think a lot of interesting

[56:49] >> Mhm. So I think a lot of interesting companies will actually come from that

[56:51] companies will actually come from that interface of like can we feed the

[56:53] interface of like can we feed the superintelligence in a certain sense uh

[56:55] superintelligence in a certain sense uh data and can we actually like take data

[56:57] data and can we actually like take data out and manipulate the physical world um

[57:00] out and manipulate the physical world um per its bidding if you want to like

[57:01] per its bidding if you want to like anthropomorphize the whole thing, right?

[57:03] anthropomorphize the whole thing, right? And then the the physical world actually

[57:04] And then the the physical world actually I almost feel like the the total

[57:06] I almost feel like the the total addressable market, etc. in terms of

[57:07] addressable market, etc. in terms of like the amount of work and so on is is

[57:09] like the amount of work and so on is is massive, possibly even much larger maybe

[57:11] massive, possibly even much larger maybe what can happen in digital space. So

[57:13] what can happen in digital space. So actually think it's like a much bigger

[57:14] actually think it's like a much bigger opportunity as well. But um

[57:18] opportunity as well. But um I do feel like it's a huge amount of

[57:19] I do feel like it's a huge amount of work and and in my in my mind the atoms

[57:21] work and and in my in my mind the atoms are just like a a million times harder.

[57:24] are just like a a million times harder. So um so it will lag behind, but it's

[57:26] So um so it will lag behind, but it's also I think a little bit of a bigger

[57:27] also I think a little bit of a bigger market. So it's kind of like uh yeah, I

[57:29] market. So it's kind of like uh yeah, I think the opportunity is kind of like

[57:31] think the opportunity is kind of like follow that kind of trajectory. So right

[57:32] follow that kind of trajectory. So right now is digital is like my main interest.

[57:36] now is digital is like my main interest. Then interfaces will be like after that

[57:38] Then interfaces will be like after that and then maybe like some of the physical

[57:39] and then maybe like some of the physical things um like their time will come and

[57:41] things um like their time will come and they'll be huge when they do come.

[57:43] they'll be huge when they do come. Well, it's it's it's an interesting

[57:44] Well, it's it's it's an interesting framework for it, too, because uh

[57:46] framework for it, too, because uh certain things, not the things I'm

[57:47] certain things, not the things I'm working on right now, but certain things

[57:48] working on right now, but certain things are much easier even in the world of

[57:50] are much easier even in the world of atoms.

[57:51] atoms. >> Mhm. Right? Like if you just think about

[57:52] >> Mhm. Right? Like if you just think about like read and write to the physical

[57:54] like read and write to the physical world, like read, like sensors, cameras,

[57:57] world, like read, like sensors, cameras, like there's a lot of existing hardware

[57:58] like there's a lot of existing hardware and you can imagine like

[58:01] and you can imagine like enriching agent capabilities or

[58:03] enriching agent capabilities or capturing a lot of new data if you just

[58:04] capturing a lot of new data if you just clever about it and like you don't

[58:06] clever about it and like you don't necessarily have to invest a lot to like

[58:09] necessarily have to invest a lot to like get something valuable.

[58:10] get something valuable. >> Yeah. Right. Yeah. So like examples of

[58:12] >> Yeah. Right. Yeah. So like examples of this that I saw for example are, you

[58:13] this that I saw for example are, you know, um a friend of mine, Liam, is

[58:15] know, um a friend of mine, Liam, is running is a CEO of Periodic. I

[58:18] running is a CEO of Periodic. I visited them last week. Yeah. So it was

[58:19] visited them last week. Yeah. So it was just on top of mind. Like they're trying

[58:21] just on top of mind. Like they're trying to do auto research for materials

[58:22] to do auto research for materials science. Mhm. Um and so in that case

[58:24] science. Mhm. Um and so in that case it's like the sensors to the

[58:26] it's like the sensors to the intelligence are actually like pretty

[58:27] intelligence are actually like pretty expensive lab equipment. And the same is

[58:29] expensive lab equipment. And the same is true in biology. I think a lot of people

[58:30] true in biology. I think a lot of people are very interested in engineering

[58:31] are very interested in engineering biology and, you know, the sensors will

[58:33] biology and, you know, the sensors will be more than just like video cameras.

[58:34] be more than just like video cameras. Does that make sense? And then the other

[58:36] Does that make sense? And then the other thing I was I saw for example is

[58:37] thing I was I saw for example is companies that are trying to have um

[58:39] companies that are trying to have um like you basically pay people for

[58:40] like you basically pay people for training data. Yeah. Yeah. Yeah. Yeah.

[58:42] training data. Yeah. Yeah. Yeah. Yeah. >> To feed the Yeah.

[58:42] >> To feed the Yeah. >> programmatically.

[58:43] >> programmatically. >> Yeah. To feed to feed the Borg. Uh

[58:46] >> Yeah. To feed to feed the Borg. Uh um and so like these are all examples of

[58:48] um and so like these are all examples of like sensors in a certain sense. So they

[58:50] like sensors in a certain sense. So they take many diverse shapes and forms if

[58:51] take many diverse shapes and forms if that makes sense. Mhm. Yeah, so I'm

[58:53] that makes sense. Mhm. Yeah, so I'm looking forward to the point where I can

[58:54] looking forward to the point where I can ask for a task in the physical world and

[58:57] ask for a task in the physical world and I can put a price on it and just tell

[58:59] I can put a price on it and just tell the agent like, you know, you figure out

[59:00] the agent like, you know, you figure out how to do it. Go get the data.

[59:02] how to do it. Go get the data. >> I'm actually kind of surprised we don't

[59:03] >> I'm actually kind of surprised we don't have enough like information markets.

[59:05] have enough like information markets. Mhm. Like if for example if Polymarket

[59:06] Mhm. Like if for example if Polymarket or other betting markets or even stocks,

[59:08] or other betting markets or even stocks, etc. If they have so much autonomous

[59:09] etc. If they have so much autonomous activity and rising amount of activity,

[59:11] activity and rising amount of activity, Mhm. like um

[59:13] Mhm. like um why should like for example if Iran was

[59:14] why should like for example if Iran was just happening now, like how come there

[59:16] just happening now, like how come there isn't a process where like taking a

[59:17] isn't a process where like taking a photo or video from somewhere in Tehran

[59:19] photo or video from somewhere in Tehran should cost like 10 bucks? Like someone

[59:21] should cost like 10 bucks? Like someone should be able to pay for that, you

[59:22] should be able to pay for that, you know, like and that's an example of like

[59:23] know, like and that's an example of like feeding the intelligence. There's not

[59:25] feeding the intelligence. There's not going to be a human looking at it, it's

[59:26] going to be a human looking at it, it's going to be like agents who are trying

[59:27] going to be like agents who are trying to guess the betting games and stock

[59:29] to guess the betting games and stock markets and so on. Mhm. So I kind of

[59:31] markets and so on. Mhm. So I kind of feel like the agentic web is still like

[59:32] feel like the agentic web is still like fairly new, but there's no like

[59:34] fairly new, but there's no like mechanisms for this, but this is an

[59:35] mechanisms for this, but this is an example of what I I think might happen.

[59:37] example of what I I think might happen. Uh there's a good book that maybe is

[59:39] Uh there's a good book that maybe is inspiring called Daemon. Mhm. You

[59:41] inspiring called Daemon. Mhm. You potentially read it. In Daemon, the

[59:43] potentially read it. In Daemon, the intelligence um

[59:45] intelligence um ends up like puppeteering almost a

[59:46] ends up like puppeteering almost a little bit like humanity in a certain

[59:48] little bit like humanity in a certain sense, you know? And so, humans are kind

[59:49] sense, you know? And so, humans are kind of like it's actuators, but humans are

[59:51] of like it's actuators, but humans are also like its sensors. Um and so, I

[59:53] also like its sensors. Um and so, I think like collectively like society

[59:55] think like collectively like society will kind of like reshape in a certain

[59:56] will kind of like reshape in a certain way in uh

[59:58] way in uh to to serve that kind of a

[01:00:01] to to serve that kind of a that will kind of like end up happening

[01:00:02] that will kind of like end up happening collectively across the industry. Where

[01:00:04] collectively across the industry. Where yeah, there's just a lot more automation

[01:00:06] yeah, there's just a lot more automation and it has certain needs and kind of

[01:00:07] and it has certain needs and kind of humans will be serving those needs of

[01:00:09] humans will be serving those needs of that of that machine, not necessarily

[01:00:11] that of that machine, not necessarily like to each other.

[01:00:12] like to each other. >> Well, we were um on this very specific

[01:00:14] >> Well, we were um on this very specific point of uh like missing pieces of

[01:00:16] point of uh like missing pieces of training data. We needed um we needed

[01:00:18] training data. We needed um we needed something like auto research, right?

[01:00:19] something like auto research, right? Like we we need the training cycle or

[01:00:21] Like we we need the training cycle or the SFTP piece to be uh

[01:00:24] the SFTP piece to be uh far more mechanized. Mhm. For for which

[01:00:27] far more mechanized. Mhm. For for which part?

[01:00:28] part? >> In order to make the

[01:00:30] >> In order to make the uh collection like to in order to take

[01:00:32] uh collection like to in order to take the human out of the loop to ask for a

[01:00:33] the human out of the loop to ask for a task that is just like improve my model

[01:00:35] task that is just like improve my model quality with new data, right? Uh yes.

[01:00:40] quality with new data, right? Uh yes. Does that make sense to you? Like we um

[01:00:42] Does that make sense to you? Like we um if you can't have the model do the

[01:00:44] if you can't have the model do the training runs by itself, then your

[01:00:48] training runs by itself, then your ability to do this as a like closed loop

[01:00:50] ability to do this as a like closed loop task with uh by pricing data is um more

[01:00:54] task with uh by pricing data is um more challenged. Yes, yes, 100%. Yeah. But

[01:00:57] challenged. Yes, yes, 100%. Yeah. But now you do.

[01:00:57] now you do. >> The thing is for LLM training, it

[01:00:59] >> The thing is for LLM training, it actually is like very easily it like

[01:01:01] actually is like very easily it like really fits the paradigm. Mhm. Um so,

[01:01:03] really fits the paradigm. Mhm. Um so, you'd actually expect

[01:01:04] you'd actually expect >> metric. Yeah, like LLM training actually

[01:01:06] >> metric. Yeah, like LLM training actually fits the paradigm really well, really

[01:01:07] fits the paradigm really well, really easily. Like all the optimization of all

[01:01:09] easily. Like all the optimization of all the code and so, it runs faster. And

[01:01:11] the code and so, it runs faster. And then you also have like metrics that you

[01:01:12] then you also have like metrics that you can optimize against. I do think that if

[01:01:14] can optimize against. I do think that if you had an autonomous loop over those

[01:01:16] you had an autonomous loop over those metrics, there's going to be a lot of

[01:01:17] metrics, there's going to be a lot of like good herding going on where the

[01:01:18] like good herding going on where the system will like overfit to those

[01:01:20] system will like overfit to those metrics. And so, um but then you can use

[01:01:22] metrics. And so, um but then you can use the system to devise more metrics and

[01:01:23] the system to devise more metrics and you just have a really good coverage.

[01:01:25] you just have a really good coverage. So, it's kind of hard to tell, but um

[01:01:28] So, it's kind of hard to tell, but um in a certain sense it's like a pretty

[01:01:29] in a certain sense it's like a pretty pretty good fit. I want to talk about a

[01:01:31] pretty good fit. I want to talk about a little uh

[01:01:32] little uh tiny side project you have before we

[01:01:34] tiny side project you have before we end. Um tell me about the micro GPT

[01:01:36] end. Um tell me about the micro GPT arts. Oh, yeah.

[01:01:37] arts. Oh, yeah. Okay, so micro GPT. So, I have this like

[01:01:40] Okay, so micro GPT. So, I have this like running obsession of like maybe a decade

[01:01:41] running obsession of like maybe a decade or two of just like simplifying and

[01:01:43] or two of just like simplifying and boiling down the uh basically LLMs uh to

[01:01:46] boiling down the uh basically LLMs uh to like their bare essence. And I've had a

[01:01:48] like their bare essence. And I've had a number of projects along these lines.

[01:01:50] number of projects along these lines. So, like nano GPT and um make more and

[01:01:53] So, like nano GPT and um make more and uh micro GPT micro grad etc. So, I feel

[01:01:56] uh micro GPT micro grad etc. So, I feel like micro GPT is now the state of the

[01:01:58] like micro GPT is now the state of the art of me trying to like just boil it

[01:01:59] art of me trying to like just boil it down to just the essence. Because the

[01:02:01] down to just the essence. Because the thing is like training neural nets and

[01:02:03] thing is like training neural nets and LLMs specifically um is a huge amount of

[01:02:05] LLMs specifically um is a huge amount of code, but all of that code is actually

[01:02:07] code, but all of that code is actually complexity from efficiency. It's just

[01:02:09] complexity from efficiency. It's just because you need it to go fast. If you

[01:02:11] because you need it to go fast. If you don't need it to go fast and you just

[01:02:12] don't need it to go fast and you just care about the algorithm, then that

[01:02:14] care about the algorithm, then that algorithm actually is uh 200 lines of

[01:02:15] algorithm actually is uh 200 lines of Python, very simple to read. And this

[01:02:17] Python, very simple to read. And this includes comments and everything. Um

[01:02:19] includes comments and everything. Um because you just have like uh your data

[01:02:21] because you just have like uh your data set which is a text um and you need your

[01:02:23] set which is a text um and you need your neural network architecture which is

[01:02:24] neural network architecture which is like 50 lines. You need to do your

[01:02:26] like 50 lines. You need to do your forward pass and then you have to do

[01:02:28] forward pass and then you have to do your backward pass to calculate the

[01:02:29] your backward pass to calculate the gradients. And so, an auto grad engine

[01:02:31] gradients. And so, an auto grad engine uh to calculate the gradients like 100

[01:02:33] uh to calculate the gradients like 100 lines. And then you need an optimizer

[01:02:34] lines. And then you need an optimizer and Adam for example, uh which is a very

[01:02:36] and Adam for example, uh which is a very state of the art optimizer is like again

[01:02:38] state of the art optimizer is like again 10 lines, really. And so, putting

[01:02:40] 10 lines, really. And so, putting everything together in the training loop

[01:02:41] everything together in the training loop is like yeah, 200 lines. And what's

[01:02:44] is like yeah, 200 lines. And what's interesting to me like normally before

[01:02:46] interesting to me like normally before like maybe a year ago or more, if I had

[01:02:49] like maybe a year ago or more, if I had come up with micro GPT, I would be

[01:02:50] come up with micro GPT, I would be tempted to basically explain to people.

[01:02:52] tempted to basically explain to people. Like I have a video like stepping

[01:02:54] Like I have a video like stepping through it or something like that. Uh

[01:02:56] through it or something like that. Uh and I actually tried to make that video

[01:02:57] and I actually tried to make that video a little bit. And I tried to make like a

[01:02:59] a little bit. And I tried to make like a little guide to it and so on. But I kind

[01:03:01] little guide to it and so on. But I kind of realized that this is is not really

[01:03:03] of realized that this is is not really is not really adding too much because

[01:03:05] is not really adding too much because people cuz it's already so simple that

[01:03:06] people cuz it's already so simple that it's 200 lines that anyone could ask

[01:03:08] it's 200 lines that anyone could ask their agent to explain it in various

[01:03:09] their agent to explain it in various ways. And the agents like I'm not

[01:03:11] ways. And the agents like I'm not explaining to people anymore. I'm

[01:03:13] explaining to people anymore. I'm explaining it to agents. If you can

[01:03:14] explaining it to agents. If you can explain it to agents, then agents can be

[01:03:16] explain it to agents, then agents can be the router and they can actually target

[01:03:18] the router and they can actually target it to the human in their language uh

[01:03:20] it to the human in their language uh with infinite uh you know,

[01:03:22] with infinite uh you know, patience and uh just at their capability

[01:03:25] patience and uh just at their capability and so on. Right. If I don't understand

[01:03:27] and so on. Right. If I don't understand um this particular function, I can ask

[01:03:30] um this particular function, I can ask the agent to explain it to me like three

[01:03:31] the agent to explain it to me like three different ways and I'm not going to get

[01:03:32] different ways and I'm not going to get that from you. Exactly. And so, I kind

[01:03:34] that from you. Exactly. And so, I kind of feel like, you know, what is

[01:03:35] of feel like, you know, what is education? Like it used to be guides, it

[01:03:36] education? Like it used to be guides, it used to be lectures, it used to be this

[01:03:38] used to be lectures, it used to be this thing, but now I feel like now more I'm

[01:03:39] thing, but now I feel like now more I'm explaining things to agents and maybe

[01:03:41] explaining things to agents and maybe I'm coming up with skills uh where like

[01:03:44] I'm coming up with skills uh where like um

[01:03:45] um uh so, basically skill is just a way to

[01:03:47] uh so, basically skill is just a way to instruct the agent how to teach the

[01:03:48] instruct the agent how to teach the thing. So, maybe I could have a skill

[01:03:50] thing. So, maybe I could have a skill for micro GPT of the progression I

[01:03:52] for micro GPT of the progression I imagine the agent should take you

[01:03:53] imagine the agent should take you through if you're interested in

[01:03:54] through if you're interested in understanding the code base. And it's

[01:03:56] understanding the code base. And it's just like hints to the model to like uh

[01:03:58] just like hints to the model to like uh first start off with this and then with

[01:03:59] first start off with this and then with that. And so, I could just script the

[01:04:01] that. And so, I could just script the curriculum a little bit as a skill.

[01:04:03] curriculum a little bit as a skill. Uh so,

[01:04:04] Uh so, uh so, I I don't feel like um

[01:04:06] uh so, I I don't feel like um yeah, I feel like there's going to be

[01:04:07] yeah, I feel like there's going to be less of like explaining things directly

[01:04:09] less of like explaining things directly to people and it's going to be more of

[01:04:10] to people and it's going to be more of just like does the agent get it? And if

[01:04:12] just like does the agent get it? And if the agent gets it, they'll do the

[01:04:13] the agent gets it, they'll do the explanation. And we're not fully there

[01:04:16] explanation. And we're not fully there yet because they I still can I still

[01:04:17] yet because they I still can I still think I can probably explain things a

[01:04:19] think I can probably explain things a little bit better than the agents, but I

[01:04:20] little bit better than the agents, but I still feel like the models are improving

[01:04:21] still feel like the models are improving so rapidly that um

[01:04:24] so rapidly that um I feel like it's a losing battle to some

[01:04:26] I feel like it's a losing battle to some to some extent.

[01:04:28] to some extent. Um and so, I think education is going to

[01:04:30] Um and so, I think education is going to be kind of like reshuffled by this quite

[01:04:32] be kind of like reshuffled by this quite substantially uh where it's the end of

[01:04:34] substantially uh where it's the end of like teaching each other things a little

[01:04:36] like teaching each other things a little bit like if I have a um library for

[01:04:39] bit like if I have a um library for example of code or something like that.

[01:04:40] example of code or something like that. It used to be that you have

[01:04:41] It used to be that you have documentation for other people who are

[01:04:42] documentation for other people who are going to use your library, but like you

[01:04:44] going to use your library, but like you shouldn't do that anymore. Like you

[01:04:45] shouldn't do that anymore. Like you should have instead of HTML documents

[01:04:47] should have instead of HTML documents for humans, you have markdown documents

[01:04:48] for humans, you have markdown documents for agents. Cuz if agents get it, then

[01:04:50] for agents. Cuz if agents get it, then they can just explain all the different

[01:04:51] they can just explain all the different parts of it. So, it's this redirection

[01:04:54] parts of it. So, it's this redirection through agents, you know?

[01:04:55] through agents, you know? Um and that's why. So, I think we're

[01:04:57] Um and that's why. So, I think we're going to see a lot more of that playing

[01:04:59] going to see a lot more of that playing out. Well, we'll see if the great

[01:05:01] out. Well, we'll see if the great teachers know like to develop intuition

[01:05:03] teachers know like to develop intuition for how to explain things to agents

[01:05:05] for how to explain things to agents differently.

[01:05:05] differently. >> ultimately, so for example, micro GPT,

[01:05:07] >> ultimately, so for example, micro GPT, like I asked I tried to get an agent to

[01:05:09] like I asked I tried to get an agent to write micro GPT. So, I told it like try

[01:05:11] write micro GPT. So, I told it like try to boil down the simplest things. Like

[01:05:14] to boil down the simplest things. Like try to boil down my um neural network

[01:05:16] try to boil down my um neural network training to the simplest thing and it

[01:05:16] training to the simplest thing and it can't do it. Like micro GPT is like my

[01:05:20] can't do it. Like micro GPT is like my is it's like my end of my obsession.

[01:05:22] is it's like my end of my obsession. It's the 200 lines. I thought about this

[01:05:24] It's the 200 lines. I thought about this for a long time. I was obsessed about

[01:05:26] for a long time. I was obsessed about this for a long time. This is this is

[01:05:27] this for a long time. This is this is the solution. Trust me, it can't get

[01:05:29] the solution. Trust me, it can't get simpler. And this is this is my value

[01:05:31] simpler. And this is this is my value add. Everything else like agent gets it.

[01:05:33] add. Everything else like agent gets it. It just can't come up with it, but it

[01:05:34] It just can't come up with it, but it totally gets it and understands why it's

[01:05:36] totally gets it and understands why it's done in a certain way etc. Uh so, like

[01:05:38] done in a certain way etc. Uh so, like my contribution is kind of like these

[01:05:40] my contribution is kind of like these few bits, but everything else in terms

[01:05:42] few bits, but everything else in terms of like the education that goes on after

[01:05:44] of like the education that goes on after that is like not my domain anymore.

[01:05:47] that is like not my domain anymore. So, maybe

[01:05:48] So, maybe yeah, it's like education kind of

[01:05:49] yeah, it's like education kind of changes in those ways where you kind of

[01:05:50] changes in those ways where you kind of have to infuse the few bits that you

[01:05:52] have to infuse the few bits that you feel strongly about the curriculum or

[01:05:54] feel strongly about the curriculum or the the best the better way of

[01:05:56] the the best the better way of explaining it or something like that.

[01:05:57] explaining it or something like that. The things that agents can't do is your

[01:05:58] The things that agents can't do is your job now. The things that agents can do,

[01:06:01] job now. The things that agents can do, they can probably do better than you or

[01:06:02] they can probably do better than you or like very soon. And so, you should um be

[01:06:05] like very soon. And so, you should um be strategic about what you're actually

[01:06:06] strategic about what you're actually spending time on. Well, we appreciate

[01:06:08] spending time on. Well, we appreciate the few bits.

[01:06:09] the few bits. Thank you, Andre.

[01:06:10] Thank you, Andre. Okay.

[01:06:13] Find us on Twitter at No Priors Pod.

[01:06:15] Find us on Twitter at No Priors Pod. >> [music]

[01:06:15] >> [music] >> Subscribe to our YouTube channel if you

[01:06:17] >> Subscribe to our YouTube channel if you want to see our faces. Follow the show

[01:06:19] want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever

[01:06:22] on Apple Podcasts, Spotify, or wherever you listen. [music] That way you get a

[01:06:23] you listen. [music] That way you get a new episode every week. And sign up for

[01:06:25] new episode every week. And sign up for emails or find transcripts for every

[01:06:26] emails or find transcripts for every episode at no-priors.com.

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

Full Transcript

Full Transcript

Full Transcript (Bilingual)

Summary

摘要 / Summary (zh-CN)

Cite this page