1 00:00:01,600 --> 00:00:04,880 As of today, I still think there is a 2 00:00:05,080 --> 00:00:08,580 25% chance we will get superintelligence by 2030. 3 00:00:09,620 --> 00:00:13,480 There's at least a 10% chance literally every person on Earth 4 00:00:13,520 --> 00:00:17,440 will die by 2030. There's at least 5 00:00:17,500 --> 00:00:21,080 a 10% chance we will get a permanent dictatorship. 6 00:00:21,340 --> 00:00:25,300 A handful of people will run the entire world for centuries 7 00:00:25,380 --> 00:00:27,740 together, with no ability to, you know, overthrow them. 8 00:00:28,300 --> 00:00:30,520 At least 10% chance of this by 2030. 9 00:00:31,560 --> 00:00:35,060 The only way all these numbers change is if there is a 10 00:00:35,120 --> 00:00:39,100 political movement to pause AI research that succeeds within the next five years. 11 00:00:40,320 --> 00:00:44,120 Uh, I'm going to now describe my technical 12 00:00:44,320 --> 00:00:44,980 I think this. 13 00:00:50,580 --> 00:00:50,589 (electronic music) 14 00:00:50,640 --> 00:00:54,580 Creation of artificial superintelligence is likely the most important event 15 00:00:54,620 --> 00:00:57,650 of approximately 10,000 years of human history. 16 00:00:58,900 --> 00:01:00,540 It is more important than Industrial 17 00:01:00,600 --> 00:01:03,680 Revolution/Newton/French 18 00:01:03,760 --> 00:01:06,060 Revolution/printing press. 19 00:01:07,200 --> 00:01:10,940 It is more important than invention of nuclear weapons, as an 20 00:01:11,020 --> 00:01:14,980 ASI will accelerate creation of weapons more dangerous 21 00:01:15,040 --> 00:01:15,660 than nukes. 22 00:01:16,780 --> 00:01:20,560 It is definitely more important than the creation of the internet and all 23 00:01:20,660 --> 00:01:22,180 Silicon Valley startups. 24 00:01:23,260 --> 00:01:27,080 If the takeoff is fast enough, creation of superintelligence 25 00:01:27,120 --> 00:01:30,820 may be the most important event in the approximately 14 26 00:01:31,000 --> 00:01:33,420 billion year history of the universe. 27 00:01:34,040 --> 00:01:37,820 It could end up more important than the evolutionary history of all 28 00:01:37,960 --> 00:01:41,620 other life forms on Earth. And to the best of our current 29 00:01:41,680 --> 00:01:45,500 knowledge, Earth is the only place in the universe 30 00:01:45,540 --> 00:01:46,640 intelligent life. 31 00:01:47,840 --> 00:01:51,230 Scaling of RL compute still seems important and 32 00:01:51,340 --> 00:01:52,020 unsolved. 33 00:01:53,100 --> 00:01:56,500 We still seem to have spent less than $1 million per 34 00:01:56,560 --> 00:02:00,220 inference, and can, in theory, spend at least $1 35 00:02:00,400 --> 00:02:04,200 billion on it. In my head, it is an open 36 00:02:04,260 --> 00:02:07,370 question whether the labs tried it and got bad results, 37 00:02:08,120 --> 00:02:11,320 or if there's some other fixable research or infra 38 00:02:11,420 --> 00:02:15,220 bottleneck that is preventing them from trying it. 39 00:02:15,360 --> 00:02:19,120 Meta-graph on this still seems useful, even though it is making 40 00:02:19,200 --> 00:02:20,140 simplifications. 41 00:02:21,340 --> 00:02:22,700 This is still important. 42 00:02:23,700 --> 00:02:27,280 If we have one AI that is superhuman, we can 43 00:02:27,320 --> 00:02:31,120 probably run at least 100,000 copies of it at 44 00:02:31,420 --> 00:02:33,780 100 times the speed of human thinking. 45 00:02:34,340 --> 00:02:38,120 This means that once the first slightly superhuman AI is 46 00:02:38,160 --> 00:02:41,900 invented, we will go from slightly superhuman to 47 00:02:42,100 --> 00:02:45,720 vastly superhuman in a very short span of time. 48 00:02:46,960 --> 00:02:50,740 I would nowadays more emphasize the meta-learning point below, 49 00:02:50,780 --> 00:02:54,580 though, because that decides what happens until we 50 00:02:54,600 --> 00:02:55,860 superhuman AI. 51 00:02:56,840 --> 00:03:00,240 Deferring to experts is obviously still important. 52 00:03:00,460 --> 00:03:04,300 I value my inside view a lot, but I value expert 53 00:03:04,340 --> 00:03:05,940 opinion some amount too. 54 00:03:07,160 --> 00:03:11,020 Many researchers who pretend not to be doomer also have 55 00:03:11,080 --> 00:03:13,340 timelines in the next five to 10 years. 56 00:03:14,040 --> 00:03:17,400 Examples, Rich Sutton, Ilia Sutskever, 57 00:03:17,980 --> 00:03:19,740 Andrej Karpathy, et cetera. 58 00:03:20,760 --> 00:03:24,460 There is the obvious list of doomer researchers, example, 59 00:03:24,860 --> 00:03:28,660 Geoffrey Hinton, Yoshua Bengio, Stuart Russell, 60 00:03:28,880 --> 00:03:29,340 et cetera. 61 00:03:30,380 --> 00:03:34,300 Heads of all frontier labs agree that their own tech could 62 00:03:34,380 --> 00:03:35,780 cause human extinction. 63 00:03:36,640 --> 00:03:39,840 Example, Elon Musk, Sam Altman, 64 00:03:40,460 --> 00:03:43,960 Dario Amodei, Demis Hassabis, et cetera. 65 00:03:45,000 --> 00:03:48,250 Generalization and meta-learning are related. 66 00:03:48,990 --> 00:03:52,980 Newer models aren't just memorizing more skills, but are more 67 00:03:53,040 --> 00:03:56,780 capable of learning new skills on their own without human 68 00:03:56,800 --> 00:03:57,780 data guiding them. 69 00:03:58,750 --> 00:04:02,620 There is a difference between training AI such that it learns skills that 70 00:04:02,680 --> 00:04:06,580 humans have already have, versus training AI to know 71 00:04:06,720 --> 00:04:09,060 how to learn new skills on its own. 72 00:04:09,880 --> 00:04:12,820 The difference between these two is not binary. 73 00:04:13,340 --> 00:04:15,280 There are levels to generalization. 74 00:04:16,260 --> 00:04:18,640 Example of low level of generalization, 75 00:04:19,459 --> 00:04:23,340 seeing lots of English to French translations and learning to 76 00:04:23,360 --> 00:04:26,100 become good at English to French translation. 77 00:04:27,080 --> 00:04:29,580 Example of medium level of generalization, 78 00:04:30,440 --> 00:04:34,420 seeing many solved competitive programming puzzles and solving 79 00:04:34,580 --> 00:04:38,540 another puzzle with an algo similar to one it has seen before for 80 00:04:38,550 --> 00:04:39,490 a different puzzle. 81 00:04:40,300 --> 00:04:43,940 Seeing lots of English text and a little Hindi text, 82 00:04:44,440 --> 00:04:48,430 inferring the grammar similarities and differences across human languages, 83 00:04:48,980 --> 00:04:52,180 and then learning to speak fluent Hindi as a result. 84 00:04:53,000 --> 00:04:56,820 Yeah, lol. This is a real result, and an old 85 00:04:56,960 --> 00:04:57,140 one. 86 00:04:58,090 --> 00:05:00,560 Example of high level of generalization, 87 00:05:01,420 --> 00:05:05,280 seeing a theorem from economics and realizing an analogous 88 00:05:05,300 --> 00:05:09,240 version of it also applies to biology, thereby solving 89 00:05:09,280 --> 00:05:11,410 an open problem in biology. 90 00:05:12,480 --> 00:05:16,150 GPT-2 to GPT-5 has led to 91 00:05:16,300 --> 00:05:20,140 immense progress in both learning new skills and 92 00:05:20,200 --> 00:05:24,010 learning how to learn new skills. The latter 93 00:05:24,060 --> 00:05:25,290 is more important though. 94 00:05:26,280 --> 00:05:30,240 The best way to test the latter is to give AI problems that 95 00:05:30,300 --> 00:05:34,080 a human has never seen and no human knows how to 96 00:05:34,200 --> 00:05:38,160 solve. This could be simple tasks, like translation 97 00:05:38,200 --> 00:05:41,600 in apocryphal ancient languages, or complex 98 00:05:41,660 --> 00:05:45,540 tasks, like proposing novel molecular bio 99 00:05:45,560 --> 00:05:45,840 run. 100 00:05:46,800 --> 00:05:50,540 Continual learning is a hard problem, but my hunch 101 00:05:50,620 --> 00:05:53,360 is one or two breakthroughs might solve it. 102 00:05:54,480 --> 00:05:58,080 AI weights are currently static, and hence, AI 103 00:05:58,180 --> 00:06:00,240 behaves as if it has amnesia. 104 00:06:01,060 --> 00:06:05,000 Once a chain of thought is complete, the AI forgets it ever 105 00:06:05,040 --> 00:06:08,988 did the task.Naive way to make sure AI 106 00:06:09,008 --> 00:06:12,778 remembers its previous inferences is to fine-tune it 107 00:06:12,848 --> 00:06:16,718 on this data. But fine-tuning LLMs is 108 00:06:16,728 --> 00:06:18,348 very sample inefficient. 109 00:06:19,208 --> 00:06:23,178 A slightly better approach is to put this data back into the chain of 110 00:06:23,288 --> 00:06:27,068 thought itself. But there may be limits to how long 111 00:06:27,178 --> 00:06:30,748 chains of thought can be. Open research question. 112 00:06:31,648 --> 00:06:33,528 We might discover better approaches. 113 00:06:34,708 --> 00:06:38,488 Operating in the real world can be expensive in domains like 114 00:06:38,548 --> 00:06:42,428 biotech. Therefore, we must be sample efficient 115 00:06:42,588 --> 00:06:46,328 in terms of how much data the AI requires from real-world 116 00:06:46,388 --> 00:06:50,168 experiments before it learns useful insights from it. 117 00:06:50,228 --> 00:06:53,488 Human language has features not present in animal language, 118 00:06:54,148 --> 00:06:57,828 and this is likely an important part of why humans can build spacecraft and 119 00:06:57,948 --> 00:07:00,288 colonize the Earth, but apes can't. 120 00:07:01,468 --> 00:07:04,808 AI already has picked up all these features of human language. 121 00:07:06,048 --> 00:07:09,768 Go read Hockett's views on what features separate human language from animal 122 00:07:09,828 --> 00:07:10,328 language. 123 00:07:11,748 --> 00:07:13,888 Humans use language for communication. 124 00:07:14,788 --> 00:07:18,058 Humans might also use language as part of their reasoning process, 125 00:07:18,568 --> 00:07:21,868 along with other modules, such as motor skills, spatial 126 00:07:21,928 --> 00:07:23,388 visualization, et cetera. 127 00:07:24,368 --> 00:07:28,328 Both of these are hypotheses for what separates humans from apes, and 128 00:07:28,338 --> 00:07:29,988 I think there's a good chance they're true. 129 00:07:31,188 --> 00:07:34,768 Other hypotheses include bigger birth canal and brain size 130 00:07:35,228 --> 00:07:38,008 and evolutionary pressures to win social games. 131 00:07:38,988 --> 00:07:42,928 These hypotheses seem compatible with the hypothesis that language is most 132 00:07:42,968 --> 00:07:43,548 important. 133 00:07:44,768 --> 00:07:48,388 Depending on how you measure it, AI may now be the second 134 00:07:48,508 --> 00:07:51,388 most complex object in the observable universe. 135 00:07:52,268 --> 00:07:56,088 More complex than ape brain, but less complex than human 136 00:07:56,168 --> 00:07:56,488 brain. 137 00:07:57,718 --> 00:08:00,728 Model scaling for text models might be saturating. 138 00:08:01,148 --> 00:08:04,388 Open research question. But it is definitely not 139 00:08:04,508 --> 00:08:07,828 saturated for images, video or robotics. 140 00:08:09,028 --> 00:08:12,948 As of December 2025, text models intuitively feel 141 00:08:12,988 --> 00:08:16,918 better than image models, and image models intuitively feel better 142 00:08:17,048 --> 00:08:21,008 than video models, and video models intuitively feel better 143 00:08:21,048 --> 00:08:24,838 than VLAs for robotics. I 144 00:08:24,868 --> 00:08:28,588 think this is mostly just because of higher compute requirements for the latter 145 00:08:28,668 --> 00:08:29,108 models. 146 00:08:30,388 --> 00:08:33,608 Minor update as of January 6th, 2026. 147 00:08:34,308 --> 00:08:38,178 Robotics has an additional bottleneck where big enough datasets are not 148 00:08:38,268 --> 00:08:38,688 available. 149 00:08:39,578 --> 00:08:43,288 Manually generating them is expensive, but probably affordable 150 00:08:43,328 --> 00:08:47,068 given current AI R&D budgets. Most 151 00:08:47,108 --> 00:08:50,988 people seem to IMO suck at forecasting AI 152 00:08:51,108 --> 00:08:55,068 progress even one year into the future, let alone 153 00:08:55,148 --> 00:08:56,048 five or 10. 154 00:08:57,228 --> 00:09:00,988 Since 2022, I am used to watching people on Twitter 155 00:09:01,068 --> 00:09:04,668 predictions of some specific benchmark or skill 156 00:09:05,028 --> 00:09:08,948 X that will never get solved, only for it to get solved 157 00:09:09,028 --> 00:09:09,808 one year later. 158 00:09:10,968 --> 00:09:14,948 Your specific AI can't do XYZ task 159 00:09:15,308 --> 00:09:18,788 that is trivial for humans is not impressive to me