1
00:00:01,600 --> 00:00:04,880
As of today, I still think there is a

2
00:00:05,080 --> 00:00:08,580
25% chance we will get superintelligence
by 2030.

3
00:00:09,620 --> 00:00:13,480
There's at least a 10% chance literally
every person on Earth

4
00:00:13,520 --> 00:00:17,440
will die by 2030. There's at least

5
00:00:17,500 --> 00:00:21,080
a 10% chance we will get a permanent
dictatorship.

6
00:00:21,340 --> 00:00:25,300
A handful of people will run the entire
world for centuries

7
00:00:25,380 --> 00:00:27,740
together, with no ability to, you know,
overthrow them.

8
00:00:28,300 --> 00:00:30,520
At least 10% chance of this by 2030.

9
00:00:31,560 --> 00:00:35,060
The only way all these numbers change
is if there is a

10
00:00:35,120 --> 00:00:39,100
political movement to pause AI research
that succeeds within the next five years.

11
00:00:40,320 --> 00:00:44,120
Uh,
I'm going to now describe my technical

12
00:00:44,320 --> 00:00:44,980
I think this.

13
00:00:50,580 --> 00:00:50,589
(electronic music)

14
00:00:50,640 --> 00:00:54,580
Creation of artificial superintelligence
is likely the most important event

15
00:00:54,620 --> 00:00:57,650
of approximately 10,000 years of human
history.

16
00:00:58,900 --> 00:01:00,540
It is more important than Industrial

17
00:01:00,600 --> 00:01:03,680
Revolution/Newton/French

18
00:01:03,760 --> 00:01:06,060
Revolution/printing press.

19
00:01:07,200 --> 00:01:10,940
It is more important than invention of
nuclear weapons, as an

20
00:01:11,020 --> 00:01:14,980
ASI will accelerate creation of weapons
more dangerous

21
00:01:15,040 --> 00:01:15,660
than nukes.

22
00:01:16,780 --> 00:01:20,560
It is definitely more important than the
creation of the internet and all

23
00:01:20,660 --> 00:01:22,180
Silicon Valley startups.

24
00:01:23,260 --> 00:01:27,080
If the takeoff is fast enough,
creation of superintelligence

25
00:01:27,120 --> 00:01:30,820
may be the most important event in the
approximately 14

26
00:01:31,000 --> 00:01:33,420
billion year history of the universe.

27
00:01:34,040 --> 00:01:37,820
It could end up more important than the
evolutionary history of all

28
00:01:37,960 --> 00:01:41,620
other life forms on Earth.
And to the best of our current

29
00:01:41,680 --> 00:01:45,500
knowledge,
Earth is the only place in the universe

30
00:01:45,540 --> 00:01:46,640
intelligent life.

31
00:01:47,840 --> 00:01:51,230
Scaling of RL compute still seems
important and

32
00:01:51,340 --> 00:01:52,020
unsolved.

33
00:01:53,100 --> 00:01:56,500
We still seem to have spent less than $1
million per

34
00:01:56,560 --> 00:02:00,220
inference, and can, in theory,
spend at least $1

35
00:02:00,400 --> 00:02:04,200
billion on it. In my head, it is an open

36
00:02:04,260 --> 00:02:07,370
question whether the labs tried it
and got bad results,

37
00:02:08,120 --> 00:02:11,320
or if there's some other fixable research
or infra

38
00:02:11,420 --> 00:02:15,220
bottleneck that is preventing them from
trying it.

39
00:02:15,360 --> 00:02:19,120
Meta-graph on this still seems useful,
even though it is making

40
00:02:19,200 --> 00:02:20,140
simplifications.

41
00:02:21,340 --> 00:02:22,700
This is still important.

42
00:02:23,700 --> 00:02:27,280
If we have one AI that is superhuman,
we can

43
00:02:27,320 --> 00:02:31,120
probably run at least 100,000 copies of it
at

44
00:02:31,420 --> 00:02:33,780
100 times the speed of human thinking.

45
00:02:34,340 --> 00:02:38,120
This means that once the first slightly
superhuman AI is

46
00:02:38,160 --> 00:02:41,900
invented,
we will go from slightly superhuman to

47
00:02:42,100 --> 00:02:45,720
vastly superhuman in a very short span of
time.

48
00:02:46,960 --> 00:02:50,740
I would nowadays more emphasize the
meta-learning point below,

49
00:02:50,780 --> 00:02:54,580
though,
because that decides what happens until we

50
00:02:54,600 --> 00:02:55,860
superhuman AI.

51
00:02:56,840 --> 00:03:00,240
Deferring to experts
is obviously still important.

52
00:03:00,460 --> 00:03:04,300
I value my inside view a lot,
but I value expert

53
00:03:04,340 --> 00:03:05,940
opinion some amount too.

54
00:03:07,160 --> 00:03:11,020
Many researchers who pretend not to be
doomer also have

55
00:03:11,080 --> 00:03:13,340
timelines in the next five to 10 years.

56
00:03:14,040 --> 00:03:17,400
Examples, Rich Sutton, Ilia Sutskever,

57
00:03:17,980 --> 00:03:19,740
Andrej Karpathy, et cetera.

58
00:03:20,760 --> 00:03:24,460
There is the obvious list of doomer
researchers, example,

59
00:03:24,860 --> 00:03:28,660
Geoffrey Hinton, Yoshua Bengio,
Stuart Russell,

60
00:03:28,880 --> 00:03:29,340
et cetera.

61
00:03:30,380 --> 00:03:34,300
Heads of all frontier labs agree
that their own tech could

62
00:03:34,380 --> 00:03:35,780
cause human extinction.

63
00:03:36,640 --> 00:03:39,840
Example, Elon Musk, Sam Altman,

64
00:03:40,460 --> 00:03:43,960
Dario Amodei, Demis Hassabis, et cetera.

65
00:03:45,000 --> 00:03:48,250
Generalization and meta-learning
are related.

66
00:03:48,990 --> 00:03:52,980
Newer models aren't just memorizing more
skills, but are more

67
00:03:53,040 --> 00:03:56,780
capable of learning new skills on their
own without human

68
00:03:56,800 --> 00:03:57,780
data guiding them.

69
00:03:58,750 --> 00:04:02,620
There is a difference between training AI
such that it learns skills that

70
00:04:02,680 --> 00:04:06,580
humans have already have,
versus training AI to know

71
00:04:06,720 --> 00:04:09,060
how to learn new skills on its own.

72
00:04:09,880 --> 00:04:12,820
The difference between these two
is not binary.

73
00:04:13,340 --> 00:04:15,280
There are levels to generalization.

74
00:04:16,260 --> 00:04:18,640
Example of low level of generalization,

75
00:04:19,459 --> 00:04:23,340
seeing lots of English to French
translations and learning to

76
00:04:23,360 --> 00:04:26,100
become good at English to French
translation.

77
00:04:27,080 --> 00:04:29,580
Example of medium level of generalization,

78
00:04:30,440 --> 00:04:34,420
seeing many solved competitive programming
puzzles and solving

79
00:04:34,580 --> 00:04:38,540
another puzzle with an algo similar to one
it has seen before for

80
00:04:38,550 --> 00:04:39,490
a different puzzle.

81
00:04:40,300 --> 00:04:43,940
Seeing lots of English text
and a little Hindi text,

82
00:04:44,440 --> 00:04:48,430
inferring the grammar similarities
and differences across human languages,

83
00:04:48,980 --> 00:04:52,180
and then learning to speak fluent Hindi as
a result.

84
00:04:53,000 --> 00:04:56,820
Yeah, lol. This is a real result,
and an old

85
00:04:56,960 --> 00:04:57,140
one.

86
00:04:58,090 --> 00:05:00,560
Example of high level of generalization,

87
00:05:01,420 --> 00:05:05,280
seeing a theorem from economics
and realizing an analogous

88
00:05:05,300 --> 00:05:09,240
version of it also applies to biology,
thereby solving

89
00:05:09,280 --> 00:05:11,410
an open problem in biology.

90
00:05:12,480 --> 00:05:16,150
GPT-2 to GPT-5 has led to

91
00:05:16,300 --> 00:05:20,140
immense progress in both learning new
skills and

92
00:05:20,200 --> 00:05:24,010
learning how to learn new skills.
The latter

93
00:05:24,060 --> 00:05:25,290
is more important though.

94
00:05:26,280 --> 00:05:30,240
The best way to test the latter
is to give AI problems that

95
00:05:30,300 --> 00:05:34,080
a human has never seen
and no human knows how to

96
00:05:34,200 --> 00:05:38,160
solve. This could be simple tasks,
like translation

97
00:05:38,200 --> 00:05:41,600
in apocryphal ancient languages,
or complex

98
00:05:41,660 --> 00:05:45,540
tasks,
like proposing novel molecular bio

99
00:05:45,560 --> 00:05:45,840
run.

100
00:05:46,800 --> 00:05:50,540
Continual learning is a hard problem,
but my hunch

101
00:05:50,620 --> 00:05:53,360
is one or two breakthroughs might solve
it.

102
00:05:54,480 --> 00:05:58,080
AI weights are currently static,
and hence, AI

103
00:05:58,180 --> 00:06:00,240
behaves as if it has amnesia.

104
00:06:01,060 --> 00:06:05,000
Once a chain of thought is complete,
the AI forgets it ever

105
00:06:05,040 --> 00:06:08,988
did the task.Naive way to make sure AI

106
00:06:09,008 --> 00:06:12,778
remembers its previous inferences
is to fine-tune it

107
00:06:12,848 --> 00:06:16,718
on this data. But fine-tuning LLMs is

108
00:06:16,728 --> 00:06:18,348
very sample inefficient.

109
00:06:19,208 --> 00:06:23,178
A slightly better approach
is to put this data back into the chain of

110
00:06:23,288 --> 00:06:27,068
thought itself.
But there may be limits to how long

111
00:06:27,178 --> 00:06:30,748
chains of thought can be.
Open research question.

112
00:06:31,648 --> 00:06:33,528
We might discover better approaches.

113
00:06:34,708 --> 00:06:38,488
Operating in the real world can be
expensive in domains like

114
00:06:38,548 --> 00:06:42,428
biotech. Therefore,
we must be sample efficient

115
00:06:42,588 --> 00:06:46,328
in terms of how much data the AI requires
from real-world

116
00:06:46,388 --> 00:06:50,168
experiments before it learns useful
insights from it.

117
00:06:50,228 --> 00:06:53,488
Human language has features not present in
animal language,

118
00:06:54,148 --> 00:06:57,828
and this is likely an important part of
why humans can build spacecraft and

119
00:06:57,948 --> 00:07:00,288
colonize the Earth, but apes can't.

120
00:07:01,468 --> 00:07:04,808
AI already has picked up all these
features of human language.

121
00:07:06,048 --> 00:07:09,768
Go read Hockett's views on what features
separate human language from animal

122
00:07:09,828 --> 00:07:10,328
language.

123
00:07:11,748 --> 00:07:13,888
Humans use language for communication.

124
00:07:14,788 --> 00:07:18,058
Humans might also use language as part of
their reasoning process,

125
00:07:18,568 --> 00:07:21,868
along with other modules,
such as motor skills, spatial

126
00:07:21,928 --> 00:07:23,388
visualization, et cetera.

127
00:07:24,368 --> 00:07:28,328
Both of these are hypotheses for what
separates humans from apes, and

128
00:07:28,338 --> 00:07:29,988
I think there's a good chance they're
true.

129
00:07:31,188 --> 00:07:34,768
Other hypotheses include bigger birth
canal and brain size

130
00:07:35,228 --> 00:07:38,008
and evolutionary pressures to win social
games.

131
00:07:38,988 --> 00:07:42,928
These hypotheses seem compatible with the
hypothesis that language is most

132
00:07:42,968 --> 00:07:43,548
important.

133
00:07:44,768 --> 00:07:48,388
Depending on how you measure it,
AI may now be the second

134
00:07:48,508 --> 00:07:51,388
most complex object in the observable
universe.

135
00:07:52,268 --> 00:07:56,088
More complex than ape brain,
but less complex than human

136
00:07:56,168 --> 00:07:56,488
brain.

137
00:07:57,718 --> 00:08:00,728
Model scaling for text models might be
saturating.

138
00:08:01,148 --> 00:08:04,388
Open research question. But it
is definitely not

139
00:08:04,508 --> 00:08:07,828
saturated for images, video or robotics.

140
00:08:09,028 --> 00:08:12,948
As of December 2025,
text models intuitively feel

141
00:08:12,988 --> 00:08:16,918
better than image models,
and image models intuitively feel better

142
00:08:17,048 --> 00:08:21,008
than video models,
and video models intuitively feel better

143
00:08:21,048 --> 00:08:24,838
than VLAs for robotics. I

144
00:08:24,868 --> 00:08:28,588
think this is mostly just because of
higher compute requirements for the latter

145
00:08:28,668 --> 00:08:29,108
models.

146
00:08:30,388 --> 00:08:33,608
Minor update as of January 6th, 2026.

147
00:08:34,308 --> 00:08:38,178
Robotics has an additional bottleneck
where big enough datasets are not

148
00:08:38,268 --> 00:08:38,688
available.

149
00:08:39,578 --> 00:08:43,288
Manually generating them is expensive,
but probably affordable

150
00:08:43,328 --> 00:08:47,068
given current AI R&D budgets. Most

151
00:08:47,108 --> 00:08:50,988
people seem to IMO suck at forecasting AI

152
00:08:51,108 --> 00:08:55,068
progress even one year into the future,
let alone

153
00:08:55,148 --> 00:08:56,048
five or 10.

154
00:08:57,228 --> 00:09:00,988
Since 2022,
I am used to watching people on Twitter

155
00:09:01,068 --> 00:09:04,668
predictions of some specific benchmark
or skill

156
00:09:05,028 --> 00:09:08,948
X that will never get solved,
only for it to get solved

157
00:09:09,028 --> 00:09:09,808
one year later.

158
00:09:10,968 --> 00:09:14,948
Your specific AI can't do XYZ task

159
00:09:15,308 --> 00:09:18,788
that is trivial for humans
is not impressive to me