2025-08-07
Time: 20 minutes
Pre-reqs
Assuming no ban or pause on AI research enforced via US-China international treaty
Assume wide error bars on these numbers
Definition of superintelligent AI: AI that is better than the best humans at every task humans care about completing.
Relevant intuitions for what I actually imagine when I imagine superintelligent AI: Humans from 1900s experiencing entire 1900-2025 inventions in one year, chimpanzees being exposed to a human being
Experts' AI timelines (~1 min)
Experts' AI extinction risk (~1 min)
If you think these clips are cherrypicked, fake, etc you can watch the full interviews linked below.
Signed letters: Dan Hendrycks' CAIS letter, FLI Pause letter
Homework
Try GPT4.5 and o3 (launched 2025) on OpenAI playground
Homework
Technical part of this document starts here.
Homework
Chinchilla scaling predicts loss accurately upto atleast 3 decimal places
As per EpochAI replication attempt, chinchilla scaling law
L(N,D) = 1.8172 + 482.01/N^0.3478 + 2085.43/D^0.3658
Number of params, depends on model size
Homework
Experts have been consistently surprised over past 6 years as to which capabilities would unlock on which year. Very few got all the predictions right, and those that did (example: Ilya Sutskever, Daniel Kokojatilo) are bullish on further AI progress.
xAI Memphis spent $7B to train grok-3
OpenAI Stargate will spend $100B annual, based on commitment from Masayoshi Sun (Softbank) and Larry Ellison (Oracle).
World GDP is $80 trillion, we probably will spend somewhere between $0.1-10 trillion on training. This is 10-1000x more compute than largest training run as of today.
People say GPT4.5 is not significantly better than GPT4, hence we are saturating this curve. However see previous point, we still have lots of compute left to go that can counteract this.
Try GPT4, try o1, try o3
We have only 1 year of past data, this means any prediction based on this has wide error bars
Homework
Cost per task
Update: This was not presented at the talk on 2025-08-07, but I'm including it anyway.
Even before LLMs were invented, we have solved all games by scaling RL with long time horizon and large amount of compute - poker 1v1, DOTA, Starcraft, Go, Chess. This is evidence indicating RL scales.
Enter email to subscribe, or enter comment to post comment