[Parent directory]
[Home]

my_research/ai_forecasts/ai_timelines_talk_20250802.html


2025-08-07

AI timelines talk (presented on 2025-08-03)

Time: 20 minutes

Pre-reqs

My top-level views

Assuming no ban or pause on AI research enforced via US-China international treaty

Assume wide error bars on these numbers

Definition of superintelligent AI: AI that is better than the best humans at every task humans care about completing.

Relevant intuitions for what I actually imagine when I imagine superintelligent AI: Humans from 1900s experiencing entire 1900-2025 inventions in one year, chimpanzees being exposed to a human being

Datapoint 1: Other experts

Experts' AI timelines (~1 min)

Experts' AI extinction risk (~1 min)

If you think these clips are cherrypicked, fake, etc you can watch the full interviews linked below.

Signed letters: Dan Hendrycks' CAIS letter, FLI Pause letter

Homework

Datapoint 2: Try the models yourself

Try GPT2 (launched 2019)

Try GPT4.5 and o3 (launched 2025) on OpenAI playground

Homework

Datapoint 3: Argument from Speed


Technical part of this document starts here.

Datapoint 4: Model scaling

Datapoint 4a: Try old and new models

Homework

Datapoint 4b: Chinchilla scaling law predicts loss

Chinchilla scaling predicts loss accurately upto atleast 3 decimal places

As per EpochAI replication attempt, chinchilla scaling law

L(N,D) = 1.8172 + 482.01/N^0.3478 + 2085.43/D^0.3658

Number of params, depends on model size

Homework

Datapoint 4c: Loss does not predict capabilities

Experts have been consistently surprised over past 6 years as to which capabilities would unlock on which year. Very few got all the predictions right, and those that did (example: Ilya Sutskever, Daniel Kokojatilo) are bullish on further AI progress.

Datapoint 4d: Scale up on compute in future

xAI Memphis spent $7B to train grok-3

OpenAI Stargate will spend $100B annual, based on commitment from Masayoshi Sun (Softbank) and Larry Ellison (Oracle).

World GDP is $80 trillion, we probably will spend somewhere between $0.1-10 trillion on training. This is 10-1000x more compute than largest training run as of today.

Datapoint 4e: Model scaling might (???) be saturating

People say GPT4.5 is not significantly better than GPT4, hence we are saturating this curve. However see previous point, we still have lots of compute left to go that can counteract this.

Datapoint 5: RL scaling

Datapoint 5a: Try models yourself

Try GPT4, try o1, try o3

Datapoint 5b: log curve for RL scaling

We have only 1 year of past data, this means any prediction based on this has wide error bars

Homework

Datapoint 5c: scale up on compute in future

Cost per task

Datapoint 6: RL scaling has worked before

Update: This was not presented at the talk on 2025-08-07, but I'm including it anyway.

Even before LLMs were invented, we have solved all games by scaling RL with long time horizon and large amount of compute - poker 1v1, DOTA, Starcraft, Go, Chess. This is evidence indicating RL scales.

Subscribe / Comment

Enter email to subscribe, or enter comment to post comment