[Home]
[Parent directory]
[Search]
my_research/ai_forecasts/ai_timelines_20251214.html
2025-12-14
My AI timelines (as of 2025-12-14)
Disclaimer
- Quick Note
- Some people might consider some of the stuff in this document infohazardous. I'm unsure but I'm currently leaning towards publishing it.
I'm still at P(ASI by 2030) = 25%, unless pause AI movement succeeds. I'm still at P(human extinction by 2030) = ~10%, P(century-stable global dictatorship by 2030) = ~10%, both with high uncertainty. Go read previous document for more background.
For communication purposes for a technical audience, I would nowadays emphasise the following points.
- Scaling of RL compute still seems important and unsolved.
- We still seem to have spent less than $1M per inference, and can in theory spend atleast $1B on it.
- In my head, it is an open question whether the labs tried it and got bad results, or if there's some other fixable research or infra bottleneck that is preventing them from trying it.
- METR graph on this still seems useful, even though it is making simplifications.
- Superintelligence via series and parallel is still important. I would nowadays more emphasise the metalearning point below though.
- Defering to experts is obviously still important. I value my inside view a lot, but I value expert opinion some amount too.
- Many researchers who pretend not to be doomer also have timelines in the next 5 to 10 years. Examples: Rich Sutton, Ilya Sutskever, Andrej Karpathy, etc
- There is the obvious list of doomer researchers. Example: Geoffrey Hinton, Yoshua Bengio, Stuart Russell, etc
- Heads of all frontier labs agree that their own tech could cause human extinction. Example: Elon Musk, Sam Altman, Dario Amodei, Demis Hassabis, etc
- "Generalisation" and "metalearning" are related. Newer models aren't just memorising more skills, but are more capable of learning new skills on their own without human data guiding them.
- There is a difference between training AI such that it learns skills that humans have already have, versus training AI to know how to learn new skills on its own.
- The difference between these two is not binary, there are levels to generalisation.
- Example of low level of generalisation: Seeing lots of english to french translations, and learning to become good at english to french translation
- Example of medium level of generalisation: Seeing many solved competitive programming puzzles and solving another puzzle with an algo similar to one it has seen before for a different puzzle. Seeing lots of english text and a little hindi text, inferring the grammar similarities and differences across human languages, and then learning to speak fluent hindi as a result. (Yeah lol this is a real result, and an old one.)
- Example of high level of generalisation: Seeing a theorem from economics and realising an analogous version of it also applies to biology, thereby solving an open problem in biology.
- GPT2 to GPT5 has led to immense progress in both learning new skills, and learning how to learn new skills. The latter is more important though.
- The best way to test the latter is to give AI problems that a human has never seen, and no human knows how to solve. This could be simple tasks like translation in apocryphal ancient languages or complex tasks like proposing novel molecular bio experiments to run.
- Continual learning is a hard problem but my hunch is one or two breakthroughs might solve it.
- AI weights are currently static, and hence AI behaves as if it has amnesia. Once a chain of thought is complete, the AI forgets it ever did the task.
- Naive way to make sure AI remembers its previous inferences is to finetune it on this data. But finetuning LLMs is very sample inefficient. A slightly better approach is to put this data back into the chain of thought itself. But there may be limits to how long chains of thought can be (open research question). We might discover better approaches.
- Operating in the real world can be expensive in domains like biotech. Therefore we must be sample-efficient in terms of how much data the AI requires from real world experiments, before it learns useful insights from it.
- Human language has features not present in animal language and this is likely an important part of why humans can build spacecraft and colonise the earth, but apes can't. AI already has picked up all these features of human language.
- Go read Hockett's views on what features separate human language from animal language.
- Humans use language for communication. Humans might also use language as part of their reasoning process (along with other modules such as motor skills, spatial visualisation, etc). Both of these are hypotheses for what separates humans from apes, and I think there's a good chance they're true.
- Other hypotheses include bigger birth canal and brain size, and evolutionary pressures to win social games. These hypotheses seem compatible with the hypothesis that language is most important.
- Depending on how you measure it, AI may now be the second most complex object in the observable universe. More complex than ape brain but less complex than human brain.
- Model scaling for text models might be saturating (open research question) but it has definitely not saturated for images, video, or robotics.
- As of 2025-12, text models intuitively feel better than image models, and image models intuitively feel better than video models, and video models intuitively feel better than VLAs for robotics. I think this is mostly just because of higher compute requirements for the latter models.
- Most people seem to IMO suck at forecasting AI progress even one year into the future, let alone 5 or 10.
- Since 2022, I am used to watching people on twitter make predictions of some specific benchmark or skill X that will never get solved, only for it to get solved one year later.
- Your specific "AI can't do XYZ task that is trivial for humans" is not impressive to me.
Subscribe
Enter email or phone number to subscribe. You will receive zero or one update per month