Browse parent directory
my_research/ai_forecasts/superintelligence_via_series_parallel.html
2025-07-21
Superintelligence via series and parallel
Disclaimer
- This is a thought experiment. Qualitative not quantative, I'm not defending the exact numbere here.
- I don't convey anything novel here. The original argument is atleast 25 years old, but includes LLM-specific knowledge from 2025.
Summary
- Series and parallel speedups
- Imagine Elon Musk's Memphis datacentre contains 6000 amnesiac humans doing 100 years of thinking every 1 year of wallclock time.
- By 2030, we may have 100,000-1,000,000 geniuses smarter than Einstein doing 10-100 years of thinking per year of wallclock time.
- Within 1 year we might go from "slightly superhuman individual" to "unimaginably superhuman parallel civilisation"
- Recursive self-improvement
- Also these geniuses are researching how to edit their brains to become even smarter. This may or may not happen but is an accelerant if it does.
- Moore's law and similar
- And the number of genius doubles every 2.5 years, so by 2040 or 2050 we could have billions of such researchers.
Main
Why and who?
- A lot of people who studied AI but not AI risk are unaware of arguments for recursive self-improvement or the microeonomics of this process. This document is for you, if you're one of them.
- Even among the few people who discuss it, I sometimes find unclear distinction between serial and parallel computation. This document is also for you.
If you have not used the latest AI models (as of 2025-04 this is GPT4.5 and o3), I strongly recommend you go try them out before reading any discussion such as the one below.
Example
- For now I'll copy paste the numbers I calculated previously
- Assume Llama3 405B inference on a 2x8xH200 SXM GPU node as of 2025
GPU node cost = $300k
$/token = e * $1.44/1M tokens
tokens/s = (2646/e) tokens / s
Superhuman
- Let us imagine this model is at slightly superhuman capability in some domain. (Llama3 is not, but let's imagine for now it was.)
Parallel
- If humanity spends $300B we get 1 million such nodes (2x8xH200) running in parallel. This is around ~$40 per capita (global) and is likely affordable only to a handful of govts.
Serial
- However each node is producing (2646/e) tokens/s. Let's make simplifying assumption that this is 1000 tokens / second. This is about two A4 sized pages of printed text in Arial font 12.
- If you've used any latest AI model and sampled 1000 tokens, you have an intuitive understanding of what this looks like.
- As a human if you had a device in your brain recording every thought you had that's probably not more than 1000 A4 pages of text per day. The AI however is producing 150,000 A4 pages of text per day. So the AI is thinking atleast 100 times faster than you.
In total we have 1 million nodes each of which is thinking 100 times faster than a human, and is slightly smarter than a human.
Assumptions made so far:
- We have a model with as many parameters as Llama3 that is slightly superhuman
- It can produce 1000 tokens / second on a 2x8xH200 GPU, which is not that far off from real LLMs.
Now let's do some thought experiments.
Serial
Imagine you had 1 year to complete a research paper and your fellow researcher had 100 years to complete the paper.
Imagine you had 10 years to complete a research paper and your fellow researcher had 1000 years to complete the paper.
This is already likely to produce outputs beyond your imagination. Humans rarely spend their entire lifetime dedicated to a problem in a way that they actually continuously keep making progress. At some point most humans give up and substitute their time with fake busy work or with an alternate task.
If you could spend 1000 years focussed on one single task, you would already be capable of superhuman feats.
Parallel
Imagine your country had 1 million PhD researchers and the opponent country had 1 million PhD researchers.
However your country employs this PhD research force to solve thousands of different problems, whereas the opponent country employs all of them to solve one singular problem. Your researchers get bored, don't take orders and follow their own curiosity. The opponent country is a dictatorship where researchers can summon the same level of curiosity on demand to work on whatever research project the dictator recommends.
Serial and parallel combined
Now imagine the above two effects combined.
Your country has 1 million PhD researchers scattered across 1000 different topics. They have 1 year to do their work.
The opponent country has 1 million PhD researchers all focussed on the same project. They have 100 years to do their work.
If any of the researchers in their country uncovers an insight in year 1, it is used as input by all the million researchers in year 2. If any insight is uncovered in year 2, it is used as input for year 3.
It is obvious that for almost any human-underestandable problem, this opponent country would have made so much progress within a few years itself that the work they produce would take multiple years just for your country to comprehend.
Parallel and superhuman combined
Imagine your country has 1 million PhD researchers focussed on 1000 topics and the opponent country has 1 million researchers smarter than Einstein (or any other outlier-brilliant researcher) all focussed on the same topic.
Whether you believe scientific progress is driven more by a handful of outlier researchers or by a collective of median researchers, it is obvious this country will make a lot more progress than yours.
Serial and parallel and superhuman combined
Imagine your country has 1 million PhD researchers focussed on 1000 research topics and has 1 year to solve a problem.
Imagine your opponent country has 1 million researchers smarter than Einstein focussed on the same research topic, and they have 100 years to solve the problem.
Serial and parallel and superhuman and RSI combined
Recursive self-improvement (RSI) is the idea that the AI can do research on itself and improve its own intelligence. It is an open question to what extent this is possible. Worst case you can assume no RSI is possible.
Human beings are not able to recursively self-improve because our knowledge of neuroscience has not advanced to the point where we can edit our own neurons with a machine. Likewise knowledge of genetics has only recently advanced to the point where we can edit our own genes. If we could edit our neurons or our genes, we could probably increase our own intelligence.
An AI can trivially edit its own weights and its training algorithm and so on. So it is likely atleast some amount of recursive self-improvement is possible. How much is unknown.
Imagine your country has 1 million PhD researchers focussed on 1000 research topics and has 1 year to solve a problem.
Imagine your opponent country has 1 million researchers smarter than Einstein focussed on the same research topic, and they have 100 years to solve the problem. Also, the problem their country is solving for the first 90 years is how to edit their own brains to become even smarter. Only in the last 10 years do they try to solve the actual problem you're competing with them on.
So on year 1 you're competing with a country full of people smarter than Einstein. On year 2 you're competing with a country full of people who have edited their brains to become even smarter than that. On year 3 you're competing with a country full of people who have edited their brains to become even smarter than that.
This is what our civilisation coming into contact with superintelligent AI could look like. By starting from an assumption of "imagine Llama3 but slightly superhuman" we have reached "unimaginably superhuman" within the span of one year.
If "Llama3 but slightly superhuman" is possible in 2030, "unimaginably superhuman AI civilisation" may be possible by 2031 as per above set of thought experiments.
A lab is still required to perform experiments, so it's possible the rate of progress becomes a lot more dependent on the experiments per dollar of the lab rather than the intelligence of the people using the lab. This is also true for AI research, where the lab is essentially a separate GPU cloud that this GPU cloud of 1 million thinking nodes can use to run experiments.
Open questions
- AI-automated lab research - AI controls a lab (in biology, chemistry, etc) and decides which experiments to run
- Research progress per FLOP of training compute
- Research progress per FLOP of inference compute
- (Historically it has not been possible to teach models using inference, only training could teach them.)
- Research progress per dollar spent on lab experiments
- Ratio of all three values above will decide dollars spent on each
- AI-automated AI research - AI controls a datacentre "lab" and decides which experiments to run
- Research progress per FLOP of training compute
- Research progress per FLOP of inference compute
- Research progress per FLOP of "lab" compute
- Ratio of all three values above will decide dollars spent on each
Prediction: Series beats parallel
2025-07-22
Disclaimer
- Quick note
- This idea might accelerate capabilities. (Someone might put more money into doing serial speedup after reading my post.)
- This idea might accelerate convincing people about AI risk (Makes it more intuitive to visualise ASI, Yudkowsky uses similar metaphors to describe ASI)
Prediction: Serial speedup of LLMs is going to matter way more than parallel speedup
Defintion: Serial speedup means running LLM forward passes faster. Parallel speedup means running more copies of LLM in parallel. Both are paths that allow the total system to produce more output than an individual LLM.
Disclaimer
- For now, let's measure progress in a domain where candidate solutions are verifiable fast and cheap.
- Assume fast means less than 1 second of wall clock time. Cheap means less than $0.01 per experiment.
- Examples of domains where each "experiment" is fast and cheap: pure math, software, human persuasion, (maybe) AI research, (maybe) nanotechnology
- Examples of domains where each experiment is expensive: Particle colliders in experimental particle physics (can cost >$1M per run), cloning experiments in biotech ($100 per run)
- Examples of domains where each experiment is slow: Spaceflight (each launch takes years of planning), Archaeology (each excavation takes years), etc
- The latter domains will also speedup ofcourse, but it complicates the analysis to also consider speed and cost of each lab experiment.
Why does serial speedup matter more?
- Verifiers are a bottleneck
- Ultimately no matter how many ideas you search through in your mind, the output is always a decision for the next lab experiment you want to run. You can't zero-shot perfect understanding of the universe. You can however, be way more time-/cost-/sample-efficient than humans at figuring out the next experiment to run that helps you learn the most about the world.
- New ideas build on top of old ideas. Parallel is like generating lots of new ideas, and then waiting to submit them to a verifier (like a lab experiment). Series is like generating an idea, verifying it, generating another, verifying another.
- Empirical evidence: Scientific progress throughout history seems to be accelerating instead of growing linearly, as we make more and more domains verifiable (by inventing instruments such as an electron microscope or cyclotron or DNA sequencer etc)
- Multi-year focus is rare
- Most humans half-ass tasks, get distracted, give up etc. Once people get good "enough" at a task (to get money, sex, satisfy curiosity, etc), they stop trying as hard to improve
- (Maybe) Empirical evidence: If you spend even 10 years of your life consistently putting effort to improving at a task, you can probably reach among the top 1000 people on Earth in that task.
- The primary reason I'm not a top-1000 guitarist or neuroscientist or politician is because I don't care enough to put in the hours. My brain structure is likely not that different from the people who are good at the task, I probably have the basic hardware and the algorithms required to get good. Sure, I will maybe not reach the level of Magnus Carlsen with hard work alone, but I could improve a lot with hard work.
- Humans only live <100 years, we don't really know how much intellectual progress is possible if a human could think about a problem for 1000 years for example.
- Empirical evidence: We know that civilisations as a whole can survive for 1000 years and make amounts of progress that are unimaginable at the start. No one in year 0 could have predicted year 1000, and no one in year 1000 could have predicted year 2000.
- RL/inference scales exponentially
- RL/inference scaling grows exponentially in cost, as we all know from log scaling curves. 10x more compute for RL/inference scaling means log(10) more output.
- Paralleling humans is meh
- Empirical evidence: We don't have very good evidence that a country with 10x population produces 10x intellectual output. Factors like culture may be more important. We do have lots of obvious evidence that 10 years of research produces more output than 1 year, and 100 years produces more than 10 years.
- It is possible this is similar to the RL/inference scaling curve, maybe 10x more researchers means log(10) more output.
How much serial speedup is possible?
- Jacob Steinhardt says LLM forward pass can be brought down to below 1 millisecond per token or 1,000 tokens per second. This is using GPUs, if you built an ASIC for inference it might be even faster.
- A human speaks at 100-150 words per minute, or around 3 tokens per second. This 300x slower.
- You could maybe make an argument that human thought stream is actually running faster than that, and that we think faster than we speak.
- Even assuming only 100x speedup, the AI experiences 10,000 simulated years per 100 years of wall clock time. If you gave me 10,000 years to progress some field, it is beyond my imagination what I would do.
P.S. On thinking more about it, if you gave me 100 simulated years per year of wall clock time, I might consider this situation worse than death.
- I will have to wait 100 seconds per second of wall clock time, or multiple minutes to move my finger. So my body is as good as paralysed, from the point of view of my mind. Yes I can eventually move my body, but do I want to endure the simulated time required to get useful bodily movements? This is basically a mind prison.
- Also everybody around me is still too slow, so I'm as good as the only person alive. No social contact will ever be possible.
- I could setup a way to communicate with a computer using eye movements or something, if I can endure living in mind prison long enough to do this.
- The number one thing that would end eternally torment would be for me to able to communicate with another being (maybe even my own clone) that runs at a speed similar to mine. Social contact will help.
Subscribe / Comment
Enter email to subscribe, or enter comment to post comment