# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/date_me.md

2025-06-04

To do for self
 - simplify this document, some core ideas are repetitive
 - use "show not tell", use more photos/videos for each point instead of sentences. Lmao maybe I have just derived from first principles why dating apps use photos and not text. It's possible the idea of a text-only document here is a bad idea.
 - I seem to lack clarity on whether this document is written to my future self or my future partner, both imply somewhat different documents.

# Date me

*Referral bonus: If you introduce me to someone I end up partnered with for atleast 5 years, I will pay you $2000. To claim reward, make sure to keep a copy of our chats and a copy of this webpage as proof, as I might forget otherwise.*

[Some videos of me](../../video/)
 - includes solo videos and with others
 - includes professional and personal context

[Some photos of me](../../non_text_non_video/self_solo_pics/)
 - includes solo photos in personal context

[Contact me](../connect_with_me/contact_me.md)
 - I value novelty highly so if you're interested in me romantically, you should probably reach out even if the criteria below don't match.
 - Cost of failure is very low from your point of view, I would appreciate the attempt even if it fails.

Disclaimer
 - I am well aware that most people see analysing relationships to this extent as atypical.
   - I like doing it for reasons of curiosity.
   - Also I benefit from doing it.
 - Atleast half the points here probably benefit from a "show not tell" approach.
   - Solution: Might upload photos/videos. Content relevant to each point.
 - It's difficult for me to predict what a hypothetical future partner might value or disvalue.
   - It varies person-to-person and I obviously can't pre-emptively guess every possible filter someone might test me against.
   - What you appreciate (or don't appreciate) about me might be very different from what I choose to write about myself on.
   - Solution: Criteria below are mostly from a book. Am open to more data on this.
   - Solution: Might upload photos/videos. Lots of content instead of trying to guess which content is best.
   - Solution: Meet irl / video call.
 - Some important information may be non-verbal such as body language.
   - Solution: See my photos/videos.
   - Solution: Meet irl / video call.
 - A lot of this type of information seems more relevant to online dating as there is very little pre-existing context. A lot of this info get implicitly figured out in irl meetups.
 - Atleast from my end, I've realised "Date Me" text documents are not a good way of filtering beyond basic criteria.
   - Filters are mostly about blacklist, I don't know how to whitelist using text-based filters.
   - Solution: Send me your photos/videos
   - Solution: Meet irl/video call

## (My guesses of) Your filters

Information about me you may (or may not) use as a filter, when deciding to date me.
 - Mental health
   - Average / below average. I'm aware this can be a major impediment to finding a partner, and am working to improve it. Will update this document once it's improved.
 - Physical health, Physical attraction
   - Decent physical health, exercise regularly.
   - Not building muscles as it's not my priority. (Can change this if someone gives a good reason why.)
   - See my photos/videos.
 - Social status
   - My proofs of social status
     - Videos with friends linked above. Feel free to schedule chats with people in my social circle.
     - Videos of work-related stuff I probably have saved somewhere, will have to find if you want them.
     - Videos/photos of past relationships not shared due to privacy reasons. Might share some stories if I trust you enough.
   - As weak proof for competence at some skill or respect from many people
     - Haven't yet found any communities I would like to long-term stick to, where I can earn significant respect or competence. I've done this in the past but currently no.
     - I work very hard to not be affected by the opinions of others. I see many status hierarchies in society as fundamentally broken as they are premised on lies.
     - There is more nuance which I can't talk about in short.
   - Within certain bounds, I think social status is a reasonable filter to use for deciding who to date. If it your primary filter though, we're probably not a good fit.
   - [I'm assuming you use social status as a filter to ensure that:
     - I won't threaten your physical safety or use it as leverage to coerce you into doing something you don't want to do.
     - I do atleast partically understand implicit norms of society, and that if I choose not to fit in it is often by choice rather than ignorance or compulsion.]
 - Finances
   - Due to combination of multiple factors (birth lottery, genes, hard work) I am not in financial difficulty and unlikely to be in future. You can see my other documents for more on this. 
   - I don't spend on a lot of consumerist stuff though. There might be a mismatch in our spending habits and lifestyle. I don't care about this but there's a possibility you might.
 - Intelligence / executive function
   - Fairly high executive function atleast while my mental health is fine. Have completed multiple independent tasks, including completing degree, working at multiple companies, working on solo software projects and so on.
 - Aesthetics
   - I don't care a lot about aesthetics. I could be motivated to put a little effort into it for your sake (if you happen to care about it) but probably not a lot of effort.
 - Protectiveness
   - I'm aware physical safety is a much bigger concern for women than it is for men. (I have forgotten about this in the past though, will try to be more mindful in the future.)
   - Atleast in my own personal life, I see entering conflicts as a significant sink of time and energy. I avoid most conflicts for this reason, and am very careful about picking which conflicts to enter. I don't let ego or short-term emotions be the primary reason to pick conflicts.
   - If you are facing or entering some conflict, I could be significantly protective of you in it. (Ask me past examples lol I have some stories.) Although again, in the long-term I would generally expect you to reconfigure your life in a way that does not require me to regularly spend my time and energy involved in your conflicts.
   - Exceptions exist. There are nuances here that are hard to summarise.
 - Conscientiousness
   - This document itself is probably some basic proof that yeah I am willing to put in significant effort to find mutually beneficial solutions, in case of a conflict. If I do take a significant move like breaking up, I probably have put a fair amount of thought into it.
 - Physical attraction
 - Past sexual experience
   - Won't discuss in detail on a public document, ask me privately.
   - Learned this from books lol. Also I've mostly received positive feedback from previous partmers.
   - I find "actually wanting the other person to have a good time" to be more important than just "not technically violating consent".
   - I would rather optimise other person's long-term interests than their short-term interests. For instance if you give consent but I have a strong guess you'll later come to regret it, then I would rather not do it.


## My filters

Here's the basic criteria that I'm looking for from my end:
 - ideal relationship type: straight, polyamorous, medium to longterm relationship. children - dont know
   - If you're interested in me but want something different from this, just ask.
   - [Since I'm fairly busy nowadays, I'm unlikely to spend lots of time searching for a relationship that's not my ideal in the first place. But that doesn't mean I'll say no if you ask.]
 - place: current place - any city in India. future place - maybe SF. (can change in future)
   - can shift for relationship but I should like the city. if shifting, may also need visa without taking fulltime job.
 - time: <= 1 day/week, need alone time, need time to work. (can change in future)
   - ideal if you also have other sources of emotional support / advice / shared activities / etc than me (easier to do if poly imo)
 - money: separate finances for first 5 years. can date someone outside my class despite additional challenges.
 - values: authentic, emotionally vulnerable, individualist, atheist, altruistic (match not required), somewhat ambitious (match not required)
   - should understand these parts of me even if you aren't similar yourself
   - tell me upfront from day one if you think your family or social circle is likely to negatively affect the relationship in future. I can probably self-modify a little but not a lot to fit into your circle. Nuances exist, hard to summarise.
 - personality: introvert (match not required), high curiosity, intellectual (match not required), high openness-to-experience, high risk-tolerance, very direct communication style (can tone this down on request but only to some extent), can put significant one-sided efforts if required
 - social circle: you dont have to fit into any of my social circles, or feel pressured to self-modify to do so
 - physical attraction
 - lifestyle: not balanced (match not required)
 - social circle: (match not required)
 - hobbies: (match not required)
   - travelling, board games (chess, catan), music (rap, rock, nightcore, etc), books (nonfiction), open to new hobbies, especially hobbies that are easy to carry around
 - everything else: (match not required)
   - I'm probably okay with: difference in socioeconomic class, difference in level of education, significant age gap, your physical or mental health issues, different hobbies, different lifestyle

Relationship advice I agree with
 - I'm sympathetic to the perspective that relationships are built not found. As long as we align in a few core ways, I'd be willing to overlook a lot of other potential misalignment.
 - I'm sympathetic to [Paul Graham's advice](https://paulgraham.com/greatwork.html) on work-life balance.

 > Don't marry someone who doesn't understand that you need to work, or sees your work as competition for your attention. If you're ambitious, you need to work; it's almost like a medical condition; so someone who won't let you work either doesn't understand you, or does and doesn't care.

Uncertainty: Alighment of core values
 - I'm generally unsure on the importance of alignment of core values in a marriage.
 - In general I find only a few people who share my core values and beliefs, and only a few people who share my lifestyle.
 - I would like the freedom to make significant changes to my core values and beliefs and lifestyle even after getting married. An extreme example of this would be someone who converts their religion many years after getting married.
 - In my experience alignment of thought process on day-to-day decisions can be just as important as alignment of thought process on the big decisions. I sometimes find it difficult to spend significant amounts of time with someone who makes their day-to-day decisions using a thought process very different from mine. I find this much easier to do if I spend a smaller amount of time with them, and also spend time with others.
 - I want more data on successful examples of marriages that maintain a high level of independence for both partners.
 - Being polyamorous might make it easier to stay together despite differences in values. I'm unsure about this.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/hall_of_shame.md

2025-06-21

# Hall of Shame

## Main

 - Tsalise. insta: tsaliserai1. +91-9366607xxx. Dimapur, Nagaland. Borrowed 400 INR 2025-04-02, Promised return date in 2025-05-01. Not returned as of 2025-05-18
 - Jagdish. +91-8837493xxx. Dimapur, Nalagand. Bangalore. Borrowed 2000 INR in 2024 or early 2025. Promised return date before 2025-03. Not returned as of 2025-05
 - Prince. ex-JNU. +91-7551115xxx. Delhi. Borrowed 2500 INR in 2023. Promised return date in 2023. Not returned as of 2025-05.
 - Wacharin. insta: wacharin_09. +66-944264xxx. Bangkok. Borrowed 4000 THB in 2024. Promised return date in 2024. Not returned as of 2025-05.
 - Shyam. +977-9824276xxx. Pokhara, Nepal. Borrowed 20,000 INR in 2024. Promised return date in 2024. Not returned as of 2025-05.

## Disclaimers etc

Update: I now have a document called "information policy" that describes in more detail when I do and don't respect privacy for others.

Disclaimer
 - If you are worried you'll end up on this list over a misunderstanding, feel free to explicitly bring this up as a conversation topic and discuss it with me.
   - I often give a person multiple chances before they end up on this list. If a person ends up on this list, there's a high probability they are also permanently removed from my life.
   - I am aware that many people consider publicly revealing private details of a relationship as highly escalatory.
   - I am likely to publicly escalate only under following circumstances.
     - I see the relationship as over, with no further option to salveage it.
     - I think you're likely to damage other people's lives just as you've damaged mine.
 - People listed here might have their own side of the story, feel free to ask them for that.
 - This is not an exhaustive list. I use my discretion what to share or not share.

Why?
 - This is an experiment.
 - Want to publicly damage reputations of people listed below, by sharing actions that (I claim) they did.
 - Want to set a precedent that due to the interent, it is difficult to hide anything from the public. The upside is you can build a reputation faster, the downside is you can also burn it faster.

Not why
 - For amounts of money that are small for me, I don't really care about the money itself.

How?
 - This page will soon get cached by internet archive, commoncrawl and others, making it hard for anyone including me to delete it from the internet.
 - As of 2025-05 internet search functionality is not accurate enough for anyone to discover this page on their own. This could soon change due to LLMs.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/my_role_models.md

2024-12-05

# My role models

I'm generally against the idea of having a single role model in my own life. (You can have a role model in your life, I have no problem with that. I'm talking strictly about myself here.)

That being said, here's an incomplete list of people I rely on for motivation. (Motivation is not the same as Advice)

 - Eliezer Yudkowsky
 - Tim Urban
 - Moxie Marlinspike
 - Vitalik Buterin
 - Yeonmi Park


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/health/std_test_20240517.md

2025-06-03

# STD test report (tested 2024-05-17)

Samuel, male, DOB 2001-01-03

Summary
 - Prior HSV1 infection. May or may not be active.

```
Test Name Result Value Unit Reference Range
HSV-Type I antibody IgG >62.2 Index
```

---

```
Test Name Result Value Unit Reference Range
HSV-Type I antibody IgG >62.2 Index

TEST RESULTS UNIT BIOLOGICAL REF RANGE REMARKS
HIV I & II ( p24 antigen and antibody combo)
HIV I & II CMIA 0.160 S/CO

TEST RESULTS UNITBIOLOGICAL REF RANGE REMARKS
HSV-Type I antibody IgG CLIA H >62.2 Index
HSV-Type I antibody IgM ELISA 0.52 Index 

TEST RESULTS UNITBIOLOGICAL REF RANGE REMARKS
HSV-Type CLIA II antibody IgG <0.500 Index
HSV-Type ELISA II antibody IgM 0.54 Index

TEST RESULTS UNIT BIOLOGICAL REF RANGE REMARKS
Chlamydia Trachomatis IgG ELISA 2.58 NTU

TEST RESULTS UNIT BIOLOGICAL REF RANGE REMARKS
TPHA Haemagglutination Negative Negative
Titre N.A. <1:80
```


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/health/blood_test_20240518.md

2025-06-03

# Blood test report (tested 2024-05-18)

Samuel, male, DOB 2001-01-03

Summary
 - ?

```
Test Name Results Units Bio. Ref. Interval
LIVER & KIDNEY PANEL, SERUM
Creatinine 0.84 (Modified Jaffe,Kinetic) mg/dL 0.70 - 1.30
GFR Estimated 125 (CKD EPI Equation 2021) mL/min/1.73m2 >59 
GFR Category G1 (KDIGO Guideline 2012) 
Urea 25.30 (Urease UV) mg/dL 13.00 - 43.00 
Urea Nitrogen Blood 11.82 (Calculated) mg/dL 6.00 - 20.00
BUN/Creatinine Ratio 14 (Calculated) 
Uric Acid 4.80 (Uricase) mg/dL 3.50 - 7.20
AST (SGOT) 16.0 (IFCC without P5P) U/L 15.00 - 40.00
ALT (SGPT) 12.0 (IFCC without P5P) U/L 10.00 - 49.00
GGTP 15.0 (IFCC) U/L 0 - 73
Alkaline Phosphatase (ALP) 76.00 (IFCC-AMP) U/L 30.00 - 120.00
Bilirubin Total 1.10 (Oxidation) mg/dL 0.30 - 1.20
Bilirubin Direct 0.42 (Oxidation) mg/dL <0.3
Bilirubin Indirect 0.68 (Calculated) mg/dL <1.10
Total Protein 8.50 (Biuret) g/dL 5.70 - 8.20
Albumin 5.35 (BCG) g/dL 3.20 - 4.80
A : G Ratio 1.70 (Calculated) 0.90 - 2.00
Globulin(Calculated) 3.15 gm/dL 2.0 - 3.5
Calcium, Total 9.60 (Arsenazo III) mg/dL 8.70 - 10.40
Phosphorus 3.69 (Molybdate UV) mg/dL 2.40 - 5.10
Sodium 143.00 (Indirect ISE) mEq/L 136.00 - 145.00
Potassium 4.38 (Indirect ISE) mEq/L 3.50 - 5.10
Chloride 110.00 (Indirect ISE) mEq/L 98.00 - 107.00

Test Name Results Units Bio. Ref. Interval
LIPID SCREEN, SERUM
Cholesterol, Total 171.00 (CHO-POD) mg/dL <200.00
Triglycerides 94.00 (GPO-POD) mg/dL <150.00
HDL Cholesterol 53.10 (Enz Immunoinhibition) mg/dL >40.00
LDL Cholesterol, Calculated 99.10 (Calculated) mg/dL <100.00
VLDL Cholesterol,Calculated 18.80 (Calculated) mg/dL <30.00
Non-HDL Cholesterol 118 (Calculated) mg/dL <130

Test Name Results Units Bio. Ref. Interval
Glucose Fasting 85.00 (Hexokinase) mg/dL 70 - 100
Vitamin B12; Cyanocobalamin 298.00 pg/mL 211.00 - 911.00
Vitamin D, 25 Hydroxy 33.70 nmol/L 75.00 - 250.00

Test Name Results Units Bio. Ref. Interval
THYROID PROFILE,TOTAL, SERUM (CLIA)
T3, Total 1.19 ng/mL 0.60 - 1.81
T4, Total 9.90 μg/dL 5.01 - 12.45
TSH 1.94 μIU/mL 0.550 - 4.780

AMYLASE, SERUM (G7PNP)
Amylase 46.00 U/L 30.00 - 118.00

IRON STUDIES, SERUM (Spectrophotometry)
Iron 167.00 ug/dL 65.00 - 175.00
Total Iron Binding Capacity (TIBC) 328.00 μg/dL 250.00 - 425.00
Transferrin Saturation 50.91 % 20.00 - 50.00

HbA1c (GLYCOSYLATED HEMOGLOBIN), BLOOD (HPLC, NGSP certified) 
HbA1c 5.0 % 4.00 - 5.60
Estimated average glucose (eAG) 97 mg/dL

C-REACTIVE PROTEIN, CARDIO; hsCRP (Immunoturbidimetry) 2.05 mg/L <1.00

APOLIPOPROTEINS A1 & B, SERUM (Immunoturbidimetry)
Apolipoprotein (Apo A1) 121 mg/dL 79 - 169
Apolipoprotein (Apo B) 76 mg/dL 46 - 174
Apo B / Apo A1 Ratio 0.63 0.35 - 0.98

URINE EXAMINATION, ROUTINE; URINE, R/E
(Automated Strip Test, Microscopy) Physical
Colour Slight Lemon Yellow Pale yellow
Specific Gravity 1.015 1.001 - 1.030
pH 5 5.0 - 8.0
Chemical
Proteins Negative Negative
Glucose Negative Negative
Ketones Negative Negative
Bilirubin Negative Negative
Urobilinogen Negative Negative
Leucocyte Esterase Negative Negative
Nitrite Negative Negative
Microscopy
R.B.C. Negative 0.0 - 2.0 RBC/hpf
Pus Cells Negative 0-5 WBC / hpf
Epithelial Cells 0.0 - 5.0 Epi
cells/hpf 0-1 Epi Cells/hpf
Casts None seen None seen/Lpf
Crystals None seen None seen
Others None seen None seen

HEMOGRAM
Hemoglobin (Photometry) 16.50 g/dL 13.00 - 17.00
Packed Cell Volume (PCV) (Calculated) 49.00 % 40.00 - 50.00
RBC Count (Electrical Impedence) 5.48 mill/mm3 4.50 - 5.50
MCV (Electrical Impedence) 89.50 fL 83.00 - 101.00
MCH (Calculated) 30.20 pg 27.00 - 32.00
MCHC (Calculated) 33.70 g/dL 31.50 - 34.50
Red Cell Distribution Width (RDW) (Electrical Impedence) 14.10 % 11.60 - 14.00
Total Leukocyte Count (TLC) (Electrical Impedence) 7.00 thou/mm3 4.00 - 10.00
Differential Leucocyte Count (DLC) (VCS Technology) Segmented Neutrophils 45.80 % 40.00 - 80.00
Lymphocytes 38.30 % 20.00 - 40.00
Monocytes 6.10 % 2.00 - 10.00
Eosinophils 9.50 % 1.00 - 6.00
Basophils 0.30 % <2.00
Absolute Leucocyte Count (Calculated)
Neutrophils 3.21 thou/mm3 2.00 - 7.00
Lymphocytes 2.68 thou/mm3 1.00 - 3.00
Monocytes 0.43 thou/mm3 0.20 - 1.00
Eosinophils 0.67 thou/mm3 0.02 - 0.50
Basophils 0.02 thou/mm3 0.02 - 0.10
Platelet Count (Electrical impedence) 346 thou/mm3 150.00 - 410.00
Mean Platelet Volume (Electrical Impedence) 8.1 fL 6.5 - 12.0
E.S.R. (Capillary photometry) 1 mm/hr 0.00 - 15.00
```


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/health/blood_test_20250525.md

2025-06-03

# Blood test report (tested 2025-05-25)

Samuel, male, DOB 2001-01-03

Summary
 - Vitamin D and B12 insufficient.

```
Test Name Result Unit Bio. Ref. Interval Method
VITAMIN D (25 - OH VITAMIN D) ,SERUM 21.7 ng/mL 30 -100 CLIA

Test Name Result Unit Bio. Ref. Interval Method
VITAMIN B12 , SERUM 170 pg/mL 190-900 CLIA
```

---

```
Test Name COMPLETE BLOOD COUNT (CBC) , WHOLE BLOOD EDTA
Result Unit Bio. Ref. Interval Method
HAEMOGLOBIN 16.5 g/dL 13-17 Spectrophotometer
PCV 49.80 % 40-50 Electronic pulse & Calculation
RBC COUNT 5.25 Million/cu.mm 4.5-5.5 Electrical Impedence
MCV 94.9 fL 83-101 Calculated
MCH 31.4 pg 27-32 Calculated
MCHC 33.1 g/dL 31.5-34.5 Calculated
R.D.W 13.8 % 11.6-14 Calculated
TOTAL LEUCOCYTE COUNT (TLC) 5,500 cells/cu.mm 4000-10000 Electrical Impedance
DIFFERENTIAL LEUCOCYTIC COUNT (DLC)
NEUTROPHILS 48.2 % 40-80 Flow cytometry
LYMPHOCYTES 37.8 % 20-40 Flow cytometry
EOSINOPHILS 6.1 % 1-6 Flow cytometry
MONOCYTES 7.4 % 2-10 Flow cytometry
BASOPHILS 0.5 % 0-2 Flow cytometry
CORRECTED TLC 5,500 Cells/cu.mm Calculated
ABSOLUTE LEUCOCYTE COUNT
NEUTROPHILS 2651 Cells/cu.mm 2000-7000 Calculated
LYMPHOCYTES 2079 Cells/cu.mm 1000-3000 Calculated
EOSINOPHILS 335.5 Cells/cu.mm 20-500 Calculated
MONOCYTES 407 Cells/cu.mm 200-1000 Calculated
BASOPHILS 27.5 Cells/cu.mm 0-100 Calculated
Neutrophil lymphocyte ratio (NLR) 1.28 0.78- 3.53 Calculated
PLATELET COUNT 297000 cells/cu.mm 150000-410000 Electrical impedence
MPV 7.8 Fl 8.1-13.9 Calculated

Test Name Result Unit Bio. Ref. Interval Method
ESR , WHOLE BLOOD EDTA 9 mm at 1 hour 0-15 Modified Westergren

Test Name Result Unit Bio. Ref. Interval Method
GLUCOSE, FASTING , NAF PLASMA 89 mg/dL 70-100 HEXOKINASE

Test Name Result Unit Bio. Ref. Interval Method
HBA1C (GLYCATED HEMOGLOBIN) , WHOLE BLOOD EDTA
HBA1C, GLYCATED HEMOGLOBIN 5 % HPLC
ESTIMATED AVERAGE GLUCOSE 97 mg/dL Calculated

Test Name Result Unit Bio. Ref. Interval Method
LIPID PROFILE , SERUM
TOTAL CHOLESTEROL 163 mg/dL < 200 CHOD-PAD
TRIGLYCERIDES 65 mg/dL < 150 GPO-PAP
HDL CHOLESTEROL 60 mg/dL >=40 Desirable Enzymatic Immunoinhibition
NON-HDL CHOLESTEROL 102 mg/dL <130 Calculated
LDL CHOLESTEROL 89.42 mg/dL <100 Calculated
VLDL CHOLESTEROL 13.08 mg/dL <30 Calculated
CHOL / HDL RATIO 2.69 0-4.97 Calculated
ATHEROGENIC INDEX (AIP) < 0.01 <0.11 Calculated

Test Name Result Unit Bio. Ref. Interval Method
LIVER FUNCTION TEST (LFT) , SERUM
BILIRUBIN, TOTAL 0.96 mg/dL 0-1.2 Diazo
BILIRUBIN CONJUGATED (DIRECT) 0.34 mg/dL 0-0.2 Diazo
BILIRUBIN (INDIRECT) 0.62 mg/dL 0.0-1.1 Calculated
ALANINE AMINOTRANSFERASE (ALT/SGPT) 24.2 U/L 10-50 IFCC with Pyridoxal Phosphate
ASPARTATE AMINOTRANSFERASE (AST/SGOT) 23.8 U/L 10-50 IFCC with Pyridoxal Phosphate
AST (SGOT) / ALT (SGPT) RATIO (DERITIS) 1.0 <1.15 Calculated
ALKALINE PHOSPHATASE 85.40 U/L 40-129 IFCC
PROTEIN, TOTAL 8.28 g/dL 6.4-8.3 Biuret
ALBUMIN 5.13 g/dL 3.5-5.2 Bromo Cresol Green
GLOBULIN 3.15 g/dL 2.0-3.5 Calculated
A/G RATIO 1.63 0.9-2.0 Calculated

Test Name Result Unit Bio. Ref. Interval Method
RENAL PROFILE/KIDNEY FUNCTION TEST (RFT/KFT) , SERUM
CREATININE 0.80 mg/dL 0.7-1.2 Jaffe
.eGFR - ESTIMATED GLOMERULAR FILTRATION RATE 124.44 mL/min/1.73m² >60 CKD-EPI FORMULA
UREA 17.60 mg/dL 13-43 Urease
BLOOD UREA NITROGEN 8.2 mg/dL 8.0 - 23.0 Calculated
URIC ACID 4.52 mg/dL 3.5-7.2 Uricase
CALCIUM 10.30 mg/dL 8.6-10 NM-Bapta
PHOSPHORUS, INORGANIC 4.36 mg/dL 2.5-4.5 Phosphomolybdate Complex
SODIUM 140.6 mmol/L 136-145 ISE (Indirect)
POTASSIUM 4.2 mmol/L 3.5-5.1 ISE (Indirect)
CHLORIDE 103 mmol/L 98-107 ISE (Indirect)
PROTEIN, TOTAL 8.28 g/dL 6.4-8.3 Biuret
ALBUMIN 5.13 g/dL 3.5-5.2 Bromo Cresol Green
GLOBULIN 3.15 g/dL 2.0-3.5 Calculated
A/G RATIO 1.63 0.9-2.0 Calculated

Test Name Result Unit Bio. Ref. Interval Method
C-REACTIVE PROTEIN CRP (QUANTITATIVE) , SERUM 1.2 mg/L 0-5 Latex Particle Immunoturbidimetric

Test Name ELECTROLYTES - SERUM , SERUM
Result Unit Bio. Ref. Interval Method
SODIUM 140.6 mmol/L 136-145 ISE (Indirect)
POTASSIUM 4.2 mmol/L 3.5-5.1 ISE (Indirect)
CHLORIDE 103 mmol/L 98-107 ISE (Indirect)

Test Name Result Unit Bio. Ref. Interval Method
IRON STUDIES (IRON + TIBC) , SERUM
IRON 107.0 µg/dL 70-180 TPTZ AND NITROSO-PSAP
TOTAL IRON BINDING CAPACITY (TIBC) 324 µg/dL 261-462 Calculated
UNSATURATED IRON BINDING CAPACITY (UIBC) 217.00 µg/dL 155-355 NITROSO-PSAP
% OF SATURATION 33.02 % 14-50 Calculated

THYROID PROFILE TOTAL (T3, T4, TSH) , SERUM
TRI-IODOTHYRONINE (T3, TOTAL) 93.2 ng/dL 87-178 CLIA
THYROXINE (T4, TOTAL) 9.4 µg/dL 5.48-14.28 CLIA
TSH (Ultrasensitive/4thGen) 2.626 µIU/mL 0.38-5.33 CLIA

Test Name Result Unit Bio. Ref. Interval Method
VITAMIN D (25 - OH VITAMIN D) ,SERUM 21.7 ng/mL 30 -100 CLIA

Test Name Result Unit Bio. Ref. Interval Method
VITAMIN B12 , SERUM 170 pg/mL 190-900 CLIA
```


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/polyamory_personal.md

2025-05-14

# Polyamory (Personal)

#### Summmary

 - If I was living in US, I'd attempt being polyamorous. I want to combine freedom (to think and act independent of others' wants/needs) with >10 year commitments and time investments in a person of opposite sex. 
 - I'm not very keen on woman friends or casual sex or short-term relationships. (Exceptions may exist.)

#### Main

I'd probably be polyamorous if I were living in the US. As of today though, I am not practising polyamory.

Why?
 - Long-term benefits
   - I'm sympathetic to the idea of compound benefits. Investing 10 years into the same person has compound benefits you don't get if you choose a new person to date every year. I don't have a deep understanding of what the compound benefits are, but I'm convinced they're real and significant.
   - I'm also sympathetic to the idea of planning your life many years into the future instead of just doing what feels good for the next day or month or year.
   - My ideal is not a casual relationship as I tend to get attached to people.
   - I'm not very keen on staying friends with women I'm attracted to, due to this not working out in past experiences.
 - Independence of thought
   - My current guess is that it is possible to maintain higher level of independence of core values and beliefs from your partner while in a polyamorous relationship. Maintaining this independence of thought is very important to me.
 - People are hard-to-compare
   - I'm sympathetic to the idea that different people provide different forms of values that are hard to compare, say on a scale of 1 to 10.
   - Imagine being forced to have only one friend, for example. It would significantly alter your attitude towards friendships if you're forced to pick only one.
   - If I'm forced to pick only one partner for my whole life, I will naturally have a high bar for who I wish to pick. I will have to eventually breakup with anyone who fails to meet this bar, no matter how good they are for me or I am for tham. I do think nobody can be perfect at everything, and there will ways in which this person falls short no matter who they are. There will be meaningful value (trustworthy advice, knowledge, emotional support etc) I can get from people other than this one person.
 - Time investments
   - I like gradients over sharp cut-offs. If you're monogamous you're forced to either consider someone as having potential to become one of the most important people in your life, or remove them from your life completely. There is no in-between. This is unlike friendships where you can gradually increase or decrease time investment as circumstances change, without losing the relationship completely.
   - Being poly can allow you to maintain a relationship with much lower time investments. For instance one meeting in 3 months may also be sufficient depending on circumstances as they are not solely dependent on me for any of their needs. This is also similar to some friendships requiring little time investment to maintain and yet lasting long-term.

Why not?
 - Unknown unknown
   - I don't yet have a lot of real world data on problems faced by polyamorous couples. Everything I wrote above is only theory. I'd like more data on the same.
 - Societal acceptance
   - As long as I live in India, I and people I date are likely to face societal judgement from religious people. I'm personally okay dealing with that but it could be a non-trivial task to find a partner who is also okay with it.

How?

Scarce resources include time and place.
 - I used to think time is a constraint for practising polyamory but I'm less convinved by this now. Even if you have only 4 days a month to spend with other people, you can split that as 2 days a month per person. Doing this over 10 years still likely gets some of the compound benefits I mentioned before.
 - Place is a significant constraint. I haven't finalised a place to live many years in, it's possible I don't stay in India. So any relationships I'm part of will have to adapt to that. (This is the kind of annoying tradeoff that makes me wish billion-person cities existed.)


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/therapy_effectiveness_personal.md

2025-05-05

# Therapy effectiveness (Personal)

Disclaimer
 - Quick note
 - I have confirmed with everyone who this document is talking about, that they're okay with this document being published.
 - I don't have background in psychology, or a huge amount of experience as a client. I have a medium amount of experience (>50 sessions) as a client.
 - I talk a lot about anecdotal personal experiences here, rather than average result of all people. Don't generalise my anecdotes too far.

Research
 - Empirical data shows therapy is helpful for some people and not for others. I could not find clear empirical data to predict in advance whether you're in a category of people who is likely to have higher or lower success rate.
   - I have not read many papers on it. I usually end up reading reviews of the papers by people like Scott Alexander and others.

Personal anecdotes
 - I have had mildly positive but not greatly positive experiences with therapy so far.
   - Not keen on sharing a lot more details on a public forum. If you want complete information on this you're better off befriending me and asking in-person.
   - A common problem I run into, from my perspective atleast, is differences in core values and beliefs.
     - Therapists seem to be taught to be maximally non-judgemental and accepting all values and beliefs. Even if they don't agree with your worldview, they are supposed to imagine for a moment they had your worldview and try figuring out a solution with that assumption in mind.
     - I think I have had multiple therapists who, from my perspective, lacked meta-honesty when doing this with me. Example of a meta-honest comment: "I might disagree with some of your opinions but I refuse to talk about the disagreement because I don't think that's best for the type of relationship I'm trying to build with you. As per my worldview it may or may not be good for you to continue down your current path. But I'm going to ignore all that and just focus on what's good for you assuming your worldview"
     - I think I distrusted multiple therapists because this was not explicitly discussed. I would force them to cough up information about their values and beliefs, without realising they've been trained to not share that information with me.
     - I think human brains have a lot of specialised wiring to identify if someone is on your side or not. If someone diplomatically tries to look like they're on my side but actually they're not, that can make them seem less trustworthy to me compared to if they honestly admit they're not on my side.
     - In theory, a therapist can choose to support you on your current path even if they don't personally like that path, because that's what you're paying them to do.
     - In practice, I think payment alone is not a good enough incentive. A therapist (or any human for that matter) is likely to do a better job helping you out if they actually want you to succeed on your current path for reasons besides just money.
     - In my personal case, I think this also becomes difficult as I'm pursuing explicitly political goals that can damage the lives of other people for a greater good. It's easier to help someone pursue goals you don't believe in, if those goals are just personal goals (like find a partner of xyz type or make lots of money) than if those goals are political goals (like organise protests in order to destroy someone else's business).
     - I think most therapists worldwide believe in something similar to deontological moral system (don't do harm ever, greater good never justifies harm) and I don't believe in it. This disagreement also, I think, plays a role in the types of problems I have run into during therapy.
   - I also think therapists' secrecy norms are generally not as strong as required for someone who actually has their career or life on the line for violating secrecy.
     - Many therapists lack a background in criminal investigations and lack understanding of what stronger guarantees of secrecy actually look like.
     - Examples: lawyers, spies, criminals or potential criminals, foreign diplomats, major politicians, billionaires, allies/advisors of major politicians or billionaires, etc
     - This probably doesn't matter for therapy efficacy in most cases, as most people's lives aren't interesting enough for an adversary to put a lot of effort into getting their secrets.
     - It definitely matters for a small set of cases where the client takes secrecy more seriously than the therapist. Sometimes the client is right in needing stronger guarantees on secrecy and sometimes the client overestimates their true need for these guarantees. But in any case, most therapists can't provide them.

Speculation
  - The main reasons a therapist can give you good advice:
    - less bias compared to you - they are not personally afflicted by the same problems you are, which can let them be less biased around it. This fails if your problems are in fact global problems shared by them. Also it's possible an intelligent friend can play this role.
    - they hve more information than you
      - they've seen other people's secrets and they understand the world better as a result. - I believe in this, if a therapist has seen lots of people with problems similar to you, they likely have some understanding of the common generators to those problems.
      - they have been trained on a standard set of procedures that have weak positive empirical evidence - I'm confident this works for a subset of people. If you don't have one of the standard problems, success rate of therapy varies a lot more I think.
      - they have spent a lot of time thinking about psychological problems, unlike you. - IMO median therapist does not do enough world-historically-original thinking that they'll be likely to solve your problems this way. There are likely outlier therapists who are good at thinking originally and creatively to solve your problems.
 - Therapists are often good at emotional support.
   - I'm usually biased towards requiring advice more than support, but emotional support is valuable to a large number of people. Most people fundamentally want to be heard, understood and accepted.
   - Beyond that though, client needs practical advice to build life outside of therapy not just support.
   - I tend to measure effectiveness based on good advice rather than based on good support. This is a personal bias of mine.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/unimportant/plan_b_personal.md

2025-05-16

# Plan B (Personal)

If I was not working on my current cluster of projects around whistleblowing and AI companies, what would I be working on instead?

1. Maybe trying to understand in depth why cost of solar energy is going down, and maybe working in the field to reduce cost of energy.

2. Maybe not working on an altruistic project at all, and just meeting new people of cultures very different from mine.
 - (My biggest issue with this idea is that close relationships don't scale, and a lot of the interesting information can only be obtained in close relationships. Which prompts me again to study SDM in a more theoretical manner.)


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/conversation_modes.md

2025-06-02

# Conversation modes

Disclaimer
 - Quick note
 - New and experimental. Unsure of a lot of claims here. I don't have crisp descriptions of reality, I am mostly thinking aloud here. Will update once I have better understanding.

I was initially trying to figure this out in a dating context, but it also seems relevant to friendships and possibly even work relationships. In a dating context, I realised I seem to increasingly not want advice or worldview discussion or shared activities. (It is possible some of this varies with person or context. I don't yet have very deep understanding of my own preferences for conversation modes.)

Seen mode
 - Objective of conversation is to help other person feel seen, heard and understood. This might (or might not) be possible even if you think their choices are morally wrong, or you think their choices are going lead them to be less happy/fulfilled/etc or regret their choices.
 - Allow them to speak more monologue, without sharing a lot of your comments or sharing similar relatable experiences.

Advice mode - tactics
 - Many people are open to advice on tactics. Share advice as soon as you notice the other person making suboptimal moves, as to what they could've done better.
 - Sometimes they might be facing significant difficulties in life, and they might especially want advice on tactics.

Advice mode - strategy
 - If you give advice either on a person's core values, or on the core strategy they've invested in for many years of their life, there is a significant probability they don't receive it well. This advice mode means sharing advice as soon as you notice the other person either has morally incorrect values as per you or is using a majorly suboptimal strategy to life.
 - The other person may be more likely to receive this advice well, if they also have an existing circle of people who does make them feel seen/heard/understood/etc.
 - This can require a significant time investment to do well.

Worldview discussion mode
 - "Advice" is most relevant to immediate actions the other person is taking in life. Discussion of worldview is more broad. Discussing worldviews is easier to do if the worldview updates the other person makes, doesn't, as per them, require them to make majorly different actions in life.
 - This can require a significant time investment to do well. Especially if discussing more fundamental aspects of their worldview.

Shared activity mode
 - It may be possible to help someone feel seen/heard/etc by doing shared activities with them. Especially if they've already shared important information about their values, strategy, etc with you. Doing shared activities is a signal that you approve of their existence regardless.

Intimacy mode
 - Has similar benefits to shared activity mode.
 - Also has additional benefits. (Not discussing here.)


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/personal/my_people.md

2024-06-13

# My people

Until recently I modelled my relationships primarily in terms of commitments.

Now however I think this model is incomplete, and there should also be a model for "relationships as shared values/beliefs". Shared values/beliefs make it easier to do multiple interactions over time. The model below only talks about isolated interactions.

Also "relationships as shared activities" and "relationships as emotional support" can be expanded more in depth. I will update this document when I have more clarity on all this.

#### "Relationships as commitments" model

Reasons to have people in your life:
 - personalised advice or knowledge transfer
   - one-time or emergency (a)
   - regular (a?A?)
 - shared activities
   - one-time (a)
   - regular (A)
 - physical intimacy
   - one-time (pa)
   - regular (PA)
 - emotional support
   - emergency/one-time (pa)
   - regular (PA)
 - financial support
   - emergency/one-time (c)
   - regular (C)
 - logistical support
   - emergency/one-time (p?a?)
   - regular (P?A?)

Heuristics
 - Scarce resources include attention/time (labelled A), capital (labelled C) and place (labelled P).
 - Place
   - Regular emotional support and physical intimacy are the most dependent on living in the same place as the other person.
   - Logistical support may or may not be dependent on living in the same place as the other person.
 - Attention/time
   - Usually if someone requires significant emotional support, they need it regularly over some time period instead of one-time only. Exceptions exist.
   - Regular shared activities require significant time investment.
   - Regular emotional support and physical intimacy require significant time investment.
   - Logistical support may or may not require significant time investment.
   - Personalised advice can generally be provided quickly, but teaching someone a new skill or worldview can require significant time investment. If the advice or knowledge is not personalised, it can usually be acquired from internet without consuming someone else's attention.
 - Capital
   - Providing investment or loan requires a certain amount of trust in the individual or system around them that protects this investment or loan. Trust levels and systems vary depending on country and socioeconomic class.
   - Providing donation to someone does not require the same type of trust, but is obviously more costly.
   - Capital can be used to purchase many types of logistical support.
 - Not scarce
   - Receiving and giving advice, and one-off casual interactions do not require a lot of time investment.

What I can offer to others as of 2025-05
 - Place-scarce
   - I have not yet finalised a long-term location to live in for many years.
 - Attention-scarce
    - I am careful about promising regular emotional support or intimacy to anyone. I might offer this to someone depending on the person and circumstances.
    - I have promised emergency emotional and logistical support to a set of people I consider close. I am open to expanding this set to include more people.
      - I'll provide a 6-month notice period if I ever want to break these commitments.
      - I'm hoping most of these commitments last multiple decades. 
      - (If you're unsure if you're on this list, please message me and ask.)
    - I'm unlikely to offer regular logistical support to anyone. Exceptions may exist.
 - Capital-scarce
   - I am unlikely to offer regular financial support to anyone who asks. Exceptions may exist. (I might hire someone, but I consider that different as I'm getting something in return.)
   - I might offer one-time financial support to someone depending on the person and circumstances.
     - I have done this before.
     - I am open to requests for the same. If you are asking me for financial support, it will help your case if you have:
       - a good reputation verifiable by your social circle, and they are informed about the support you've received from me
       - what a bank would consider credit-worthy, such as a source of income or collateral
 - Not scarce
   - I can offer advice to wide variety of people during my lifetime. (Whether my advice is any good is a harder question, I'd like to atleast offer average advice to most people and great advice to some people.)
   - I can have one-time (or few-times) casual interactions with a wide variety of people during my lifetime.

What I want/need from others as of 2025-05
 - Place-scarce
   - I have not yet finalised a long-term location to live in for many years.
 - Attention-scarce
   - I am unsure about what types of emotional support I want from others. I am still in exploration phase.
   - I am open to finding long-term partner for intimacy. See my [date me](./date_me.md) document for more.
   - I am trusting a handful of people I know to offer emergency logistical support when I need it. Having more such allies couldn't hurt, including more trustworthy and more capable ones.
   - I am unsure which shared activities I would like to regularly engage in. I am still in exploration phase.
 - Capital-scarce
   - I am trusting a handful of people I know to offer emergency financial support. I think there is very low probability that I will actually end up using this.
   - I want donation for projects I want to work on. See my [donate to me](./donate_to_me_asi_leaks.md) documents for more.
   - I can be hired for a big enough salary. See my [hire me](./hire_me.md) document for more.
 - Not scarce
   - I want to get advice from a wide variety of people. I am likely to ignore a lot of advice though. I especially want trustworthy advice from people a) with similar life goals as me, and b) with empirical data of outcomes of people with similar life goals as me.
   - I want to have more one-time (or few-times) casual interactions with a wide variety of people in my life.

Ask don't guess
 - If you want to request anything from me, feel free to ask. You don't have to guess and pre-emptively reject yourself, it is much better to ask.
 - I am comfortable rejecting requests I cannot fulfill, and I will generally try not to make you feel uncomfortable for making very personal requests. (If you're still concerned, send me an anonymous email.)
 - I generally think of myself as a helpful person. As long as you're not asking me for a large amount of a scarce resource (like time, capital, etc) there's a high probability I will help you.

Who?
 - If we share topics of curiosity or similar ways of thinking on a day-to-day basis, then too I am likely to want to spend time interacting with you.
 - I value novelty and ideological diversity. If I find novelty in my interaction with you, it is likely this means either you have deviated from social norms in some way, or you belong to a group whose social norms differ from the groups I have spent time in. If either of these is true about you, I am likely to find you interesting.
 - Getting into my inner circle (to receive significant support from mine) can usually be done by fitting into one of the above clusters plus having a significant number of shared experiences with me. There may be other ways to do it; I don't have a deep understanding of why I find people interesting or worth spending time with.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/courses_for_you/software_course.md

2025-07-11

# Software course

Haven't made a software course as it's not a priority

w3schools has good intro courses

Topics to study
 - Intro to programming - basic concepts like data types, loops, arithmetic operations, functions
 - Data structures and algorithms - big O complexity, sorting algos, so called "dynamic programming", graph algos (djkstra, floyd warshall, etc - used more in job interviews than jobs), etc
 - Specific platforms for specific tasks - nodejs/ruby on rails/etc for web backend, react for web frontend, flutter/java for ios/android app dev, etc
 - Low-level software and hardware optimisation - operating systems, assembly programming, networking, memory management, database management etc - this is where many of the high-paying jobs are
 - Niches - game dev, etc

I recommend starting with intro to programming in C, and python. Then you can choose between either doing data structures and algos, or building a specific website/app/whatever depending on your interest.

Ask a friend to install libraries and compilers for you. This is very much not beginner friendly. For a beginner, writing code is much easier than installing stuff.

AI is pretty good at doubt clearing now, make extensive use of AI.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/courses_for_you/finance_course.md

2025-07-11

# Finance course

Haven't made a proper course as it's a not an immediate priority.

Topics I strongly recommend to almost everyone
 - Intro to asset classes
   - equity, bonds, crypto, gold and other commodities, real estate
 - Intro to derivatives
   - margin trading, quarterly/monthly futures, perpetual futures (popular in crypto), european and american options
 - Intro to risk
   - time value of money
   - expected value, variance, sharpe ratio
     - Maximising expected value is not enough, you want maximise expected value given a certain variance, tail risk and time horizon. Two opportunities may have same expected value but one may be way better than the other.
   - tail risk, log utility, kelly criterion
     - Variance and tail risk are two different risks, don't mix them up. Calculate separate values for both.
 - Trading in practice
   - understanding orderbook, limit/market orders, stoploss, etc
 - Effects of large portfolio size (> $10M)
   - absorb more tail risk (due to log utility)
   - access illiquid but big opportunities
   - pay full-time employees to do analysis for you, thereby increasing expected value of bets
   - illiquid small opportunities are no longer worth your time 
     - (if you are trading personal funds, you have comparative advantage in illiquid markets where big players don't bet)

Topics I weakly recommend (depends on your situation and your interests)
 - Financial theory
   - Markowitz optimisation, formal proofs surrounding EMH, long-term versus short-term
 - Fundamental analysis
   - EBIDTA, balance sheet analysis
 - Options trading
   - brownian motion, black scholes model, black scholes as applied in practice
 - Liquidation process

Topics I recommend you avoid
 - "Technical analysis" - it is mostly a scam


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/courses_for_you/ai_course_120min_202506.md

2025-07-02

# Intro to ML, intro to LLM

2-hour talk, for Ooty retreat 2025-06-26, Samuel Shadrach

## Pre-requisites

Matrix multiplication, differential calculus, computer programming

```
Find C = A @ B.T

@ means matrix multiplication, .T means transpose

    [1  -1 3]
A = [2  3  3]
    [-2 0  2]

    [7  -1 4]
B = [-2 0  0]
    [0  -3 -1]

z = y^2 - (sin(theta))^2 + 2 y cos(theta) + 1

Find partial derivatives dz/dy and dz/d(theta)

Write a function in C to sort an array of N integers in time complexity O(N log N). Do not use built-in libraries for sorting. Accept N as input, then accept the integers as input.

Definition of time complexity: Suppose we plot time required to sort an array as a function of N. Let's call this T(N). Time complexity of O(N log N) means T(N) < c N log N for some constant c and all possible N.
```

If you can solve all the above questions, you have covered the pre-reqs.

## Two-layer fully-connected network, trained on MNIST

Resources
 - [MNIST dataset](https://github.com/cvdfoundation/mnist)

problem statement
 - let's say we have 60000 photographs, each is 28x28 pixels wide, grayscale (pixel value between 0 and 1)
 - each of these is already classified into one of two digits: 0,1,2....9
 - we want a program that can quickly classify as many of these as possible
 - we cannot hard-code the final answers into the final program, because we want this program to also work well on new images we have never seen.
 - there are 10000 photographs we have not seen, our final score will be tested on this.

solution: define the following function using constant weight matrices W1 and W2

forward pass
```
Y = ReLU( ReLU(X @ W_1) @ W_2 )
loss = - sum (Y dot Y')
```

definition of ReLU

`ReLU(M)_ij = if M_ij > 0, then M_ij, else 0`

Y contains prediction (stored as log probabilities), Y' contains actual answer

example (assume N=1 image for now)
```
Y = [ln(0.15) ln(0.75) ln(0.15) ln(0.05) ln(0) ln(0) ln(0) ln(0) ln(0) ln(0)] = [-1.89 -0.28 -1.89 -3.00 -inf -inf -inf -inf -inf -inf]
Y' = [0 1 0 0 0 0 0 0 0 0]
loss = -ln(0.75) = -0.28
```

dimensions
```
X: (N,D)
W_1: (D,E)
W_2: (E,C)
=> Y: (N,C)
```

example dimensions
```
X: (60000, 28*28) = (60000, 784)
W_1: (784, 800)
W_2: (800, 10)
=> Y: (60000, 10)
```

Objective: find W1 and W2, so that as many images get classified into correct classes as possible

#### training loop

How do we find W1 and W2 fast?
 - W1 has 627k values, W2 has 8k values
 - even if each cell must be 0 or 1, thats `2^(627k * 8k) = 2^(5 * 10^9)` possibilities
 - any sort of brute force or iteration is too slow

gradient descent
 - given any weight matrices W1 and W2, we will find new W1 and W2 that are slightly better

find gradients of loss with respect to weight matrices
```
dL / dW2 = ???

dL / dW2_ij = ???
```

visualise it
```
L = - | { ReLu { ReLu { [X_00 ... ] [W1_00 .... ] } [W2_00 .... ] } } dot Y' |
      | {      {      { [...  X_ND] [....  W1_DE] } [....  W2_EC] } }        |
```

how to find dL/dW1_00? and repeat for all values in W1?

rewrite it

```
H = f1(X,W1)
Y = f2(H,W2)
L = f0(Y,Y')
```

remember we are finding partial derivatives. all values are assumed constant, except the value with respect to which we are finding derivative

```
dL / dW2_ij
= df0/dY * df2/dW2_ij
= ...

dL / dW1_ij
= df0/dY * df2/dH * dH/dW1_ij
= ...
```

final answer, copy-pasted from o3, might contain hallucinations
```
# forward
A1 = X @ W1
H  = np.maximum(0, A1)
A2 = H @ W2
Y  = np.maximum(0, A2)
loss = -(Y * Y_true).sum()

# backward
grad_Y  = -Y_true
mask2   = (A2 > 0).astype(float)
delta2  = grad_Y * mask2            # dL/dA2
grad_W2 = H.T @ delta2              # dL/dW2

delta1  = (delta2 @ W2.T) * (A1 > 0)  # dL/dA1
grad_W1 = X.T @ delta1              # dL/dW1
```

conclusion
 - we have found matrices `dL/dW1` and `dL/dW2`, for given constant values of `X, Y', W1, W2`
 - `(W1 - e * dL/dW1, W2 - e * dL/dW2)` will have slightly less loss than `(W1, W2)`
 - repeat this processs millions of times to get very good `(W1, W2)`
 - this will run fast on a GPU (that's why we defined the network this way)
   - only intensive operation is matrix multiplication
   - no copy-paste operation
   - discard intermediate values on each iteration, and update W1 and W2
 - p.s. in practice we will split our dataset into batches to do batch SGD, use an optimiser such as Adam and use cross-entropy loss. all this is not important for now.

#### optional homework

 - derive the backprop formula above
   - figure out notation so that this calculation becomes easy to do
   - I have heard that einstein notation makes this result easier to derive, I have not practised it myself though
 - using pytorch, actually train a two-layer fully connected network on MNIST 60k examples training dataset, obtain 1.6% error on test dataset
   - in modern ML, usually ML engineers rely on pytorch autograd to compute gradient formulas, so they don't have to calculate it by hand.
   - for this homework assignment, don't use pytorch autograd. hard-code the gradients specified above.

## deep learning is 15 years of accumulated blackbox tricks

 - question: ok but why did we use two-layer fully connected network in the first place?
   - answer: [click here](https://www.youtube.com/watch?v=XD68yeBVXgA)

 - we almost never have mathematical proof for why doing anything is a good idea
   - why is RMS norm a good idea? why is softmax a good idea? why is ReLu a good idea?
   - why did we define Q, K, V this way in attention blocks? why did we use 48 layers not 24?
   - why did we use residual layer? why did we use cosine for positional embeddings?

 - we do things if we have empirical evidence it worked before, and outperformed similar ideas
   - but: most ideas have not been tried yet. so we don't know if it actually outperforms similar ideas, just the ones we have tried so far.
   - there is a graveyard of old approaches
     - Older architecture: RNN, CNN, LSTM
       - Today: Attention dominates everything
     - Older activation functions: Sigmoid, tanh, GELU
       - Today: ReLu dominates everything
     - Older loss functions: Mean square loss, hinge loss, logistic loss
       - Today: Cross-entropy loss dominates everything
     - Older optimisers: vanilla SGD, RMS prop
       - Today: AdamW (Adam with weight decay) dominates everything
     - Older ideas that nobody remembers: dropout, L1 regularisation, etc

     - New ideas that might or might not become old one day: Mixture of experts, temperature tuning for long context, etc

 - we have intuitions for why we are doing what we are doing
   - but: intuitions are often retroactively justified, after we have empirical evidence it worked. no one publishes intuitions for failed ideas.
   - but: intuitions often come just by looking at hundreds of training runs with slightly different networks. today only big labs can afford to do these many runs at large scale, so only their researchers can build these intuitions.

## Transformer

Resources
 - Justin Johson, University of Michigan, Deep Learning for Computer Vision - [Lecture 13: Attention](https://www.youtube.com/watch?v=YAgjfMR9R_M&list=PL5-TkQAfAZFbzxjBHtzdVCWE0Zbhomg7r&index=13)
 - Facebook, Llama 4 (launched 2025-04) - [Code and model card](https://github.com/meta-llama/llama-models/blob/main/models/llama4/), [Try on OpenRouter](https://openrouter.ai/meta-llama/llama-4-maverick)

Ask o3 this question along with the code: `make a list of all the steps in the following forward pass in plain english`

 - Input: Sequence of tokens
 - Output: Logits (log probabilities) for next token

Forward pass
 - Tokenise
 - Add positional embeddings
 - Fuse image embeddings (optional)
 - N transformer blocks (let's say N=80)
   - RMS Norm
   - Multi-headed Attention
     - **Projection:** `X -> Q_1, K_1, V_1`
     - Reshape: `{Q,K,V}_1 -> {Q,K,V}_2`
     - Rotary embedding (optional): `{Q,K}_2 -> {Q,K}_3`
     - Head-wise RMS Norm (optional): `{Q,K}_3 -> {Q,K}_4` 
     - Temperature tuning (optional, for long context): `Q_4 -> Q_TEMP_TUNED`
     - Duplicate KV cache (for faster computation): `K_4, V_1 -> K_DUPLICATED, V_DUPLICATED`
     - **Scaled dot product attention (torch.nn.functional.scaled_dot_product_attention):** `Q_TEMP_TUNED, K_DUPLICATED, V_DUPLICATED -> WO`
     - Projection: `WO -> OUT`
   - Add residual
   - RMS Norm
   - Feed forward / Mixture of experts (`ffn.py`)
     - Either expert gating network
     - Or 2-layer fully connected network with SiLU activation function
   - Add residual
 - Projection

Training loop
 - Given log probabilities of next token and the actual next token, compute cross-entropy loss
 - Compute gradients of all the weight matrices with respect to cross entropy loss
 - Use AdamW optimiser to do gradient descent
 - Spend $100 billion per year mostly on one single training run ([yes, really](https://www.youtube.com/watch?v=GhIJs4zbH0o))
   - *Note: To be technically accurate, this repo is for Llama 4 Scout which only cost ~$10M capex to train (5M hours on H100). State-of-the-art models like Llama 4 Behemoth, GPT4.5, grok-3 likely cost $1-10B capex and used similar architecture but not this exact repo.*

#### important steps in attention block

Projection

```
Q = X @ W_Q
K = X @ W_K
V = X @ W_V
```

Scaled dot product attention

`Y = softmax[ (Q @ K_T + mask) / sqrt(|Q|) ] @ V`

Softmax normalisation
 - replace each cell with e to the power same cell
 - divide each cell by sum of the row

`softmax(M)_ij = e^M_ij / sum i (e^M_ij)`

example
```
softmax { [1  2 ] } = [ e/(e+e^2)     e^2/(e+e^2)     ] = [0.268 0.731]
        { [3  -1] }   [ e^3/(e^3+1/e) (1/e)/(e^3+1/e) ]   [0.982 0.018]
```

Mask
 - this is the data we are going to pretend we don't have access to, and have the model try to predict it

Intuition: How does this work?
 - softmax: largest values in a row output approx 1, smallest values in a row output approx 0
 - mask: some values get hard-coded to 0
 - `softmax(something) @ V`
 - `softmax(something)` is a matrix with most values close to 0 and 1, this tells us which values in `V` to ignore versus not ignore
 - **We are hiding some parts of X from ourselves (using mask), paying more attention to other parts of X (using softmax Q K_T), and then finding optimal weight matrices to predict the parts of X that we hid**

Intution: Why does this outperform other known techniques?
 - Sequential versus parallel
   - Naive way of formulating next-token prediction is as a sequential problem. Attention masks parallelise this.
   - RNNs do next-token prediction but their forward pass predicts tokens sequentially.
   - This means gradient descent needs to happen across many serial steps. GPUs are good for training parallel stuff not sequential stuff.
   - Vanishing gradients problem when doing gradient descent across many sequential steps.
 - Pay attention to what?
   - CNNs maintain hard-coded sliding windows of which tokens to pay attention to.
   - LSTMs and RNNs maintain a shared context that is reused for many tokens.
   - Attention layer can look at the tokens and use the tokens themselves to compute which tokens to pay attention to. (Remember `K,Q,V` are all a function of `X`)

#### misc stuff in transformers

tokenisation
 - create a vocabulary of ~100k most commonly used words and phrase
 - convert input into 1-hot encoding using this
 - basically "hello" will become [0 0 0 0 ....maybe 25k entries .... 0 0 1 0 0 ... maybe 75k entries ... 0]
 - why?
   - deep learning works on math, not words
   - english has less than 100k commonly used words, we can hard-code this into the model instead of training the model to figure this out on its own.
   - why not train it? don't know for sure

positional embeddings
 - calculate some cosine thing of the token, and attach it to the token
 - ensures each token also now stores data of which position it is. imagine new input is: `my one name two is three alice four`
 - why?
   - don't know for sure
   - intuition: maybe humans speak differently at the start of a paragraph versus in between. so it's helpful to always remember where you are.

RMS norm
 - divide each cell by root mean square of all cells in that row
 - ensures values remain between 0 and 1
 - why?
   - don't know for sure
   - intuition: gradient descent is more well-behaved for values between 0 and 1

Residual layer
 - let's say we did some stuff: `Y = f(X)`
 - adding residual just means adding input back in: `Y_with_residual = Y + X = f(X) + X`
 - why?
   - don't know for sure
   - intuition: "exploding and vanishing gradients" - sometimes if you do gradient descent on weight matrixes across so many layers you get gradients that approach zero or infinity

Mixture of experts
 - Not covering in this lecture

N layers put together
 - In total there are N sequential layers of transformer blocks. So we are doing gradient descent across N layers to find optimal weight matrices in each layer.
 - Intuition for N layers: pay attention to nearby tokens, then not so nearby, then not so nearby

#### typical hyperparams

 - number of layers
   - depends on model size
   - typically 32 to 128 layers

 - number of params
   - depends on model size
   - GPT2 XL (2019): 1.5B params
   - GPT3 (2020): 175B params
     - GPT3.5 based on GPT3
   - GPT4 (2023): rumoured 1.8T params
     - o1, o3 based on GPT4
   - GPT4.5 (2025): rumoured 12T params, of which 1T active params (mixture of experts)

 - bytes per param
   - training
     - typically float32 (4 bytes per weight)
     - mixed precision training is recent, for example deepseek
   - inference
     - quantisation works well: fp16, int8, int4, 1.58-bit

 - model size
   - model size in bytes = number of params * bytes per param

How to pick number of params when training a new model
 - depends on data and compute available, versus data and compute required
 - data and compute required is calculated using chinchilla scaling law
 - typically compute has always been the bottleneck, not data
 - epoch.ai forecasts running out of internet data in 2028

 - data required per param
   - (not very good) rule of thumb: 20 tokens/param * number of params
   - typically trained for one epoch - gradient descent is done on every token exactly once

 - compute required
   - (not very good) rule of thumb: 6 FLOP/param/token * number of params * number of tokens

More resources
 - [chinchilla scaling law](https://en.wikipedia.org/wiki/Neural_scaling_law#Chinchilla_scaling_(Hoffmann,_et_al,_2022))
 - [Epoch.ai - past data and forecasts of FLOP, tokens, params](https://epoch.ai/trends)

---


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/related_quick_notes/distributed_hosting_of_leaked_documents.md

2025-06-14

# Distributed hosting of leaked documents

#### Disclaimer

 - Quick note
 - Migh update quickly based on new info
 - Contains politically sensitive info

## Update

 - **In the specific context of AI whistleblowers, I now consider most of these ideas of the original document low priority. Please go see the other document instead.**
 - **This document is more of a description of a platonic ideal world than something that I will build immediately. When I first wrote this I was thinking mostly of theory not practice.**

 - Proposed solutions from highest to lowest priority
   - Highest priority: redaction guidelines, persuade more people to become operators of SecureDrop-like system
   - Standardise a format for PoW hashes attached to files, and standardise a format to attach these hashes to http headers. May need to be spearheaded by a crawler or search engine or publisher in order to get mainstream adoption.
     - Alternatively, wait 5-10 years so that it is cheap enough for individuals to crawl and archive the entire internet. Then there may be better solutions than PoW hashes.
   - Run open more open source crawlers (like internet archive, commoncrawl etc) in states outside of US geopolitical influence
   - Open source file format conversion, embedding search
   - Make list of most popular youtube/twitter/etc channels and hard-to-censor social media platforms in each country
   - Don't know how to do right now: incentives for multi-hop dead drop system
   - Don't know how to do right now: reduce costs of video CDNs, where videos are banned in some nuclear states and not others

 - This document assumes 3 stages of increasing attention on the documents. In practice, rarely are all 3 stages needed.
   - In practice, sometimes someone powerful wants suppress information and many others want to read it. In this case, one can skip the medium-attention stage and directly go to high-attention stage.
     - Example: Many high-profile leaks
   - In practice, sometimes no one powerful wants to suppress the information. In this case, one can skip the low-attention stage and directly go to medium-attenion stage.
     - Example: Most information currently on the internet
   - In practice, sometimes someone powerful wants to suppress information but it is not immediately obvious who wants to read it or how to reach them. This is the case where 3 stages is most likely to be of use. I have to research more about this.
     - Example: whistleblowers who are ignored at first and taken seriously later, crime-related evidence that the offender wants to suppress but the general public does not care about.

## Main

What?
 - This document describes how to setup distributed hosting and transmission of documents leaked by whistleblowers, in a way that reduces personal risk for everyone involved in the process.
 - (If you squint hard, this document is also a blueprint for an internet with no delete button and by extension a society with no delete button. Once some information has been leaked it stays in public view forever.)

Why?
 - Typically whistleblowing (such as with wikileaks or snowden leaks) incurs significant personal risk.
 - Reducing personal risk to whistleblowers may ensure whistleblowing is highly likely to happen when an org doesn't have complete trust of all its members, forcing them to pay a secrecy tax (in Assange's words).
 - I have my own personal viewpoint around which orgs I'd like to most enable whistleblowing on, although this will be general-purpose infra that can be used by anyone to whistleblow on any org.

#### Summary

 - do SecureDrop / Signal but with increased security and >1000 servers all run by independent actors, and multiple independent dev teams
 - do Internet Archive / CommonCrawl but also crawl rate-limited/banned stuff (like leaked/banned/copyrighted documents, and social media websites), also do >1000 crawls all run by independent actors. also some of these actors share the LLM embeddings.

#### Potential problems with distributing leaked documents

 - Low-attention on the documents. Military-grade security. Documents circulated by people with technical skills and willing to run servers and maintain opsec as part-time job.
   - (If nobody is trying to suppress the publishing of these documents, can skip this stage and directly post to clearnet. See next bullet point for this.)
   - Ideally thousands of server operators exist. Some of them can choose for themselves special roles "redaction specialist" and "publisher". They can use public track record to prove to whistleblower and other operators that they can be trusted with this role.
   - Whistleblower sends documents to an operator via SecureDrop or similar system or via hard disk dead drop. If redaction is required and they can't do it themselves for whatever reason, they send it to an operator who is a "redaction specialist" and has a good reputation.
     - IMO PGP + airgap + dead drop may offer more privacy than PGP + airgap + Tor http request, as of 2025. This is my personal bias and could change in future if physical world DAQ increases (cctv, drones, gigapixel cameras on aircraft).
     - I'm not very happy by some of the design choices made by SecureDrop. I'm looking into alternate solutions. It's possible I don't understand all of their choices. I have written a proper criticism of SecureDrop in a separate document.
     - PROBLEM: convince thousands of people to become operators of SecureDrop or similar system **(most important)**
     - PROBLEM: good infra, protocols, incentives to coordinate dead drops don't exist. Especially true if crossing a large geographic distance and multiple hops are required.
   - This operator does redaction of any sensitive metadata or information, if required. They perform another hop here and send the documents to many other operators in the network using the same system.
     - PROBLEM: need public guidelines on redaction, so anyone can do it. This ideally ensures there are thousands of potential operators right from the start.
   - If any operators thinks the documents are not spam, they can attach a proof-of-work hash and resend it to many operators in the network using the same system.
     - PROBLEM: need standard protocol for proof-of-work hashes. These could be static strings attached to documents, or generated at request-response time. (Tor, Brave, Proton all have separate implementations and they're all low difficulty hashes.)
   - Eventually one of the operators who is a "publisher" hosts documents on a clearnet webserver for the public. This operator also posts a link to this webserver on a hard-to-censor social media platform such as 4chan or rumble.
     - PROBLEM: need guidelines for what the hard-to-censor social media platforms in each country are.
 - Medium-attention on the documents. Low security. Documents circulated by people with technical skills but not much free time.
   - (If the documents are sufficiently important, a popular media org can publish them on their server, allowing the documents to skip this stage and directly go to high-attention.)
   - Mirror a searchable version of docs to thousands of servers immediately
     - It is important that automated mirroring happens **before** any humans read the content on the operator's clearnet server. Whoever first posts the document to clearnet is an obvious target for anyone who wants to take the documents down.
     - Several mutually incompatible protocols exist for mirroring specific information. Example: torrent, ethereum blobdata, IPFS/filecoin, archive blockchain.
     - Protocols to crawl and mirror the entire internet are still in development though. Example: WARC format, Apache Hadoop, the Internet Archive's crawlers, CommonCrawl's crawlers
     - PROBLEM: need open source web crawler to crawl entire internet including any leaked docs/videos, and torrent links containing leaked docs/videos.
     - OR: PROBLEM: need a standard protocol to only crawl websites and torrents that claim to have leaked docs on them (maybe they include a special flag in their readme/robots.txt, and proof-of-work hash to prove not spam.)
     - PROBLEM: need open source plaintext extraction and embedding generation so that along with the raw html crawls (WARC), the plaintext and embeddings are also circulated in the same torrent. need standardised format (WARC-parquet?) that keeps some metadata just like WARC keeps metadata.
 - High-attention on the documents. No security. Documents circulated by anyone.
   - A popular media house publishes it to increase public attention
     - Popular media house will do document verification. I'm assuming they won't face any significant challenge with this. May require metadata of the documents (how to get this??) or contacting the org whose docs got leaked.
     - Popular media house will use embedding search functionality already provided, to figure out what is important to raise attention for.
   - High-attention hard-to-censor social media to discuss the document in general public
     - PROBLEM: need open source crawling and mirroring crawls of all social media
     - I think actually doing distributed social media is too hard. Complexity of app ensures software developers who write the app are politically co-optible. What's easier to do is have distributed crawling and mirroring of a centralised site, so people in future can still view the consensus reached by users of the social media. If it ever gets taken down, someone can get a new server running (does not have to have content of old one).
     - Which social media are high attention and hard-to-censor varies by country.

#### Summary of potential solutions

 - persuade thousands of people to become operators of SecureDrop or similar system **(most important)**
 - coordination for hard disk dead drops, including multi-hop hard disk dead drops
 - proof-of-work hashes to prevent spam on the operators
 - redaction guidelines
 - open source web crawling
   - flags and proof-of-work to only crawl some websites
   - crawl and mirror leaked docs. crawl and mirror social media discussions.
 - open source plaintext extraction, embedding generation
   - standardise format to share extracted plaintext and embeddings
 - guidelines for latest hard-to-censor high-attention social media
   - to publish torrent link, maybe raw docs, and social media discussions
   - guidelines must be country-wise and include legal considerations. always use a social media of a country different from the country where leak happened.

IMPORTANT: Need feedback from people who have actually worked with whistleblowers, to validate all hypotheses listed above.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/related_quick_notes/internet_spam.md

2025-06-18

# Internet spam

Disclaimer
 - Quick note

## Summary

Spam prevention
 - To ensure sender is spending more capital (on computer resources) than the receiver, you can either ask them to complete a proof-of-work challenge or send a junk payload that spends their upload bandwidth.
 - To ensure sender is spending more attention (as a human being) than the receiver, you can ask them to verify an ID or pay you money or verify that other people are already paying attention to them.
   - ID verification can be done open source using videos uploaded online, or it be done using government issued IDs and phone numbers
   - Payment can be done using cryptocurrency or credit card
   - Proving social status can be done using social media following (which can in theory be open source) or using legacy markers such as citation counts or linkedin profiles.

## Main


I recently (meaning 2025-05) enabled comments on my website, which forced me to (again) think about spam and censorship as potential problems on the internet.

#### Censorship

I am not interested in preventing any content from reaching me, however cloud providers may be interested in this.
 - As of 2025-05 it is still not difficult to find a cloud provider that does not packet scan all your traffic.
 - Receiving objectionable packets is fine, sending objectionable packets can be a problem. I will use my discretion on what content gets posted publicly, as that is me sending packets to others not receiving packets.

#### Spam

Spam is bad for two reasons

 - Spending receiver capital (on server resources) is bad for receiver
   - Atleast for text content, server resources are very cheap. Images, audio and videos are where it becomes non-trivial.
   - If the sender is anonymous, you want to ensure the sender spends more resources than the receiver.
   - Sender cost:
     - Sender cost, network egress > $0.05/mbps/mo (assume) = $0.05/(316.4 TB) = $0.00016/TB
     - Sender cost, CPU ~ 0
   - Receiver cost:
     - Receiver cost, network ingress << Sender cost, network egress.
       - (This is usually true assuming the receiver is running on a cloud datacentre, but it could break down in the limit. See: [cloudflare article on ingress versus egress costs](https://blog.cloudflare.com/the-relative-cost-of-bandwidth-around-the-world/). Typically cloud providers have more egress than ingress so ingress is free, and residential connections have more ingress than egress so egress is free.)
     - Receiver cost, CPU ~ 0
     - Receiver cost, disk = $20/TB/mo * Storage time of comment = $0.0000077/TB/s * Storage time of comment
   - If I want to ensure sender cost > receiver cost, I have following options:
     - Store comments for short duration. Storage time = $0.00016/TB / ($0.0000077/TB/ss) = 20.8 s
     - Force sender cost to increase artificially.
       - Inflation ratio
         - Inflation ratio = desired storage time / 20.8 s
         - (Assume) Desired storage time = 6 hours => inflation ratio = ~1000
       - Solution 1: Force sender cost for network egress to increase
         - I can ask sender to append 1023 KB of junk to each 1 KB of content. Receiver only reads the first 1 KB and discards the rest. Receiver spends on network ingress but not as much on disk. 
       - Solution 2: Force sender cost for CPU to increase
         - I can ask them to compute a hash pre-image whose difficulty threshold depends on content size.
         - Hash difficulty > ($0.00016 / TB) / ($0.004/CPU-core / hour) (assume) = 0.04 CPU-core hours / TB = 2.4 CPU-core minutes / TB
     - Or ofcourse one can use an even stronger solution such as one of those proposed below
 - Spending receiver attention is bad for receiver
   - If the sender is anonymous:
     - Sender cost ~ 0 / TB
     - Receiver cost = 1 / (1000 words/min) = 1 / (83 bytes/s) = 12.3 s / KB
     - Receiver cost, time converted to money = 12.3 s / KB * ($100/h) (assume) = $0.34/KB = $340,000/GB
   - Solution 1: Ask sender to trade capital for your attention.
     - Sender cost > $0.34 / KB
     - Open source approach
       - Monero transaction costs > $0.15, so a monero transaction can be attached to each comment.
       - (There should be good UX for this so the sender and receiver are not wasting additional seconds sending or verifying the payment. This can be automated.)
     - Govt-backed approach
       - A common existing solution is to only take .com domains seriously, those cost atleast $10 to rent. A chain of certificate authorities ending in a govt issues the certificates.
   - Solution 2: Ensure sender spent their attention (as a human being) sending the content.
     - Proof of human
       - The only undeniable proof is meeting them in person. Video footage is second-best.
       - Both of these require spending your own attention in order to do verification.
       - Therefore it is too expensive to prove unique human every time.
       - It is common for them to do verification once and reuse the ID across multiple websites for long duration.
       - Govt-backed approach
         - Most common ID proof nowadays is phone number or gmail linked to phone number. Governments are restricted in how many phone numbers they can issue and therefore do KYC to issue phone numbers.
       - Open source approach
         - Also possible to create an open source ID system where people upload videos online and get vouching by other people's videos online.
       - Once you have such a system running, you can also use ZK proofs to do proof of unique human without revealing who it is. But first you need the proof of human system running.
   - Solution 3: Ensure lots of people spent their attention (as human beings) sending the content.
     - Typically multiple humans spending attention on the same topic is a stronger signal than one person spending attention on a topic. It is more likely worth it for the receiver to pay attention to latter.
     - Statistical distribution
       - Let's say a human sender is paying attention to some content, and they make an attention request to the receiver to also spend your attention on the same content. Let's say receiver had proof this was happening. Let's call this a 1-to-1 attention request.
       - Humanity spends 8 billion seconds of attention per second of attention spent by receiver.
       - For the average human, this is equalised. They are sending as many 1-to-1 attention requests to other people as they are receiving 1-to-1 attention requests from other people.
       - For a human who is high status for any reason, they are receiving a lot more 1-to-1 attention requests from other people than they are sending 1-to-1 attention requests to other people. Hence they need even stronger filtering than proof that a human spent attention on the message they sent.
     - Legacy status
       - Receiver website can verify any legacy status of the sender, such as citation count in academia, job profile in corporate or government, and so on.
       - Verifying such proofs costs significant amount of receiver attention, as most documents can be faked.
         - One solution is to rely on third-parties who will altruistically call-out any fake documents publicly.
         - Documents must be hosted on a neutral third-party server where such comments are allowed.
         - Can also include videos from other people in that same legacy institution, publicly confirming the documents as correct in a video.
     - Social media status
       - Social media platforms typically start off with some proof of human, such as phone number as login or gmail attached to phone number as login.
       - Upvotes (backed by this proof of human) are a popular method of determining who is paid attention to, on any forum
         - Upvotes can decide what is allowed to post at all.
         - Upvotes can decide what is likely to rise to the top of the forum.
       - Usual there is initial seed effect
         - Whatever was the shared topic that most of the initial members were paying attention to, that is what gets upvoted.
         - New users are only likely to join if they were previously paying attention to this before they joined, and not otherwise.
         - Sometimes new users may end up paying attention to the topic even though they didn't before.
       - An ideal world ensures even if there are only two people on Earth spending their attention on some topic, there should exist some internet forum whose seed set is these two people.


## See also

anubis - recent implementation of PoW that is actively in use now (I wish I heard about this sooner)

https://news.ycombinator.com/item?id=43668433

cloudflare's recent RFC

https://blog.cloudflare.com/web-bot-auth/
https://news.ycombinator.com/item?id=43994779

 - I don't like this proposal, main reason being it piggybacks on top of centralised CAs which are hard to obtain anonymously. I will have to write a proper review if I prioritise this stuff

recent complaints by devs on bot traffic caused by rise of LLMs

https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-forcing-blocks-on-entire-countries/

https://www.akamai.com/newsroom/press-release/bots-compose-42-percent-of-web-traffic-nearly-two-thirds-are-malicious


Tor on PoW and similar

https://blog.torproject.org/stop-the-onion-denial/

https://community.torproject.org/onion-services/advanced/dos/

Brave on PoW and similar

https://safe.search.brave.com/help/pow-captcha


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/related_quick_notes/internet_anonymity_without_tor.md

2025-06-18

# Internet anonymity without Tor

Disclaimer
 - As of 2025, there is no empirical evidence of successful deanonymisation attack on Tor. If any govt has this capability, they're successfully keeping this info private. There is public evidence of many of the internet's fiber optic cables being tapped.

## Summary

 - Governments can wiretap fiber optic cables and obtain connections between senders and receivers, along with timestamps.
 - If senders send their pgp-encrypted messages to everyone, and the receiver retrieves the entire dump of all messages from one of these users same hours or days later, then this metadata is much harder to collect.
 - This setup is expensive hence it only works for <1 MB text payloads sent on >1 gbps connections.

## Main

Intelligence-agency-resistant internet anonymity is hard because the physical infrastructure can be inspected by someone with a monopoly on violence.
 - Fiber optic cables cannot hide sender/receiver identities as the attacker can wiretap the cables and then follow the physical path to identify which cable exactly carries a given message. Then they can break into the building that the cable enters.
   - (also fiber optic connections usually requires KYC in most countries, but that's a legal limit not a physical one)
 - Radio signals cannot hide sender/receiver identities as the attacker can triangulate the signal based on signal strengths. Then they can break into the building that is transmitting the signal.
   - (also encrypted radio is illegal in many countries, but that's a legal limit not a physical one)

Success criteria of attacker
 - When considering intelligence-agency-resistant anonymity, getting the metadata alone is enough to count as an attack, even if they don't get the message content.
 - Metadata includes sender/receiver irl identities, sender/receiver pseudonyms, message sizes and timestamps.
 - If the receiver is marked as suspicious, then any sender that connects with them is also marked as suspicious.

Attack 1: Get view access into majority of exit nodes
 - Tor relies on the sender passing each message to three other random users before it reaches the receiver, and hoping the three intermediaries don't all collude with the attacker.
 - If an intelligence agency has view access into majority of exit nodes, they can deanomyise Tor completely.
 - This could be done by controlling exit nodes themselves or by breaking into exit nodes run by others. They can do the latter using hardware or software backdoors, using targeted cyberattacks or using spies.

Attack 2: Wiretap source and receiver machine
 - If the intelligence agency is tapping fiber optic cables of both source and receiver, the timestamps of packets sent will match. This means they are aware of the physical addresses of both machines, the fact that there's a connection between them and the time interval in which this connection occured.

#### possible solution

What if the sender just sent the message to everyone instead of sending it to their intended receiver?
 - Assume that some receivers may be publishing public proofs (via youtube, twitter etc) of their latest uncompromised PGP keys.
 - Assume that each user sends a single payload of X bytes to all users each day. This payload can include encrypted messages to specific users. If they have less than X bytes to send, they fill the remaining bytes with junk data.
 - Assume each user sends their X bytes at approximately the same time each day.
 - Only the actual receivers of the content can decrypt the message. It is junk to everyone else.
 - Assume 'gpg --hidden recipient' was used, so there's no way to tell which pubkey was used to encrypt a given message, from a given set of pubkeys.

Throughput
 - 8 billion users, each user has 1 gbps unmetered fiber optic
   - 1 gbps / 8B = 0.016 bytes/s = 1350 bytes/day
 - 100 million users, each user has 1 gbps unmetered fiber optic
   - 1 gbps / 100M = 105.4 KB/day
 - 100 million users, each user has 10 gbps unmetered fiber optic
   - 10 gbps / 100M = ~1.03 MB/day

Potential problems
 - Real-time messaging not possible. This is slow like courier.
 - Running servers from residential area requires effort. ISPs and OS developers can make this difficult. Renting a cloud server to download the messages does not work, as the cloud server owner knows which subset of these messages you downloaded to your local machine or display.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/related_quick_notes/incentives_and_culture_open_source_intel_agency.md

2025-06-13

# Incentives and culture - Open source intelligence agency

Disclaimer
 - Quick note

## Summary

 - If you want to build open source intelligence agency to enable a world without secrets, it's not sufficient to figure out the tech to do it. Also need to figure out incentives and culture.
 - Incentives: Youtubers and journalists get lots of attention. Blockchain devs gets lots of capital. People doing whistleblowing or hacking or mirroring of info, don't have very good incentives as of 2025-06. Whistleblowing is not very expensive but cyberhacking is very expensive. I'm figuring out how to ensure reward exceeds cost for whistleblowers and cyberhackers.
 - Culture: I consider creating an ideology dangerous so I'm avoiding for now. This system collides with liberal consent norms of society, and secrecy norms of police/military. Hence it may be beneficial to create an ideology that's more powerful than existing ideologies such as liberalism or nationalism. I might create an ideology once I'm more confident it's a good thing to do.

## Main

To reduce lead time in open source leaking of information, I need to figure out:
 - tech
 - incentives
 - culture

I have not currently figured out incentives or culture as well as I would like to, in order to be confident in this as a political system.

Tech
 - I've written many documents on this already. Working on it.

Incentives
 - Data acquisition
   - Reducing hardware cost and software complexity will reduce cost placed on intermediaries in the system, such as people acquiring, transmitting, hosting and mirroring sensitive information. This cost is still non-zero though, and I'm yet to figure out a counterbalancing reward for doing this type of work.
   - Cyberhackers are well compensated for disclosing zero-days (be it by the defending or an attacking org), but not if they choose to use it themselves and publish data publicly for free. Typically finding zero-days for govt targets requires hundreds of researchers working full-time for multiple years, which requires a combination of significant funding and ideologically motivated effort.
   - Similarly, spies can be well compensated for disclosing info to foreign govt, but whistleblowers are not typically compensated for publishing info to public. (Wikileaks experimented with payments for whistleblowers.)
 - Medium and high attention on the data
   - Mirroring content to multiple non-allied nuclear states and making it searchable is sufficient to make it uncensorable as of 2025.
   - Many intermediary operators and deve teams typically don't get paid very well. Web crawlers/mirrors, torrent seeders, Tor nodes, blockchain nodes, blockchain relays and onramps, etc.
   - Youtubers and journalists acquire a lot of attention for publishing politically sensitive info. It is not obvious how much it is in a youtuber's self-interest to pass on some of the rewards back to people in previous steps, be it capital or attention (more specifically political legitimacy). Sometimes journalist orgs (such as the Intercept or wikileaks) have paid for legal defence of whistleblower.
   - Some blockchains (and their foundations and core dev teams) have lots of capital protecting them. It is not obvious how much it is in their self-interest to pass on some of these rewards back to people in previous steps. (Let's say if someone published documents to a blockchain.) Sometimes wealthy crypto donors have funded media efforts.

Culture
 - I'm deliberately refraining from making too many ideological posts right now.
 - The strongest version of this system currently collides with liberal consent norms for privacy of citizens, and secrecy norms practised by military/intelligence and judiciary/police.
 - Besides liberalism, this system is neutral at best and actively hostile at worst to multiple dominant cultures in the world, including most political systems and religions and the institutions that defend them. Most systems have atleast some facts critical of their system that they would rather hide (personal view).
 - My personal view is that morality is less important to study and balance of power is more important to study. The cluster of moral arguments that are persuasive to me personally are arguments around how this will shift the distribution of power and the greater good this will create. This can be used to justify some of the harm (with harm measured as per existing moral systems).
 - Above arguments may not be persuasive to a broad audience though. If I wish to involve a broad audience, may require inventing ideology for that purpose.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/ai_forecasts/ai_timelines_talk_20250629.md

2025-06-29

# AI timelines (presentation on 2025-06-29)

Pre-reqs
 - assuming some technical knowledge of how transformers and deep learning works
 - assuming you have used transformer-based AI significantly in daily life (gpt4, o3, claude, gemini etc)

#### My top-level views

Samuel
 - ~25% probability of superintelligent AI deployed by 2030, assuming no ban or pause on AI research is enforced internationally
   - ~10% probability of human extinction by 2030 (due to superintelligent AI takeover)
   - ~10% probability of 100-year stable global dictatorship by 2030 (due to superintelligent AI aligned to small group of people)
   - ~5% unknown unknown third outcome

The numbers are guesses and they're approximate, 25% actually means 25 +- 10%

Definition of superintelligent AI: AI that is better than the **best** humans at **every** task humans care about completing.

Relevant intuitions for what I actually imagine when I imagine superintelligent AI: Humans from 1900s experiencing entire 1900-2025 inventions in one year, chimpanzees being exposed to a human being

In this talk
 - Will explicitly be defending "short timelines" view.
 - Will primarily discuss timelines not risks, unless group has strong preference otherwise

## Summary of all datapoints

 - Outside view: What do other experts believe?
 - Inside view: Forecast it yourself
   - Forecasting what happens from 2025 onwards
     - Forecasting scaling of pre-training
     - Forecasting scaling of inference scaling
     - Forecasting likelihood of new breakthrough
   - Forecasting what happens when ASI is near
     - Forecasting recursive self-improvement, intelligence explosion, automation of economy etc

## Datapoint 1: Other experts

You will find experts on all ends of spectrum:
 - ASI won't happen in next 5-10 years. It will be fine.
 - Powerful AI is happening, will automate large fractions of economy, ASI won't happen in next 5-10 years. It will be fine.
 - ASI will happen in next 5-10 years, it will be aligned to someone. It will be fine.
 - ASI will happen in next 5-10 years, it will cause human exinction. AI is morally superior than us therefore it will be fine.
 - ASI will happen in next 5-10 years, it will cause human extinction. This is bad.
 - ASI will happen in next 5-10 years, human extinction will not happen, misuse and dictatorship is possible. This is bad.

I'm selectively presenting doomer views here because I am also somewhat doomer. Message me for resources on expert views that are different.

Other people predicting ASI in next few years, and high estimate of risk
 - Original doomer: Elizer Yudkowksy
   - Almost 100% probability of human extinction. Most of this probability is in next 10-15 years.
   - [Yudkowsky's latest podcast interview](https://www.youtube.com/watch?v=0QmDcQIvSDc)
 - Lesswrong.com community
   - Started by Yudkowsky in 2001, now has more people knowledgible in AI
 - Geofrey Hinton, noble prize winner in 2024, considered "godfather" of field
   - [Hinton's recent podcast interview](https://www.youtube.com/watch?v=66WiF8fXL0k)
 - Yoshua Bengio, considered "godfather" of field
   - [Bengio's recent blogpost](https://yoshuabengio.org/2023/05/22/how-rogue-ais-may-arise/)

Other people predicting ASI in next few years
 - Ilya Sutskever, cofounder of OpenAI, now resigned due to presumed safety concerns
   - [Ilya's TED talk](https://www.youtube.com/watch?v=SEkGLj0bwAU)
 - Elon Musk, original funder for OpenAI, now runs xAI which has leading AI model grok-3
   - [Elon Musk's address to ycombinator](https://www.youtube.com/watch?v=cFIlta1GkiE)

Longer list of experts with views in similar cluster
 - [Dan Hendrycks' signed letter](https://safe.ai/work/statement-on-ai-risk)
   - Many people on this list have podcast interviews

## Datapoint 2: Scaling pre-training, try the models yourself

 - Actually go try GPT2 on some prompts.
 - Actually go try GPT3.5 or GPT4 on some prompts.
 - Actually go try GPT4.5 on some prompts.

Don't trust benchmark datasets, don't trust what some expert has said, actually go try some of the older models yourself.

number of params, depends on model size
 - GPT2 XL (2019): 1.5B params. Estimated training ~10^20 FLOP
 - GPT3 (2020): 175B params
 - GPT3.5 based on GPT3
 - GPT4 (2023): rumoured 1.8T params
   - o1, o3 based on GPT4
 - GPT4.5 (2025): rumoured 12T params, of which 1T active params (mixture of experts). Estimated training ~10^26 FLOP.

## Datapoint 3: Scaling pre-training, chinchilla scaling law, historical data

[Chinchilla scaling law (wikipedia)](https://en.wikipedia.org/wiki/Neural_scaling_law#Chinchilla_scaling_(Hoffmann,_et_al,_2022))

 - Discovered in 2022.
 - More compute, more data => lower loss
 - All pre-training from 2019 to 2025 fits in with predictions of chinchilla scaling. Across six orders of magnitude.
   - GPT2 cost ~10^20 FLOP to train. $1-5 million
   - GPT4.5 cost ~10^26 FLOP to train. $1-10 billion
 - Chinchilla scaling law predicts loss accurately upto over 3 decimal places.

[Epoch AI trends based on chinchilla scaling law](https://epoch.ai/trends)
 - Pre-training compute cost increased 3x annually historically.
   - $10 million, then $30 million, then $100 million, then $300 million etc
   - Epoch.ai estimates grok cost ~$500 million only for training run.
 - Training compute FLOP increased 5x annually historically.
   - 10^20 FLOP, then 0.5 * 10^21 FLOP, then 0.25 * 10^22 FLOP, etc

## Datapoint 4: Scaling pre-training, chinchilla scaling law, forecasts of future compute and capabilities

More disagreement on future since we don't have hard data about the future.

How much compute will be used in future datacentres?
 - Future investments already committed
   - Case study: [xAI Memphis](https://www.youtube.com/watch?v=Jf8EPSBZU7Y).
     - $7 billion invested in building. 200,000 H200 GPUs. Used primarily to train grok-3.
   - Case study: [OpenAI Stargate](https://www.youtube.com/watch?v=GhIJs4zbH0o) under construction.
     - Has commitments of $100 billion annual for a total of $500 billion
     - Commitments by Masayoshi Sun (well-known asian investment firm) and Larry Ellison (founder of Oracle)
     - This is immediate 3 orders of magnitude scaling. Historical is $1M to $500M over 6 years. Now expecting $500M to $100B over maybe 2-4 years.
 - Upper bound on future investments
   - World GDP is $80 trillion.
   - Most people including me agree $10 trillion investment is hard to exceed in next 5 years.
   - Many people also agree getting $1 trillion investment is hard but not impossible to exceed in next 5 years.
 - What if no future investments? (spherical cow assumption)
   - FLOP/$ halves every 18 months due to improvements in hardware.
   - Let's say we cap out on $100 billion for a training run. This still means we are putting in "$200 billion equivalent" 1.5 years later, "$400 billion equivalent" 3 years later, and so on.
   - This curve is slower than the immediate ramp up in investment though.

How much capabilities will this increased compute translate into?
 - Open question 1: Will chinchilla scaling law continue to accurately predict loss?
   - Emprirical data for past 6 years, but no mechanistic understanding of how this happens. It could just break for some unknown reason.

 - Open question 2: Will reduced loss lead to improvements in specific capabilities?
   - Loss is compute on predictions versus actual truth for entire internet dataset. Most of this dataset is junk.
   - In practice we care mainly about capabilities on specific benchmarks such as difficult research, math, coding datasets.
   - Going from 99% to 99.9% on predicting entire internet data for example, does not tell you how much progress on doing PhD-level math.

## Datapoint 5: Scaling RL/Inference, try the models yourself

 - Actually go try o3 high on some prompts.
 - Actually go try o1 on some prompts.
 - Actually try GPT4 on some prompts. (GPT4 was the base model from which o1 and o3 were built. GPT4 does not have inference scaling)

## Datapoint 6: Scaling RL/inference, log curve for historical data

Benchmarks
 - [OpenAI solves ARC-AGI by Francois Chollet](https://x.com/fchollet/status/1870169764762710376)
 - [Deepmind gets IMO silver medal using similar approach](https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/)
 - [o3 and Gemini 2.5 Pro have solved ~20% of Humanity's Last exam by Dan Hendrycks](https://lastexam.ai)

I have less trust in benchmarks personally, due to data being leaked publicly.

We have only 1 year (2024 to 2025) since RL/inference scaling has been tried.
 - Any curve fitting will be a bad estimate.
 - There is still some disagreement on which curve is most accurate.

Some published curves
 - [Toby Ord's inference scaling curve](https://www.tobyord.com/writing/inference-scaling-and-the-log-x-chart)
 - [OpenAI's inference scaling curve](https://arcprize.org/blog/oai-o3-pub-breakthrough)
 - [METR's proposed correlation between inference scaling and time horizon of task](https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/)

Cost per task
 - o1 or o3 in day-to-day use is $1 to $10 per task.
 - ARC benchmark was solved using over $1000 per task.
 - Max cost per task tried is below $100k per task.

I (samuel) don't have strong opinion on which curve is exactly true. 

## Datapoint 7: Scaling RL/inference, forecasts of future compute and capabilities

Two competing factors
 - Log curve is slow
   - Putting 10x more compute leads to only slightly more improved capabilities. RL/inference scaling is currently brute-force-like approach.
 - Lots of money left to invest
   - On high value tasks like new R&D (solve cancer, do AI R&D, etc), it may even be worth spending $1 billion per task. This is atleast 5 orders of magnitude more than current $100k.

[EpochAI article on forecasting scaling RL/inference](https://epoch.ai/gradient-updates/how-far-can-reasoning-models-scale)


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/ai_forecasts/superintelligent_ai_timelines.md

2025-06-04

# Superintelligent AI timelines

## Summary

My view as of 2025-06
 - P(ASI by 2030) = 25%
   - P(ASI by 2030 && human extinction due to ASI by 2030) = 7.5%
   - P(ASI by 2030 && >100y half-life stable dictatorship due to ASI by 2030) = 10%
   - P(ASI by 2030 && unknown unknown third outcome by 2030) = 7.5%

My view from 2023 to early 2025
 - P(ASI by 2030) = 15%
   - P(ASI by 2030 && human extinction due to ASI by 2030) = 5%

My view before 2022
 - Didn't take ASI as a serious threat, but was vaguely curious about it. My internet username since 2015: [ghosts_in_the_code](https://stackoverflow.com/users/4436162/ghosts-in-the-code)

## Update

Update (2025-04-24): The numbers in the previous version of this document might be outdated.
 - My new prediction is 25% probability of superintelligent AI by 2030. (My old prediction was 15%.) I haven't thought as much as I'd like about it.
   - The main thing that updated me was seeing o3 make multiple novel inductive/deductive reasoning steps on top of each other without high hallucination rate.
   - In general GPT4 seems capable of making novel reasoning steps, but it usually just makes one or two steps and then stops. If you start the next inference, any previous progress is lost. A human is capable of making hundreds of novel reasoning steps in series, each step dependent on the previous steps.
 - As before, I have ~30% probability conditional on ASI by 2030 that humanity goes extinct. So that's total 7.5% probability of extinction guaranteed by 2030.
 - I also have ~40% probability conditional on ASI invention by 2030 that whichever small group of people controls the ASI will become by far the most powerful group in human history. So that's total 10% probability of stable dictatorship guaranteed by 2030.
   - They will almost certainly be world dictators essentially, although they will likely provide people with enough nominal power that this fact is not obvious to everyone.
   - I assign a smaller probability that this group will have hyperpersuasion skills (i.e. they'll be able to persuade anyone of any ideology they wish).
   - The remaining ~30% probability conditional on ASI invention by 2030 is unknown unknown for me. So that's total 7.5% more. I think there are multiple outcomes besides the above two outcomes (extinction, global dictatorship) with non-trivial probability.


#### AI timelines for noobs

Disclaimer
 - Quick note

Main
 - If you are an outsider to this whole space, the simplest argument that will make sense to you is to go and try using GPT2, GPT3, GPT3.5, GPT4 and now o3.
   - Openrouter.ai is good for trying many models.
   - You will notice a clear trend in LLMs improving from 2019 to 2025. If you are new to the space you are probably only noticing the trend line from 2023 to 2025.
   - If you want you can go even further back in time and try RNNs for NLP, and n-grams and stuff before that. The actual trend line goes atleast from 2012 to 2025.
   - Unfortunately AI companies have discontinued providing APIs for some of the older models (like GPT3). You have to either go check older youtube videos of other people trying these models, or find a developer who can setup all this for you.
   - This trend is called scaling law or Rich Sutton's bitter lesson. It basically means the more number of multiplications you can do, the more intelligent the AI becomes. Over the last 10 years, processors have become cheaper (more multiplications for same money spent) and people are using more processors to train the next model. First one GPU then tens, then hundreds, now we have models trained on hundreds of thousands of GPUs. We might soon have significant fractions of global electricity production going into more multiplications for the next AI model.
 - The second thing to do is read my intelligence explosion document. The basic idea is that lots of dumb models can't work together or build on top of each others results. But smart models can do this just as humans can do this. A group of beetles can't make a very intelligent plan, a group of humans can make an intelligent plan.
 - The last argument (more speculative) is to notice the difference between GPT4 and o3. o3 is able to do multiple reasoning steps one after the other. I have personal bias that this could end up important.

---

## Main (original v1 of this document)

DISCLAIMER
 - I'm not a deep learning expert. I understand theory of LLMs, RNNs, CNNs, etc. but I don't have experience training large models or curating large datasets or doing original DL research.
 - Please consider getting the opinion of many people other than me. Many of the signatories on [this list](https://www.safe.ai/work/statement-on-ai-risk) have made individual statements, search for their podcast interviews, blogposts, research publications and so on. Lesswrong is also a good forum.
 - I strongly recommend you form your own independent opinion on this subject, and not blindly copy the views of others, no matter how smart or trustworthy they seem. This field would make a lot more sense if more people formed independent opinions. (And yes forming a good opinion takes a lot of time and effort, and you have to decide whether the time investment is worth it for you personally.)

**If you have not used the latest AI models (as of 2025-04 this is GPT4.5 and o3), I strongly recommend you go try them out before reading any discussion such as the one below.**

This document is mainly aimed at lesswrong (LW) rationalists / effective altruists / adjacent people, since a lot of my work is culturally downstream of theirs, and a lot of my potential research collaborators exist in their communities. This doc will make less sense if you haven't encountered their writings before.

Most of this doc is guesswork rather than models I have a lot of confidence in. Small amounts of evidence could upend my entire view of these topics.

**If you have evidence that my view is wrong, please tell me**. Not having to spend any more time thinking about AI will improve my quality of life. I am being completely serious when I say I might thank you till the day I die, if you persuade me either way. I can also pay you atleast $1000 for convincing me, although we'll have to discuss the details if you really wanna be paid.

The last time I read papers on this topic was early-2023, it's possible I'm not up-to-speed on any of the latest research. Feel free to send me anything that's relevant. I haven't really updated my views on this topic since 2023, although the field has moved fast meanwhile.

**I have ~15% probability humanity will invent artificial superintelligence (ASI) by 2030.**
 - If it happens, this will be the most significant event in ~10,000 years of human history on many metrics of what counts as significant. (If the intelligence gap is sufficiently large it might even be the most important event in the ~14 billion year history of the universe. My probability for this is smaller.)
 - This will more-likely-than-not use transformers + data and compute scaling.
 - I define artificial superintelligence to be AI superior to the best humans (not median human) at basically every task humans can do on laptop with internet access.
 - See my usual disclaimer on predictions shaping reality. If you are involved in building ASI, please consider *not* doing that or atleast talking to me once about it. I don't want ASI to be built by 2030 unless many aspects of society change first.

**Conditional on ASI being invented by 2030, I expect ~30% probability it will kill everyone on Earth soon after.** In total that's ~5% probability of humanity killed by ASI by 2030.

I have chosen not to work on this problem myself. This is downstream of ~15% not being large enough and the problem not capturing my curiosity enough. I am not a utilitarian who blindly picks the biggest problem to solve, although I do like picking bigger problems over smaller ones. If you are working on safety of AGI/ASI, I think you are doing important work and I applaud you for the same.


## Reasons for my beliefs

*Mildly relevant rant on consensus-building:* Both these probabilities seem to involve dealing with what I'd call deep priors. "Assume I take a random photo of Mars and a random pixel from that photo, what is the probability its RGB value has higher Green than Blue?" Humans tend to agree better on prior probabilities when there's enough useful data around the problem to form a model of it. Humans tend to agree less on prior probabilities when they're given a problem with very little obviously useful data, and need to rely on a lifetime of almost-useless-but-not-completely-useless data instead. The "deepest" prior in this ontology is the [universal prior](https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_inductive_inference).

~15% is lower than what many people in EA/LW communities assign, because I reject a lot of the specific models they use to forecast higher likelihood of ASI.
 - Total amount of neurons in human brain has nothing to do with compute required to build ASI, as evolution produced human brains and evolution used a lot more compute than an individual brain.
 - Nobody has a proposal that's IMO likely to work at replicating biological evolution in-silico. How to capture initial conditions? How much compute do you require to simulate the environment? I haven't seen convincing arguments for why these details aren't important.
 - I have a prior that most STEM research that promises bold outcomes fails. This is the default not the exception. To study this prior you have to go back in time to each of the failed research agendas of the past 100 years, and notice what it would have **felt like** if you were born then and were hyped up by a research agenda you didn't know would succeed or fail.
 - Update (credits: someone on LW): I have seen some basic arguments from neuroscience saying neurons (or the genetic information that produced the neurons) don't by themselves encode much of whatever it is that produces human intelligence, most of it is learned after birth. I'm open to arguments of this type but I'll probably have to study neuroscience to evaluate them.

[Discontinuous progress in human history by Katja Grace](https://www.lesswrong.com/posts/CeZXDmp8Z363XaM6b/discontinuous-progress-in-history-an-update) is the closest thing I could find to work that tries evaluating the prior probability of any reserach agenda succeeding with radical outcomes. I have not spent a lot of time searching for such research though. Convincing me either way may require publishing or pointing me to a lot more work of this type.

Alternatively you have to provide me a gears-level model that explains why the LLM scaling laws empirically hold.

~15% is higher than what many AI researchers assign, because I reject a lot of the specific reasons they give for why LLM scaling cannot possibly achieve ASI
 - I have read some arguments for why specific unsolved problems are *hard* when compared with already solved problems, and why there's no specific reason LLM will crack them.
 - However most of the problems LLMs have already cracked also got cracked without anyone having an actual model for why LLMs will or won't crack them. LLM scaling has consistently proven itself (to me) to be black magic; both its supporters and its detractors fail to accurately predict which problems it will or won't crack. Predicting loss curve doesn't tell which problems will or won't be cracked.
 - Some people cite architectural limitations of LLM as a bottleneck. For instance LLM has a theoretical upperbound on number of calculations per forward pass, and a certain ratio of "memory access" to "computation" steps. But solving a real world problem can use multiple forward passes + non-LLM scaffolding + non-self-attention layers. For example, you can often use layers from two different architectures (let's say self-attention layers and CNN layers) in a single model, train this model and get good performance.
 - Some people say we will run out of data required as per empirical scaling law but my guess is this is more likely than not solveable. I'm very unsure on this. (And I lack sufficient expertise to evaluate this argument tbh.) I'm guessing you can teach a model simple reasoning using a smaller dataset, use this model to fill holes in a bigger dataset, and then teach the model more difficult reasoning using this data.

~30% probability for extinction conditional on ASI invention by 2030 is because I am more optimistic about boxing an ASI than some LW rationalists. I do believe misalignment happens by default with high probability.
 - The intelligence difference between Einstein or Churchill or Hitler, and a hyperpersuasive AI is large, relative to the intelligence jumps shown by scaling so far. I understand there's no natural scale for intelligence. I am open to the idea a hyperpersuasive AI can exist in theory. (In this context, a hyperpersuasive AI is an AI that can gain complete control over your mind with near 100% probability simply by talking to you.)
 - I am assuming a high level of competence among the people boxing the ASI. I have low probability of ASI coming completely by surprise. A lab building ASI in this decade will almost certainly employ many people who take the prospect of ASI seriously.
 - I am assuming that pausing ASI research will be politically possible, even obvious, once we have built a boxed ASI with empirical evidence of a lack of safety. I think I have better intuitions on politics than the median LW poster.
 - I have some confusion on how the AI will reason about the hardware-software abstraction. At what point does an AI translate "maximise XYZ assembly variable" into "maximise the number of electrons in positions in semiconductor underlying XYZ"? My guess is whether an AI wants to "break out" of its machine depends on how it reasons about this. I accept that an ASI could understand that its variables are made up of positions of electrons in a semicondutor, I'm just unsure what it'll do once it knows this.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/ai_forecasts/intelligence_explosion.md

2025-07-11

# Intelligence explosion

Disclaimer
 - This is a thought experiment. Qualitative not quantative, I'm not defending the exact numbere here.
 - I don't convey anything novel here. The original argument is atleast 25 years old, but includes LLM-specific knowledge from 2025.

## Summary

 - Intelligence explosion
   - Imagine [Elon Musk's Memphis datacentre](https://www.youtube.com/watch?v=Jf8EPSBZU7Y) contains 6000 amnesiac humans doing 100 years of thinking every 1 year of wallclock time.
   - By 2030, we may have 100,000-1,000,000 geniuses smarter than Einstein doing 10-100 years of thinking per year of wallclock time.
   - Within 1 year we might go from "slightly superhuman individual" to "unimaginably superhuman parallel civilisation"
 - Recursive self-improvement
   - Also these geniuses are researching how to edit their brains to become even smarter. This may or may not happen but is an accelerant if it does.
 - Moore's law and similar
   - And the number of genius doubles every 2.5 years, so by 2040 or 2050 we could have billions of such researchers.

## Main

Why and who?
 - A lot of people who studied AI but not AI risk are unaware of arguments for intelligence explosion or recursive self-improvement. This document is for you, if you're one of them.
 - Even among the few people who discuss it, I sometimes find unclear distinction between serial and parallel computation. This document is also for you.

**If you have not used the latest AI models (as of 2025-04 this is GPT4.5 and o3), I strongly recommend you go try them out before reading any discussion such as the one below.**

Example
 - For now I'll copy paste the numbers I calculated previously
 - Assume Llama3 405B inference on a 2x8xH200 SXM GPU node as of 2025

```
GPU node cost = $300k

$/token = e * $1.44/1M tokens

tokens/s = (2646/e) tokens / s
```

Superhuman
 - Let us imagine this model is at slightly superhuman capability in some domain. (Llama3 is not, but let's imagine for now it was.)

Parallel
 - If humanity spends $300B we get 1 million such nodes (2x8xH200) running in parallel. This is around ~$40 per capita (global) and is likely affordable only to a handful of govts. 

Serial
 - However each node is producing (2646/e) tokens/s. Let's make simplifying assumption that this is 1000 tokens / second. This is about two A4 sized pages of printed text in Arial font 12.
 - If you've used any latest AI model and sampled 1000 tokens, you have an intuitive understanding of what this looks like.
 - As a human if you had a device in your brain recording every thought you had that's probably not more than 1000 A4 pages of text per day. The AI however is producing 150,000 A4 pages of text per day. So the AI is thinking atleast 100 times faster than you.

In total we have 1 million nodes each of which is thinking 100 times faster than a human, and is slightly smarter than a human.

Assumptions made so far:
 - We have a model with as many parameters as Llama3 that is slightly superhuman
 - It can produce 1000 tokens / second on a 2x8xH200 GPU, which is not that far off from real LLMs.

Now let's do some thought experiments.

#### Serial

Imagine you had 1 year to complete a research paper and your fellow researcher had 100 years to complete the paper.

Imagine you had 10 years to complete a research paper and your fellow researcher had 1000 years to complete the paper.

This is already likely to produce outputs beyond your imagination. Humans rarely spend their entire lifetime dedicated to a problem in a way that they actually continuously keep making progress. At some point most humans give up and substitute their time with fake busy work or with an alternate task.

If you could spend 1000 years focussed on one single task, you would already be capable of superhuman feats.

#### Parallel

Imagine your country had 1 million PhD researchers and the opponent country had 1 million PhD researchers.

However your country employs this PhD research force to solve thousands of different problems, whereas the opponent country employs all of them to solve one singular problem. Your researchers get bored, don't take orders and follow their own curiosity. The opponent country is a dictatorship where researchers can summon the same level of curiosity on demand to work on whatever research project the dictator recommends.

#### Serial and parallel combined

Now imagine the above two effects combined.

Your country has 1 million PhD researchers scattered across 1000 different topics. They have 1 year to do their work.

The opponent country has 1 million PhD researchers all focussed on the same project. They have 100 years to do their work.

If any of the researchers in their country uncovers an insight in year 1, it is used as input by all the million researchers in year 2. If any insight is uncovered in year 2, it is used as input for year 3.

It is obvious that for almost any human-underestandable problem, this opponent country would have made so much progress within a few years itself that the work they produce would take multiple years just for your country to comprehend.

#### Parallel and superhuman combined

Imagine your country has 1 million PhD researchers focussed on 1000 topics and the opponent country has 1 million researchers smarter than Einstein (or any other outlier-brilliant researcher) all focussed on the same topic.

Whether you believe scientific progress is driven more by a handful of outlier researchers or by a collective of median researchers, it is obvious this country will make a lot more progress than yours.

#### Serial and parallel and superhuman combined

Imagine your country has 1 million PhD researchers focussed on 1000 research topics and has 1 year to solve a problem.

Imagine your opponent country has 1 million researchers smarter than Einstein focussed on the same research topic, and they have 100 years to solve the problem.

#### Serial and parallel and superhuman and RSI combined

Recursive self-improvement (RSI) is the idea that the AI can do research on itself and improve its own intelligence. It is an open question to what extent this is possible. Worst case you can assume no RSI is possible.

Human beings are not able to recursively self-improve because our knowledge of neuroscience has not advanced to the point where we can edit our own neurons with a machine. Likewise knowledge of genetics has only recently advanced to the point where we can edit our own genes. If we could edit our neurons or our genes, we could probably increase our own intelligence.

An AI can trivially edit its own weights and its training algorithm and so on. So it is likely atleast some amount of recursive self-improvement is possible. How much is unknown.

Imagine your country has 1 million PhD researchers focussed on 1000 research topics and has 1 year to solve a problem.

Imagine your opponent country has 1 million researchers smarter than Einstein focussed on the same research topic, and they have 100 years to solve the problem. Also, the problem their country is solving for the first 90 years is how to edit their own brains to become even smarter. Only in the last 10 years do they try to solve the actual problem you're competing with them on.

So on year 1 you're competing with a country full of people smarter than Einstein. On year 2 you're competing with a country full of people who have edited their brains to become even smarter than that. On year 3 you're competing with a country full of people who have edited their brains to become even smarter than that.

This is what our civilisation coming into contact with superintelligent AI could look like. By starting from an assumption of "imagine Llama3 but slightly superhuman" we have reached "unimaginably superhuman" within the span of one year.

**If "Llama3 but slightly superhuman" is possible in 2030, "unimaginably superhuman AI civilisation" may be possible by 2031 as per above set of thought experiments.**

A lab is still required to perform experiments, so it's possible the rate of progress becomes a lot more dependent on the experiments per dollar of the lab rather than the intelligence of the people using the lab. This is also true for AI research, where the lab is essentially a separate GPU cloud that this GPU cloud of 1 million thinking nodes can use to run experiments.

Open questions
 - AI-automated lab research - AI controls a lab (in biology, chemistry, etc) and decides which experiments to run
   - Research progress per FLOP of training compute
   - Research progress per FLOP of inference compute
     - (Historically it has not been possible to teach models using inference, only training could teach them.)
   - Research progress per dollar spent on lab experiments
   - Ratio of all three values above will decide dollars spent on each
 - AI-automated AI research - AI controls a datacentre "lab" and decides which experiments to run
   - Research progress per FLOP of training compute
   - Research progress per FLOP of inference compute
   - Research progress per FLOP of "lab" compute
   - Ratio of all three values above will decide dollars spent on each


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/longterm_view_on_info_tech_society.md

2025-07-10

# Long-term view on info, tech and society

I decided to make a summary of some non-obvious insights of mine around similar topics. (Some of this work is not original but borrowed from others.)

## Summary

 - Most information about everyone on earth may end up in the public domain soon. Reasons for this include a) people being incentivised to share their information in public, because of the benefits they get in return, and b) cyberhacks and espionage leaking people's information against their will.
   - This could lead to improved truth-seeking and coordination on political matters - more direct democracy, improved law and order, less geopolitical monopoly power.
   - This could also lead to improved truth-seeking on taboo topics besides politics, such as morality, religion, sex, mental and physical health, and relationship conflicts.
   - This could also lead to damaging the lives of people deviating from societal norms (including deviating in ways that are not harmful).
   - If most information is not allowed to be in public but is instead kept secret by a small group of people, this could lead to a more stable dictatorship instead.
 - I am currently making the uncertain bet that this level of information sharing might net good for humanity.
   - Cost of CPUs, disks, fiber optic cables etc going down means it'll be possible to build decentralised social media and governance platforms that don't require a lot of money to pay for servers and software developers. A decade ago, building social media required a lot of money, hence Big Tech companies built them and made many self-serving decisions in how they are governed.
 - Intelligence-enhacing technologies are becoming possible this century and can radically alter our society. Extinction-causing technologies are also becoming possible this century. Examples: superintelligent AI, human genetic engineering, brain-computer interfaces, human brain connectome research.
   - Our current mental models for regulating tech may be inadequate when dealing with this. A more intelligent mind can persuade a less intelligent mind to hand over money or votes for nothing in return. This breaks the foundations of both capitalism and democracy.
   - Cost of sensors, transducers etc going down could lead to invention of more scientific instruments to collect data. Invention of such instruments is usually accelerating for scientific fields. 
     - I specifically guess (but am not sure) that the following inventions directly benefitted from moore's law for CPU chips and the resulting circuit miniaturisation: DNA next gen sequencing, DNA nanopore sequencing, gigapixel camera lenses, electron microscopy (accelerated cell biology and neuroscience, for instance fruiftly connectome), solar photovoltaic cells (direct overlap with CPU production process)
     - Generic examples of other instruments that accelerated scientific fields: microscope, telescope, cyclotron, etc
 - Offense-defence balances in technology shape incentives. Incentives shape culture. If you want to understand how society evolves on multi-century timescales, study offence-defence balances of various technologies. Don't just study which ideology or morality is popular in society today.
   - If "enough" people find a new way to acquire money, status, etc they will retroactively invent moral justifications for the way they used, and many people in rest of society will eventually accept this justification.
   - (Incentives include providing physical safety, social approval or money as a reward for some behaviour. Culture includes literally all popular ideas and behaviours society, including ideas that have moral weight in people's minds.)

## Main

**Information and society**

Disclaimer
 - This disclaimer might make more sense to you after you read the rest of my post.
 - Note on terminology:
   - A lot of terminology around people's information being acquired and shared is morally loaded.
     - Negatively loaded: surveillence, doxxing, spying, hacking, stealing information, cancelling
     - Positively loaded: transparency, journalism, free speech
   - Suppose Alice acquires Bob's data and shares it with the public without Bob's consent. People who consider Alice a "good guy" and Bob a "bad guy" will tend to use positively loaded terminology and people who consider Alice a "bad guy" and Bob a "good guy" will tend to use negatively loaded terminology. For example: Alice and Bob belong to two different socioeconomic classes or two different political ideologies.
   - There is a strong incentive gradient for many actors to invent persuasive ideology around which "greater good" justifies violating consent (i.e. violating liberal morality).
     - Example: A social-left-leaning journalist who believes becoming a billionaire is immoral, and believes they are justified in obtaining and leaking private lives of billionaires without consent.
     - Example: A social-right-leaning individual who believes LGBTQ is immoral, and believes they are justified in doxxing and publicly shaming individuals who come out as LGBTQ.
     - Example: AFAIK Stalin built two intelligence agencies (KGB and GRU) run by two different branches of govt to spy on and purge dissidents from the other branch. He figured out ideology so that members of each branch felt justified in spying on the other branch.
   - I'm going to use the term "data acquisition" as a neutral term so I can separate the discussion of what is physically happening, from the discussion of what moral judgement I assign to either the people making it happen or the consequences of it happening.
   - For the most part, I haven't yet made up my mind on what moral judgements (if any) I should have on this topic.
     - Suppose we ignore morality for now and just predict consequences. I haven't made up my mind yet on which equilibria are stable. Which equilibria are stable depends on technical capabilities and financial incentives, not just on social incentives (such as people's moral judgments of each other).

How?
 - Historically, every advancement in transmitting information, be it horseback, ship, semaphore, printing press, telegraph, radio, or now smartphone and fibre-optic internet, has been followed by significant social and political change.
 - Cost of computing hardware going down has reduced the cost-to-benefit ratio for cyberhacking and espionage. Individuals and organisations will find it harder to keep secrets in coming world.
   - If you wish to protect information on a computer from being stolen by large corporations and nation states, both software and hardware methods fail. The only defence is physical methods - crush hard disks into pieces to wipe them, switch off electricity to wipe RAM, meet in person, use a copper envelope (faraday cage) to block signals etc.
     - Example: Various fiber-optic cable wiretaps and router backdoors revealed by Snowden leaks
   - Espionage is now within the reach of individuals operating independent of any corporation or nation state.
     - Examples: Ulrich Larsen, Edward Snowden, Chelsea Manning
   - Information once acquired is rarely lost or destroyed. Every cyberhack or intelligence operation increases the number of actors (N) in the world who now have access to that information. Since N goes up but never down, on a long enough timescale N approaches say 50, at which point it is basically guaranteed to reach the public eye.
     - Example: the 800+ leaked torrents for password databases behind haveibeenpwned.com
     - Note that it can still take multiple decades before the information eventually reaches the public.
       - Example: As of 2025, NSA still successfully keeps some information from the Manhattan project as classified. (This is arguably a bad example as the internet did not exist back then. It is worth checking if this information remains secret for some more decades or not.)
 - Physical and digital worlds are both likely to have increasing amounts of data acquisition.
   - As of today there is more public info in digital world than physical world. Conversely it is easier to meet someone in the physical world than in digital world and obtain high level of guarantee that the meeting was 100% private.
   - This might change soon.
     - Gigapixel photography from helicopter/aircraft at 10 km altitude can most likely acquire data of a city at a resolution precise enough for accurate facial recognition. Gigapixel lenses have been invented, but they may or may not be deployed in this setting yet as of 2025-05.
     - ~10,000 quadcopter drones can definitely acquire data of a city precise enough for facial recognition, assuming one can pay enough drone pilots or automate the swarm.
     - Line-scanning imaging by low-earth orbit satellites may or may not be another method. Commercial satellites are currently at 30 cm which is insufficient for facial recognition. However information on latest satellites is classified, and <10 cm resolution may be possible as of 2025.
 - Individuals and organisations that operate in public get various benefits, such as proving trustworthiness and receiving better feedback.
   - Example: popular podcasters like Joe Rogan
 - Very similar scenarios could lead to very different outcomes. I have not yet figured out what are the **stable equilibria** we could reach at the end.
   - Consider the following four scenarios: 99% data acquisition by a few elites, 99% data acquisition shared to public, 100% data acquisition by a few elites, 100% data acquisition shared to public
   - How do we end up with these scenarios?
     - Biologically implanted cameras and mics may be one pathway to achieve 100% data acquisition instead of 99% data acquisition.
     - Drone, airplane and satellite footage, smartphones connected to internet, etc all achieve 99% data acquisition.
     - As of 2025, 99% of people's info coming out in public seems to me like the more likely scenario than 100% of info coming out in public.
       - Especially motivated actors such as political dissidents and whistleblowers will likely still incur the costs (social, psychological, financial) required to avoid having their data acquired by anyone.
   - Possible consequences of these scenarios
     - If 100% of people's info is only privately accessible to a few elites but a few elites' info is not public, it may enable dictatorships with much longer half-lives.
     - If 100% of people's info is public and all elites' info is also public, society could be significantly different than today. This could instead enable a more direct democracy. More on this below.
     - It is posible a society where 99% data acquisition occurs eventually slides into a society where 100% data acquisition occurs. I haven't made up my mind on this.
 - "100% of people's info being public and all elites' info also being public" might be the least worst stable equilibrium.
   - In general there are two tradeoffs - privacy tradeoff and freedom of speech tradeoff. How much privacy does society provide its members and how much freedom of speech does society provide its member. IMO extreme ends of both tradeoffs are more stable than middleground position.
     - It is difficult to ensure only *some* large group of people know a secret and they don't leak it further, it is easier to ensure either no one or a small group knows the secret, or everyone knows the secret. Small means necessarily less than 50 people, and usually less than 5 people.
     - It is difficult to ensure only *some* topics of content are not allowed to post on a given internet platform. Eventually the platform may be politically co-opted so more and more types of content are restricted. Alternatively one structurally builds platforms such that all types of content are allowed.
   - A moral stance that is pro-free speech is also anti-privacy. If Alice has obtained Bob's secret information, a maximally pro-free speech will allow Alice to publish this info online because there is no trustworthy Carol who gets to decide which content one is allowed versus not allowed to publish online.
     - A pro-free speech stance is therefore also in conflict with EU's right to be forgotten, as no Carol is trusted to decide which information should or not be should be erased from public eye.
 - Here's another possibly stable equilibrium that's quite different from the above listed options.
   - Build a community of a few hundred people and completely disconnect from the rest of society. People and information go in, no person or information ever comes out. Multiple generations of people are raised in the same isolated community. A secret-keeping community of few hundred people will allow more ideological diversity than a group of two (like a married couple) or ten (like a C-suite of a company)

Consequences
 - Increased public info about elites may reduce freedom of elites and establish a more direct democracy. Both corporations and governments will be more controllable by the general population.
   - This has similarities to town life versus city life. The internet is homogenising global culture and morality by putting town incentives across the globe.
 - Truth as a moral virtue is likely to thrive in a highly transparent society as long as multiple actors can defend themselves long enough to persuade others. Nuclear weapons allow this at national level between nuclear-armed countries. Guns allow this at individual level in countries that tolerate guns.
 - Geopolitical power is built majorly by maintaining lead time over competitors in various technologies. The leading manufacturer of any product gets export revenue from across the world. The second leading manufacturer with 6 months inferior product gets zero export revenue. (They either survive in domestic market or go bankrupt.)
   - Example: Airbus in France and Germany manufactures and exports most of the world's civilian aircraft. China manufactures and exports most of the world's solar PV modules.
   - Increased public info about such orgs will significantly reduce lead times, but not eliminate them. Competent actors will likely still be able to remain on the leading edge and get the same geopolitical power they did before.
   - There will be stronger incentives towards "use it or lose it" when it comes to inventing any weapon in the future, as competitors will be able to copy the same weapon more quickly. It may sometimes be advantageous for the inventors' political elites to use the weapon immediately after it is invented and before it gets copied.
   - Example: US govt was the only actor in the world to possess nukes between August 1945 and August 1949. Many in US govt proposed pre-emptively nuking Soviet Union during this time period to establish nuclear monopoly. If there were not Soviet spies in the Manhattan project, this time period would likely have been longer than 4 years.
   - There may be more incentive to be competent as an elite running such an org, and hoarding of technical talent and knowledge will no longer be sufficient moat to keep the org.
   - Replicating entire supply chains from scratch in competing nations might become possible with smaller lag time.
   - Example: If an arms race is started for human genetic engineering, there will be a smaller difference in relative power between competing nations as the entire biotech supply chain will get replicated in a small number of years in multiple nations.
 - Increased public info about citizens (including those committing crimes) may help fix law and order in many countries.
   - Stable economic growth requires law and order.
   - Example: Acemoglu's nobel prize-winning work, direct correlation between countries with high GDP and stable law and order.
   - Lack of law and order has multigenerational psychological effects. Physical safety is low on Maslow's hierarchy and is more important to most people than an abundance of consumer goods.
   - Economic metrics such as GDP are a weak proxy for measuring human happiness, including political metrics such as stability of law and order will make it a better proxy.
   - I'm still unsure who will get to define what "crime" is in the new equilibrium. Which morality becomes the majority morality? It is possible each nuclear-armed nation and its dependents ends up homegenising moral ideas among its members and gets one universal morality. Liberalism and various religious moralities are top contenders.
 - Increased public info about citizens' private lives might damage their lives. But it might also improve society's ability to seek truth on topics currently considered taboo.
   - A lot of information about individuals does not reach the public eye, because people have incentives to hide it. I'll call this "Social dark matter" (SDM). See also: Duncan Sabien's article.
   - Common topics considered social dark matter by individuals: death, morality, sex, money, mental and physical health, relationship conflicts, religion and politics
   - Some professions have higher access to social dark matter, such as psychologists, religious leaders, tech CEOs etc.
   - Increased public info about citizens might means a lot of such social dark matter forcibly comes out in public.
   - Social dark matter is very useful for empathising with individuals and giving them useful advice. A society fails to make progress on issues when consensus cannot be established in public on them, and social dark matter by definition is not in public view.
   - Establishing consensus allows dominant paradigms of thought to fall and new ones to replace them. Common knowledge is needed not just widespread knowledge.
   - Example: Blue Eyes puzzle (theoretical example). Nicotine and methamphetamine usage have reduced in the US today compared to 1980s. Increased willingness to discuss substance abuse may be a causal factor.


**Software and society**

(This section talks about my current plan for next few months or years. This could change in future.)

 - I (Samuel) am most keen on building new forms of governance via software.
   - Naval Ravikant says there are 3 types of leverage in society: capital, attention and internet-copyable products such as books, videos and software.
   - Internet-copyable products are the newest and least competed for.
   - Internet can transmit incentives and culture. People can paid over the internet. People can get social approval over the internet. People can be influenced by ideology over the internet.
   - Therefore new forms of governance can be built via software. Early examples: cryptocurrency, twitter-influenced public policy
 - I'm currently making the bet that 99% data acquisition is difficult to avoid. Also it may have significant benefits if managed well. Hence it may be worth actively accelerating towards such a world, and ensure the transition is managed well. I am not confident this is a good bet but it is the bet I'm currently making.
   - A lot of the consequences of 99% data acquisition listed above (direct democracy, improved law and order, improved truth-seeking on SDM topics, no singleton superpower) may occur if 99% data acquisition is built the right way.
 - Big Tech companies currently write the software that governs society, whether their execs are fully aware of it or not. Typically, computer hardware is expensive and software developers are expensive. This necessarily meant only a large company (in terms of capital) can manage society's software and hardware stacks.
 - Hardware costs
   - Within the next 10-20 years, it will be possible to store every word spoken by every person on Earth, on a home server affordable to a group of friends. If information and software gets open sourced, it will be possible to build governance using open source software rather than letting Big Tech alone govern society.
   - This is not true for video data however. Video data of every person ever on Earth is still too expensive to store on a home server. Big Tech may still get some influence on how society is governed, by defining rules of access for video storage.
 - Software costs
   - Software is expensive to write because of complexity. AI + video data + cheap hardware might reduce complexity of various popular applications.
     - Often software is complex because hardware is expensive and hence optimised algorithms are needed. Cheap hardware for text-based data may mean it will be possible to use less efficient but also less complex ways of writing software.
   - Search is the most popular application of the internet. Be it searching for partners or employers or food or household products.
     - Embedding search is a low complexity way of solving search. 
   - Identity is a necessary application for governance software.
     - Cheap video capture and storage may allow for decentralised identity, each person can just upload their own video in public.
   - Most internet applications (including search, identity, payments, communication, etc) exist in an adversarial environment where people must either prove trust or operate despite low trust.
     - Video data will help scale trust.
     - Example: Online conversations on political topics may be higher trust if done via video.
  - Conclusion: I think building governance via software needs the following 3 primitives to exist first, and governance software must be built on top of it:
    - Low-complexity high-accuracy open source search engine that can be hosted for cheap.
      - LLM embedding search is one possible way.
    - Uncensorable network of data transmission, along with low complexity ways of dealing with file formats.
      - Hard drive dead drops and independently motivated spies are one possible way.
      - I don't have a solution to the file format problem though, it seems important to figure out.
    - Cheap ways of dealing with video data, be it storage, transmission, embedding generation, converting formats, etc
      - I haven't figure this out yet. Also seems important to figure out.


**Technology**
 - Most fields of science and technology get accelerated when someone invents a tool that allows data collection of a system that was previously not possible at same cost and resolution.
   - Example: electron microscope, optical telescope, cyclotron, phosophrescent DNA tagging, etc.
   - Cost of electrical components such as transistors, actuators, inducers, etc has gone down, which will generally accelerate all scientific fields as it could lead to invention of new data collectio  instruments.
   - Possible examples (haven't researched deeply): gigapixel cameras, nanopore DNA sequencing, advancements in electron microscope, possibly even solar PV modules due to supply chain overlap
   - Materials science is underrated as it plays a significant role in invention of data collection instruments.
 - Intelligence-enhancing technologies are worth paying special attention to, as a small differential in intelligence leads to a large differential in power of every kind - offensive and defensive, scientific, engineering, military and political.
   - If the intelligence gap is sufficiently large, this breaks the foundations of both capitalism and democracy. If Alice is much more intelligent than Bob, Alice can run simulations of Bob's mind accurately enough to persuade Bob to hand over money or votes in return for nothing valuable.
   - This is true whether Alice is an AI or a mind upload or a gene edited human or a group of humans communicating via BCIs.
 - Key intelligence enhancing technologies: superintelligent AI, human genetic engineering, human brain-connectome mapping, cognitive-enhancing drugs, nanotechnology, ?
   - Research into superintelligent AI is already ongoing at full pace. AlexNet in 2012 was key milestone. If built this will be in a sense the last invention of human history, as the AI will them be faster than us at making new inventions.
     - I have estimated 25% probability of superintelligent AI being built by 2030. Scaling laws seem to work but nobody knows why they work or how long they'll keep working.
   - Research into human genetic engineering has stalled due to lack of consensus in academia on political consequences. CRISPR invented in 2012 was key milestone. This pause is fragile and powerful actors will be able to accelerate this field soon. Gene editing of humans has already succeeded in China (illegal experiment).
     - CRISPR may also enable human-animal hybrids and enhancing human traits besides just cognitive ones (IQ, memory etc)
     - Gene drives can cause the extinction or genetic modification of entire populations, not just individual members. This works better in species with small generation time (i.e. not humans)
     - Both human genetic engineering and gene drives have massive implications for warfare, economic growth, political structure of society etc. **I find this very understudied, and might look further into it myself sometime.**
   - Human brain simulation might be possible within 30 years but no one really knows. Fruitfly connectome has been mapped in 2017-2023, and neuroscientists are currently trying to understand implications. Connectome data includes connections between neurons but not signals going through them.
   - Research into brain-computer interfaces is ongoing. Example: Neuralink. I have not studied it deeply. They are increasing electrodes per cm, they seem to be keeping secrets on purpose.
   - Research into nanotechnology seems to have slowed to a crawl. I looked at it surface-level but haven't understood deeper reasons why the field has slowed. Fundamental breakthroughs are likely needed.
   - Research into cognitive-enhancing drugs is not something I've looked a lot into. Many such programs were illegally run in 20th century, this might have influence on getting motivated researchers or academia grants for it today. In general we lack knowledge of biochemical pathways to directly affect higher-level rational brain, instead of affecting lower-level emotional brain and affecting rational brain indirectly. Examples: injecting oxytocin, adrenaline, LSD, barbiturates etc
 - Extiction-related technologies are worth paying special attention to.
   - CRISPR invented in 2012 may make it possible to produce bioweapons in the next 10 years, which could cause human extinction.
     - Gene drives may also cause significant population-level changes which could affect food supply, incide of natural disease etc. This could affect human population significantly, but is unlikely to cause human extinction.
   - Superintelligent AI if invented could cause human extinction. My blind guess is this has 30% probability of occuring assuming superintelligent AI is invented.
   - Reduced cost of data acquisition may have some influence on nuclear balance of power, as all nations will get much better visibility into each other's nuclear deployed arsenals, manufacturing facilities and supply chains. This is unlikely to change the fundamental rules IMO, so odds of human extinction are not significantly affected by this.

**Technology and society**
 - Offense-defence balances inherent in technology shapes incentives. Incentives shape culture. Culture shapes laws.
    - If you want to predict or influence societal structure far into the future, you should probably study offense-defense balanaces inherent in technology.
    - I basically believe in a "might makes right" theory of history but I think offense-defence balances decide what type of person or organisation wins a conflict, and what ideology they are likely to have.
    - Usually when people say "might makes right" they imply the winner of a conflict decides of their own free agency which ideology will be popular in the future. This is not what I mean.
    - This will make a lot more sense with actual examples, when I prioritise this I'll list a lot of examples.
 - Culture shapes laws
   - Law enforcement cannot enforce a law if most of the lawyers and policemen and general population of a region don't believe that law is moral. Eventually the law will get changed in favour of the new culture.
 - Incentives shape culture
   - Incentives don't easily change a person's values, but they may change a person's behaviour. Incentives however place selection effects for people who already agree with the type of behaviour being rewarded, and those people become high-status in society. People who have not yet decided their values are more likely to copy and internalise the values of whoever is high-status.
   - Sometimes there is clash between financial incentives and social incentives. Sometimes people alter their behaviour to make money at the expense of what behaviour their social circle expects of them. This tends to make them lonely in the short-term, but in the long-term their social circle too is likely to emulate the same behaviour.
   - To do: Examples. This section is incomplete. (I've avoided sharing contemporary examples as they're often politically sensitive, but I could probably find some historical examples and post those here.)
   - In the absence of incentives grinding culture into a specific form, culture progresses via mutation and remixing. Most new ideas are near neighbours of old ideas. Human brains are machines whose output depends on input. Fundamentally new ideas that don't depend on old ideas don't usually exist.
     - Affecting the distribution of popular ideas in collective attention affects the likely new ideas your society comes up even though you can't predict the new ideas themselves.
     - Many technologies in today's society seem a direct consequence of the cultural environment their inventors grew up in. For example: superintelligent AI research ongoing today because Yudkowsky and other singularitarians increased collective attention focussed on these ideas.
   - Morality is a key aspect of how culture transmits.
     - When two cultures clash, how much members of both cultures tolerate each other depends on the moral judgement they assign to the opposing culture. Your culture has won a person to it when it has shaped that person's morality.
     - More tolerant cultures spread under certain circumstances and less tolerant cultures spread under different circumstances. Often a highly self-replicating culture has both more and less tolerant versions of it so it can spread in both environments. (There's probably some link between these idea and ideas around common knowledge and preference cascades which I haven't figured out yet.)
 - Technological offense-defence balances shape incentives
   - To do: Examples. This section is incomplete.
   - Competition for capital and attention exists at every level of societal organisation, not just between corporations and governments. For examples individuals, families, ethnic groups, etc. Competitions at different levels of structure affect each other. A nation that is in wartime competition to produce more steel than its opponent is also likely to force more competition between its citizen steel workers for example.
     - If you want to influence society in any way, you have to atleast be competitive enough to survive. What "competitive enough to survive" looks like depends on the situation.
 - Some technologies can be produced and used by small groups, whereas others can only be used by large groups.
   - For example: uranium centrifuges and solar PV modules require a large group of people to manufacture them, whereas guns and radio can be manufactured by a small group of people.
   - Tech that can only be produced by large groups fuels a lot of geopolitics. Nations and corporations try to become the first to build some tech, prevent other nations and corporations from catching up, and then use this as a bargainining chip to export whatever morality holds that group together in the first place.
   - Deliberately choosing to build tech that can be produced and used by small groups has consequences for societal structure. The open source software movement is one example of this.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/us_govt_whistleblower_guide.md

2025-07-05

# US Govt Whistleblower guide

Disclaimer
 - Incomplete
 - I deleted the full version as I'm still working on it. Will update once ready.

Why this guide?
 - I continue to think there isn't a single whistleblower guide on the internet that's good enough for this scenario. Some guides avoid talking about important details due to chilling effects. Other guides prioritise interests of journalists or lawyers.

Summary of the guide
 - If you are leaking US classified information, your best choice is probably flying to Russia like Snowden did. It is probably not improving your opsec and hoping to stay anonymous in the US.
   - Why?
     - The sysadmins working for the NSA leadership track every document downloaded from central DB to client machines, so your opsec being good isn't enough to protect you.
     - Almost every person who stayed in a country within US sphere of influence after leaking classified info has been imprisoned.
   - How? (Mindset)
     - Security mindset is hard to quickly convey. (I don't yet have good resources for this.)
     - You should be familiar with concepts like [bits of anonymity](https://gwern.net/death-note-anonymity) and security through obscurity. Every word, expression and action reduces bits of anonymity, as long as there's a physical trail, a digital trail or a person who observed it. Example of an action that reduces bits: Leaving your house sparkling clean when you otherwise leave it somewhat messy.
     - You should be aware law enforcement has also read all the guides you're reading including this one. 
     - You should probably avoid thinking of ad-hoc methods and stick to tried-and-tested methods instead.
     - The reason you might succeed at this plan is not because you're more intelligent or knowledgible than law enforcement, it's because of physics/engineering constraints that make whistleblowing easier than catching whistleblowers. Assume by default that they're more intelligent and knowledgible than you.
   - How? (Methods)
     - You should probably leave no digital trail.
       - You should probably redact documents yourself using GIMP on an airgapped tails setup, inspect bytes for steganography and metadata, and create a single tarball of everything. Redacting audio/video correctly is hard, I would recommend sticking to plaintext and images if possible.
       - There is no safe way to erase a disk using a hardware (firmware) or software tool. You you must physically shred all disks used and process data in RAM otherwise.
       - I do not currently recommend building a faraday cage as that leaves behind a suspicious purchase record. I would recommend using no wireless connection, and using absence/presence of wired connection as a de-facto airgap.
     - You should probably leave no unusual items in your physical trail.
       - This includes but is not limited to every item in your house (electronic, paper, etc), every purchase you make and every roadside camera you pass.
     - Trusted people
       - While in the US you should probably have zero people in-the-loop, while outside the US geopolitical sphere you should probably have one lawyer and zero other people in-the-loop. "People" here includes immediate family members, psychiatrists, journalists, etc. You should probably trust zero people to help you commit the action, but trust a few people to support you after you have committed the action.
     - Sending to journalists
       - If you redact documents yourself, you should ideally not require trusting any journalists with any sensitive info such as your identity.
       - You should probably send documents to as many journalists as possible, but trust none of them.
       - Most SecureDrop servers provide journalist's PGP pubkeys. You should ideally manually PGP encrypt the tarball before you send it via any channel (be it securedrop or protonmail or something else).
       - (I am yet to make up my mind on whether it is better to send documents before or after you leave the US. Sending documents after leaving the US is safer if you can successfully smuggle an SD card past airport security. Do your own research, or wait for me to do mine.)
     - Country of asylum
       - Russia has good historical track record for this scenario. It is very important to make the right choice on which country you fly to. You may use a connecting flight through a third country to reduce suspicion.
       - It is important you are present in this final destination immediately after sending the documents, every day of delay makes a difference.
     - Advanced users only:
       - If you rely on journalists to publish the documents for you, there's some probability they'll help cover up mistakes you made while doing redaction. On the other hand there's also some probability they'll act against your interests or simply refuse to publish your documents. Predicting their behaviour is hard and I don't recommend trusting your predictions of how they'll behave.
       - If you publish the documents yourself, you have to do redaction correctly. But you can guarantee publishing without trusting anyone.
       - You can send the documents to multiple social media sites that allow anonymous submissions over Tor.
       - You can acquire ETH anonymously and publish your tarball directly to ethereum blobdata. This ensures mirroring to multiple nuclear states. The same goes for purchasing BTC anonymously and publishing to bitcoin blockchain.
       - There's two methods to acquire ETH anonymously, the first method is to CPU mine XMR and then swap it for ETH using a trusted bridge, the second is use some imperfect method like cash or gift cards to buy ETH, but then use Tornado to wash it. Both methods should be done using tails only (not airgapped).
   - Evidence, full guide
     - Some of this is currently my personal opinion. I would much rather back everything in the guide with empirical evidence from previous cases, than rely on my opinion.
     - see more: https://www.lesswrong.com/posts/jKehN6uTYF7Z4WFKW/us-govt-whistleblower-guide-incomplete-draft
 - If you are not leaking US classified information but only an overview of the situation based on your own word, your best choice is probably coming out publicly in the US with a legal defence and requesting donations to fund it.
   - Why?
     - Historically, a majority of such people did not end up in prison.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/open_source_weaponry.md

2025-07-02

# Open-source weaponry

## Summary

 - If we built a world with no secrets, then all weapons capabilities will also become open source.
   - IMO safety of the world does not significantly reduce if all weapons tech known in 2025 was made open source, including nuclear weapons, bioweapons and AI model weights.
   - Producing all these requires large scale industrial processes, and cannot be produced a small group such as one with terrorist aims. A world with no secrets would ensure all large scale industrial processes are under oversight of the public.
   - This could change in future. I support coordinating around (temporary or permanent) bans on technologies that may not be as safe to open source (such as superintelligent AI model weights, bioweapons manufacturing in a small lab, etc). For this plan to work, it is important that the bans are successfully implemented before any lab invents such a technology.

## Main

 - Suppose we ended up with a world where even its most powerful organisations cannot keep secrets. Information about values and about capabilities of the orgs will be public.
   - Values becoming public means the public can decide if the org represents their true values, and if not, coordinate to shut down the org or build another org to replace it.
     - This is control via a more direct democracy than what we currently have, and not control via market.
   - Capabilities becoming public means all weapons capabilities will also be open source. **This document analyses implications of open-source weaponry.**

 - Examples of technologies that will become open source.
   - nuclear weapons and ICBMs, underlying supply chains for uranium/plutonium enrichment, missile guidance systems etc
   - bioweapons and underlying supply chains for biotech
   - AI and GPUs and underlying supply chains
   - (hypothetically) ASI, BCIs, human genetic engg, human connectome simulations
   - (hypothetically) gene drives, solar geoengineering, etc
   - (hypothetically) any offence-beats-defence weaponry, including such weaponry deployable by small groups
 - As of 2025, I am okay with all these technologies being open sourced as long as this is a byproduct of the organisation's other information (such as values and decision-making processes) also leaking.

 - Why is open sourcing weaponry not fatally unsafe?
   - Open source nuclear weapons
     - As of 2025-05 only 9 governments have nuclear weapons. Many other governments would like to have them but don't. My guess is that the bigger causal factor by far for this is that they're pre-emptively threatened by (some subset of) these 9 governments, not that they lack the technical knowledge and infrastructure to build nuclear weapons.
     - As of 2025-05, my guess is open sourcing technical knowledge to produce nuclear weapons is not likely to significantly increase the number of governments in the world that have nuclear weapons.
   - Open source bioweapons
     - As of 2025-05, most bioweapons require an industrial scale production process to manufacture. This keeps them out of the reach of smaller organisations, and governments are by default aware of and responsible for the biotech supply chains in their countries.
       - [DNA synthesis machines](https://duckduckgo.com/?q=oligo+nucleotide+synthesis+machine&iar=images) require an industrial process to manufacture as of 2025-05. Genetically engineered bioweapons can likely be more deadly than bioweapons without it, and these are reliant on DNA synthesis machines.
       - Older bioweapons such as anthrax. My guess is as of 2025-05, scaling up these processes to produce sufficient volume again requires an industrial scale process.
     - Assuming an industrial scale process is required to manufacture bioweapons, the situation is similar to nuclear weapons.
       - My guess is that it is difficult for a government with the required technical knowledge to prevent other governments from also gaining the same knowledge.
       - What can again work is international coordination. Either some governments can threaten other governments to stop bioweapons production and R&D, or all governments on Earth can coordinate together to halt bioweapons production and R&D.
       - (Hypothetical) It is possible for a small group to secretly use the infrastructure overseen by a government to manufacture bioweapons, against the interests of the government.
         - If we had lots of public information on what the infrastructure is being used to build at all times, it would be easier to identify such a group and stop them.
     - (Hypothetical) It is possible that one day bioweapons can be manufactured by a small group of people.
       - As of 2025-05, I would prefer halting bioweapons R&D worldwide before this scenario is reached.
       - Even if this scenario is reached, as of 2025-05 I would weakly prefer a world with open source capabilities and lots of public information on everyone building them, over a world with closed source capabilities held by a small number of organisations (whose values and decision-making processes are also secret).
       - I'm assuming a majority of people will still be able to shut down any small organisation trying to build bioweapons, once they had lots of public information about them.
   - (Hypothetical) Open source ASI model weights
     - As of 2025-05 I would prefer halting ASI development worldwide. If there is a worldwide agreed halt, that is compatible with my suggestion for a world with less secrets. Leaked information about AI companies may help with coordinating this ban.
     - As of 2025-05 I am not optimistic that a small time period of closed source development in one organisation is necessary or sufficient to solve the ASI alignment problem. This time period is likely less than 5 years, before the capabilities get leaked or replicated in another org anyway.
       - (Yudkowsky was at some point in favour of closed source ASI development. If you defer to Yudkowsky on this, maybe consider that even Yudkowsky does not believe uninterpretable deep-learning-based ASIs will likely be aligned within 5 years of research at one organisation.)
     - If ASI gets developed (and we don't go extinct due to AI misalignment soon after), I would strongly prefer a multipolar world where we had multiple orgs with independent militaries developing ASI, and we had information about the values and capabilities of each org developing ASI.
       - ASI aligned to a small group of people could enable stable dictatorship by a small organisation, and leaking information about the values and decision-making processes of this organisation could help avert this outcome.

Not yet researched
   - (Hypothetical) Open source brain scanning tech, open source whole brain scans
     - Open sourcing brain scans of political elites could be a more extreme version of the potential for direct democracy that internet already may enable. This might happen voluntarily or due to theft (such as by independent hackers or whistleblowers etc).
     - TO DO
   - (Hypothetical) Open source nanotech
     - TO DO
   - ???


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/securedrop_review.md

2025-06-14

# SecureDrop review

## Summary

 - SecureDrop provides less source privacy than journalist privacy, even though it is the source not the journalist who is more likely to be punished for their actions.
 - SecureDrop opsec guide may be overkill for the median whistleblower but not good enough for the high-profile whistleblower, who is facing as much risk as a spy operating on foreign soil. SecureDrop team could clarify exactly which whistleblowers they can and cannot support.
 - SecureDrop dev team in the US could consider working with dev teams in other countries, and journalists in countries outside of US geopolitical sphere of influence (example: Russia, China, India). Alternatively, if they refuse to do this, they could make a clear statement to this end.
 - SecureDrop dev team could consider whether they're ready to accept political consequences of "anyone can violate anyone's consent and get all their info published", and make a clear statement to this end.
 - Update: SecureDrop doesn't appear to do much DDOS protection besides piggybacking off of Tor proof-of-work. This ensures nobody can crash the server, but an attacker can still send millions of (LLM-generated?) messages per day to spam the human readers on the receiving end.

## Main

For diplomacy sake, I'll offer compliment before criticism.

Compliment
 - SecureDrop has clearly pushed forward on security-versus-usability pareto frontier for some subset of users.
 - SecureDrop has been used in many of the highest-profile leaks in the time period 2015-2025, as confirmed by the Guardian and others.
 - It is rare for a small dev team to single-handedly shape human history in the way SecureDrop has.
 - I'm only criticising them because I think they're working on something important. Otherwise I might not be paying as much attention to it as I am.

Technical
 - SecureDrop does not offer the maximum security possible.
   - User privacy
     - Source privacy: No source airgap, source PGP encryption is optional, source using tails is optional, no redaction guide for the source.
     - Destination privacy: Imperfect destination airgap (plaintext visible in RAM before re-encryption with PGP)
     - Both: No education for source and destination on how to isolate from your current social circle and manage your psychology while doing so, no test run or practice time period for the source. 
     - Example: I'm confident some of the sources have "how to whistleblow" in their google search results the same week they send the documents.
     - Example: I'm confident atleast some of these media orgs have journalists who tell their family and friends about their work, and someone in this circle will crack under police interrogation.
   - Distributed software and development
     - Codebase could be made even simpler (so multiple dev teams could manage it).
     - If you browse through the docs and UX personas, you'll find that SecureDrop dev team has arbitrarily decided to "encourage" some use cases and "discourage" others such as crime victims and political opinions of mentally ill. It's not very clear to me how they plan to encourage or discourage them. An app whose primary selling point is censorship-resistance should ideally serve everyone IMO.
     - Multiple independent dev teams operating from different countries would ensure the dev teams can't pick political sides or be co-opted by the interests of any political side. Example: bitcoin core dev teams operate from multiple countries.
   - Distributed hardware
     - Only 75 users onboarded after 10 years. Not sure why this is, and whether the dev team "discouraged" some interested users who wished to also run servers. For example citizen journalists.
     - Most of the servers are run by people in the same profession (journalism) and from US/Europe, which means their decisions are correlated. Ideally spreading the servers across more professions and more countries would ensure more censorship-resistance.
     - In particular, all the approved journalists lie within the US geopolitical sphere of influence and may find it difficult to host information critical of US govt. It will help if there are journalists from Russia, China and India for example.
 - Security-versus-usability
   - All of the above makes their system more secure but less usable than say, Signal + Tor + tails. This makes their system less secure but more usable than using PGP + airgap + Tor (curl request) + Whonix on both source and destination side.
   - For whistleblowers that are not sufficiently high-profile, it would not surprise me if even Signal running on a mobile phone (no Tor, no linux) is sufficient security.

Political
 - Diversify incentives and culture
   - In order to maximise resilience of the project, it is important to ensure there are multiple different actors writing the software, hosting the hardware and using the software. Additionally, this particular use case requires protecting the user privacy.
   - It's important that not all the actors belong to the same incentives and culture.
   - Do the actors belong to different countries and professions? Do some of them have a lot of capital or attention, or any formal position of power?
   - (I've already mostly answered these questions above.)
 - No trial by fire
   - As a rule of thumb, if the proposed system isn't being used to store >$100M bitcoin or trade drugs and CP on a daily basis, it doesn't have empirical evidence that it matches the security required for the highest-profile leaks.
   - [Dread opsec guides](http://dreadytofatroptsdj6io7l3xptbet6onoyno2yv7jicoxknyazubrad.onion/post/caca0fb44c86e28bd83b) are better than SecureDrop opsec recommendations IMO, because there's more trial-by-fire going on. Dark web drug vendors get arrested at a higher rate than journalists or whistleblowers.
   - The correct reference class for the highest-profile whistleblowers is spies operating on foreign soil, which is a higher risk category than drug vendors. The ideal opsec guide for such whistleblowers should be even stricter than that of drug vendors.
   - Journalists are often more protected than whistleblowers. In theory, a journalist can provide bad opsec recommendations to their sources, end up getting a source arrested, fail to face any consequences themselves, and continue providing same bad opsec recommendations to their next source.
   - It's important to take opsec advice from the sources themselves, as they are ones under more trail-by-fire.
   - I think it's not great that most of the opsec suggested is for the journalists not the source, when it's the source who is at higher risk of being imprisoned / murdered for their actions.
   - There might be a conflict of interest between doing what's right for the whistleblower versus doing what's right for the journalist, and it is possible SecureDrop dev team currently leans more towards the latter.
 - Power-law distribution
   - Median whistleblower is a corporate whistleblower for a random Fortune 500 company and therefore isn't a priority for nation states to pursue. However someone needs to always be prepared for that rare high-profile whistleblower who will be a priority for nation states to pursue. This power-law distribution allows people involved (sources, journalists, SecureDrop developers) to be lax on security for the median case and then fail badly on the tail case.
   - SecureDrop team needs to clearly decide where on the spectrum of security-versus-usability they want to be. (Maybe they have decided and I'm unaware, I just want more clarity on what their decision is.)
     - Too low, and they're not an improvement over Signal running on an iphone over clearnet.
     - Too high, and they're going to lose some of the journalists they already have onboarded.
       - Update (2025-05-03): I no longer believe opsec good enough to long-term protect anonymity of NSA whistleblowers such as Chelsea Manning is possible. See my newer documents on why. In any case I think SecureDrop could clarify which whistleblowers they can and can't support.
 - Outside view, 2025 versus 2010
   - Empirically it's not obvious to me that ~75 media orgs running SecureDrop has lead to a more transparent world in 2025 compared to 2010 when only wikileaks existed. Seems worth investigating if this is true.
   - Why has SecureDrop only gotten 75 users in 10 years, and were there other interested users who were discouraged by SecureDrop dev team as they did not have a professional reputation as journalists?
   - What fraction of these 75 users have courage and a favourable situation to go against their local incentives and post the highest-profile stuff in the way that Assange did? I have not checked.
   - Many media orgs often write a sensationalised story about the leaked documents with a clear political leaning, and avoid actually publishing the leaked documents so the reader can form an independent opinion. If your browse both websites, there are obvious difference between a wikileaks post and a new york times post, even if they're both talking about the same document.
 - Negative consequences
   - In general I get the vibe that some of the devs in this broader space (Tor, tails, Signal etc) haven't fully accepted the political consequences of their own work, including potential negative ones.
   - It won't just be "good guys" violating consent of "bad guys" to publish their information in public, it will be anyone violating consent of anyone else (using any ideology that appeals to some greater good) and publishing the information in public.
   - SecureDrop dev team could consider making a clear statement whether their app is solely for who they consider good guys or for everyone.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/open_source_search_summary.md

2025-07-03

# Open Source Search (Summary)

Disclaimer
 - Quick note
 - I support a complete ban on AI R&D. This app requiring AI doesn't change that.

## Summary

 - This document describes how to build an open source search engine for the entire internet, that runs on a residential server
 - As of 2025 it'll cost between $100k-$1M to build and host this server. This cost will reduce with every passing year, as GPU, RAM and disk prices reduce.
 - Most expensive step is GPU capex to generate embeddings for the entire internet.
 - Most steps can be done using low-complexity software such as bash scripts (`curl --multi`, `htmlq -tw`, `curl -X "$LLM_URL"`, etc)

## Main

Why?
 - I realised my posts on this topic are sprawling all over the place, without one post to summarise it all. Hence this post.
 - If someone donates me $1M I might consider building this. I've written code for more than half the steps, and no step here seems impossibly hard.

## Use cases of open source search

 - Censorship-resistant backups
   - aka internet with no delete button aka Liu Cixin's dark forest
   - Any data that reaches any server may end up backed up by people across multiple countries forever.
   - You can read my other posts for more on the implications of censorship-resistant backups and discovery.
 - Censorship-resistant discovery
   - Any data that reaches any server may end up searchable by everyone forever.
   - Currently each country's govt bans channels and websites that they find threatening. It is harder to block a torrent of a qdrant snapshot, than to block a static list of IP addresses and domains. Will reduce cost-of-entry/exit for a new youtuber.
   - Since youtubers can potentially run for govt, subscribing to a youtuber is a (weak) vote for their govt.
 - Privacy-preserving search
   - In theory, it will become possible to run searches on an airgapped tails machine. Search indices can be stored on read-only media and memory can wiped on reboot.
   - As of 2025, a handful of intelligence agencies have exclusive access to everyone's thoughts, as everyone is dependent on centrally hosted search engines. This could change.
 - Search for jobs, houses, friends, partners, etc without relying on a tech company.
   - Most tech companies exist just to provide search functionality, and the incentives and culture so that both sides upload their data online.
   - Not having to rely on them would mean better incentives and culture can be set. Lower cost-of-exit.
 - Niche discovery
   - Higher quality search (due to LLMs) makes it easier to connect people interested in a niche. Might be able to spawn subcommunities more easily based on shared thoughts or actions.
   - Attention on internet currently is very heavy-tailed, few youtubers and social media companies have all of Earth's attention. This phenomena might weaken.
 - Governance
   - Can build political applications such as liquid democracy, distributed social media, etc if no politically or economically motivated group can censor data or alter search rankings or upvote counts.

## Final output (after steps 0 to 5)

 - Distribute a torrent of the plaintext and the embedding search database snapshots (qdrant or dragonflyDB or similar) as mentioned below.
 - Distribute code for all steps of the pipeline mentioned below. Torrent the code if github or similar website removes it from their site.

## Step 0: Estimate hardware costs in 2025

**Important:** All these prices are dropping exponentially. Try forecasting prices in 2030 or 2035. We will eventually end up with entire text internet stored in your pocket.

Prices taken from hetzner server auction, vast.ai, aws s3 deep archive

Rented
 - Compute/Memory
   - CPU compute = ($4/thread/mo) / (3.5 GHz/thread) = $1.1/B cycles = $0.07/GFLOP
   - CPU RAM = $0.40/GB/mo
   - CPU throughput = infinity, either disk or network throughput is the botteleneck
   - GPU compute = ($1.40/h) / (50 TFLOP/s) = $0.008/PFLOP
   - GPU RAM = $10/GB/mo
   - GPU throughput = ($2/h) / (50 GB/s) = $28/TB
 - Storage
   - SSD = $10/TB/mo
   - HDD = $2/TB/mo
   - Tape = $1/TB/mo
   - SSD throughput = ($20/mo) / (0.5 GB/s) = $16/PB
   - HDD throughput = ($0.50/mo) / (100 MB/s) = $2/TB
 - Network
   - Network throughput = ($48/mo) / (10 gbps) = $0.015/TB

Self-hosted
 - typically 1-100x cheaper than cloud
 - price gap between cloud and rented is largest for storage, compared to cloud or memory
 - Extreme examples
   - SSD throughput = ($300/5y) / (10 GB/s) = $0.20/PB
     - this is 80x cheaper than rented
   - Tape (second-hand LTO-9) = $2/TB/30y = $0.0055/TB/mo
     - this is 180x cheaper than rented

## Estimate storage required

 - In theory it is possible to do all the steps below in batches, so that a single node with 8 TB RAID0 is sufficient to crawl, extract, generate indices and store indices for 2 PB of internet plaintext.
 - In practice you will likely use a network-based filesystem like ceph. All steps below are fully parallelisable, so the separate nodes don't need throughput between them.
 - Raw plaintext can be stored on second-hand LTO-9 tapes

## Step 1: Crawling

Figures taken from commoncrawl and internet archive

 - Total urls = 1T urls

Figures taken from my own (bad) benchmark

 - Crawl rate > (30 HTTP headers / s) / (4 cores) = (~100M headers / mo) / (4 cores)
 - Compute required < (1T headers) / (25M headers/mo/core) = 40k core mo
 - Compute cost < 40k core mo * $4/core/mo = $160k
 - Requesting header involves DNS lookup, TCP handshake, TLS handshake.
 - TLS handshake requires compute to multiply prime numbers, this is the likely real bottleneck. (Not tested.)
 - Linux allows increasing number of sockets beyond 4096, network buffer has enough space to manage this. Using 10s timeout and 30 headers/s => 300 connections open at once
 - Commoncrawl CDX files provide initial 1T urls from which to seed the crawl.

Software
 - parallel curl requests is sufficient, nothing fancy. (Not tested on full dataset)

## Prioritised public datasets

If you can't do the entire internet, here's some datasets you might want to prioritise:
 - personal blogs - searchmysite.net 3k blogs, substack, livejournal
 - video transcripts - youtube, rumble, rutube, bilibili, youku - atleast top 10k channels each by subscriber count. use yt-dlp or similar to scrape.
 - forums - hackernews, reddit, lesswrong, stackexchange
 - books, papers - arxiv, libgen, wikipedia
 - code - github
 - leaked datasets - wikileaks, distributed denial of secrets
 - social media - discord public (discord unveiled) and private, insta public and private, twitter public and private etc - takes effort to get scrapes

**It is important to do most of them not just few, otherwise your app won't be competitive with existing apps.**

## Private datasets

Also provide software to create private datasets
 - Collect data of every keystroke on private machine. Retrieved using keylogger
 - Collect data of all previous AI inputs and outputs, both local LLM calls and API calls. Retrieved by sniffing network traffic locally.
 - Collect data from every webpage visited on private machine. Retrieved through browser cache directory.
 - Collect data from all user-generated files on private machine. Retrieved directly.

Each user will likely have to individually create their own private dataset, as some of this data may not be present in any public dataset.

## Step 2: Plaintext extraction

Figures taken from commoncrawl and internet archive

Data size
 - Total plaintext on internet = 2 PB
 - Google and NSA datacenters are currently 10,000 PB, mostly for storing video
 - Theoretical max plaintext = (10k words /day/person) * 5 bits/word * 8B people * 100y = 1,700 PB
 - Theoretical max video (downsampling/embedding gen to 100 bytes/s using AI) = 100 B/s * 8B people * 100y = 2,000,000 PB
   - video downsampling cost not considered

Plaintext extraction cost
 - compute required to extract plaintext from a webpage, is typically less than compute required for the TLS initialisation for that webpage
 - all processing can occur in RAM to avoid hitting disk I/O bottleneck.

Software
 - `do something | htmlq -tw | do something`

## Step 3: Embedding generation

Algorithm used
 - BM25 or similar
 - LLM embedding search
   - searches concepts not keywords
   - significantly outperforms BM25, although ideal system uses both

Figures taken from openai text-embedding-3-small

Embedding generation cost
 - openai price = $0.01/1M input tokens
 - assume 200 tokens per chunk, no overlap between chunks
 - open source naive price = 175B params * $0.008/PFLOP * 1 FLOP/param/output vector / (200 input tokens/ output vector) = $0.007/1M input tokens
 - Embedding generation cost = $0.01/1M input tokens * 2 PB * (1 input token/ 5 bytes) = $4.2M

Performance
 - as of 2025, text-embedding-3-small outperforms many models that are overfit to MTEB
 - also its cheaper than hosting yourself

Software
 - Bash pipelines is sufficient

## Step 4: Embedding search

Algorithm used, Search time
 - Search in RAM or disk
   - RAM search is within few seconds - might be optimisable to below 100 ms, but will require custom work on RAM heap management
   - Disk search is bottlenecked by disk throughput - >8 GB/s SSDs are available locally but not on cloud.
 - [FAISS index factory by Pinecone](https://www.pinecone.io/learn/series/faiss/composite-indexes/) - different algos use different bytes/weight, and give different search times and recall.
   - Graph-based algos like HNSW and pinecone proprietary algos outperform geometric algos like LSH, k-means clustering, product quantisation, etc
 - Future algos
   - Seems likely that a better graph algo than HNSW will be discovered in next few years.
   - My basic intuition is you want to put 1T vectors into 1M clusters with 1M vectors each, then have a fast way to check which cluster is likely to contain matches to query vector. Then brute force search those clusters. 

Software
 - One-click software - dragonflyDB for RAM, qdrant for disk
 - Can load databases using bash pipelines
 - As of 2025, many other implementations exist, require more work to use.

## Search filters

 - Tags based on data source
   - User can specify they only want to search reddit or they only want to search their own keylogs etc
   - Search filter based on tag based on data source is important
 - Tags based on content
   - Can use the embeddings to generate automated tags: politics, sports, tech, etc
   - Search filter based on tag based on content is important
 - Timestamps
   - Search filter based on time interval is important

## Latency

If your hosting the app locally for your own use, latency does not matter, only search time matters. If you're hosting it for other people to use, latency can be relevant, and it may make sense to host multiple edge servers.

 - Human body latency
   - Nerve conduction latency = 1m / (100 m/s) = 10 ms
   - Optical latency = (90 fps)^(-1) = 10 ms
 - Computer latency
   - CPU, RAM, disk, GPU, I/O device latencies are much below 10 ms
   - Network latency is 100 ms for round-trip EU to US, due to speed of light.
   - 1 gbps fibre optic now popular, sufficient to transmit HD video at below 10 ms latency with no encoding or compression
   - 3D or VR data still can't be sent uncompressed

## Step 5: Tool use

Make sure AI can also make searches to the embedding search databases. For example you can put a wrapper in front of qdrant to convert it into an MCP server.

Reasoning models can reason about the source and context of any given piece of information, and estimate how likely it is to match ground-truth. 


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_research/us_govt_whistleblower_database.md

2025-07-11

# US Govt Whistleblower Database

**Disclaimer**
 - **Quick note**
 - **Incomplete**
 - All information here is based on public record

**What?**
 - Collecting (mostly) fact-checked of previous US govt whistleblowers.

**Why?**
 - Might aid future whistleblowers, or people directly working with them, or people indirectly supporting them.
 - Unknown unknown reasons. I can't pre-emptively guess every way this resource might be used in future.


## Database categories

Since there are many people who become sources, it is useful to categorise them. I have categorised them as follows based on intent, action and consequences.

categories based on intent
 - (categorising based on intended beneficiary not intended recipient)
 - whistleblowers
   - intended beneficiary: perceived public interest
   - intended recipient of info: usually public. sometimes specific people acting in public interest such as judges or congressmen.
 - leakers
   - intended beneficiary: perceived personal gain but not money (romantic interest, personal rivalry, social status, etc)
   - intended recipient of info: anyone
 - spies
   - intended beneficiary: perceived value-alignment with foreign govt or ideology, or money
   - intended recipient of info: intelligence service of foreign govt

categories based on action
 - leaked classified documents
 - did not leak classified documents, but may have leaked classified information
 - did not leak classified documents or information

categories based on consequences
 - was imprisoned
 - was not imprisoned
 - (not categorising based on other consequences such as social ostracism, financial loss, etc)


## Some notes and disclaimers

About classification
 - Background info
   - US classification levels: CONFIDENTIAL < SECRET < TOP SECRET < TOP SECRET/SCI 
   - As of 2022-09-30, public claim is that 1.35 million people have TOP SECRET security clearance.
   - TS/SCI indicates a compartment which only few named individuals can access, not everyone with a TOP SECRET security clearance. Different documents can belong to different compartments. A compartment can be as small as 10 people.
 - Conclusion
   - in many whistleblower cases, seems unclear if classification status was SECRET or TOP SECRET at time of leak

About key dates recorded
 - date of first transmission of a document to a second person (there could be multiple documents sent on different dates)
 - date of first public publication of a document (there could be multiple documents published on different dates)
 - date of arrest
 - date of public revealing of whistleblower's identity
 - date of being released from prison

About consequences on social circle
 - Typical consequences once identity is publicly out
   - Family members are interrogated, house raided and wiretapped.
   - Family members face significant legal expenses. Almost always, a defence fund is raised with donations from non-profits and the general public.
   - Family members are verbally harassed in-person and online.
   - Multiple people in extended social circle cut off contact. A common reason is to avoid being involved in the investigation.
   - Once imprisoned, prison visits are allowed for immediate family members.
   - Family members are not imprisoned.
 - Unless specified otherwise, it is IMO a reasonable assumption that all of the above consequences occured in every single case of US govt whistleblowers/leakers who were imprisoned. There may or may not be documented proof for all of the consequences.
 - Usually there is more documented proof if the whistleblower chose to talk to journalists or the general public about the challenges they faced. Usually law enforcement or intelligence did not make information public against the will of the whistleblower.

About journalists
 - This list is **not** an endorsement of the values or capabilities of any specific journalists. It only provides historical fact-checked information.
 - Information recorded
   - Date, title, authors, media house of first publication
   - Link to original copy of first publication, or a mirror if possible
   - Whether publication contains original documents
   - Whether journalists and editors knew identity of the source
 - Some journalists later quit the orgs they worked at, at the time of the leak. Unless stated otherwise, I have specified the org they worked for at the the time of the leak.
 - Some articles may have older edits or link urls. Internet Archive Wayback Machine is one possible place you can check for this.
 - In multiple cases there is public record of a journalist being informed of the source identity but no public record of the editor being informed.
   - Speculation by me (Samuel): It is highly likely that if a journalist knows the identity of a source, the editor will pressurise the journalist to inform them as the editor.
   - Editor's reputation is affected if the journalist invents fake information claiming an anonymous source, and this sequence of events becomes public later. Editor's reputation is affected if the source's reputation is negative for the editor in some way (for example they're a criminal or spy), and the source's identity or reputation becomes public later.
   - As per AP policy, a reportor/journalist must inform the editor of the identities of any sources.
   - As per Fox News policy, no mandatory requirement for reporter/journalist to share the identity of a source with the editors.

About lawyers
 - This list is **not** an endorsement of the values or capabilities of any specific lawyer. It only provides historical fact-checked information.
 - Some minor details may be incorrect. I lack a formal legal background.


## Details of US govt whistleblowers/leakers who leaked classified documents and were not imprisoned

US govt whistleblowers and leakers, leaked classified documents, not imprisoned (sorted by date)
 - Edward Joseph Snowden - still wanted for arrest
 - Daniel Ellsberg, Anthony (Tony) J. Russo Jr.

### Classification status

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, classification status of leaked info (sorted by date)
 - Edward Joseph Snowden
   - 100,000-2,000,000 documents (exact number is not public record), many of which were TOP SECRET/SCI
 - Daniel Ellsberg, Anthony (Tony) J. Russo Jr.
   - TOP SECRET

### Key dates

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, key dates (sorted by date)
 - Edward Joseph Snowden
   - first transmission 2013-01 to 2013-06 (date is not public record), flight from Hawaii US to Hong Kong 2013-05-10, first major transmission 2013-06-02, first publication 2013-06-05, public identity 2013-06-09, flight from Hong Kong to Moscow 2013-06-23, first asylum request made 2013-06-23, first asylum granted (by russia) 2013-08-01, citizenship granted (by russia) 2022-09-26
 - Daniel Ellsberg, Anthony (Tony) J. Russo Jr.
   - first involvement of an unauthorised person (Anthony (Tony) J. Russo Jr.) 1969-10-01, transmission to US senator 1969-10 (??? exact date unclear), first transmission to journalist 1971-03-02, first publication 1971-06-13, arrest 1971-06-28, released on bond 1971-06-28, case dismissed 1973-05-11

### Social circle

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, consequences on social circle
 - Edward Joseph Snowden
   - Documented social circle at time of leak: Father, mother, (divorced multiple years before the leak), 1 sister, girlfriend (now wife as of 2025-06), no children (2 children as of 2025-06)
   - Documented consequences for social circle: house raid, interrogation, polygraph, wiretap, significant legal expenses, online harassment
   - Documented visits in asylum: Father visited on 2013-10-10, girlfriend (now wife) permanently shifted to moscow in 2014-07 (possibly 2014-07-15 ??? exact date not clear), multiple in-person visits by journalists and lawyers since 2013-06, multiple video calls by journalists. No public record indicating mother ever visited him after the leak. (??? seems unclear)
   - Misc
     - Snowden's family members working for US govt kept their jobs but with no further promotions.
     - Snowden had two children with his wife in Russia and they still live together in Russia as of 2025-07.
 - Daniel Ellsberg
   - Documented social circle at time of leak: Father, wife (married on 1970-08-08, during leak), siblings unknown (??? seems unclear), mother dead, ex-wife (divorced), 2 children from ex-wife (later had 1 child from wife)
   - Documented consequences for social circle: house raid, interrogation, wiretap, significant legal expenses, in-person harassment by FBI agents, cut off by extended circle
   - Documented visits: After case dismissal: Visited by children and step-children. Visited by multiple friends and anti-war activists. Visited by multiple journalists. Significant surveillance by FBI and NSA continued during this time frame.
   - Misc
     - Psychiatrist's office was (illegally) broken into under direction of Howard Hunt, CIA officer, to obtain evidence so that Ellsberg could deemed mentally unfit for trial.
     - Daniel Ellsberg's father initially disowned him for this decision to leak the documents, but may have later changed his mind. (??? exact details not clear)
     - Daniel Ellsberg's son later claimed parents had strained marriage for many years and he had less contact with his father.

### Opsec/Cybersecurity

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, opsec mistakes and arrest methods
 - todo

### Journalism

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, journalists they worked with
 - Edward Joseph Snowden
   - First publication: [NSA collecting phone records of millions ..., The Guardian, 2013-06-05](https://www.theguardian.com/world/2013/jun/06/nsa-phone-records-verizon-court-order)
   - Also published: [US, British intelligence mining data from ..., Washington Post, 2013-06-07](https://www.washingtonpost.com/investigations/us-intelligence-mining-data-from-nine-us-internet-companies-in-broad-secret-program/2013/06/06/3a0c0da8-cebf-11e2-8845-d970ccb04497_story.html)
   - Glenn Greenwald, Janine Gibson (editor, US), Alan Rusbridger, (editor, UK) - The Guardian
   - Laura Poitras, Barton Gellman, Anne E Kornblut (editor) - Washington Post
   - Gleen Greenwald and Ewen MacAskill knew the identity of the source before publishing. Janine Gibson and Alan Rusbridger knew that Greenwald was meeting an anonymous source in Hong Kong, but no public record confirming they knew the identity of the source.
     - Speculation by me (Samuel): It is highly likely they knew the identity of the source.
   - Laura Poitras and Barton Gellman knew the identity of source before publishing. No public record confirming Anne E Kornblut or other Washington Post editors knew identity of source before publishing.
     - todo - more research on this topic
   - [Github archive by iamcryptoki containing documents published publicly from 2013 to 2018](https://github.com/iamcryptoki/snowden-archive).
   - Most of the 100,000-2,000,000 documents leaked by Snowden have never been published publicly. Public record is that only a handful of journalists (listed above) have access to a copy of the full set of documents.
     - [Speculation by electrospaces.net on who all have access as of 2019](https://www.electrospaces.net/2019/04/the-snowden-files-where-are-they-and.html)
     - todo - more research on this topic


 - Daniel Ellsberg
   - First publication: Vietnam Archive: Pentagon Study Traces 3 Decades of Growing U.S. Involvement, [1971-06-13, weekly issue, The New York Times](https://timesmachine.nytimes.com/timesmachine/1971/06/13/issue.html)
     - Neil Sheehan, other team members, Abe Rosenthal (editor) - The New York Times
     - Newspaper contained only small excerpts per issue, total 9 issues contained excerpts.
   - Another publication: [Documents Reveal U.S. Effort in '51 To Delay Viet Election, Washington Post, 1971-06-18](???)
     - Ben Bagdikian, other team members, Ben Bradlee (editor) - Washington Post
     - Ellsberg provided Bagdikian with a copy on 1971-06-16, during the FBI manhunt.
     - Ben Bagdikian provided a copy to Senator Mike Gravel on 1971-06-26. 4100 out of ~7000 pages were published by Senator Mike Gravel on 1971-06-29 to Subcommittee on Public Buildings and Grounds.
   - [Unredacted copy of Pentagon Papers by released by US govt, 2001-06-13](https://www.archives.gov/research/pentagon-papers)
   - Neil Sheehan knew the identity of the source. As per public record, Neil Sheehan negotiated with Abe Rosenthal (managing editor) to ensure that the story could be published without the latter being informed of the identity of source.
   - Ben Bagdikian knew the identity of the source. No public info confirming Ben Bradlee or others at WaPo knew the identity of the source.
     - More research on this topic - todo


### Law

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, lawyers they worked with
 - Edward Joseph Snowden
   - lawyers for: No trial occurred, only asylum requests: Ben Wizner (US, ACLU Director), Jesselyn Radack (US, WHISPer Director), Robert Tibbo (Hong Kong), Jonathan Man (Hong Kong), Albert Ho Chun-yan (Hong Kong), Anatoly Kucherena (Russia), Plato Cacheris (US), Wolfgang Kaleck (Germany/EU), William Bourdoun (France/EU), Marcel Bosonnet (Switzerland), Gonzalo Boye (Chile), Baltasar Garzón (Spain/Chile, Wikileaks international legal head), Halvard Helle (Norway), Emanuel Feinberg (Norway), other anonymous lawyer-advisors
   - lawyers against: No trial occurred: Neil H. MacBride, Eric H. Holder Jr.
     - Civil suit over book published: G. Zachary Terwilliger, Jody Hunt, Jeffrey Bossert Clark, Jeffrey A. Rosen, Lauren A. Wetzler, R. Trent McCotter
 - Daniel Ellsberg
   - lawyers for: Leonard Boudin, Charles Nesson, Leonard Weinglass
   - lawyers representing NYT: Daniel Sheehan, Floyd Abrams
   - lawyers against: David Nissen, Warren P. Reese, Richard J. Barry, Joseph L. Tauro, Erwin Nathaniel Griswold

### Misc

US govt whistleblowers and leakers, leaked classified documents, not imprisoned, miscellaneous information
 - Edward Joseph Snowden
   - empty
 - Daniel Ellsberg
   - G Gordon Liddy, ex-FBI ex-Army, claims Howard Hunt (who broke into Ellsberg's psychiatrist's office) also planned to induce LSD overdose to deem Ellsberg mentally unfit.
   - Wiretap also performed without warrant. Judge dismissed case due to extensive illegal evidence gathering.
   - Robert L. Meyer, US attorney, was forced to resign for refusing to pursue case against Daniel Ellsberg


## Details of US govt whistleblowers/leakers who leaked classified documents and were imprisoned

US govt whistleblowers and leakers, leaked classified documents, imprisoned (sorted by date)
 - Jack Douglas Teixeira
 - Daniel Everette Hale
 - Reality Leigh Winner
 - Terry J Albury 
 - Joshua Adam Schulte
 - James Hitselberger
 - Donald Sachtleben
 - Chelsea Elizabeth Manning 
 - Shamai Kedem Leibowitz
 - Samuel Loring Morison

### Classification status

US govt whistleblowers and leakers, leaked classified documents, imprisoned, classification status of leaked info (sorted by date)
 - Jack Douglas Teixeira
   - classified at time of leak
   - most documents TOP SECRET/SCI (TOP SECRET//HCS-P/SI-G/TK//NOFORN or Top Secret//SI//NOFORN//FISA ???), some documents SECRET//REL FVEY
   - remains classified as of 2025, US govt has confirmed authenticity of some documents
 - Daniel Everette Hale
   - classified at time of leak
   - some documents TOP SECRET (TOP SECRET//SI//NOFORN), other documents SECRET
   - remains classified as of 2025
 - Reality Leigh Winner
   - classified at time of leak
   - TOP SECRET//SI//ORCON/NOFORN
   - remains classified as of 2025
 - Terry J Albury 
   - classified at time of leak
   - some documents SECRET, some documents CONFIDENTIAL, other documents unclassified, at time of leak (??? seems unclear)
   - remains classified as of 2025
 - Joshua Adam Schulte
   - classified at time of leak
   - some documents SECRET, some documents TOP SECRET or TOP SECRET/SCI (??? seems unclear), operational details of vault7 not leaked at all
   - remains classified as of 2025
 - James Hitselberger
   - classified at time of leak
   - SECRET
   - remains classified as of 2025
 - Donald Sachtleben
   - classified at time of leak
   - main documents TOP SECRET // SCI, other documents SECRET
   - remains classified as of 2025. Summarised details confirmed in press interviews.
 - Chelsea Elizabeth Manning
   - classified at time of leak
   - iraq war logs SECRET//NOFORN, guantanamo bay SECRET//NOFORN, collateral murder video SECRET, diplomatic cables CONFIDENTIAL or SECRET or TOP SECRET (??? seems unclear)
   - Remains classified as of 2025: iraq war logs, guantanmo bay documents, collateral murder video
   - Some redacted documents declassified as of 2025: Diplomatic cables
 - Shamai Kedem Leibowitz
   - classified at time of leak
   - SECRET
   - remains classified as of 2025
 - Samuel Loring Morison
   - classified at time of leak
   - TOP SECRET or SECRET (??? seems unclear)
   - some lower resolution photos similar to leaked photos declassified as of 2025

### Key dates

US govt whistleblowers and leakers, leaked classified documents, imprisoned, key dates (sorted by date)
 - Jack Douglas Teixeira
   - first transmission to semi-public discord 2022-02, first transmission to journalist likely 2022-12 (discord server logs 2022-02 to 2022-12 not publicly available), publication to wide audience 2023-04-06, public identity 2023-04-13, arrest 2023-04-13, not released as of 2025-06
 - Daniel Everette Hale
   - first transmission 2014-05 (multiple messages sent, exact date of first message containing classified document is not public record), first publication 2015-10-15, arrest 2019-05-09, public identity 2019-05-09, released 2025-07-04
 - Reality Leigh Winner
   - first transmission 2017-05-09, arrest 2017-06-03, first publication 2017-06-05, public identity 2017-06-05, released 2021-06-02
 - Terry J Albury
   - first transmission 2016-02, first publication 2017-01-31, arrest 2018-03-28, public identity 2018-03-29, released 2020-11
 - Joshua Adam Schulte
   - first transmission 2016-04 (exact date not in public record), first publication 2017-03-07, arrest on allegedly unrelated charge 2017-08-24, public identity as a suspected whistleblower 2018-05-15, public identity as whistleblower confirmed 2018-06-18, not released as of 2025-06
 - James Hitselberger
   - first transmission 2012-04-11, no publication, arrest 2012-10-25, public identity 2012-10-25, released 2014-07
 - Donald Sachtleben
   - first transmission 2012-04-30, first publication 2012-05-07, arrest on allegedly unrelated charges 2012-05-11, indicted as whistleblower 2013-09-23, public identity 2013-09-23, released 2022 (??? exact date not clear)
 - Chelsea Elizabeth Manning
   - first transmission 2010-01 (as per chelsea's claims, 2010-02 is publicly documented), first publication 2010-02-18, arrest 2010-05-27, public identity 2010-06-07, released 2017-01-17
 - Shamai Kedem Leibowitz
   - first transmission 2009-01 (exact date unclear, may not be public record), first publication 2009-03-26, house raid 2009-04 (exact date unclear, may not be public record), final arrest 2009-12-17, public identity 2009-12-17, released 2012-01 (exact date unclear)
 - Samuel Loring Morison
   - first transmission 1984-07 (??? exact date within 1984-07 not clear), first publication 1984-08-07, arrest 1984-10-01, public identity 1984-10-01, released 1988 (??? exact date unclear)

### Social circle

US govt whistleblowers and leakers, leaked classified documents, imprisoned, consequences on social circle
(only done surface-level research so far)
 - Jack Douglas Teixeira
   - Documented social circle at the time of leak: Step-father, mother, biological father, 1 step-brother, 1 step-sister, girlfriend
   - Documented consequences for social circle: house raid, interrogation, online harassment of parents
   - Documented prison visits: no info available
   - Misc: gave TV interview while in prison
 - Daniel Everette Hale
   - Documented social circle at the time of leak: Father, mother, no SO
   - Documented consequences for social circle: house raid, interrogation
   - Documented prison visits: no info available
 - Reality Leigh Winner
   - Documented social circle at the time of leak: Father, mother, 1 sister, boyfriend
   - Documented consequences for social circle: house raid, interrogation, wiretap, cut off by extended circle, significant legal expenses
   - Documented prison visits: multiple visits by mother, multiple phone calls
   - Misc: Mother faced panic attacks and depression. Sister withdrew from college for a semester. Parents faced difficulties with retaining job.
 - Terry J Albury
   - Documented social circle at the time of leak: Father, mother, siblings unknown, wife, 2 children
   - Documented consequences for social circle: house raid, interrogation
   - Documented prison visits: no info available
   - Misc: Wife and multiple friends remained supportive throughout the trial and prison sentence.
 - Joshua Adam Schulte
   - Documented social circle at the time of leak: Father, mother, 3 brothers, SO unknown
   - Documented consequences for social circle: house raid, interrogation, significant legal expenses, online harassment
   - Documented prison visits: some family visits, visits were restricted for 3 years, claimed that he was attempting to release more info from prison
 - James Hitselberger
   - Documented social circle at the time of leak: Father, mother, no siblings, no SO
   - Documented consequences for social circle: house raid, interrogation
   - Documented prison visits: no info available
 - Donald Sachtleben
   - Documented social circle at the time of leak: no info available
   - Documented consequences for social circle: no info available
   - Documented prison visits: no info available
 - Chelsea Elizabeth Manning
   - Documented social circle at the time of leak: Father, mother, (divorced), 1 sister, boyfriend (breakup at same time)
   - Documented consequences for social circle: house raid, interrogation, wiretap, significant legal expenses, cut off by extended circle, online verbal harassment
   - Documented prison visits: Multiple visits by family and friends. Multiple letters received, although some were redacted. First visits were behind glass, later visits were regular.
   - Misc: UK govt cooperated with US govt to wiretap mother and aunt in Wales. Father lost job, mother lived in debt, until sufficient donation received for legal defence. Mother collapsed during hearing, faced multiple panic attacks and medical consequences. Father became depressed. Father's second marriage broke apart as well.
 - Shamai Kedem Leibowitz
   - Documented social circle at the time of leak: Father, mother, wife, children unknown
   - Documented consequences for social circle: house raid, interrogation
   - Documeted prison visits: no info available
   - Misc: Used public defender not private lawyer. Grandson of Yeshayahu Leibowitz. Likely morally supported by family and broader jewish community throughout trial and imprisonment.
 - Samuel Loring Morison
   - Documented social circle at the time of leak: Father, mother, siblings unknown, spouse unknown, children unknown
   - Documented consequences for social circle: house raid, interrogation
   - Documented prison visits: no info available
   - Misc: Grandson of Samuel Eliot Morison

### Opsec/Cybersecurity

US govt whistleblowers and leakers, leaked classified documents, imprisoned, opsec mistakes and arrest methods
 - todo

### Journalism

US govt whistleblowers and leakers, leaked classified documents, imprisoned, journalism
 - Jack Douglas Teixeira
   - First publication: Semi-public discord server Thug Shaker Central, 2022-02.
   - First publication in mass media: [Ukraine War Plans Leak Prompts Pentagon Investigation, The New York Times, 2023-04-06](https://www.nytimes.com/2023/04/06/us/politics/ukraine-war-plan-russia.html)
   - New York Times publication does not contain original documents.
     - Could not find a mirror to original documents yet. - todo
   - Helene Cooper, Eric Schmitt, Joseph F Kahn (editor) - The New York Times
   - No journalist directly contacted by the source. Journalists eventually found the discord and broadcast the information further.
   - No public record confirming journalists or editor knew the identity of the source.
 - Daniel Everette Hale
   - First publication: [The Drone Papers, The Intercept, 2015-10-15](https://theintercept.com/2015/10/15/the-drone-papers/)
   - Jeremy Scahill, Betsy Reed (editor) - The Intercept
   - Publication contains some original documents.
   - Journalist knew the identity of the source. No public record confirming editor knew the identity of the source. (See note above on this scenario.)
 - Reality Leigh Winner
   - First publication: [Top Secret NSA Report Details ..., The Intercept, 2017-06-05](https://theintercept.com/2017/06/05/top-secret-nsa-report-details-russian-hacking-effort-days-before-2016-election/)
   - Richard Esposito, Matthew Cole, Sam Biddle, Ryan Grim (editor) - the Intercept
   - Publication contains some original documents.
   - No public record confirming journalists or editor knew source identity.
 - Terry J Albury 
   - First publication: [The FBI's Secret Rules, The Intercept, 2017-01-31](https://theintercept.com/series/the-fbis-secret-rules/)
   - Trevor Aaronson, Cora Currier, Jenna McLaughlin, Alice Speri. Betsy Reed (editor) - The Intercept
   - Publication contains some original documents.
   - No public record confirming journalists or editor knew source identity.
 - Joshua Adam Schulte
   - First publication: [Vault 7, Wikileaks, 2017-03-07](https://wikileaks.org/ciav7p1/index.html)
   - Anonymous team, Julian Assange (editor) - Wikileaks
   - Publication contains some original documents.
   - No public record confirming journalists or editor (of wikileaks) knew source identity.
   - 2nd attempt: From prison, he promised more documents offered to: Shane Harris, the Washington Post. Marcy Wheeler, Emptywheel. Both journalists knew identity of source.
 - James Hitselberger
   - Publication: No publication
   - Did not work with any journalists.
 - Donald Sachtleben
   - First publication: [CIA thwarts new al-Qaida underwear bomb plot, Associated Press, 2012-05-07](https://www.youtube.com/watch?v=eei6MAmYrl0).
     - Unable to find text article on AP website, may have been taken down, may still be available on Wayback Machine - todo.
     - [Same news repeated, Fox News, 2012-05-07](https://www.foxnews.com/us/cia-thwarts-new-al-qaida-underwear-bomb-plot)
   - Adam Goldman, Matt Apuzzo, Ted Bridis (editor) - Associated Press
   - Sachtleben did not send original documents to the journalist, hence they're not published.
   - Court record confirms Adam Goldman knew the identity of the source. No public record confirming Matt Apuzzo or Ted Bridis knew the identity of the source. (See note above on this scenario.)
 - Chelsea Elizabeth Manning
   - First publication: [Classified cable from US Embassy Reykjavik on Icesave, Wikileaks, 2010-02-18](https://www.wikileaks.org/wiki/Classified_cable_from_US_Embassy_Reykjavik_on_Icesave,_13_Jan_2010)
   - Anonymous team, Julian Assange (editor) - Wikileaks
   - No public record confirming anyone at Wikileaks knew the identity of the source at the time of the leak. Julian Assange has declined knowing identity of source before it was publicly reported.
     - Speculation by me (Samuel): Since Adrian Lamo could figure out the identity from social media clues, and Julian Assange was also a skilled hacker, there is a significant probability wikileaks also independently deduced the identity of the source before it was publicly reported.
   - Publication contains original documents.
 - Shamai Kedem Leibowitz
   - First publication: FBI Wiretap Transcripts: Israeli Embassy Targets Iran and U.S. Opinion, Richard Silverstein (at richardsilverstein.com), 2009-03-26
     - Link taken down, could not find a mirror yet. - todo
     - [Later article by Richard Silverstein](https://www.richardsilverstein.com/2016/12/10/published-us-intelligence-secrets-israels-anti-iran-campaign/) discusses the transcripts but does not contain the original transcripts
   - Richard Silverstein - independent blogger
   - Richard Silverstein knew the identity of the source.
   - Misc: Leibowitz and Silverstein later publicly accused each other of misaligned motives.
 - Samuel Loring Morison
   - First publication: Jane's Defence Weekly, volume 2 no 5, 1984-08-07 sent to newsrooms, 1984-08-11 official publishing date.
     - Could not find digitised version of magazine issue yet. - todo
   - Derek Wood (editor), Sidney Jackson (managing director), other editorial staff
   - Publication contains original documents (photographs).
   - Derek Wood knew identity of the source. Public record does not confirm any staff knew the identity of the source. Speculation: Sidney Jackson also may have known the identity of the source.

### Law

 - Supervisory role played in some cases by attorney generals or assistant attorney generals: Zachary Terwilliger, John C. Demers, Matthew G. Olsen

US govt whistleblowers and leakers, leaked classified documents, imprisoned, lawyers they directly worked with
 - Jack Douglas Teixeira 
   - lawyers for: Brendan O. Kelley, Gene Allen Franco, Joshua Robert Hayne (withdrawn), Michael Bachrach
   - lawyers against: Nadine Pellegrini, Jared C. Dolan, Jason A. Casey, Christina A. Clark, Joshua Levy
 - Daniel Everette Hale
   - lawyers for: Todd Richman, Cadence Mertz, Ruth Vinson, Tor Ekeland, Jesselyn Radack
   - lawyers against: Gordon Kromberg, Alexander Berrang, Heather M. Schmidt
 - Reality Leigh Winner
   - lawyers for: Titus Nichols, Alison Grinter Allen, Joe D. Whitley, Matthew S. Chester
   - lawyers against: Julie A. Edelstein, Jennifer G. Solari, Bobby L. Christine
 - Terry J Albury 
   - lawyers for: JaneAnne Murray, Joshua L. Dratel
   - lawyers against: Danya E. Atiyeh, Patrick T. Murphy, David C. Recker
 - Joshua Adam Schulte
   - lawyers for: Joshua Adam Schulte (represented self), Sabrina P. Shroff, Deborah Austern Colson (withdrawn), Matthew B. Larsen, Sean Michael Maher, James Matthew Branden, Lauren Martine Dolecki, Edward S Zas, Allegra Glashausser
   - lawyers against: David W. Denton Jr., Michael D. Lockard, Nicholas S. Bradley, Sidhardha Kamaraju, Matthew Laroche, Scott McCulloch, Damian Williams (supervisory), Geoffrey S. Berman (supervisory)
 - James Hitselberger
   - lawyers for: Mary Manning Petras, Rosanna Margaret Taormina, A. J. Kramer, Carlos J. Vanegas
   - lawyers against: Jay I. Bratt, Mona N. Sahaf, Thomas A. Bednar, Deborah A. Curtis
 - Donald Sachtleben
   - lawyers for: Charles C. Hayes, Kathleen M. Sweeney, Larry A. Mackey
   - lawyers against: Jonathan M. Malis, G. Michael Harvey, Richard S. Scott, Mona N. Sahaf, Steven D. DeBrota, Joseph H. Hogsett
 - Chelsea Elizabeth Manning 
   - lawyers for: David E. Coombs, Nancy Hollander, Vincent Ward, Matthew Kemkes, Paul Bouchard, Chase Strangio. ACLU
   - lawyers against: Ashden Fein, Joe Morrow, Angel Overgaard, Hunter Whyte
 - Shamai Kedem Leibowitz
   - lawyers for: Cary D. Feldman (withdrawn), Richard M. Asche
   - lawyers against: Steven M. Dunne, Kathleen M. Kedian, David Kris (supervisory), David Kris (supervisory)
 - Samuel Loring Morison
   - lawyers for: Jacob A. Stein, Robert F. Muse, Mark H. Lynch, Charles F.C. Ruff, Neil K. Roman, Steven F. Reich, Armistead P. Rood
   - lawyers against: Michael Schatzow, Michael Schatzow, Breckinridge Long Willcox. 2nd case: James G. Warwick, Rod J. Rosenstein

### Misc

 - empty


## Details of US govt whistleblowers/leakers who did not leak classified documents but leaked information, and were imprisoned

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned (sorted by date)
 - Henry Kyle Frese
 - John Chris Kiriakou
 - Stephen Jin-Woo Kim
 - Jeffrey Alexander Sterling

### Classification status

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned, classification status of leaked info (sorted by date)
 - Henry Kyle Frese
   - classified at time of leak
   - some documents TOP SECRET/SCI, some documents SECRET (?? seems unclear)
   - remains classified as of 2025
 - John Chris Kiriakou
   - classified at time of leak
   - officier identity SECRET, interrogation details TOP SECRET/SCI or CONFIDENTIAL/SECRET (??? seems unclear)
   - officer identity remains classified as of 2025, partial info about interrogation methods declassified as of 2025
 - Stephen Jin-Woo Kim
   - classified at time of leak
   - TOP SECRET/SCI
   - remains classified as of 2025
 - Jeffrey Alexander Sterling
   - classified at time of leak
   - TOP SECRET/SCI, or SECRET (??? seems unclear)
   - remains classified as of 2025

### Key dates

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned, key dates (sorted by date)
 - Henry Kyle Frese
   - first transmission 2018-04-27, first publication 2018-05-02, arrest 2019-10-09, public identity 2019-10-09, released 2022-10-14
 - John Chris Kiriakou
   - first non-classified transmission 2007-12-10, first non-classified publication 2007-12
   - identity of CIA officer deuce martinez was classified. deuce martinez involvement independently suspected in public since 2006-06-20. first classified transmission 2008-08 (email from public record in 2008-08, previous email in 2008-04 alleged), first public publication of martinez' name 2008-06-22 (Scott Shane, NYT), first classified publication in a classified legal hearing 2009, clear public publication of classified info 2015-02-18.
   - arrest 2012-01-23, public identity 2012-01-23, released 2025-02-03
 - Stephen Jin-Woo Kim
   - first transmission 2009-06, first publication 2009-06-11, indicted 2010-08-18, arrest 2010-08-24, public identity as whistleblower confirmed 2010-08-24, released 2015 (??? exact date not clear)
 - Jeffrey Alexander Sterling
   - first transmission 2003-03 (first phone call 2003-02-27 out of multiple phone calls, not clear which phone call revealed classified info), first publication 2006-01-03, arrest 2011-01-06, public identity 2011-01-06 (identity was semi-private before this), released 2018-01 (exact date unclear)

### Social circle

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned, consequences on social circle
(only done surface-level research so far)
 - Henry Kyle Frese
   - Documented social circle at the time of leak: Father, mother, 3 sisters, girlfriend
   - Documented consequences for social circle: house raid, interrogation
   - Documented prison visits: no info available
 - John Chris Kiriakou
   - Documented social circle at the time of leak: Father, mother, siblings unknown, wife, 5 children (of which 2 from wife, 3 from ex-wife)
   - Documented consequences for social circle: house raid, interrogation, wiretap, polygraph, cut off by extended circle, significant legal expenses, online and inperson verbal harassment
   - Documented prison visits: Multiple visits by spouse and children, visits by journalists
   - Misc: Public talks about how being shunned by his entire social circle was painful
 - Stephen Jin-Woo Kim
   - Documented social circle at the time of leak: Father, mother, girlfriend (later wife), 1 sister, other siblings unknown
   - Documented consequences for social circle: house raid, interrogation, significant legal expenses, online harassment
   - Documented prison visits: Multiple visits by family members, visit by journalist
   - Misc: James Rosen, journalist, visited Stephen Kim in prison to apologise.
 - Jeffrey Alexander Sterling
   - Documented social circle at the time of leak: Father, mother, multiple siblings, wife, no children
   - Documented consequences for social circle: house raid, interrogation, wiretap, significant legal expenses, online harassment
   - Documented prison visits: Multiple visits by wife. Some journalists were allowed visits and others were denied.
   - Misc: Lost house and nearly went bankrupt due to legal fees. Supported by wife throughout trial and imprisonment.

### Opsec/Cybersecurity

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned, opsec mistakes and arrest methods
 - todo

### Journalism

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned, journalism
 - Henry Kyle Frese
   - First publication: [China quietly installed missle systems ..., CNBC, 2018-05-02](https://www.cnbc.com/2018/05/02/china-added-missile-systems-on-spratly-islands-in-south-china-sea.html)
   - Amanda Macias, CNBC News, in romantic relationship with Frese. Courtney Kube, CNBC News.
   - Frese did not transmit original documents, hence they're not published.
   - Journalists knew source identity.
 - John Chris Kiriakou
   - First publication with name of CIA officer (considered classified info): [Inside a 9/11 Mastermind's Interrogation, the New York Times, 2008-06-22](https://www.nytimes.com/2008/06/22/world/americas/22iht-22ksm.13873332.html)
   - Scott Shane, Bill Keller (editor) - The New York Times. Info also offered to: Matthew Cole - the Intercept.
   - Both journalists and editor knew the identity of the source.
   - Misc: John Kiriakou publicly accuses Matthew Cole, journalist at the Intercept, for getting him imprisoned.
 - Stephen Jin-Woo Kim
   - First publication: [NK's Post UN Sanctions Plans, Revealed, Fox News, 2009-06-11](https://www.foxnews.com/politics/nks-post-un-sanctions-plans-revealed)
   - James Rosen, Michael Clemente (editor), Bill Sammon (editor) - Fox News
   - Kim did not send any original documents to the journalist, hence they're not published.
   - James Rosen knew the identity of the source. No public record confirming the editor knew the identity of the source. (See note above on this scenario.)
 - Jeffrey Alexander Sterling
   - First publication: State of War, James Risen, 2006-01-03, published by Free Print under Simon & Schuster. [State of War, Anna's Archive book download](https://annas-archive.org/md5/0d865e7d195887efe41c34547b1a3b21). James Risen was a journalist at the New York Times.
   - Publication does not contain original documents, hence they're not published.
   - James Risen knew identity of the source. Court record (US v Sterling) confirms James' wife Holly also knew identity of the source. Multiple intelligence community members also suspected the identity of the source. Public record is not clear on who all were informed by Sterling that he was the source.


### Law

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, imprisoned, lawyers they directly worked with
 - Henry Kyle Frese
   - lawyers for: Stuart Sears
   - lawyers against: Jennifer Kennedy Gellie, Danya E. Atiyeh, Neil Hammerstrom
 - John Chris Kiriakou
   - lawyers for: Robert Trout, Plato Cacheris, John F. Hundley, Jesse Isaac Winograd, Jesselyn Radack (advisory)
   - lawyers against: Neil H. MacBride, Mark Schneider, Iris Lan, Patrick Fitzgerald (absent), Patrick J. Fitzgerald, Ryan Fayhee, William N. Hammerstrom Jr., Lisa Owings
 - Stephen Jin-Woo Kim
   - lawyers for: Abbe D. Lowell, Paul M. Thompson, James M. Commons, Ruth Wedgwood
   - lawyers against: Michael Harvey, Jonathan M. Malis, Thomas A. Bednar, Deborah A. Curtis, Julie A. Edelstein, Ronald C. Machen Jr. (supervisory)
 - Jeffrey Alexander Sterling
   - lawyers for: Edward MacMahon, Barry Pollack, William James Trunk, J. Richard Supple Jr., Mia Haessly, Lawrence S. Robbins
   - lawyers against: James L. Trump, Eric G. Olshan, Dennis Fitzpatrick, William M. Welch II (withdrawn), Timothy Kelly, Neil H. MacBride, Dana J. Boente (supervisory), Robert A. Parker (supervisory), Leslie R. Caldwell (supervisory), Sung-Hee Suh (supervisory)

### Misc

 - empty

## Details of US govt whistleblowers/leakers who did not leak classified documents but may have leaked classified information, and were not imprisoned

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned (sorted by date)
 - Thomas Andrews Drake
 - Mark Lee Klein
 - Russell D Tice
 - Thomas M Tamm
 - Sibel Deniz Edmonds
 - Edward Loomis
 - William Edward Binney, John Kirk Wiebe
 - Perry Douglas Fellwock

### Classification status

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned (sorted by date)
 - Thomas Andrew Drake
   - not classified
   - govt alleged classified documents leak, judge declared those documents were not classified
 - Mark Lee Klein
   - leaked existence of program that may have been classified, but no classified documents 
 - Russell D Tice
   - leaked existence of classified program, but no classified documents
 - Thomas M Tamm
   - leaked existence of classified program, but no classified documents
 - Sibel Deniz Edmonds
   - leaked details that were retroactively classified after the leak, did not leak classified documents
 - Edward Loomis
   - leaked details of classified program, but no classified documents
 - William Edward Binney, John Kirk Wiebe
   - leaked existence and details of classified program, but no classified documents (as per court record)
 - Perry Douglas Fellwock
   - leaked existence and extensive amount of details of classified program, likely did not leak classified documents (as per court record)

### Key dates

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned, key dates (sorted by date)
 - Thomas Andrews Drake - first transmission to journalist 2005-11 to 2006-02 (??? exact date unclear), first publication 2006-01-29, house raid 2007-11-28, trial sentencing date 2011-07-15, no arrest
 - Mark Lee Klein - first transmission 2006-01 to 2006-02 (??? exact date unclear), first publication 2006-04-06 (??? unclear if there was a previous publication), no house raid / indictment / arrest
 - Russell D Tice - first transmission to journalist 2004 (??? exact date unclear), first internal complaint to DoD IG 2004 or 2005 (??? exact date unclear), security clearance revoked 2005-05 (??? exact date unclear), first publication 2005-12-16, no house raid / indictment / arrest
 - Thomas M Tamm - first transmission 2006-03 to 2006-06 (??? exact date unclear), first publication 2005-12-16, house raid 2007-08-01, public identity 2008-12-13, no indictment, investigation formally closed 2011-04
 - Sibel Deniz Edmonds - first internal complaint 2001-12-02, fired 2002-03-22, first transmission 2002 (??? exact date unclear), first publication (TV interview) 2002-10-27
 - Edward Loomis - first internal complaint 2002-11-09, no transmission of secret info to outside sources (??? seems unclear), house raid 2007-07-26, no indictment
 - William Edward Binney, John Kirk Wiebe - binney resigned 2001-10-31, first internal complaint (both) 2002-11-09, first transmission todo, first publication ??? (??? exact date not clear), house raid (both) 2007-07-26, no indictment, public identity 2011 (??? exact date unclear)
 - Perry Douglas Fellwock - first transmission 1972 (??? exact date not clear), first publication 1972-08, public identity 1972-07-18, no house raid, no indictment

### Social circle

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned, consequences on social circle
 - todo

### Opsec/Cybersecurity

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned, opsec mistakes and investigation methods
 - todo

### Journalism

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned, journalists they directly worked with
 - todo

 - Thomas Andrews Drake
   - First publication: [Biggest boondogle going on now, Baltimore Sun, 2006-01-29](https://baltimoresun.newspapers.com/newspage/248566127/)
   - No original documents published
   - Siobhan Gorman, Timothy A Franklin (editor) - Baltimore Sun
   - Diane S Roark knew identity of the source. No public record confirming anyone at Baltimore Sun (including Siobhan Gorman or the editor) knew identity of the source. Anonymous encrypted email tip.
 - Mark Lee Klein
   - todo
 - Russell D Tice
   - First publication: Same as Thomas Tamm, listed below. All details similar.
   - ??? not clear who all knew identity of source. Speculation is all three of them knew. - todo
 - Thomas M Tamm
   - First publication: [Bush lets US spy on callers without courts, The New York Times, 2005-12-16](https://www.nytimes.com/2005/12/16/politics/bush-lets-us-spy-on-callers-without-courts.html)
   - No original documents published
   - James Risen, Eric Lichtblau, Bill Keller (editor) - The New York Times
   - ??? not clear who all knew identity of source. - todo
 - Sibel Deniz Edmonds
   - First publication: [FBI whistleblower Sibel Edmonds interview, CBS News, 2002-10-27](https://www.youtube.com/watch?v=nlCoabsjmPI)
   - No original documents published
   - Ed Bradley (correspondent i.e. main reporter on TV), David Kohn (writer), Don Hewitt (producer), Philip Scheffler (editor)
   - Source declared identity publicly in same interview.
 - Edward Loomis
   - As per public record, no media publication directly used sensitive info from him. (He gave TV interview in 2013 but all details mentioned were public record by then.)
 - William Edward Binney, John Kirk Wiebe
   - todo
 - Perry Douglas Fellwock
   - todo


### Law

US govt whistleblowers and leakers, did not leak classified documents but may have leaked classified information, not imprisoned, lawyers they directly worked with
 - Thomas Andrews Drake
   - lawyers for: James Wyda, Deborah Boardman, Jesselyn Radack, Meghan A. Skelton, James Bamford (advisory)
   - lawyers against: William M. Welch II, John P. Pearson, Lanny A. Breuer, Steven Tyrrell
 - Mark Lee Klein
   - Note: No trial against Mark Lee Klein directly, trials were fought by EFF against AT&T and US govt. Landmark case: Jewel v NSA.
   - lawyers for: EFF legal team (Kurt Opsahl, Kevin S. Bankston, Cindy Cohn, Lee Tien, James S. Tyre, Corynne McSherry, Mark Rumold, Jamie L. Williams, Andrew Crocker, James S. Tyre), Bert Voorhees, Theresa M. Traber, Keker and Van Nest LLP (Rachael E. Meny, Benjamin W. Berkowitz , Michael S. Kwun, Audrey Walton-Hadlock, Philip J. Tassin), Richard R. Wiebe, Aram Antaramian, Thomas E. Moore III
   - lawyers against: Representing AT&T/telecoms: Michael Kellogg, Brian Matthew Boynton, Sidley Austin LLP (Bradford Allan Berenson, Eric Dean McArthur, Eric Shumsky), Pillsbury Winthrop Shaw Pittman LLP (Bruce A. Ericson, Kevin M. Fong) ; Representing US govt: Peter Keisler, Michael Mukasey (supervisory), Anthony Joseph Coppolino, Thomas Mark Bondy, Kevin V. Ryan, Carl J. Nichols, Joseph H. Hunt, Andrew H. Tannenbaum
 - Russell D Tice
   - Note: No trial against Russell D Tice directly.
   - lawyers for: Mark Zaid, Tom Devine, Jesselyn Radack, Roy W. Krieger
   - lawyers against: Alberto R. Gonzales, David Kris, Paul J. McNulty (related hearing), Robert L. Deitz (related hearing)
 - Thomas M Tamm
   - Note: No trial against Thomas M Tamm for whistleblowing, trial was for revoking bar license. He won the trial and kept his license.
   - lawyers for: Paul Kemp, Michael Frisch, Cary Feldman, Asa Hutchinson
   - lawyers against: Hamilton P. Fox III, Gene Shipp
 - Sibel Deniz Edmonds
   - lawyers for: Mark S. Zaid, Michael D. Kohn, ACLU legal team (Benjamin Wizner, Ann Beeson, Art Spitzer, Melissa Goodman), Eric Seiff, Roy W. Krieger
   - lawyers against: John Ashcroft (supervisory), Paul D. Clement, Peter D. Keisler, Douglas Letter, H. Thomas Byron III, Vesper Mei, Valerie Caproni, Kimberly Dawn Ziropoulos, Bruce Fein, Dan Marino
 - Edward Loomis
   - Note: No trial against Edward Loomis
   - lawyers for: Jesselyn Radack (advisory)
   - lawyers against: none
 - William Edward Binney, John Kirk Wiebe
   - Note: No important trial involving William Edward Binney or John Kirk Wiebe. Main trials were against Thomas Andrews Drake, and the landmark case Jewel v NSA. William Edward Binney had to sue only to retrieve his personal belongings taken from him during house raid.
   - lawyers for: John K. Wiebe, William Edward Binney 
   - lawyers against: Rod J. Rosenstein (supervisory)
 - Perry Douglas Fellwock
   - Note: No trial involving Perry Douglas Fellwock
   - lawyers for: no public info (??? seems unclear)
   - lawyers against: none

### Misc

 - empty

## Details of US govt whistleblowers/leakers who did not leak classified documents or information, and were not imprisoned

US govt whistleblowers and leakers, did not leak any classified documents or information, not imprisoned (sorted by date, incomplete list)
 - John R Crane
 - James Robertson
 - Robert J. MacLean - leaked SSI, which is unclassified but restricted
 - Diane Roark
 - todo

more info
 - todo

## Details of spies against US govt

 - todo


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_projects/my_projects.md

2025-07-11

# My projects

#### Search Engine for Books

Time duration of project: 2024-07 to 2025-01

Update (2025-06-05): [HN post with 250+ upvotes](https://news.ycombinator.com/item?id=44176514) says there's demand for this. Is there???

 - **Not hosting anymore due to hosting cost. Can host if some users are willing to pay for access.**
 - Uses AI (openai text-embedding-3-small) to search a large collection of books (libgen english epubs)
 - Intended for researchers
 - [Contact me](../connect_with_me/contact_me.md) to try it out
   - Free for 30 min = Search entire libgen, 900 seconds per search (Contact me and I'll host it just for you)
   - Pay $50/mo = Search entire libgen, 900 seconds per search. (I'll rent 4 TB SSD, and load Qdrant snapshot)
   - Pay $100/mo = Search 10% of libgen, <5 seconds per search. (I'll rent 256 GB RAM, and load DragonflyDB snapshot)
   - Pay $800/mo = Search entire libgen, <5 seconds per search. (I'll rent 2 TB RAM, and load DragonflyDB snapshot)
 - Support this project
   - Inform me if any cloud provider is hosting fast SSDs (>8 GB/s sequential read throughput).
   - Find paid users / donate / host it yourself. Contact me to request access to the embeddings.

More notes
 - Target market
   - I'm guessing this will be more useful for people who want to do intermediate-level of research of existing work.
   - Basic: If you've not already done 50 google searches and quickly skimmed through top 3 standard books (let's say reddit recommendations) on a topic, you might want to go do that first before using this search engine.
   - Intermediate: If you've already done this much basic research and want more recommendations, then this search engine is for you.
   - Advanced: Let's say you're a PhD researcher who has already spent multiple years on a topic, and you already have spent significant time designing a custom solution for how to filter the latest papers. This search engine is likely not as useful for you. But it couldn't hurt to try.
 - Budget
   - Total spent out-of-pocket so far = ~$2600 = ~$1000 (openai embedding API) + ~$1600 (CPU, disk, bandwidth etc)
   - Ongoing spend = $26/mo ($24/mo hetzner + $2/mo aws s3 DA; storing embeddings and snapshots, in case someone wants to host this in future)
 - Developer notes
   - Dataset = ~2 TB embeddings ~300M vectors; from ~300 GB plaintext; from ~7 TB ~700k unique english epubs; selected from ~65 TB libgen database
   - Embedding model = openai text-embedding-3-small
   - Database and search algo = Qdrant (slow, cheap, disk-based embedding search), DragonflyDB (fast, expensive, RAM-based embedding search). Both tested.
   - Languages/Frameworks used = perl, bash, nginx, .... mojolicious, jq, htmlq, gnu parallel,
 - More Developer notes
   - Used bash and perl pipelines in all steps (extracing plaintext from epubs, converting to openai jsonl format, queueing them for openai servers, loading results into DB) to max out disk throughput
     - Abandoned implementation in nodejs and python in order to avoid memory overflow and increase disk throughput.
     - Had to figure out some tricks to ensure the entire codebase operates as pipeline, not batch-wise. For instance unzipping epubs in memory not disk to avoid hitting disk I/O limits.
   - OpenAI BatchAPI rate limit documentation is bad, had to figure out some hacks like sending 25 "requests" per batch file, 2048 strings per "request", 20 batch files at a time.
     - This allowed me to process the queue in 2 weeks instead of 6 months.
   - Used OpenAI text-embedding-3-small
     - Abandoned an open source model on rented vast.ai GPU due to bad search accuracy. Realised many embedding models are overfit to MTEB and perform poorly on real data.
   - Hetzner + DragonflyDB worked out a lot cheaper than any hosted embedding search solution.
     - Used and then abandoned Pinecone due to cost.
   - Byte numbers used for indexing.
     - Wrote and then abandoned my own custom CFI parser in javascript. Realised that since there's no reference CFI spec, every library does its own custom implementation that doesn't match spec. Hence the spec is not worth following.
   - Research into better embedding search.
     - Realised embedding search is better than finetuning, prompt injection or any other method as of 2025-01.
     - As of 2025-05 I think the bottleneck to faster embedding search on > 100 GB plaintext is a cloud provider that hosts fast disks. Disks with > 8 GB/s sequential read speed are available for consumers but cloud providers are still stuck with 1 GB/s.
     - Understood different state-of-the-art embedding search algos and implementations such as microsoft diskANN, google scANN, FAISS. Implemented locality-sensitive hashing. Understood why in-memory databases are difficult to program. Understood why graph-based methods (like HNSW) outperform geometric-based methods (like LSH). More notes on this on my website or elsewhere.

#### Real-time voice translation

Time duration of project: 1-3 days

Objective:
 - Translate Alice's voice for Bob to hear in Bob's language. Translate Bob's voice for Alice to hear in Alice's language.
 - Neither person should hear translation of their own voice.
 - Alice and Bob could be in the same room physically or in different rooms.
 - Neither person should hear noise due to closed loop between a mic and speaker.

Zero-code solution

1. Open: Realtime API in openAI playground in macOS Safari. Input: macOS mic
2. Open: Zoom. Input: Loopback Audio. Output: macOS speaker
3. Open: Rogue Amoeba Loopback.app. Create new device. Safari 1&2 -> Channels 1&2

Do this on only one device for translation one way. Do this on both devices for translation both ways.

Once you have this setup working, you can also connect headphones for better noise cancellation if both people are in the same room. Only change required is Zoom Output: Headphone

Prepend each prompt with "translate to French/Chinese/etc" either by speaking these 3 words aloud, or by writing an app that can do it automatically. (I can host this if there's demand.)

#### Tokens for tokens

Time duration of project: 1-3 days

 - Not hosting anymore due to lack of user demand
 - OpenRouter offers the same feature now
 - Pay for OpenAI API using cryptocurrency, at discounted rate, anonymously
 - AI model: openai o1; Payment provider: Optimism Rollup (less tx fees compared to ethereum mainnet); Currency supported: USDC

#### Screenshots

![Booksearch Screenshot](../../non_text_non_video/booksearch_screenshot.png "Booksearch Screenshot")

![Tokens for tokens Screenshot](../../non_text_non_video/tokensfortokens_screenshot.png "Tokens for tokens Screenshot")


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_statements/declaration_of_war_against_ai_companies.md

2025-05-28

# Declaration of war against AI companies

Update: I now have a document called "information policy" that describes in more detail when I do and don't respect privacy for others.

#### Disclaimer

 - This document contains politically sensitive info.

#### Summary

 - I support most non-violent methods of stealing information from superintelligent AI compaies and the govts, hardware manufacturers and investors that support these companies.

## Main

As of 2025-04, I endorse the following actions that violate liberal consent-based norms against people involved in building superintelligent AI.
 - If you commit any of the mentioned actions against the mentioned people with any of the mentioned goals in mind, I morally support you.
 - I can't define very strict boundaries, but I will define them loosely.

#### Actions

I morally support you if you do this
 - Without consent, publicly publish interviews of them or anyone in their social circle.
 - Without consent, publicly publish video recordings of them at home or at the company office or anywhere else. Includes recording made by computer/mobile camera, cctv or drone.
 - Without consent, plant recording devices such as cameras in their homes or offices.
 - Hack their personal or work computers.
 - Steal their personal or work computers or disks by breaking into their houses or offices but without use of violence on any person.
 - Without consent, publicly publish any material found on their personal or work computers or disks, obtained by hacking or non-violent theft.
 - Deanonymise any of their online profiles
 - Without consent, publicly publish any material found on their online profiles, including those found by deanonymising them

I do not morally support you if you do this
 - Fake significant emotional investment (as a romantic partner, friend, family member, work colleague etc) in order to infiltrate their social circle
 - Threaten physical violence against them
 - Physical violence against them, including injury or murder

#### People

List of companies publicly aiming to build superintelligent AI:
 - Incomplete list as of 2025-04: Deepmind, OpenAI, Anthropic, Meta, xAI, SSI, Reflection AI, Deepseek (China), AI21Labs (Israel).

I morally support you if you take the specified actions on any of these people:
 - Employee roles
   - All tech employees working at an ASI company.
 - Leadership roles
   - Leadership of an ASI company.
   - Major funders of an ASI company.
     - Includes leadership of Microsoft, Google, Apple.
     - Includes any private investors funding >$100M to an ASI company.
   - Leadership of semiconfuctor manufacturer TSMC.
   - Leadership of semiconductor design companies including Nvidia, AMD and Apple.
   - Leadership of intelligence community or executive branch of USA, UK, Israel, China or Taiwan.

I do not morally support you if you take the specified actions of any of these people:
 - AI researchers not working at a company publicly aiming to build superintelligent AI.
 - Investors funding AI companies that are not publicly aiming to build superintelligent AI.
 - Employees at semiconductor manufacture or design companies.
 - Anyone else on Earth.

#### Goals

I morally support you if this is your primary goal
 - Acquire data about these people in order to share it with the public, even if doing so causes some suffering to them
   - I'm assuming that you might therefore make an attempt to redact information not relevant to this goal. For example, this includes information about their personal lives that does not give you information about their values.
   - I'm assuming that people in the public may find some way of meaningfully using this information to further societal change such as a global ban on ASI.

I do not morally support you if this is your primary goal
 - Intentionally increase suffering of these people
   - Assumed that intentionally increasing suffering may be done to directly dissuade people from working on ASI by spreading fear
 - Sell the information for money or political influence (exceptions exist)

#### Why?

 - I think there's a significant probability inventing ASI is going to lead to either extinction of the human race, or a global dictatorship with expected lifespan >100 years.
 - I think there's a significant probability one of these companies will succeed in building ASI by 2030.
 - For more exact probability estimates or reasoning behind the estimates, read my other post or go ask someone else who has also thought about it. Many of the people on [this list](https://safe.ai/work/statement-on-ai-risk) have made public statements in various podcasts, blogposts and so on.

#### Proof of non-violence

 - Evidence that I do not currently support or engage in violent actions as of 2025-04
   - I have no criminal record.
   - If I am investigated by law enforcement, I will comply with said investigation.
   - My opsec is not very strict.
   - You can probably get a meeting with me or people in my social circle if you have a good reason to do so.
 - If you are considering committing violence, do not contact me. Like I said, I will comply with any investigation against me.

#### Potential consequences of declaring war

(As of 2025-04, I am willing to accept all the consequences of my actions that I can foresee.)

External, world
 - People working at AI companies or in govt may take hostile actions against people similar to me.
 - People working at AI companies or in govt may refuse to be persuaded by arguments around ASI timelines, x-risk, totalitarian risk etc. (Applying force often means giving up on persuasion.)
 - People working at AI companies may endure significant suffering as a result of information stolen. In worst case they may commit suicide or commit significant crimes in response.
 - People working on AI safety and governance but via more peaceful approaches may isolate from me, and may take hostile actions against me.
 - All these hostilities may last multiple generations and I may be permanently reducing the success rates of anyone's peaceful approaches at solving this problem.

External, me
 - I may be permanently damaging my reputation, and people's ability to trust me.
   - Since I have endorsed this set of actions on this particular target, people will correctly predict from their point of view that on a later date, there's increased likelihood that I may endorse this or other actions on another target.
 - People following me may escalate to violence against AI companies. I will be partially responsible for the same as I provided moral support for similar actions through this document.
 - I might be investigated by law enforcement if crimes against AI companies' occur, including violent crimes.
 - People working at AI companies or in govt may isolate from me, and may take hostile actions against me.
 - I may be permanently reducing the set of people who are willing to interact with me in any capacity, be it work or personal.

Internal
 - I may be permanently or temporarily damaging my ability to empathise or connect with other people. (People who believe in pacifict principles seem happier to me on average.)

#### Morality of war

 - I don't have an airtight moral justification for entering this conflict. But my guess is I am morally okay with it and I am unlikely to regret it later in my life.
 - Some claims I do believe are below. They're not defined very rigorously or presented very clearly.
   - Incentives shape morality
     - People often take self-interested actions available to them in their circumstances and then retroactively invent moral justifications for it.
     - The nicer a person's circumstances are to begin with, the nicer their morality can be in response to their circumstances.
     - The circumstances around inventing ASI are sufficiently dire that being nicer than the circumstances is a very low bar. I am clearing this bar and therefore I feel morally okay with my actions.
     - There are lots of incentives pulling the world towards more transparency and less privacy. It seems very likely to me that, in absence of atleast a significant coordinated effort otherwise, the end state equilibrium likely includes:
       - less privacy for everyone as compared to privacy everyone has in 2025
       - more people inventing and then following moralities that accept this end state as normal or good
     - Given that the end state equilibrium is likely this anyway, I would like to ensure this equilibrium also features increased transparency of elites who currently have not earned high level of trust from me (or I'm guessing, high level of trust of the general public).
   - [Power buys you distance from the crime](https://acesounderglass.com/2019/08/01/power-buys-you-distance-from-the-crime/)
     - If I am willing to publish stolen information and I wish to minimise probability of regretting my actions later, then I should also be willing to endorse stealing the information in the first place. This is a heuristic not an absolute rule.
   - Value stability
     - I have been thinking along similar lines since 2023. It is 2025 now. This indicates I'm less likely to regret my actions later.
   - Great filter
     - There is significant likelihood someone will deploy an intelligence-enhancing technology by 2050, unless there is significant effort by many people to prevent this outcome.
     - I want to have an impact on the future post-2050 that is not zero. This means I have to pick a life trajectory that explicitly acknowledges the fact that these technologies may be coming. There are a small number of such life trajectories and I am picking one of them.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/my_statements/ban_superintelligent_ai.md

2025-04-29

# Ban superintelligent AI

I support a complete international ban on building superintelligent AI.
 - The ban must apply to every person and organisation on Earth with no exceptions.
 - I support passing laws in all countries that ban building superintelligent AI, and allow imprisoning any developers violating this law.
 - I support any country risking nuclear war to prevent other countries from building superintelligent AI.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/projects_for_you/project_ideas_for_you.md

2025-07-11

To do for self
 - rewrite this whole post once you have more clarity. this post is somewhat of a mess right now.

# Project ideas for you

#### Update

 - Update (2025-07-11)
   - The key assumption I make is that if I could get global referendum on whether specific technologies should be built (artificial superintelligence, human genetic engineering, etc), they will agree with me on a pause, despite coming from very different cultures as me.
     - This is an assumption I still need to validate sufficiently so I'm confident in it myself.

 - Update (2025-06-18)
   - I think the main problem with this post as it's currently written is that it discusses projects and technologies too much in isolation and not in terms of the "movements" they unleash on the world.
     - A "movement" includes a cluster of technologies, an ideology/morality, a mechanism of acquiring capital or attention, and an equilibrium (in the resulting power struggle) that decides how that capital and attention ends up being distributed.
   - Current views
     - I'm weakly confident it's net bad to accelerate ASI.
     - I'm highly confident it's net good to accelerate energy.
     - I'm weakly confident it's net good to accelerate societal transparency using internet.
     - I'm weakly confident it's net good to accelerate societal transparency using internet, before accelerating any intelligence-enhancing tech (ASI, human genetic engg, BCIs, human connectome research, neuropharmacology)
     - In general I find human genetic engg and BCIs as poorly studied movements, atleast in terms of supportive ideologies and resulting distribution of power.
     - I'm weakly confident atleast some people should be trying to search for and create more movements.
   - Summary of the post
     - accelerate: solar energy, fusion energy
     - stop: artificial superintelligence, human genetic engineering, genetic engineering more broadly, brain computer interfaces
     - stop: bioweapons
     - accelerate: internet anonymity
     - accelerate: psychology, erase language barrier
     - ignore unless special insight: quantum computing, carbon capture, anti-aging, nanotech, cryonics, macroeconomics, etc
       - There's a long list of fields I've invested some time into and advice you not to invest your time into.
     - no opinion yet: solar geoengineering

#### Intro

If you don't want to read a lengthy post, just look at these price charts. Many of the projects I suggest are direct implications of these graphs.

Historical price charts
 - [Computer memory, storage](https://ourworldindata.org/grapher/historical-cost-of-computer-memory-and-storage)
 - [GPUs](https://ourworldindata.org/grapher/gpu-price-performance)
 - [DNA sequencing](https://ourworldindata.org/grapher/cost-of-sequencing-a-full-human-genome)
 - [Solar PV modules](https://ourworldindata.org/grapher/solar-pv-prices)
 - [Battery cells](https://ourworldindata.org/grapher/average-battery-cell-price)
 - Sensors
   - I can't find a good graph for this yet but cost of sensors (electric sensors, light sensors, transducers) has gone down, which can potentially drive down cost of scientific instruments in basically every field. Cost reduction in scientific instruments in turn drives scientific progress.
   - Example: gigapixel camera lenses, DNA nanopore sequencers, brain computer interfaces - all these research fields as of 2025 seem to be working more on electronic hardware and software, than on optics or genetics or neuroscience respectively (pls correct me if I'm wrong).

#### Full post

What?
 - Here's a (incomplete) list of projects that might be worth pursuing.
 - Feel free to complete any of them if you find them interesting enough.
 - You'll probably also get other ideas if you browse through my website. I hoping to minimise duplication of info on my website.
 - I might work on some of these myself in future.

Important?
 - What is "important" depends on one's worldview.
 - This list is biased towards my worldview.
 - I would encourage to form your own independent view of what is important or meaningful to you and select projects based on that, instead of blindly copying my view.
 - Since your ability to succeed at the project depends on your skills and your interests, it seems to me worth listing many options, rather than few.
 - It's possible some of these ideas are completely off-base. If so, please let me know.

For each of the projects listed, you can directly work on the project. Or you can do "meta" work that enables other people to complete the projects. Depending on the project, you can identify which of these meta work will be more useful and which will be less useful.

Examples of meta work
 - find people who can fund the work
 - find people interested in doing the work and connect them to each other
 - raise media attention for the work to find more interested people
 - figure out business models if possible (Usually the most impactful+tractable+neglected opportunities to improve the world are hard-to-commercialise. Once you can find a way to commercialise them they become less neglected)
 - set up research labs or offices with infrastructure conducive to the work, and reduce entry barrier for interested people to get access to the infra
 - publish reviews of existing work. write about failed agendas so researchers won't waste time on the failed ones.
 - train people who are interested in doing the work but lack skills


## Movement: Stop superintelligent AI

#### AI research - artificial superintelligence

 - Please read my post on [AI timelines](../my_ideas/my_ai_timelines.md) or related posts made by others.
   - They explain my guesses on why the AI labs in San Francisco might invent artifical superintelligence in the next 10 years, and what its outcomes could be.
   - If such arguments persuade you, consider working in the field of AGI/ASI safety, and publicising this issue to more people.
 - Consider publicising this issue to more people.
   - Make a public statement on youtube indicating your beliefs, and maybe why. Here's [my video statement](https://youtube.com/shorts/T40AeAbGIcg?si=o05tezYK5ojX5Tq_).
   - Random idea: Consider uploading a youtube where you print LLM weights on sheets, hire ~100 people for a year (or pretend to) and ask them to compute a forward pass by hand. My guess is this will make it easier for a layman to understand how AI actually works.
   - Random idea: Host older models like GPT2 (125M, 2019) and GPT2-NeoX (20B, 2022) along with newer models like GPT-3.5-Turbo (175B?, 2021) and o3 (2T?, 2025) so that non-technical people get a better grasp of scaling laws. This is trivial to do using huggingface TGI but needs >$100/mo in funding for GPUs.
 - I hesitate on making specific recommendations on which exact project or job role is good, you have to figure that out for yourself.
   - Please don't blindly defer to others when making this decision.
   - I think lesswrong forum is a great starting point for your exploration, but don't get stuck there. 
     - I think debating on an internet forum is by far not the highest leverage move you have available to you.
     - I think a lot of org leaders in EA/LW spaces are self-interestedly protecting their own reputations when they invoke arguments like "unilateralist curse".
     - I also think there's the usual phenomenon of people who just like maintaining a steady tech job and arguing on an internet forum, and will invent rationalisations for why that's the best thing for everyone to do. If you are a high-agency person, don't blindly take advice from low-agency people.


## Movement: Increase societal transparency using internet

#### Anyone

Life
 - Live a unique or unusual life if you feel you want to do this.
   - You are normalising other people also leading unique lives.
   - You will create more raw data on what the outcomes of living that particular life is like. If enough people do this, eventually some people are likely to find better ways of living.

Writing
 - Write a book about your life that will be released after you die. Prefer including information that you don't feel safe sharing while you are still alive.
   - Objective: People of later generations will get unfiltered view of the good and bad in your life, and can use this information to make their lives and their society better.
   - If the information will negatively affect other people who'll outlive you (such as family members), consider leaving the book with a young lawyer / trusted person and a date when they should release the book.
   - Publishing it (getting an ISBN or whatever) is optional. Put the book online and pay a trusted person to ensure the server hosting it doesn't ever go down.
   - You can also start a non-profit that popularises this "autobiography after death" idea and handles legal, payment, webhosting aspects for everyone else who wants to launch such a book.

 - If you hold any well-paying job, record videos of your day-to-day life at work. This will help others more quickly acquire the necessary tacit knowledge for them to get your job.

#### Politics

 - Consider starting a youtube channel and getting enough attention to influence politics.
   - Which niche you want your initial userbase to come from is upto you, the important thing is to not stop there and go for mass appeal eventually.
   - Which country's politics you'd like to influence is upto you.
     - You probably have most knowledge about the country you live in.
     - You might be able to take a neutral or contrarian stance on a country if you don't live in it. This might improve collective truth-seeking.

#### Media/journalism

 - Consider starting an independent espionage org to infiltrate and HD video record an org doing something of questionable morality. For example, the Mole (anti-NK) or Edward Snowden (anti-US).
   - Consider NOT mentioning my name anywhere; I make no public claims on which orgs do or don't deserve to be spied on.
 - Consider operating a SecureDrop server at your media org
   - This plan requires 3 people, a journalist who can filter and publish info, a sysadmin who can manage server and a politically connected individual who provides immunity to the other two.

#### Biotech

Research
 - Look into feasibility of mics and cameras implanted into human body. (If this turns out feasible, carefully pick who to disclose this information to.)

#### Cybersecurity

Institutions
 - Consider building an independent cybersecurity team that attacks powerful institutions and publishes its hacks to the public, instead of selling its services or zerodays to an intelligence agency.
   - You can hack a large company or govt. I make no public claims on which companies deserve to be attacked.
   - I make no public claims on what you should redact before publishing it. The minimum is a handful of corporate documents which you consider incriminating, the maximum is unredacted DB snapshots.
   - For practical reasons, you're likely to be more effective if you pick targets outside of your nuclear state and its allies. If you're picking targets within your state, consider applying for asylum outside of it, after the attack.
   - Many people are likely to see this as net negative. I don't share their view, but also there are nuances here which I haven't figured yet.
 - Consider operating a SecureDrop server at a media org
   - This plan requires 3 people, a journalist who can filter and publish info, a sysadmin who can manage the SecureDrop server and a politically connected individual who provides immunity to the other two.

Hardware
 - Improve airgapped tech
   - Test all the Faraday cages available online for security. Suggest recommendations based on security, price, blending in with society, etc
   - Handmade radios that can read wifi IP packets. (I'm not sure this is even possible.)

Software
 - Improve internet anonymity
   - Payment-based (or proof-of-work-based?) firewalls so server owners don't need to deanonymise and demand "reputation-building" of IP addresses to defend against DDOS. Proton and Brave already have working implementations of proof-of-work captcha, I'm unsure why they haven't replaced Cloudflare yet. So you could figure that out.
   - Pay-for-bandwidth on Tor (or something Tor-like), so nobody has to altruistically donate bandwidth on Tor. In theory, it seems trivial to me to put three Monero payments inside the three hops.
   - Invent something Tor-like that resists analysis of metadata and traffic stats better than Tor
 - Improve airgapped tech
   - Invent a cli tool that's pgp but doesn't suck.
     - I'm actually upset by how much human potential has already been wasted, compared to a world where pgp cli wasn't cumbersome to use.
   - Increase ease-of-use of LUKS disk encryption.
   - Help with mature implementation of Shamir secret-sharing that can ship with linux.
     - `ssss` already exists, I'm not sure what's the process to make it as trustworthy as pgp for example. Probably just more cybersecurity researchers using it and reporting issues?
   - Create a Linux phone that doesn't suck.
     - Will need to get popular enough that all popular apps also ship a linux version of their app. Some apps can run as websites and send push notifications with browser in background. Other apps may benefit from running locally (such as maps, requires GPS).
     - ios and android (and the mobile browsers) would be a lot less locked down if linux phones existed as a competitor. It will be easier to configure the firewall yourself, install custom apps and trust that there isn't a software backdoor in the OS.
 - Misc
   - Good SFTP GUI clients for android and ios (iphone)
     - Support image/audio/video thumbnails and previews, GUI file browser, support internet connections that are really good (low latency) and really bad (high latency, high packet loss)
     - ios filesystem probably sucks on purpose so people are forced to purchase icloud subscription (or purchase macbook+airdrop) instead of renting a linux server. The bigger move to play here is to launch a linux phone (see above).
   - Look into Denuvo game crack and study state-of-the-art in code obfuscation and reverse engineering.
     - Using deliberately complicated data structures is one of the last defences against a world of complete transparency, see how well it actually holds up versus not.

Writing
 - Write about all the ways AI is automating the cyberattack and cyberdefence pipelines, including pentesting, reverse engineering, static analysis, and so on.
 - Write about hardware backdoors.
 - Write about cyberwarfare from a game theory lens.
 - Find zerodays, leaked identity databases, hacking software etc and publish about them publicly.
   - It would be nice to get more public information on what state of the art looks like here.
   - There's a responsible way to do this, figure that out for yourself and follow it.

#### AI and information tech

Software
 - LLM embedding search projects
   - Obtain upperbound on effectiveness of LLM embedding search-based stylometric doxxing.
   - Datasets and embeddings
     - LLM embedding search over Libgen books' text.
       - Update: I've done this.
     - LLM embedding search over CommonCrawl plaintext.
     - LLM embedding search over ethereum blobdata or another hard-to-censor data store.
     - LLM embedding search over whistleblower-leaked datasets.
   - Programs
     - cli tool to convert all file formats to plaintext.
     - Standardised file formats to share embeddings
 - LLM language translation projects
   - Why?
     - People of different countries have significant differences in the values they live. Erasing language barrier may lead to more homogenisation of these values globally.
   - Integrate LLM language translation better with browser and OS.
   - Realtime voice language translation for in-person conversation. Connect to two pairs of headphones and two screens. Play the translation while the person is still speaking, and subtract the original from the headphone mix. [Example by Twilio](https://www.loom.com/share/71498319660943638e1ef2c9928bcd2a)
     - Pricing is an issue, will reduce with time as GPU FLOP/$ goes down. As of 2025-04, gpt-4o-mini realtime API costs for audio output ~ `($20/1M tokens) * (20 tokens/s) = $0.0004/s = $1.44/hour`
     - Update: I found a hack for this. Use rogue amoeba loopback to pipe safari audio output (openai playground output) to zoom input.
 - Language learning content
   - Possibly produced by video generation AI
   - Produce video content aimed at language learning based on "comprehensible input". (Video content of real-life scenarios where visuals help explain to the viewer what the audio means.). Prefer near-zero large gaps in the audio, entertaining not just educational, and progressively increase vocab size and grammar complexity based on the learner's language level. Repeat this for all pairs of popular source and destination languages.
 - One-click LLM-generated summaries of every ~10 pages of any libgen epub.

Infra
 - Hard disk "dead drop" marketplace - as a replacement to torrent for large datasets

#### Any STEM researcher

Writing
 - Make a blogpost with photos of the latest equipment in your lab, short descriptions of what each equipment does, and short videos showing how to operate it.
   - Target audience: STEM-degree holder who know nothing about your research field
   - Objective: someone can quickly get an idea of what your field of research does on a day-to-day basis, without having to complete a bunch of courses.
   - Ideally you could even publish video recordings of all your experiments which are published in journals. It will be easier to understand, replicate and notice errors in experiments if there is publicly available video recording of them.
   - Update: jove.com seems like it already does this. It's not open source though, seems to follow journal subscription model. Open source resource may still have value.
 - Identify Hamming questions in your research field and write about them. Hamming questions are basically the set of questions that would accelerate the research field *the most* if solved, and have a non-negligible chance of being solved if people tried solving them. For many fields, the Hamming question is IMO inventing a data collection tool (like microscope, brain probe, cyclotron, etc.) that provides data at a resolution/cost/speed not available with existing tools.
 - In general if you have any sort of expertise that is rare on Earth, consider writing about it. If you're patient you may be positively surprised how many people read it.

#### Psychology

 - Consider starting a youtube channel on psychology or or cultural issues.
   - IMO a lot of problems in modern world can be solved through changes in culture, even if there are no significant changes in material circumstances. So a few people like you could many countries' populations on a better path.
 - Psychology research using internet.
   - Internet is bringing to public light a lot of private information about people's life experiences. You can study this to advance psychology as a field.
   - If you have a sufficiently large social media following, you can also run surveys with much larger sample sizes than a handful of college students.


## Movement: Accelerate energy

#### Energy

 - I think a lot of cultural problems in the world would be improved if everyone involved was more wealthy.
   - IMO it's important wealth is measured in terms of real resources like food, water and energy consumption per capita, and not measured only in terms of dollars. The abstraction is helpful in some contexts but confusing in others.
   - Energy prices have been stuck at around $0.10/kWh for over 50 years and downstream of this, a significant fraction of humanity has been stuck in the middle and lower socioeconomic classes.
   - Bringing everyone to upper and upper-middle classes will allow everyone to have the option to lead lives more independent of each other.
   - **Giving all individuals the ability to exit unhealthy family relationships, workplaces, communities, nations etc will help fix a lot of cultural problems in society IMO.**
   - Material abundance has generally correlated with peace. I'm still unsure about the underlying dynamics though.
   - IMO this is one of those rare ideas that is radical and yet agreed on as a good prokect by most people, if they think enough about it.

Research, writing
 - Work on reducing cost of solar energy further, or write about why this can't be done.
   - AFAIK solar is already the cheapest energy source on Earth, at around $0.05/kWh. Forecasts for 2030 for seem as low as $0.02/kWh but I'm unsure.
   - I have tried reading about the entire production process but still haven't figured exactly which process improvement was the critical one. Or how to forecast future prices. There's clearly atleast 50% overlap between solar PV production process and GPU production process, except one is operating in bulk (maximise throughput) and the other is miniaturised (increase performance per unit, even if it reduces throughput).
 - Consider working on fusion energy, or writing about it.
   - I'm not particularly optimistic on the plan of using magnetic fields to contain 10M K plasma, maybe pursue a different research agenda unless you have some special insight into why you can make this work. If you spend few months in the space you'd have more expertise than me and would be better equipped to pick research directions.


## Movement: Stop genetic engineering

#### Biotech

In general a bunch of biotech research is dual use. I don't at the moment have good recommendations for how to navigate that. You'll have to figure it out yourself.

 - Figure out societal consequences of human genetic engineering of humans to increase IQ, executive function, etc. Figure out new institutions and ideologies so this research can safely occur.
   - This is one of the most radically life altering tech that may soon exist. I don't yet have a strong opinion whether it's good for the world for human genetic engg to progress. You will have to form a strong opinion to pursue this because a lot of people in the field will dislike your opinion irrespective of which side you pick.
     - My current weak opinion is this is net negative for humanity if pursued in 2025 as a silicon valley startup.
   - [Related](https://www.lesswrong.com/posts/JEhW3HDMKzekDShva/significantly-enhancing-adult-intelligence-with-gene-editing)
   - High quality datasets is a bottleneck to understanding this problem, if you are pursuing anything besides a global ban.
     - This will require DNA sequences matched to life outcomes for millions of people. Assuming per genome cost of $500, it will cost $500M to collect 1M sequences unless you can persuade people to your cause and donate their sequences to public domain for free.
 - Figure out societal consequences of accelerating any subfield of genetic engineering. Figure out new institutions and ideologies so this research can safely occur.
   - Some potential areas: gene drives, reducing cost of DNA sequencing, automation of any part of genetics R&D such as gene cloning, artificial selection
   - Assuming you were interested in acceleraring genetic engineering:
     - I'm optimistic on building systems that can search for good edits and populate them across a species, instead of manually searching for that one magic edit. Example: gene drives can populate a species with an edit quickly, but they don't do the search by default.
     - I don't yet have enough expertise to suggest which the best subfields are, but its obvious some are much more promising than others.
 - Work in the bioweapons space, to reduce risk of anyone using a bioweapon or mitigating damage
   - I'm not sure what the best projects are in this space, go ask someone who knows about it.

Writing
 - Write about automation of biotech R&D. Optimistic/pessimistic/neutral take, anything is fine.
 - Write about history of 1990-present biotech.


## Movement: Stop brain computer interfaces, stop human connectome research

#### Neuroscience

 - Review existing work on simulating fruitfly brain or figuring out more insights about it. Or write about this topic.

 - Review existing work on brain computer interfaces.

 - This work could end up net negative for humanity IMO. Form an independent opinion on whether its positive or negative, and under what circumtances.

#### AI and information tech

 - Example: [VNC challenge](https://codex.flywire.ai/app/vnc_matching_challenge) in drosophilia connectome. No neuroscience knowledge needed. Fully map vertices between two graphs (~19k vertices each, ~2-4M edges each) to maximise number of common edges.
   - This investigates the connectome hypothesis. Is a weighted graph of neurons (connectome) alone sufficient to learn useful things about C elegans or Drosophilia melanogaster brains, or is lower-level information such as electric signal data needed?


## Movement: Discover new movements

#### Political science

 - Figure out political institutions and ideologies that can safely handle technologies like ASI, BCIs, human genetic engineering, etc
   - I hesitate putting this way up here, main reason being it's so poorly defined.
   - Existing ideologies like liberalism, free markets, representative democracy and so on may be suboptimal if not completely incompatible with a society with these technologies. Figuring out alternatives in theory and in practice seems worth doing.

#### Any STEM researcher

Writing
 - Try to identify if accelerating your field of research is net negative for humanity, and what circumstances are required for it to be positive.
   - This question doesn't get asked as often as it should IMO, common reasons being people don't want to inconvenience their social circles or leave a well-paying job.
   - If you think it is net positive, then see the further points.


## Movement: ??? Neuropharmacology

I haven't made up my mind on this movement yet, is it net good or net bad. Haven't studied deeply. Most of the important research happened during WW2 and has led to somewhat of a cultural taboo.

#### Pharmacology

 - Figure out substances that do any of the following, but without the downsides and side effects that existing substances have:
   - improve socialising similar to alcohol and cannabis, or similar to oxytocin. Especially focus on changes in ability to experience trust or love that are long-term.
   - render someone incapable of lying, similar to barbiturates but more effective. (If you find this, disclose this very carefully, it has huge political implications.)
   - improve fluid intelligence (IQ), focus, executive function, working memory, long-term memory, etc.
   - increase human lifespan
 - In general any sort of work in pharmacology that is aimed at improving the baseline human condition, rather than fixing disabilities, is someting I'd be a fan of.
 - In general pharmacology seems to focus more on changes in low-level cognition with the hope this percolates to changes in high-level cognition instead. (System 1 versus System 2?) Figuring out ways to directly target changes in high-level cognition seems useful.


# Some more projects

Here's some more cool projects. I'm less confident they're important or worth doing, but consider looking into them anyway.

#### Politics

 - Look into industrialisation of agriculture in India, in particular the politics involved in consolidating small land holdings into big ones so that large-scale automation becomes possible.
   - 500 million plus people's lives depend on it.

#### Urban planning

 - Figure out why don't we build one city with one billion population
   - Bigger cities will probably accelerate tech progress, and other types of progress, as people are not forced to choose between their existing relationships and the place best for their career 
   - Assume end-to-end travel time must be below 2 hours for people to get benefits of living in the same city. Seems achievable via intra-city (not inter-city) bullet-train network. Max population = (200 km/h * 2h)^2 * (10000 people/km^2) = 1.6 billion people
   - Is there any engineering challenge such as water supply that prevents this from happening? Or is it just lack of any political elites with willingness + engg knowledge + governing sufficient funds?
   - If a govt builds the bullet train network, can market incentives be sufficient to drive everyone else (real estate developers, corporate leaders, etc) to build the city or will some elites within govt need to necessarily hand-hold other parts of this process?

#### VR

 - Build VR good enough that in-person conversations feel equivalent to online conversations
   - This is an alternate approach to the above problem of not forcing people to pick between their relationships and their career
   - I haven't studied VR well enough to decide whether this is a promising project or not.

#### AI and information tech

Software
 - Write better RAM-only software
   - We're reaching a point where the whole world's plaintext data and most of its programs can run entirely in RAM, rarely ever touching the disk. (As of 2023, RAM is $1000/TB, in 10 years it could be below $100/TB making storing CommonCrawl \*.WET 600 TB in RAM affordable.) Even backup can be done to another machine's RAM instead of to disk. Performant RAM-only databases and software are still hard to write however.
   - In general OSes and software are often written with the assumption that RAM caches some subset of data on disk and disk caches some subset of data on other disks in the network. These assumptions might or might not change in future.
 - Improve DevOps to setup cloud machine in < 1 second
   - Within one second it should be possible to get the following ready: launch new machine, install requested software (apt install xyz, snap install xyz), libraries (npm install xyz, pip install xyz, etc), load user API keys to respective tools, load user dataset on disk
   - There are two software dev paradigms: where developer time costs more and where hardware+electricity costs more, this is to speedup the former
   - (Maybe) Increase ease-of-use of resizing and attaching/detaching block storage without corrupting the filesystem or data. Attaching disks is a faster way of transferring large datasets compared to downloading to disk over the internet. (For all data except large datasets, ideally they're stored in in-memory databases and no disks are needed.)
   - (Maybe) Provide installed versions of all popular apt, pip, npm liraries on such attachable disks so no separate installation time required.
     - Allow fetching them during first run of the program instead of mandating a separate install step.
     - Distribute .pyc bytecode instead of python wheels (for all combinations of OS, python and library versions). In general, figure out why npm installs and pip installs are slow and speed them up
   - (Maybe) Standardise a keystore where someone can dump all their API keys, and import them on any internet-connected machine with one-liner command + password + 2FA (optional).
   - (It's possible some of this can already be done a different way, and I'm just ignorant)
 - Global notification app
   - (Launching a linux phone is a bigger move than all this though, prefer working on that if possible.)
   - Allow any website to register with your app and push mobile notifications through your app
   - Why?
     - Most mobile apps could be mobile websites instead, if they had a way to send notifications while closed.
     - Apple and Android are trying to kill mobile websites, in order to push mobile apps instead. Hence mobile websites can't send notifications and they might ship a copy of the browser instead (electrum).
     - Open source: mobile OS - no, browser - yes, desktop OS - yes
     - Sandboxing: mobile OS - good, browser - good, desktop OS - bad
     - Usage as of 2025: Mobile apps > mobile browser > desktop browser > desktop apps

Infra
 - Setup low latency ("edge") GPU datacentres in hundreds of cities across the world
   - i.e. as many geographic locations as cloudflare, not aws
   - its possible cloudflare themselves do this over next 1-2 years
   - This is a "cool" project that benefits a few use cases (such as AI-enabled cloud gaming) but may not have much effect on more politically relevant stuff like AI timelines or internet privacy.

#### Games

 - Write a game that does not blindly copy ideas from existing game genres.
   - Games allow you to express a strictly larger set of ideas than films or books. However as of 2024 it's still more common for a person to say that they changed their life over a book or film, than a game. If you're imaginative enough, you can change this.

#### Robotics

 - Run an automated kitchen operating at medium scale (1k-1M meals/day).
   - If you compare society before Newton and today, cooking is one activity that still consumes a huge number of man hours as it did then.
   - "Scaling laws" + robotics has lots of low hanging fruit, see DeepMind's end-2024 papers for example.
   - I would currently recommend *not* advancing the state of the art in AI or robotics.

#### Biotech

 - Look into making meal replacements (Huel, Soylent etc) more tasty, run longer scale trials etc
   - I don't know much about freeze-drying proteins/vitamins/minerals or about chemistry of taste, so I'm the wrong person to guide you here.
 - Work in the artifical meat protein space
   - Again, don't know much about this space personally.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/misc/I_want_leverage.md

2025-05-20

# I want leverage

Disclaimer
 - This is quick note. Haven't spent much time on it.

Update
 - I think there's something fundamentally broken about this whole model by Naval, but haven't figured out what exactly.
   - Internet-copyable products such as software or podcasts are basically just a new way to gain attention. Maybe I should call this "digital attention" or something.
   - Some people then trade this digital attention for capital (using advertising or subsriptions or whatever). But this is not a mandatory step.
     - Are you providing advice, are you providing emotional support, are you providing them food/water/basic goods, are you providing them social status?
     - If you are providing physical goods (like Amazon) you need to charge the end user ofcourse. But in many other cases you can get by without charging most of your users.
   - Digital attention is very heavy-tailed. A few channels occupy most people's attention and many channels occupy small niches.
     - Youtube channels can now run entire governments and shopping websites can now run entire economies.
     - Tech website owners seem to be more aware of this fact than youtubers are.

2025-04-17

#### Main

How does someone build massive amounts of influence over society?

I am highly sympathetic to Naval's viewpoint that there's 3 forms of leverage in society:
 - attention
 - capital
 - internet-copyable products - software, blog posts, video speeches, etc

You can also merge multiple forms of leverage
 - For example Zuckerberg has now built leverage in terms of both capital and software.
 - Vitalik too built leverage in terms of both capital and software, although increasing political popularity of crypto means he is also building leverage in terms of attention.
 - Maximum leverage may very well to build alternate forms of government that operate in internet-native ways, rather than operate in conventional ways but adapted to the internet.

Having leverage means you can lend or trade those resources with terms attached
 - Everyone needs a minimum amount of capital and attention to survive (lower and middle levels of Maslow's hierarchy respectively). If you have more capital and attention than you need for your survival, you may now have ability to decide who else survives or doesn't.
 - You might lend it to someone who also has more capital and attention than they need to survive. After enough rounds of lending and trade eventually these resources end up used by someone who actually needs them to survive.
 - (I guess more people find all this obvious for capital than for attention. People can commit suicide for lack of attention just as they can starve to death. Attention can hoarded or circulated to others just as capital can be hoarded or circulated to others.)

Real power versus formal power
 - It is also important to differentiate between formal positions of power and actual influence on millions of people. A billionaire may have thousands of people listening to him, a random youtube podcaster may have millions of people listening to him. Owning a formal position of power does not mean lots of people are actually paying attention to you or changing their own speech or behaviour as a result of your speech and behaviour.
 - Internet allows one to build deeper connections with millions of people, than was ever possible before in history. You are still limited on close relationships like friendships, you still can't make a million friendships for instance. But you can for example get real feedback from a million people, be it collect it, keep track of who is providing what feedback and why, filter it, and connect each piece of advice to the relevant people who will take decisions based on it. This was not anywhere as easy to do in world where your voter's feedback would reach you via postal service via train for example.

Self-replication
 - You might want your leverage to self-replicate after you die. Building a structure (such as a corporation or government) that can meaningful weild your leverage after you die is one to ensure long-term influence on society. Some corporations and governments are much more effective than others at weilding leverage after you as the founder die.
 - An even more powerful way of doing self-replication is for millions of people to themselves voluntarily self-replicate your way of doing things. For instance ideology that is pro-democracy or pro-market has self-replicated for multiple centuries. An important factor here is that spreading the ideas also means newer believers are motivated to replicate the societal structures that would allow for even further growth. Societal structures provide features such as resilience, legitimacy and accountability. Religions have replicated for millenia. Religions too spread in terms of both ideas and societal structures that rely on the ideas. Starting an ideological or religious movement is one of the strongest forms of leverage to exist.

What leverage do I want, personally?
 - I am personally most keen on building an alternate form of government via software, that does not rely on me acquiring any formal position of power.
 - I am also keen on leverage in the form of becoming an advisor to someone with a lot of capital or attention.
 - I am somewhat keen on but not very keen on leverage in the form of books or videos where I directly provide value (such as life advice or career advice) to millions of people. My current guess is this is not what I'm best suited to. I'd prefer building the tools, incentives and culture for other people to do this instead of doing it myself. (I might change my views on this in the future.)
 - I am not keen on putting the work to acquire a lot of capital or attention myself. Apart from requiring a massive amount of effort and time to succeed at (because they're older and more competed forms of leverage), they will also both place constraints on my behaviour which I'm currently not keen on accepting. (I might change my views on this in the future.)
 - I am not keen on trying to become a religious/spiritual/ideological leader currently. I don't think I currently have deep enough understanding of such topics to succeed at starting an ideological movement that survives multiple generations, and I don't think investing a few years of my time will be enough for me to succeed at it. (I might change my views on this in the future.)


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/misc/industrial_civ_numbers.md

2025-04-28

# Industrial civ numbers

Disclaimer
 - written quickly
 - numbers are all bad estimates

Basic numbers of industrial civilisation that I wish I was taught in school instead of having to figure out myself in adulthood. The point is not to memorise the numbers, it is to get an overall picture that is quantitative and to be able to do these calculations on-the-fly. I'm sure this gets taught in college as energy economics or something, just disappointed I was not taught.

#### Coal power plant

Electricity cost for consumer : $0.10 / kWh

Bituminous coal : 2.2 kWh / kg, $110 / metric tonne => $0.05 / kWh
Coal power plant construction cost : $3k / kW, 50 year lifespan, 50% efficiency => $0.014 / kWh
Coal power plant operating cost : $0.01 / kWh labour + $0.006 / kWh maintenance labour + $0.004 / kWh maintenance materials + $0.005 / kWh ash pond, scrubbers, etc = $0.025 / kWh

**Cost of coal is primary cost for cost of electricity**

#### Coal mine

Bituminous coal price : $110 / metric tonne

Thermal coal mine construction cost : $50 / (metric tonne / year), 30 year lifespan => $1.67 / metric tonne

Thermal coal mine operating cost :  $35 / metric tonne
 - Of which, labour cost : 5 metric tonnes / worker-hour, $25 / worker-hour => $5 / metric tonne

Thermal coal transportation cost : $10-$50 / metric tonne
 - Train operating cost : $0.10 / metric tonne / km, 100-1000 km 
 - Trains are typically run using diesel (see petroleum refinery stats below)

**Cost of transporting coal is the primary cost for cost of coal**

(I need to double-check this with someone. Main doubt I have is that electricity tranmission loss is 7% / 1000 km whereas calorific value loss for coal transport by train is atleast 30% / 1000 km. Why is it assumed cheaper to transport coal than transport electricity? Typically coal power plants are built closer to electricity demand centres and far from coal mines.)

#### Steel plant

Steel cost : $1000 / metric tonne

Steel plant construction cost : $700 / (metric tonne / year), 40 year lifespan => $17.50 / metric tonne

Steel plant operating cost : $900-1000 / metric tonne
 - Operating cost, materials : $600 / metric tonne
 - Operating cost, electricity: $200 / metric tonne
 - Operating cost, operating labour + maintence labour: $200 / metric tonne

Assuming BF-BOF route, materials required per metric tonne steel
 - iron ore : 1.37 metric tonne iron ore / metric tonne steel, $100 / metric tonne iron ore => $137 / metric tonne steel
 - metallurgical coal : 0.78 metric tonne coal / metric tonne steel, $120 / metric tonne metallurgical coal => $93 / metric tonne steel
 - limestone (quicklime): 0.27 metric tonne quicklime / metric tonne steel, $150 / metric tonne quicklime => $40 / metric tonne steel
 - recycled steel : 0.12 metric tonne

(These numbers don't add up to $600 / metric tonne. unsure why)

**Cost of iron ore and coal are the primary costs for cost of steel**

#### Iron mine

To do

#### Petroleum refinery

To do

#### Oil rig

To do

#### Energy mix

https://ourworldindata.org/energy-mix


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/misc/per_capita.md

2025-04-23

# Per capita

All macroeconomic and financial metrics should be expressed as per capita IMO, by dividing by either national population or global population. This includes market caps of publicly listed companies, national debts, national GDPs, and international debt markets. This makes it a lot easier for the common person to understand. I anyway implicitly end up doing the conversions every time.

For example if twitter market cap is $40 billion, instead say $5 per capita or $117 per capita national. If Indian budget deficit is $180 billion, instead say $22.50 per capita or $128 per capita national.

People working in technical fields often invent more jargon than necessary to justify their own legitimacy and this is yet another small example of it I think.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/misc/donate_to_me_asi_leaks_old.md

2025-05-20

# Donate to me (ASI leaks)

Disclaimer
 - Contains politically sensitive info.
 - Quick note. May update quickly based on new info.

## Update

**This document is outdated, please go see the other document instead.**


## Funding requests

 - Request for $40k/year
   - The biggest benefit of getting any non-trivial funding, let's say $40k/year, will be that I could shift to the US and live among peer group who also care a lot about similar problems. Having a peer group would help me stay motivated and get more work done.
 - Request for $1M
   - If I got $1M, I'd use it towards some or all of the following projects. Some of the projects below are bottlenecked by significant amount of capital.

## List of projects

**Objective: Get secret information out of US companies building superintelligent AI (including classified information). Host it in countries outside the US. Present it to US and world public to influence politics.**

Data acquisition (black)
 - whistleblower guide
   - for whistleblowers at AI companies, who wish to disclose company info to the public
 - independent hacker guide
   - for independently motivated cyberhackers trying to obtain info of AI companies and publicly disclose it
 - pre-committed funding and legal support for whistleblowers and hackers

Data acquisition (grey)
 - internet doxxing tool
   - to deanonymise social media accounts of people at AI companies
 - drones/cctv outside offices/ datacentres of AI companies
   - to track general info such as employee lists, in and out times, and their emotional states. Google "Pentagon Pizza index" for more.

High attention
 - make list of US-sphere journalists/youtubers who a) understand how to get attention online b) have good opsec c) publish original documents not propaganda pieces d) understand AI risk 
 - make list of non-US-sphere journalists/youtubers who a) understand how to get attention online b) have good opsec c) publish original documents not propaganda pieces d) understand AI risk 
 - work for some or all of these journalists/youtubers to educate them on these topics
 - OR, run my own own youtube channel with all this fixed

All the data acquisition will be aimed at leadership and employees of people involved in building ASI. (See my other documents for who exactly is included in this.)

## Capital and attention bottlenecks

 - Whistleblower guide
   - Not bottlenecked
   - Will work on writing this guide
   - In short, US whistleblowers leaking classified documents should focus on getting to Russia like Snowden did, instead of improving their opsec and hoping to stay anonymous
 - Hacker guide
   - Knowledge bottlenecked
   - I don't know enough to offer them technical advice. Mostly I'll offer moral support and maybe some legal advice.
 - Pre-committed funding and legal support for whistleblowers/hackers
   - TO DO
 - Internet doxxing tool
   - **Weakly capital bottlenecked**
   - I tried building this by doing embedding search and anomalous word counts on reddit extract of commoncrawl. This will likely work better as a two pass system, first pass use PII, second pass do stylometrics.
   - I need capital for more servers, and maybe to purchase some PII datasets similar to whitepages/weleakinfo/snusbase.
 - Drones/CCTV outside offices / datacentres
   - **Capital bottlenecked**
   - Need capital for lawyers, and for setting up the cameras
   - This is technically legal in US (not UK) but will definitely be contested on legal grounds if I actually did this. This is more of a legal project than a technical or infra one.
 - High attention guide
   - **Weakly capital bottlenecked.** Attention bottlenecked.
   - Most journalists-by-training lack many of the following skills.
     - How to run good opsec
       - Advertising a SecureDrop-like system or a Signal number or running pgp.
     - How to become popular online
       - Understanding things like heavy-tailed distribution of attention and importance of building a brand around your face and understanding what readers want.
       - Most journalists-by-training are being replaced by YouTubers across US, Europe, India and Russia atleast.
     - Understanding ASI risks
       - Having enough technical knowledge about AI and about ASI risks
     - Publishing original documents
       - Many US journalists publish propaganda pieces instead of original documents.
   - I'm unsure if I should be trying to work for existing journalists or youtubers and teach them this stuff, or should I just run a youtube channel myself.
     - If running my own channel, need to get a little capital and a lot of attention.
     - If helping existing channels, I might not be bottlenecked.

## Impact estimates

 - Data acquisition (DAQ)
   - Guides and pre-committed funding and legal support for whistleblowers and hackers releasing classified documents of AI companies.
     - Neglectedness - high, Impact - high
   - Other lower-stakes ways of leaking info such as internet doxxing or drones/cameras.
     - This is unlikely to uncover the most important information, just that it could force the ASI company employees to isolate further from rest of society.
     - Neglectedness - high, Impact - medium
 - First host of documents
   - US-sphere journalists/youtubers will probably host documents leaked by a US-sphere AI whistleblower.
     - Atleast a few US-sphere journalists/youtubers have good opsec and digital attention-acquiring skills and are willing to publish original documents.
     - I'm assuming one of them will do the job when the time comes. Nothing to fix here.
   - This could change if the US enters a hot war with Russia or China, or there is an emergency of equally large magnitude (which is possible if superintelligent AI is arriving). US journalists/youtubers may be stopped from publishing. Will be reliant on non-US-sphere journalists/youtubers.
     - Atleast a few non-US-sphere youtubers/journalists have digital attention-acquiring skills. Very few have good opsec or publish original documents or have knowledge about AI risk.
     - May need to work for a youtube channel or run a youtube channel in worst case.
     - Neglectedness - medium, Impact - high
 - High attention
   - Obviously there need to be journalists/youtubers both inside and outside the US trying to raise attention. Both will have separate biases.
     - The biggest problem IMO is that most journalists/youtubers both in US and outside lack a technical background in AI and hence don't understand AI risk. This will cause them to do a poor job in how they cover it.
     - May need to work for a youtube channel or run a youtube channel in worst case.
     - Neglectedness - high, Impact - high

## Legal, moral

Reputation risks
 - If you wish to discuss these projects with me, I can protect your privacy to some basic degree.
   - I will not be revealing the info to people who casually ask me about it. But I will not be able to protect this info against a targeted attack on me or interrogation by law enforcement.
 - If you publicly associate with me on these projects, you are likely to see consequences on your reputation.
   - These consequences could be both good and bad. You have to take a personal decision on whether you want to associate with me or not, and to what degree.
   - There is an obvious chilling effect here, which means the people for whom this issue matters most are the ones who are likely to accept the reputation hit first.
   - For example, Laura Poitras and Glenn Greenwald earned a lot of respect from journalists worldwide for publishing Snowden's work. But also, they both found themselves poor fits for their respective jobs at The Intercept, and ended up leaving and working more independently.

Legal
 - Publishing guides for whistleblowers and hackers is legal in US as per first amendment. Pre-committed funding for a legal defence for anyone who has committed any crime is legal in US. I am not a US citizen and this does not apply to me.
 - Actual act of whistleblowing or hacking of classified info is not legally protected. Previous cases have typically lead to imprisonment.
 - Doxxing accounts based on publicly available information is legally grey. Civilian-run drone surveillance of public areas is also legally grey. Both might be arguable either way depending on legal resources and the specifics of the case. (I have not done detailed research)
 - Educating journalists/youtubers on opsec or on AI risk is legally safe.
 - Publishing US classified info as a journalist/youtuber is legal in the US, but journalists who have done this in past have faced adverse consequences anyway. See case studies for more.

Moral
 - I'm generally willing to support people who obtain info about AI companies and publicly disclose it. And I'm generally unwilling to support people who obtain info for another nation's intelligence agency. Whistleblower versus spy, independent hacker versus state-sponsored hacker, point to this distinction.
 - I'm okay with a world with this degree of information collection, as long as a) info is collected about all elites as well b) the info ends up in public and not in the hands of a small group (such as an intelligence agency). I see this most of this as fairly inevitable, although the details can matter.
 - I generally get the sense that more people are against the doxxing tool and drones/cctv, than they are against supporting whistleblowers. I disagree but I might understand where this is coming from. I'm open to feedback on this.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/misc/lw_blogs_scrape.md

2025-06-10

# LW blogs scrape

Disclaimer
 - Quick note

incomplete, can contain errors, unlikely to fix

https://www.yudkowsky.net

https://gwern.net/

https://kajsotala.fi

https://www.astralcodexten.com

http://www.weidai.com/

https://lukemuehlhauser.com

https://www.mccaughan.org.uk/g/

https://vladimirslepnev.me

https://sethaherd.com

https://jimrandomh.tumblr.com

https://muckrack.com/thane-ruthenis/articles

https://ailabwatch.substack.com/

https://www.lsusr.com

https://theeffortlessway.com

https://www.cognitiverevolution.ai/mind-hacked-by-ai-a-cautionary-tale-from-a-lesswrong-users-confession/

http://zackmdavis.net/blog/

https://benjaminrosshoffman.com

https://acesounderglass.com/

https://www.benkuhn.net/

https://benlandautaylor.com/

https://thezvi.wordpress.com/

https://eukaryotewritesblog.wordpress.com/

https://flightfromperfection.com/

https://meteuphoric.wordpress.com/

https://srconstantin.wordpress.com/

https://sideways-view.com/

https://rationalconspiracy.com/

http://unremediatedgender.space/

https://unstableontology.com/

https://sjbyrnes.com/agi.html

https://sites.google.com/view/afdago/home

https://paulfchristiano.com

https://paisri.org/

https://katjagrace.com

https://blog.ai-futures.org/p/our-first-project-ai-2027

https://www.mariushobbhahn.com/aboutme/

http://rootsofprogress.org/

https://www.bhauth.com/

https://turntrout.com/research

https://www.jefftk.com

https://virissimo.info/documents/resume.html

https://bmk.sh

https://metr.org/

https://kennaway.org.uk

http://1a3orn.com/

https://acritch.com

https://lironshapira.substack.com

https://coral-research.org

https://substack.com/@theojaffee

https://matthewbarnett.substack.com/

https://www.beren.io/

https://www.cold-takes.com

https://newsletter.safe.ai

https://www.metaculus.com/accounts/profile/116023/

https://www.patreon.com/profile/creators?u=132372822

https://medium.com/inside-the-simulation

https://www.lesswrong.com/users/unexpectedvalues?from=search_page

http://markxu.com/about

https://www.overcomingbias.com

https://www.vox.com/authors/miranda-dixon-luinenburg

https://www.getkratom.com

https://github.com/YairHalberstadt

https://www.scott.garrabrant.com

https://arundelo.com

http://unstableontology.com/

https://mealsquares.com/pages/our-team

https://evhub.github.io

https://formethods.substack.com/

https://www.nosetgauge.com

https://substack.com/@euginenier

https://drethelin.com

https://entersingularity.wordpress.com

https://doofmedia.com

https://mindingourway.com/about/

https://jacquesthibodeau.com

https://www.neelnanda.io/about

https://niplav.site/index.html

https://jsteinhardt.stat.berkeley.edu

https://www.jessehoogland.com

http://therisingsea.org

https://www.stafforini.com/

https://acsresearch.org

https://elityre.com

https://www.barnes.page/

https://peterbarnett.org/

https://joshuafox.com/

https://itskatydee.com/

https://ethanperez.net/

https://owainevans.github.io/

https://chrislakin.blog/

https://colewyeth.com/

https://www.admonymous.co/ryankidd44

https://ninapanickssery.substack.com/

https://joecarlsmith.com

http://coinlist.co/

https://davidmanheim.com

https://github.com/SarahNibs

https://malmesbury.substack.com

https://www.admonymous.co/rafaelharth

https://dynomight.net/

http://nepenthegame.com/

https://github.com/RDearnaley

https://graehl.org

https://nikolajurkovic.com

https://www.julianmorrison.com

https://avturchin.livejournal.com

https://www.perfectlynormal.co.uk

https://www.250bpm.com

https://www.youtube.com/@TsviBT

https://adamjermyn.com

https://www.elilifland.com/

https://zhd.dev

https://ollij.fi

https://arthurconmy.github.io/about/

https://www.youtube.com/@RationalAnimations/featured

https://cims.nyu.edu/~sbowman/

https://crsegerie.github.io/

https://escapingflatland.substack.com/

https://qchu.wordpress.com

https://dtch1997.github.io/

https://math.berkeley.edu/~vaintrob/

https://mutualunderstanding.substack.com

https://longerramblings.substack.com

https://peterwildeford.substack.com

https://juliawise.net

https://uli.rocks/about/

https://stephencasper.com/

https://engineeringideas.substack.com/

https://homosabiens.substack.com

https://martin-soto.com/

https://www.tracingwoodgrains.com

https://www.brendanlong.com

https://foresight.org/fellowship/2024-fellow-bogdan-ionut-cirstea/

https://davekasten.substack.com

https://datapacrat.com

http://admonymous.co/nat_m

https://mesaoptimizer.com/

https://ae.studio/team

https://davidad.org

https://heimersheim.eu

https://nunosempere.com/

https://www.thinkingmuchbetter.com/nickai/

http://kilobug.free.fr/code/

http://vkrakovna.wordpress.com/

https://www.conjecture.dev/

https://ejenner.com

https://morphenius.substack.com/

https://gradual-disempowerment.ai

https://www.clubhouse.com/@patrissimo

https://mattmacdermott.com

https://knightcolumbia.org

https://www.openphilanthropy.org/about/team/lukas-finnveden/


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/readme_for_unimportant_folder.md

2025-01-06

# README for "unimportant" folder

Attention is a scarce resource. Whenever I make a bid for your attention (be it in person or online), I only want to push a small amount of content. Anything I do not think is important enough to push on someone with limited attention goes into the "unimportant" folder.

My standards for how crisp a world model is or how clearly it is communicated are lower in this folder.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/software_as_hypothesis_testing.md

2025-04-09

# Software as hypoethesis testing

Common goals for building software include acquiring money, acquiring people's attention, and providing them tools to solve a problem of theirs.

I realised I'm writing software with a different goal - to test hypotheses about reality, by making contact with reality.

 - I wrote the libgen search project not primarily because I wanted to help researchers or make money myself, but because I wanted to check whether it can be built or not. Can an AI recommend me books I can't find myself? Answer I got was yes, an AI can recommend me books that are useful that I wouldn't find otherwise. AI can improve my epistemology.
 - Similarly, I wanted to know if using search tools to seek truth on currently taboo topics is possible or not. So I wrote a tool to search reddit. The tool I wrote did not get me significantly better answers than just making google searches with "site:reddit.com" appended. Although tbh googling reddit posts itself is enough to make progress on topics considered taboo.
 - I wanted to know if LLM-based stylometric doxxing is possible. I am yet to get an answer on whether stylometric doxxing is possible, but atleast I have a better picture in my head now of what a doxxing tool will look like. It is going to require more grep than embedding search, and it is going to require more crawling of websites that commoncrawl refuses to crawl because it respects robots.txt. Doxxing tools like whitepages likely purchase data from brokers that don't respect robots.txt, hence they have more information. Also, I learned that stylometrics only comes into the picture after you have integrated all the classical approaches to reducing number of potential matches. Stylometrics can filter 1 out of 1000 people easily but can't filter 1 out of 1M people as easily, so it has to be combined with classical approaches.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/tokens_per_dollar.md

2025-04-24

# Tokens per dollar

I end up recomputing these numbers many times so here's a handy reference. Feel free to plug in your own numbers.

```
FLOP : floating point operation(s). assume float32 unless specified otherwise.
FLOP/s : floating point operations per second
FLOPs, FLOPS : I will never use this terminology

Given a GPU:
FLOP/$ = (GPU FLOP/s) * (GPU lifespan in s) / (GPU sales price in $)

Given a GPU and an LLM for inference:
$/token = (FLOP / token) / (GPU FLOP/$) = e * (LLM params) / (GPU FLOP/$)

Given a GPU and an LLM for inference:
tokens/s = (GPU FLOP/s) / (FLOP / token) = (GPU FLOP/s) / (e * (LLM params))

where e : number of times each LLM param was accessed (and multiplied) per forward pass
e > 1

(assumes cost of energy consumed over 5 years is much smaller than sales price)
(assumes one inference token per forward pass)
```

Assuming Llama3 405B inference, picking a machine
```
Llama3 405B float32 memory = 405B * 4 = 1620 GB
H200 memory = 141 GB
1620 GB / 141 GB = 11.48
=> Atleast 12xH200 required
```

Assuming 2x8xH200 SXM
```
Total FLOP/$ = (2 * 8 * 67 TFLOP/s) * (5 years) / ( 2 * $300k ) = 2.817e17 FLOP/$
```

Assuming Llama3 405B inference and 2x8xH200 SXM
```
$/token =  e * (405 billion) / (2.817e17 FLOP/$) = e * 1.44e-6 $/token = e * $1.44/1M tokens

tokens/s = (2 * 8 * 67 TFLOP/s) / (e * 405 billion) = (2646/e) tokens / s
```

Here's [OpenAI pricing page](https://openai.com/api/pricing/) for comparison.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/rl_for_llm.md

2025-04-14

# RL for LLM

I'm writing this more for my own understanding than to teach anyone.

LLM solves following problem
 - Given some initial tokens t1 t2 t3 t4, guess probabilities of which token t5 comes next.
 - You can apply this multiple times. For instance given some initial tokens t1 t2 t3 t4 t5, guess probabilities of t6.

RL for LLM solves following problem
 - Given some initial tokens t1 t2 and some final tokens t5 t6, guess probabilities of which t3 t4 likely follow t1 t2 and likely lead to t5 t6.
 - As an intuition, you can look at a sequence of tokens as a state, and every next token as a state transition. Given any initial state you can in theory compute a tree of possible state tranisition sequences from there. RL for LLM is basically trying to efficiently search this tree to get from some initial state to some final state.

Major doubt I have:
 - Can we train an LLM but with the entire dataset in reverse? Won't that help solve the problem? Typically when solving a maze a human searches forward from the entrances and backward from the exits.
 - Reverse LLM will solve following problem: given final tokens t5 t6, guess probabilities of which t4 precedes it. And then apply this multiple times, so you can get say given t2 t3 t4 t5 t6, probabilities of t1.

In order to do RL for LLM, you have to train the following:
 - LLM.
   - Given some initial tokens t1 t2 t3 t4, LLM guesses probabilities of which token t5 comes next.
 - Value network.
   - Assume we have a trained LLM.
   - Assume some final tokens t5 t6 as hard-coded constant.
   - Given some initial tokens t1 t2 t3 t4, value network guesses probability that t1 t2 t3 t4 leads to t5 t6.
   - We can get accurate answers if we just exhaustively enumerate every possible t1 t2 t3 t4 and use the query the LLM on every possible such set. However this requires too much compute so the value network makes its own guesses using some simpler algo.
   - Typically value network doesn't have just one set of final tokens t5 t6 hard-coded as high value, but thousands or even millions of such final token sets. Then we guess probability that a given t1 t2 t3 t4 leads to any one of the millions of hard-coded final token t5 t6
 - Policy network.
   - Assume we have a trained LLM and a trained policy network (with some hard-coded final token sets t5 t6).
   - Given t1 t2, guess probabilities of which t3 t4 lead to any one of the final token sets t5 t6. During training of policy network, we can query value network.
   - If you were a human being trying to solve a very big maze you'd probably do something similar. First you'd make (possibly incorrect) guesses of which sections of the maze are closer to atleast one of the exits, and then you'd use these guesses as input and your current position in the maze as input to try to guess where to go next.

How to train these models:
 - Not discussed here. There are some generic training algos that can produce value and policy networks. It's possible what works best here is a training algo designed to do RL on LLM rather generic RL training algos.

RL for safety
 - Safety assumes that some token sequences are "unsafe" and other are "safe". You can basically hard-code some final token sets t5 t6 as unsafe and then solve problem: given t1 t2, guess probalilities of tokens t3 t4 that likely follow t1 t2 and likely do not prcede any t5 t6 in the unsafe set.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/ml_is_not_geometry.md

2025-04-04

# ML is not geometry

I am recording an intuition I have about ML, just to see if it was right or wrong a few years from now. I have no immediate plan of working on it.

Disclaimer
 - There is small possibility someone uses this and accelerates ML field by a lot, and this has bad outcomes for the world. Lemme know if you'd rather I take this page down.

In general I have the intuition that ML should be seen in graph terms not geometric terms because of curse of dimensionality.
 - 1.58-bit or maybe even 1-bit models may converge faster than fp8 assuming same compute spent training both. There may not be significant loss of accuracy.
 - there may be a graph algorithm that outperforms backpropagation. Instead of imagining backprop finds weight matrices that are geometrically close to original weight matrices, imagine it searches through "Hamming space" to find bitstrings with low hamming distance from original bitstrings.

Evidence in favour of this view
 - 1.58-bit quantisation works well atleast for inference
 - Graph-based embedding search (HNSW, microsoft diskANN) outperforms geometric-based embedding search (LSH, k-means clustering, google scANN)
 - ReLu outperforms all other activation functions


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/open_source_search/embedding_search.md

2025-01-10

# Embedding search

I'm writing a post on why I'm so excited by embedding search.

It may have massive upsides and downsides. Also the ingredients for it seem mostly ready. So it'll be difficult to un-invent even if the downsides are greater. It seems better to adapt to a world where better embedding search is present.

A lot of distinct problems in life seem like they could benefit from better embedding search.
 - Finding a life partner
 - Finding funding or a job
 - Finding books, articles, etc relevant to your research
 - Establishing trust. Getting diverse information sources about a person helps build trust faster.

Some dual use implications:
 - Finding people who are trying to stay hidden because they are afraid of or competing with you or someone else. (This is dual use.)
 - Finding secrets leaked to the internet (or any private database) by powerful organisations. (This is dual use.)
  - In general, even things like finding funding or becoming better at research, etc accelerates everyone irrespective of what their goals are. 

Attention markets on the internet right now are quite power-law distributed.
 - If you want to find people of a specific niche, often you have to first get attention of the mass public, and then filter out the subset of this that is interested in your niche. This is especially true if there doesn't exist an already well-established community with organisers and funders.
 - Attention markets are power law-distributed because there's atleast 1 million people competing for attention, but any random user has not more than 10 or 20 internet sources they regularly follow. (Sure, you might follow 1000 instagram pages, but you're not paying attention to all of them.) If you can't get into someone's top 20 follows, you're at a massive disadvantage when trying to get mass attention.
 - There aren't clear rules for how to give or receive attention. In practice this leads to an arms race, one side invents ever new techniques to spam all the platforms, and the other side raises the entry barrier to allow anything in.

Embedding search might (??) also help fix the attention market. For example, I could list 5000 people I'm interested in, and the other person could also list 5000 people they're interested in, and if there's overlap then we connect. At no point do either of us need to send a DM without knowing a good probability estimate of whether this is useful or spam for the other person.

Elites use filters
 - The more you wield a scarce resource such as attention or money, the more people are competing to get in your network or influence your thinking. And the more you need to be deliberate about what filters you use to let people in or out.
 - A lot of problems in society are causally downstream of principal-agent problems. Elites don't know who to trust and get fooled into doing massively suboptimal things, basically all the time.
 - Building better filtering mechanisms for elites seems useful.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/open_source_search/simplify_embedding_search.md

2025-04-10

# Simplify embedding search

tldr complexity bad, graph algo good, geometric algo bad, buckets good

Why?
 - I'm looking for a simple embedding search algorithm because political projects should rely on simple software not complex software.
   - Torrent codebase is simpler than etheruem codebase for example, hence torrent is less likely to get politically captured IMO.
   - Examples of such political projects: search feature for distributed social media, spam filter for distributed social media, spam filter for SecureDrop servers that accept leaked documents, search feature for published leaked documents, etc.
 - As of 2025, embedding search is necessary and almost sufficient method of using LLMs on a big dataset. And LLMs are state-of-the-art AI for basically every task.
   - This could change in future, for instance it will one day be cheap enough to just feed an internet's worth of tokens as inference tokens and find out the most relevant content.
   - We also don't yet have good interpretability tools to make sense of layers besides the embedding layer.
   - Since input is too expensive and other layers are unhelpful, we will likely rely on embeddings

Embedding search: Given a dataset of vectors and a query vector, find top-10 vectors that maximise dot product with query vector.

This is equivalent to finding top-10 vectors that minimise euclidean distance with query vector, assuming all vectors are unit vectors.

Common algos
 - Graph-based - HNSW, pinecone proprietary (?) algos
 - Geometry-based - locality sensitive hashing, k-means clustering, product quantization (subvectors)
 - Brute-force

Dataset sizes
 - Rule of thumb: 1000 chars (1 KB) -> 1536 dimensional float32 vector (6 KB)
 - 1 TB plaintext => 6 TB embeddings, 1 PB plaintext => 6 PB embeddings

Brute-force algo
 - Commoncrawl is ~1 PB plaintext. Trillion vectors (10^12). ~10 PB embeddings
 - Brute force can be done either fully in RAM or by loading disk to RAM at query time. This is affordable for 10^6 vectors but not for 10^12 vectors.
 - If disk throughputs become fast enough, one could do brute-force on a large array of disks in parallel. Join 10,000x 8 GB/s 1 TB disks in parallel, and you can search a batch of queries in ~15 minutes. This cluster costs atleast $1M in hardware alone.
 - Brute-force algo on disk will be possible within a decade or two. But is expensive today.

State of the art algos are captured by ANN benchmarks and big ANN benchmarks. The main conclusion from them for me is that graph-based methods beat geometry-based methods.
 - My guess is this is because of curse of dimensionality. You don't lose a lot of search accuracy if you quantise vectors to 4-bit (or maybe even 2-bit or 1-bit). So you are basically searching for bitstrings that have low hamming distance from each other. Using intuitions from 2D and 3D geometry no longer makes sense to me.

[Pinecone's articles on FAISS](https://www.pinecone.io/learn/series/faiss/composite-indexes/) are the best resource on the internet on this that I've found.

Buckets
 - As long as you can put the trillion vectors into buckets of million vectors each and quickly identify which buckets have the query's matches, you can then just go and brute-force search those buckets.
 - Additional software complexity to save a few milliseconds of search time within the bucket is not necessarily worth it, depending on the application.
 - Indexes must not occupy too much memory. For instance in LSH, each vector is stored along with a (say) 24-bit signature for 24 hyperplanes. 3 bytes / vector. In HNSW, each vector is stored along with index positions of atleast 10 neighbours. >100 bytes / vector. 
 - If you do the bucket approach, you need to store the index positions of all vectors in a shuffled order. For 10^12 vectors, that's 5 bytes / vector. Or you can avoid numbering the vectors at all and just the store the raw vectors in the shuffled order. This way you require zero extra space.

I'm yet to find a state-of-the-art algo for this, I just shared some intuitions of mine here.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/open_source_search/death_of_big_tech.md

2025-03-10

# Death of Big Tech

Two major reasons why big tech companies exist:
 - Software is complex. Hiring lots of good software developers is expensive.
 - Hardware and electricity is expensive. Hardware includes everything from fiber optic cables to satellites to cameras to server CPU and RAM.

Credits: Moxie Marlinspike and others for really convincing me why software complexity is the heart of the matter here.

Because hardware costs are exponentially reducing every decade, I expect most applications to be possible on a small amount of hardware within next 10-20 years. Applications that might not still be cheap include a) applications based on videos b) applications based on heavy usage of LLM inference c) applications discovered in future, such as applications based on 3D data.

Because of recent advances in LLM embedding search, I expect a few key applications such as search to become easy to build. A major part of what a lot of tech companies do is just a) provide incentives for people to share their data online, and b) use this data to connect one set of users to another set of users. Uber connects drivers to passengers, Linkedin connects employees to employers, Facebook connects friends, Google search connects people to articles and news etc. A lot of these applications can simultaneously be outcompeted by an application that uses LLM search to connect users.

It should be possible to build this 10-20 years from now, such that any person on their personal computer can host a server that does what Big Tech currently does. This gives me (and others) 10-20 years to figure out what a good replacement for Big Tech looks like.

Big Tech may still survive ofcourse, but their form will look noticeably different.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/open_source_search/ai_enabled_cloud_gaming.md

2025-01-18

# AI-enabled Cloud gaming

AI-enabled cloud gaming seems like one of the hardest applications to do on cloud rather than locally. However I expect it'll get done in 10 years.

If you're a game developer you might want to work on this.

Why?
 - Latency limits of human body
   - Video output - Most people can't distinguish individual frames in video above 90 frames per second (~10 ms / frame)
   - Audio ouput - Some audio engineers on reddit find 10 ms latency when playing music digitally to be noticeable but acceptable.
   - Keyboard + mouse input - Human motor reaction times are generally estimated above 100 ms. Upper bound on nerve conduction velocity is around 120 m/s, covering 1 metre of neurons from hand to brain requires >10 ms. Anticipating inputs and reacting to them can lower response time (often happens in games).
   - End to end - Many cloud gamers have reported on reddit that <10 ms latency is where FPS and other action-heavy games feel as fast as playing them offline.

 - Internet bandwidth limits
   - streaming 24x7 video requires lot more bandwidth than text/image/audio
   - 1 gbps fiber connection (with no upload/download cap) is becoming increasingly popular in US, which is more than sufficient to stream UHD 90 fps video.
   - Streaming 3D content directly is not possible though. VR headsets-based use cases might (?) still prefer streaming 3D content over the rendered 2D output, I haven't studied VR well enough.

 - Latency limits of computers
   - Input/output device latency - 1 ms latency (1000 Hz) has been achieved on keyboards and mice, and gamers generally feel increased latency won't be detectable.
   - Game engine, 3D rendering latency - I don't know much about this, but seems doable for most games today in under 1 ms? It depends a lot on the exact application though, there's definitely lots of 3D apps that can't be built with 10 ms latency constraint.
   - Network roundtrip latency - <10 ms has already been achieved on consumer fiber connections in many US cities, there's no fundamental reason paying customers can't get this in cities across the world. Light travelling from Netherlands to California (9000 km) one-way takes 33 ms, in practice roundtrip latency is reported around 100 ms (50 ms one-way). As long as the closest datacentre is within 1000 km, there's no physical limitation on achieving <10 ms via fibre connection.
   - AI inference time - As per Jacob Steinhardt, [forward passes can be significantly parallelised](https://bounded-regret.ghost.io/how-fast-can-we-perform-a-forward-pass/). 1 frame of video can be generated in under <1 ms, assuming you build an ASIC just for that specific model.
   - AI inference cost - This is the biggest bottleneck. Diffusion models use maybe 0.1-1.0 PFLOP for 100 sampling steps, for one frame. At 90 fps that's 10-100 PFLOP per second of video generation. For 1 second per second output, you need a GPU cluster with 10-100 PFLOP/s. H200 is 4 PFLOP/s fp8 rentable at $2/hour. Assuming Epoch AI scaling laws of FLOP/s/dollar doubling every 2.5 years, we should get 16x more FLOP/s/dollar in 10 years, so 100 PFLOP/s rentable at $2/hour.

Effects on IT ecosystem
 - If this application can be done on cloud, then almost any application can be done on cloud
 - Cybersecurity and user control will be the only reasons to do things locally, performance will no longer be a reason. Financial incentives to build anything in favour of security or user control are a lot weaker than the incentives in favour of higher performance. Big Tech will no longer need to fund open source software for performance-based reasons, hence open-source software could lag behind. 
 - Client device could also change. End state of this vision is 99.9% of people own machines with touchscreen (keyboard+monitor) and network card (but no CPU, no disk, no RAM) and it is not practical to do anything unless you submit the job to a server. (This will probably a Big Tech server, unless small cloud is able to compete on getting low latency connections with ISPs who have inherent network effects.). See example of mobile being more locked down than desktop, but having more users. It is possible to live without a phone, but you lose access to jobs, friendships, etc. and are at a disadvantage relative to everyone else.
 - This incentive structure makes it technically less challenging for the NSA (or its equivalent in your country) to get 99% surveillance over people's thoughts. As of today they need to backdoor lots of devices, routers and cables, and send whatever is useful back to their servers. This might be possible technically but requires more developer time and coordination/coercion of intermediaries to pull off.
 - Incentives push in the direction of them using this data for political purposes and also leaking the data itself.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/open_source_search/all_data_in_ram.md

2025-05-27

# All data in RAM

I have found it increasingly obvious that there is no technical barrier to all the world's data ending up stored in RAM in the next 10-20 years. Disks could become obsolete for most use cases. I figured I should write about it publicly too, so it's obvious to others too.

(All data in this article is being measured in petabytes for ease, 1 PB = 1024 TB. Your consumer laptop probably has 0.25-1.0 TB storage in it, so a petabyte is a thousand times that.)

Hobby project that can be done today:
 - Purchase a few LTO-9 45 TB tapes from second-hand market and store CommonCrawl on them. You are now carrying the entire internet (plaintext) in a backpack.
 - Write your own crawler to also crawl the websites that CommonCrawl does not.

## Latency

As of today, RAM latency < Network latency < Disk latency

See [Latency numbers every developer should know](https://gist.github.com/jboner/2841832)

Latency is the primary reason why developers will prefer storing this data in RAM rather than on disk or tape.

You can also read my other post on AI cloud gaming for extended discussion on latency. The short version is that it will likely soon be possible to stream your entire computer experience from a datacenter at <10 ms latency. Since human body operates at latency higher than 10 ms, it will be indistinguishable to you from something running locally on your machine.

## Data size

All data formats are ultimately some mix of text, image and video.

For now if we consider just text:

 - CommonCrawl is an open source dump of a good fraction of the entire public internet. It's .WET plaintext files total around **0.6 PB**.

 - Assume hypothetically we could capture every word spoken by every person on earth. A person speaks \~10k words per day.
   - Shannon's estimates are English communicates \~1 bit per char and 5 chars per word, so a person produces \~6 KB/day.
   - Assuming we are storing the data for entire Earth's population for past 100 years. Total data = 8B persons * \~6 KB/day/person * 365 days/year * 100 years = \~1700 PB. 
   - My blind guess is we can get it down by atleast 1-2 more orders of magnitude with more advanced compression techniques. Most times when people say something they're not the first person in history to be saying it. Assuming this we may only need to store 10-100 PB. Let's assume we need to store **100 PB**.
   - LLMs are probably a good example of this, Llama 3 140B weights fit in less than 1 TB (0.0001 PB), yet Llama 3 140B can say things similar enough to what most humans say most of the time.
   - There's a lot of metadata and intermediate data generated by processes and input devices however I would be surprised if per person, they're generating an order of magnitude more data than the \~6 KB/day the person anyway generates.

 - All of Earth's video and image data is unlikely to fit in similar size range. However if we use AI to generate text descriptions for all the data, this again fits in comparable size.
   - Assume we downsampled the video to 1 frame per second and generated 100 bytes of description per frame. Each description only needs to be a diff compared to the previous description. And we are likely storing data in a compressed format, not english. 100 bytes is 800 binary flags that have differed from the previous frame 1 second before.
   - At 100 bytes/second we generate \~8500 KB/day/person. This is 3 OOMs more than the previous data, so if we could hypothetically store this for the entire population for 100 years we get **2,000,000 PB**. This is an upper bound on all the data we would ever really need to store for most tasks.

 - For completeness, lets also consider worst case of storing all video data without much compression.
   - Assume 1 hour of 4K 96 fps video takes 30 GB to store. Human perception can't differentiate video at much higher color quality or frame rate than this.
   - Assume we want to store 100 years of content for 8 billion people.
   - This is **210 billion PB**

 - It is possible that in the future we invent new types of tasks that require even more storage. For instance if we scanned humans in 3D or scanned lots of biomolecular data from them. Or if we used AI that generated a huge amount of intermediate data (chains of thought ??). As of today it's not easy for me to predict this, if someone has spent more time trying to predict this I'd love to hear from you.

## RAM Cost

As of 2023, RAM costs $1M/PB as per Our World in Data. See [our world in data historical data](https://ourworldindata.org/grapher/historical-cost-of-computer-memory-and-storage) on RAM costs.

(If you want off-the-shelf cost for consumers, it's noticeable higher. Hetzner EX44 cloud machines cost $44/mo for 64 GB RAM, or $8.5M/PB/year. This also includes cost of network, disk, support staff, etc. not just Hetzner's profit margin.)

Seems reasonable that RAM could cost below $100k/PB by 2030. Assuming RAM lasts for 5 years before it gives up, this is $20k/PB/year.

At hypothetical $20k/PB/year it costs:
 - $12k/year to store every word on the public internet today (0.6 PB) - affordable for a rich donor or a group of friends
 - $2M/year to store every word ever spoken (100 PB) - affordable for a medium-sized tech startup with Series A funding.
 - $40B/year to store descriptions of video of every person ever (\~2,000,000 PB) - affordable for a Big Tech corporation, assuming they can capture additional revenue of $5/person/year from 8B people to justify it.
 - $4000T/year to store all video of every person ever (\~200,000,000,000 PB) - not possible

**It is possible we don't end up acquiring this much data by 2030, but if so the reasons will be culture and incentives, not technical reasons.**

Also please remember these are only 2030 numbers. If Moore's law does not stop and research into reducing RAM cost does not stop, these numbers could go down by a few more OOMs by 2040 or 2050.

## Disk cost, disk weight

I figured I'd include this section just for completeness.

As of 2025, aws s3 deep archive offers tape storage for $12k/PB/year. Assume this cost reduces to $1000/PB/year by 2030.

At hypothetical $1000/PB/year it costs:
 - $600/year to store every word on public internet as of 2025 - trivially affordable
 - $100k/year to store every word ever spoken (100 PB) - affordable for a group of family/friends to pool, or for a rich donor
 - $2B/year to store descriptions of every video ever (\~2,000,000 PB) - affordable for a Big Tech corporation even if they generate no additional revenue by doing so.
 - $200T/year to store all video of every person ever (\~200,000,000,000 PB) - not possible

In terms of weight, as of 2025, LTO-9 stores 45 TB at weight of 200 grams, or \~4.4 kg/PB. Assume this too reduces 10x, so it will weigh \~0.44 kg/PB by 2030.
 - ~0.25 kg weight for every word on public internet as of 2025 (0.6 PB) - fits in your pocket
 - 44 kg weight for every word every spoken on Earth (100 PB) - fits in a school bag
 - 880 metric tonnes for (descriptions of) every video ever (\~2,000,000 PB) - a typical frieght train carries 5000 metric tonnes, a typical aircraft carries 100 metric tonnes.

## Data transport, security implications

I'm mainly studying this to understand feasibility of data being stolen or traded by companies and governments in the future.

Legal and illegal copies
 - If you can afford to buy N disks or tapes, you can probably also afford to hide N disks or tapes. All such data will be very easy to transport.
 - It will be very easy to make multiple copies of this data and very easy to destroy individual copies.
 - Any govt or large company with govt support can smuggle an aircraft's worth of freight for example. I use the word "smuggling" because some or all of the actors involved may not be aware this is happening. The extreme end of this is theft, where none of the actors involved are aware this is happening.

Datacenters
 - Datacenters today as of 2025
   - Every word every spoken (100 PB) can be stored in a secret datacenter or cold storage whose location remains unknown.
   - As per [this xkcd](https://what-if.xkcd.com/63/), Google datacenter and NSA datacenter are the largest and are roughly on order of magnitude of 10,000 PB.
   - GPS locations of these datacenters are public information, along with estimates of their power and water consumption.
   - These datacenters can easily store every word every spoken (100 PB), and store some small fraction of all video of every person ever.
   - The data in such a datacentre can be loaded on LTO-9 tapes and transported in one single commercial aircraft.
 - Datacenters in future
   - Descriptions of every video ever (\~2,000,000 PB) will probably get stored in datacenter whose location is known. Maybe you can split this into multiple small datacenters and hide it, I'm unsure how many years it will take before this is technically feasible.
   - Descriptions of every video ever will be transportable in one single freight train or a fleet of aircraft.
   - All video of every person ever will likely not be possible to store or transport anytime before atleast 2050, unless there's a fundamental breakthrough in computing hardware.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/software_and_ai/samuel_saksham_ai_timelines_20250509.md

2025-05-12

# Samuel x Saksham AI timelines (discussion on 2025-05-09)

 - top-level views
   - samuel top-level: 25% AI!2030 >= ASI, >50% ASI >> AI!2030 >> AI!2025, <25% AI!2030 ~= AI!2025
   - saksham top-level: medium probability AI!2030 >= ASI
   - samuel bullish on model scaling, more uncertain on RL scaling
   - saksham bullish on RL/inference scaling, saksham bullish on grokking
     - samuel: does bullish on grokking mean bullish on model scaling. saksham: unsure
 - agreements
   - samuel and saksham agree: only 2024-2025 counts as empirical data to extrapolate RL/inference scaling trend. (o1, o3, deepseek r1, deepseek r0). RLHF done on GPT3.5 not a valid datapoint on this trend.
   - saksham and samuel agree: if superhuman mathematician and physicist are built, high likelihood we get ASI (so robotics and other tasks also get solved). robotics progress is not a crux.
 - crux: how good is scaling RL for LLM? 
   - saksham is more certain as being bullish on scaling RL for LLM, samuel has wider uncertainty on it.
   - testable hypothesis: saksham claims GPT3 + lots of RL in 2025 ~= GPT4. saksham claims GPT2-size model trained in 2025 + high quality data + lots of RL in 2025 ~= GPT3. samuel disagrees. need top ML labs to try this stuff more.
   - testable hypothesis: saksham claims models such as qwen 2.5 coder are <50B params but better than GPT3 175B and almost as good as GPT4 1.4T. samuel disagrees and claims overfit to benchmark. samuel needs to try <50B param models on tests not in benchmarks.
   - testable hypothesis: samuel thinks small model being trained on big model leads it to overfit benchmark. saksham unsure. samuel and saksham need to try such models on tests not in benchmarks.

# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/politics/distributed_disincentives_for_common_knowledge_formation.md

2025-04-27

# Distributed disincentives for common knowledge formation

Disclaimer
 - Quick note

Credits: Ben Ross Hoffman's blog, for helping me think through some of this.

Animal Farm by George Orwell talks about how authoritarian elites use violence to supress the formation of common knowledge of the truth. Once the truth is supressed, they can then invent an alternate history and present such that if people believed in it, it would serve the dictator's self-interest.

If you relax the constraint from actual violence to any sort of disincentives, then I make a much more general claim. Almost everyone is applying disincentives to suppress the formation of some common knowledge or the other. For every Alice, there exists a Bob such that the truth about Bob's internal experience would make Alice uncomfortable, and Alice would confer less social approval, time, attention and capital to Carol, if Carol tried to make Bob's truth common knowledge.

If you do not have a monopoly on violence, however, your disincentives are less coordinated. There will often be a few outlier individuals for whom bringing that truth into common knowledge is sufficiently important that they will put in the work to acquire enough power that they can resist these disincentives.

Violent mobs such as those in Nazi Germany, are an example of a coordinated group of people, not directly hired by the dictator, who nevertheless acted to suppress the formation of common knowledge. A lot of political groups are united around the shared agreement that they must suppress the formation of some common knowledge around the other.

A lot of political technologies such as freedom of speech and right to bear arms, make more sense when you are studying this sort of distributed supression of common knowledge. Obviously a dictator will not support freedom of speech or right to bear arms. But even among countries that are not dictatorships, some countries' citizens are suppressing a lot more knowledge about the truth of each other's experience, than others.

If you want to increase long-term human flourishing more broadly, sometimes the only effective move is to try and topple the dictator. You have to get information about the truth of what happened politically, and form common knowledge on it against the dictator's interests. Once you have succeeded at this though, the next effective move may be to find technologies and levers that allow common knowledge formation of various other types.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/politics/reply_vitalik_dacc_reply.md

2025-03-07

# Reply to Vitalik on d/acc

Vitalik recently wrote an article on his [ideology of d/acc](https://vitalik.eth.limo/general/2025/01/05/dacc2.html). This is impressively similar to my thinking so I figured it deserved a reply. (Not claiming my thinking is completely original btw, it has plenty of influences including Vitalik himself.)

Disclaimer
 - This is a quickly written note. I might change my mind on this stuff tomorrow for all I know.

Two axes he identifies for differentially accelerating tech are:
 - big group versus small group - prioritise accelerating tech that can be deployed by a small group rather than by a big group
 - offense versus defense - prioritise accelerating tech that can be deployed for defence rather offense

I think I generally get where this is coming from and find these important ideas.

Some confusions from my side:
 - Self-replication
   - I am generally in favour of building self-sustaining social systems over not. Success of d/acc ultimately relies on followers of Vitalik's d/acc a) building only those tech that satisfy d/acc criteria and b) providing social approval to people who build tech as per d/acc criteria. For this system to be self-sustaining, point b) may need to be passed into the future long after all of d/acc's current followers (vitalik included) are dead. Self-replicating culture is possible to build but extremely difficult. Religions are among the oldest self-replicating cultures. Ideas such as markets and democracy have also successfully self-replicated for multiple centuries now. I'm unsure if this idea of d/acc being present in culture is alone sufficient to ensure people in year 2200 are still only building tech that satisfies d/acc criteria
   - Often, culture is shaped by incentives IMO. If people of the future face incentives that make it difficult to follow d/acc, they might abandon it. It is hard for me to explain this idea in short, but it is something I consider very important. I would rather leave future generations with incentives to do a Thing, than just culture telling them to do a Thing.
 - Terminal values
   - To me the terminal values of all these galaxy-brain plans is likely preserving and growing timeless stuff like truth and empathy.
     - Defensive tech provides truth a good defence as information is easy to replicate but hard to destroy. As long as multiple hostile civilisations (or individuals) can coexist, it is likely atleast one of them will preserve the truth for future generations.
     - However, it is harder for me to see how any of these plans connect to empathy. Sure, totalitarianism and extinction can be bad for promoting empathy, but I think it requires more work than just preventing those outcomes. Increasing resource abundance and solving physical security seem useful here. Building defensive tech can increase physical security. In general, my thinking on which tech increases versus decreases human empathy is still quite confused.
 - Takeoff may favour offence
   - Intelligence-enhancing technologies such as superintelligent AI, genetic engineering of humans to increase IQ, human brain connectome-mapping for whole brain emulation, etc. are so radically accelerating that I'm unsure if an offence-defence balance will get maintained throughout the takeoff. A small differential in intelligence leads to a very large differential in offensive power, it is possible offense just wins at some point while the takeoff is occuring
 - Entropy may favour offence
   - Historically, it has always been easier to blow up a region of space than to keep it in an ordered state and defend it against being blown up. Defence has typically been achieved and continues to be achieved in game theoretic ways, "if you blow up my territory I blow up yours", rather than in actual physical ways, "I can defend against your attack, also my defence costs less than your offense". This seems somewhat inherent to physics itself, rather than specific to the branches of the tech tree humans have gone down as of 2025. Consider this across times and scales, from the very small and ancient (gunpowder beats metal locks) to the very big and futuristic (a bomb that can blow up the observable universe may have no defence).
 - Maybe big group is inherently favoured
   - What a big group can build is a strict superset of what a small group can build. Ensuring that all the frontier tech can necessarily be built by small groups is hard. A lot of tech in today's world follows centralised production decentralised consumption. For example solar panels can only be manufactured by a large group, but they can be traded and used by a small group. Even if all this tech is open source such that another large group could, in theory, build a copy of it, in practice this often doesn't happen (why?). Supply chains stack on top of each other, so a few such leads stacked on top of each other means no other big group in the world can easily independently produce the product if they have to build it from scratch (by scratch, I really do mean digging their own water and iron ore and crude oil out of the ground and building an independent supply chain). Assuming multiple big groups can't build independent supply chains, one big group can use measures such as export controls to prevent other big groungs from building a specific tech.
   - Internet has made it harder for any big group to maintain monopoly on the knowledge required to produce a given tech, but there is a significant lead time that can be still be obtained by being first to deploy a tech (or tech stack or supply chain) irl or at scale, which can then be leveraged for some objectives.
   - In general considering all these nuances seems worth doing for me.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/politics/us_geopolitics_long_term.md

2025-05-27

# US geopolitics long-term

Disclaimer
 - Quick note
 - This document includes blind guesses. Don't take this document too seriously unless there's enough evidence for it.

Summary
 - Nuclear security is obviously a major predictor of geopolitics. Since Truman did not nuke USSR and establish US nuclear monopoly after cold war, the world has now been carved into 7-9 nuclear empires. Countries without nukes are client states for one or more empires.
 - US intelligence community and presidents may have succession lineages that are majority christian/jew, and this might be the single biggest predictor of geopolitics. (Blind guess by me.)
 - Some factors I think are less predictive of geopolitics than this factor include which country has crude oil and what the ideology of US citizens is.

I'm very interested in understanding US geopolitics from the end of world war 2 till today, as it seems influential in most of the geopolitics upto 2025. Countries that have allied with US govt have done well economically on average and countries that haven't have mostly not. (Chinese govt and its allied govts are a recent counterexample.)

During first 2 years of cold war, long list of people who advocated pre-emptively nuking USSR and establishing US nuclear monopoly and world government. I still haven't fully understood why this plan was not executed. Were the decision-makers just too slow to reach consensus, did Truman single-handedly veto everyone else for some personal reason or what?

In the short-term (1-10 years), geopolitics seems predictable based on material interests. Which war is likely to break out, which resources are urgently required. I want to understand how to predict geopolitics at longer timescales (>50 years).

At longer time scales there are a few competing factor worth paying attention to:
 - personal worldviews of leaders
 - nuclear security
 - coal and petroleum reserves
 - ideological allies

Nuclear security obviously played a role.
 - The world got carved into a small number of large blocks instead of a large number of small blocks, because most world leaders collectively agreed that the less people have nuclear weapons the better.
 - They could use threat of nuclear war to prevent other countries from building nukes even if they had the technical know-how. A few countries like India and Pakistan managed to bypass this and build nukes anyway, but most countries in the world failed at it.
 - This still doesn't explain who the blocks will be or what their decision-making will be, beyond predicting each block's leaders will take extreme steps to prevent creation of new blocks.
 - (This is all obvious to me and I'm assuming most people with a background with geopolitics. But I end up often explaining it to people without one.)
   - Today's world has only 9 independent empires, every other country is a client state for one or more empires. All empires extract protection money from their client states.
   - US, UK, France, China, Russia, India, Pakistan, Israel, North Korea
   - Due to declining economies, France often defers to US and UK on most military decisions, and Pakistan often defers to US and China on most military decisions. Neither has a good negotiating position. This arguably reduces the number to 7.

Coal and petroleum reserves I'm guessing were not that important.
 - Global energy prices have stagnated at $0.10/kWh since then, with dependence on coal and petroleum. Industry today is still mostly dependent on resources like electricity, natural gas, water, wood, steel and so on, which are ultimately dependent on coal and petroleum. Housing and food transport are dependent on steel and cement industry.
 - Since fossil fuels are in fixed supply, the primary way to grow your country's wealth is to somehow extract fossil fuels from other countries. Zero-sum game.
 - Our World in Data has good data on per capita energy consumption by country. US has clearly managed to get good trade deals from most countries with large reserves of coal and petroleum.
 - However my guess is this is not the fundamental thing that enables you to predict geopolitics. I'm guessing that leaders of US govt are willing to take deals that reduce the amount of crude oil and coal that is available to their citizens, as long as it benefits the leaders personally. Studying divergence of economic interests here seems important.

Ideology of citizens seem less important.
 - It is not clear to me if there is any ideology uniting US citizens.
 - US public
   - US constitution is definitely unique in the world and has broad support across the US public. US public has support for democracy more broadly, and is against authoritarianism. 
     - However my guess is that christianity versus atheism is a dividing factor in the US, as big as the constitution is a uniting factor, atleast among the general public.
     - I don't think democracy versus authoritarianism is the primary predictor of US geopolitics as the US govt has a long history of interfering to democratic elections outside US to prevent anti-US leaders from being elected.
   - US public does not have broad support in favour of capitalism or against communism. I don't think US public's economic ideology is a major predictor of geopolitics. My guess is geopolitics was driven by some other factor and communism versus capitalism was used as an excuse to rally additional public support.
   - US public is not very nationalist. US public does not have majority support for extracting resources from other countries at threat of war, with the justification that US lives are worth more.

Personal worldviews of leaders.
 - I'm increasingly biased towards personal worldviews of leaders being the most predictor of US geopolitics.
 - At >50 year timescales, no individual leader lasts. For >50 year predictable changes, you either need a lineage of successors or you need a uniting ideology across leaders.
 - AFAIK > 50-year lineages don't really exist in either Republican or Democrat parties. (I haven't read much about either Republicans or Democrats history so I could be wrong.) There may however be >50-year succession lineage in the intelligence community.
 - My blind guess is that a majority of US presidents were christian. Roosevelt, Truman and Bush were likely christian. Allen Dulles, Henry Kissinger, and more recently Michael Hayden were likely christian/jew. And this was the dominating factor for which countries they chose to ally versus not ally with.
   - US allies with significant christians: Europe except eastern Europe, Australia, South Korea, Canada, Israel (jew not christian), Brazil, Argentina?, Russia post-WW2
   - US allies with significant non-christians: Saudi Arabia, Japan
   - US non-allies with significant non-christians: Mexico, India, China, Russia today, most of middle East, most of west Asia
   - To study separately: sub-Saharan Africa, southeast Asia - afaik these countries tend to trade/borrow from both blocs without permanent alliances. I haven't read much.
 - My guess is a lot of other euphemisms like authoritarian or communist or immigrant or whatever, ultimately get translated to atheist, muslim and hindu - in the minds of US leaders.
 - Maybe race was another factor but I'm unsure and lack evidence.
 - If my hypothesis is true
   - If there were a succession of atheist presidents and intelligence community leaders in the US, it could have significant implications for the future.
   - China is the only country with nuclear weapons, whose leadership of executive branch and intelligence community is majority atheist. This could have significant implications for the future.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/politics/elite_class_social_norms.md

2025-01-14

# Elite social norms

I have been raised in upper-middle class India which means I have internalised many of the social norms of the middle (knowledge) class and am blind to many of the social norms of the upper (elite) and lower (labour) classes. I am very sympathetic to perspectives such as [this analysis by siderea](https://siderea.dreamwidth.org/1237182.html) on why social class is more important than economic class, and this helps explain why I am blind.

This entire post is therefore a bunch of guesswork from someone who does not have enough first-hand data. Everything here is to be read with the understanding that it could be completely incorrect with significant probability.

If you are an elite, please correct the incorrect parts. It'll mean a lot to me if you do.

#### Disclaimer

Publishing this article in public makes me nervous for a couple of reasons.
 - I am still in a knowledge-gathering phase of my life. I want to spend more time interacting with people outside my class and learn more about what life is like for them. I'm worried I will get reduced opportunities to do that if people know I'm going to publish some of my learnings on my blog.
   - This is especially true if what I publish includes criticism, and isn't just descriptive of the facts.
   - This is especially true if I publish includes incorrect guesses of what it's like in other classes. People could think I'm naive and overconfident, and not be willing to correct me. (If you think I'm naive and overconfident, please correct me, I will appreciate more for this than if you keep quiet. If I don't accept your criticism that's okay, I'll still appreciate the attempt.)
   - In general, if I have published anything that you don't like, please let me know. I'll usually be willing to take it down or atleast rewrite it in a way it no longer feels like it is criticising you specifically.
   - (There may be exceptions. Maybe you've done something especially immoral from my perspective, and I want to call you out publicly and pick a fight with you. But this is rare, I don't usually start conflicts with other just because their values are different from mine. If you are afraid that something you say to me will end up on this blog against your wishes, please ask me first.)
 - If I had a magic wish, I would love to be part of the elite class instead of my current socioeconomic class. I think there's a lot of good that can be done in the world only by elites (or by people of my class who join the elite class).
   - However some of my current behaviours and actions in the world are already indicating my membership of the knowledge class, and my potential lack of eligibility to other classes.
   - For instance my ideal attitude to criticism is a lot closer to people of my class than the elite, whenever I hold back on offering criticism I feel like I am betraying my inner ideals for instrumental gain. In practice yes I often hold back on offering criticism for various reasons, but I hate the fact that I have to do this. (Yes, all 3 classes withold criticism, but elite class does this the most IMO, atleast for public criticism not private.)
   - If I choose to give up on my ambition of becoming an elite, I could still have significant influence, for example by influencing and advising existing elites. But this influence will still be reduced (by how much??) if I refuse to adopt any of the social norms of the elite class myself.

**I still remain unsure whether this article should stay published. If you think I would have a better life if I took this article down, please let me know.**

## Elite social norms that differ from knowledge class' social norms

A bunch of the social norms seem explainable just through incentives. I probably don't actually have to spend 10 years living as an elite myself to figure it out.

If I have to summarise this entire post in two lines it's probably the following:
 - Elite social circles are a small world, there's a small number of people like you. Your entire life - personal and professional - depends on allying with some subset of this set of people.
 - In labour class, others can destroy you so you must be careful how you behave. In elite class, you can destroy others so you must be careful how you behave. It's only the knowledge class that lives in this fictitious zone of "equality".

Now that the summary is over, I'll talk about details.

 - My guess is elites can get away with a lot more weird or even immoral behaviour in private, compared to people of other classes.
   - My guess is elites on average have a stronger demarcation between their private and public life than people of other classes. (Although yes ofcourse all 3 classes maintain this demarcation.)
   - Elites can get away with a lot more things in private simply because of the power they wield. If an elite ever got into a conflict with someone of the knowledge class, they are likely to "win" in many ways.
   - This includes self-expression that would be punished in other classes. If an elite wants to architect their house in a 16th century style, or consume a range of non-FDA-approved supplements, or go live in a monastery for a year, or have multiple partners, it's easier for them to do that. Many people of other classes would love this level of freedom and self-expression, but get restricted by their immediate social circle.
   - [Trigger Warning: Rape] This also extends to immoral actions. For example, it's a lot easier for an elite to get away with rape than someone of other classes. I also think the number of men of other classes who would rape others if they could get away with it is surprisingly high (atleast 5%, global average), but elites get to act on it instead of just thinking it in the privacy of their minds. There are 3000 billionaires today, I'm pretty sure atleast 150 of them are rapists (I'm not including paid sex work here, and I'm not including edge cases like a 50-year old sleeping with an 18-year old here).
     - Disclaimer: Predictions shape reality. Please don't turn my prediction into reality. It would be nice if I was wrong, and this number was a lot less than 5%.
 - My guess is elites are on average more lonely.
   - It is a small world at the top, there's less than 10000 elites globally. For the knowledge class it's easier to do something weird and trust that you eventually find a find people who are not just tolerant but actively supportive of anything weird you want to do. For example, if you want to suddenly convert from hinduism to christianity, you can probably find an existing community of hindus-converted-to-christianity of your class. If you are a Nepali immigrant to the US, you can probably find other nepali immigrants to the US, of your class. As an elite it's a lot harder to find another elite who shares your unique set of life experiences. (Also it takes a lot of time and effort to forge social bonds outside your class that are based on honest foundations, see my next point.)
   - As an elite you're basically 24x7 exposed to non-elites who are engaging in varying amounts of deception to get access to some of the capital and attention you wield. This is true both in professional life (other companies and non-profits who want to ally with you) and in personal life (people who want to befriend or date you). This also seems obvious based on incentives. If someone has $1M it might be worth spending 1 year trying to entrap them, if someone has $1B it might be worth spending your entire life trying to entrap them. Forging trustworthy bonds outside your social class takes a significant amount of time and effort, in fact a significant part of designing your company or government or whatever is basically figuring out how to get people of other classes to do your work without you ever having to trust them.
   - As an elite, if you're ambitious and trying to get something done, you need to interact with a lot of people where you cannot let your guard down emotionally. You cannot choose to avoid interacting with people if you are ambitious and wish to climb further upward. (This is true for basically all 3 classes, but it is an added factor here. I'm not claiming it's worse or better in this class.)
   - There is also cultural spiralling here. If the other people you know are also more lonely and less willing to take bold risks, this will also rub off on you.
 - My guess is elites can get away with less weirdness on things that actually matter, in business and politics, than people of the knowledge class.
   - There's a lot of nuance here that's hard to summarise.
   - This is especially true if they wish to keep their elite social circles (including friends and family) or climb up within their class.
   - This is especially true if the elite wields or aspires to wield attention rather than capital. Elites often wield a lot of attention by crafting a public image that is entirely separate from their private image.
     - Getting a large number of people behind cause may require you to support the average of their views, and the average of a large number of people's views is by definition conformist.
     - [Side note: I wish there was an honest way to do this. For example, I have massive respect for a politician who says "my private beliefs are X, however the voters have voted for Y, therefore if I were elected I'm going to do Y" and I have a lot less respect for someone who just pretends their private beliefs are Y. I understand that the former has disadvantages. For instance, your voters may be too unintelligent to understand the nuance. Also, your message will get mutated as it spreads across place and time, and competes with other ideas, so you may prefer communicating a simpler message over a more complicated one. If someone figures this stuff out, I'd be grateful.]
 - Elites can get away with less criticism of other elites. People of the knowledge class can tolerate more criticism on average, both on the giving and receiving end.
     - Criticising another elite in public is essentially rallying a mob against them. An elite's words in public carry a lot more weight than someone of the knowledge class. (Arguably, being able to rally people using words is basically the most important role an elite plays, in a sense.)
     - An elite's words also end up setting up status heirarchies across all the organisations they control. If you as an elite criticise only one particular type of behaviour by someone, people across all your organisations will go out of their way to avoid that particular type of behaviour.
     - The elite who receives the criticism can also do a lot more damage in response. See again, "it's a small world". The elite on the receiving end can influence their network of other elites to demand conformism or else cut off their ties with you. The extreme end of this phenomena is countries going to war over personal rivalries involving trivial matters.
     - Even when elites do use their media houses to push criticism against other elites, it's done via abstractions. It's "The New Yorker published X" not "the Newhouse family who owns Conde Nast allowed X to be published", it's "the NSA helped kill X" not "Michael Hayden the NSA director helped kill X", it's "climate non-profit got shutted because X" not "Mohammed bin Salman the prince of Saudi Arabia shuttered climate non-profit". Most of us don't even know the names of the people who own the media houses and companies and nonprofits that we interact with on a day-to-day basis, this alone is proof that abstractions work. (I'm not claiming anything about these specific examples, they're just examples.)
     - [Side note: My guess is a lot of criticism being extinguished top-down on the internet, is downstream of this phenomenon. Elites own the social media and knowledge workers publish on them.]
   - Knowing when to do things "how things are usually done" and when to invest on your personal beliefs around better ways of doing things - seems like one of the strongest markers of an elite's good judgment.
 - My guess is elites look at morality differently than people of the knowledge class.
   - Elites are less likely to register the harm done by the organisations they wield as harm that they personally have done. See also: [Power buys you distance from the crime](https://www.lesswrong.com/posts/9fB4gvoooNYa4t56S/power-buys-you-distance-from-the-crime)
     - (Arguably knowledge workers do the same blame deflection, just in a different context. If a smartphone manufacturer uses slave labour to mine their rare earth metals for their smartphone, then the people purchasing the smartphone typically think they're blameless with "if not us then someone else would do it" as a potential justification, and the CEO and investors of the company also typically think they're blameless with "if not us then someone else would do it" as a potential justification.)
   - Elites are basically above the law, the only law that matters is the rules set by other elites. The only way people of the knowledge class get to enforce rules on the elites is to find another elite who is willing to do the enforcing.
   - Most large organisations end up doing some harm to someone. Some of this is an accidental byproduct of governing a large organisation.
     - Often organisations was designed by people other than you. Even if you as an elite have the ability to re-design them better, it's often not worth the time and effort invested to do this, if your goals in life are something else.
   - Some of this is deliberate. Governments often deliberately suppress their citizens using the military for example. Companies often have armies of salesmen, lawyers and scientists aimed at addicting people to various products. These armies don't spontaneously spring into existence, there are elites who devote their entire life to trying to build them. A full analysis of deliberate harm done by elites and why this happens is out of the scope of this post.
   - Again, "small world" phenomena means as an elite, you have less ability to cut people out of your network if they have done evil things. If you want to raise as much capital or attention as possible towards a cause, you have to be more willing to tolerate receiving it from people who have done evil things. How much you tolerate depends on your negotiating position, which could depend on how rich you are, for eexample. A knowledge worker's moral line might be "I will not accept money from anyone who is rude to all their friends."; an elite's moral line might be "I will not accept money from anyone who has committed genocide.". An aspiring elite with $10M may be more tolerant of a billionaire rapist, another billionaire may be less tolerant of the first billionaire, because the former is more dependent on the billionaire's support. (I don't know where these lines actually get drawn in practice, I'm just guessing they get drawn differently.)
   - Elites are definitely more sympathetic to utilitarian ethics - ensure good done is greater than harm done. Elites accept as a matter of course that they will end up doing a bunch of both good and harm in their life. Knowledge workers are more sympathetic to deontological ethics - do not harm, how much good you do less important.
   - The net result of all these dynamics makes elite morality noticeably different from that of knowledge workers.
   - [Side note: I wish some elites could help explain their morality to the rest of us. I'm afraid many elites have given up on trying to explain anything to anyone, and this makes me sad. Many knowledge workers think that if an elite must have high moral standards, they must copy the values of the knowledge workers themselves. This is not possible due to all the incentives mentioned above. The net result of making such a demand is that you get politicians who pretend to "live like a common man" while thinking in ways very different from one. I wish some elites could explain to us what high ethical standards look like *from their own perspective*, and then actually try to uphold those standards. Forging genuine empathy and understanding between members of both classes is going to require work of this sort.]
 - My guess is elites trade a lot of favours, especially the ambitious ones.
   - Keep tracking of debts owed and incurred to other elites, not just financial but favour-based, seems like a non-trivial part of the job of an ambitious elite. Not keeping good records puts you at a disadvantage.
 - My guess is elites, especially the most successful ones, make plans with time horizons longer than the other two classes.
   - This is definitely not true of all elites, I'm sure many are just trying to improve status among their peers or find a better partner or whatever.
   - Having more wealth does allow you to make plans on the timescale of multiple generations though, so a handful of elites make use of this opportunity.
   - Sucessful elites put a lot of time into thinking what happens to their wealth, position and organisations after they die. They try finding successors (for example grooming their children for the role) and try modifying their organisations such that they can survive their death. They especially do this when they are old themselves. They care about what story they are remembered for.

[Side note: I basically think the internet is going to disrupt the social norms of all three classes. I would first like to understand what the current norms are, before trying to predict how they will be disrupted.]


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/unimportant/politics/collusion_and_control.md

2025-07-11

Disclaimer
 - Quick note
 - Incomplete

Multiple structures in the world want to suppress some knowledge from reaching the public or it then becoming common knowledge.

Examples of structures
 - religious institutions
   - in context of this dicsussion, this includes fundamentalist atheist ideologies such as marxism, nazism, liberalism, etc
 - families (parents, spouses, etc)
 - political parties, governments
 - schools
 - ?

Examples of common knowledge that may be suppressed
 - Direct harm - Institution is directly responsible for harming the person, such as through physical violence, stealing their assets or jobs, etc.
 - Lost potential - Individual would be happier if they did something they wanted to, instead of the thing they're being controlled to do
 - Loss of agency - Individual has suffered loss of agency and therefore loss of meaning / existential stuff, as a result of being controlled
 - ?

Disincentives applied in order to exert control and prevent the knowledge from coming out:
 - safety
   - threaten violence if they disobey
 - financial incentives
   - create financial dependence, pull back finances if they disobey
   - threaten violence if they attempt to obtain a source of income or savings you do not control (this is technically also a safety threat)
 - social incentives
   - ostracise them
   - coordinate with other people in their life to also ostracise them
   - control individuals with high social status - Many people look up to these individuals, so if these individuals too guilt and shame you, it is harder to break free.

Collusion
 - Multiple institutions often coordinate to gain control of individuals
 - Collusion to apply threats of violence
   - Due to invention of nuclear weapons, only 9 countries' elites have truly independent policy. Even within them there is some coordination.
     - Studying which of the 9 countries have collusion between which institutions and the government is important.
   - Religious and political elites often coordinate with other political elites to gain control over people in various countries. This is true across multiple religions including atheist ones.
   - They in turn can coordinate with family elders to enforce threats of violence, if the child disobeys.
     - Most commonly this is done by persuading the family elders into the ideology.
     - This can also be done by directly threatening violence on the family if the child disobeys. Example: North Korea
   - Schools can also be used to apply threats of violence, although this is becoming rarer in the world.
 - Collusion to apply financial dependence
   - Political institutions often use their monopoly of violence to
     - extract tax from citizens
     - prevent citizens from hiding their assets
     - prevent citizens from moving their assets out of the country, using capital controls on gold, cryptocurrency, foreign equity etc
     - extract bribes from economic institutions. Depending on the balance of power they might also provide favourable policies in return.
   - Similarly families and geographic communities often
     - restrict children's physical movement to prevent them from finding sources of income outside their control
   - Schools and colleges can restrict the set of careers one is aware about or has access to. Schools and colleges can normalise certain careers and denormalise others.
 - Collusion to apply social incentives
   - ? todo
 - Collusion to spread ideology
   - ? todo


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/connect_with_me/horse.pub.md

```
-----BEGIN PGP PUBLIC KEY BLOCK-----

mQINBGhvSiABEACi4il+8TggVfoZpEQo6bYu7emML5hgcPHH4HKFH4LScGzDSbE2
FQ2drGMcims90iak//tWkceuuHnSjJBaMOKxDK2X69BXP7ndHV/ag6UoM7id4h8m
QMnoQbqUxEg+7jVzRuzYSojm2Ox50w03QAFGy2pzeSEu0GXhSEkudHfrKt9GlIz4
9ArcNQiRNrySQeG0HOIIiXrQzwKXqDbXlctS5J955E6/9xtRtQP5qJzN9XWY+NdE
20KCqOvn5WKIML+VenBs2a0Zh7+Uf5dq2p7+L5zjSHYhFK6/p3YGxOxK11rk1aVN
JiwpjNUBnOpdmvwwkIXM8xzqDLlhaowKudEBQ2TPu708i9L+GYZhW8vDhyP/LZ5i
an3bi6HLSwp0Rd7lgm6PvkHd0826z2dzKw7qV+i0Mu9mettjKOHpWu/wEUBfd/M3
qkSHPjINojO3RpSEs6Jv7T8dg6LxUFPbmqZZLoYIXQ7sg7C2zhhBYms9W1JeaUUA
k16QTPQ4kx4Aylbd+QYIDue3X11LfeKBng42hi59iMtAlTMsiWuXIhxDSKm4aCs/
DOFXqmTFMvXSZ2LhV0QXzZwhCdlQho29cXdVXSg3exdAZPgOR1PMkcEM7QdcX5kb
nPPlOjqW27xe0wKT5rPmoqfYA9Yf7CMx6fHIEUMSEEmIyacDPNbOJg5sSQARAQAB
tAVob3JzZYkCTgQTAQoAOBYhBLmlTdTnb5tpVO/BDiFihTCoJFohBQJob0ogAhsD
BQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJECFihTCoJFohR2UP/0Axk5NPXeAQ
hNmtni2uPFvBw6za3xqpdMvmgxRD8zsywafS9XRYx4Lzv3x45mRbFJNFiB+nXJZd
+7L9tygKcBdlZfqnnnxpxWCwQV0eUepp9kMAikLr7T/eNVYOsPOzIJCe57gWiZ6B
lagLlmpieXo3fjapLKdneEIt7J+N6IGlq2v6uyeLwxD7eKXaWvtVaH4cvNOagR2S
Y+dtKtCibrBOob57SFTVFo9wvE1hdMkGT+9duusDL6ZC09tUhhS6DSN3YVIAXekb
pxmJImq1bygevWLgzAIPSPvvHYY3l+yoX7U4acMFy77rX+Wz45i1CQ4u0yZsWraf
DhBlCNLaJ1BPC17fhAFQs+nqaA4Xx+eBAkF+QGAeMc80M8l01/kDBzXck1KzjlKA
ZrKH3ILywCra5n/letdNCP16BlaD7dkeU/hFY3HkXWH1fDt1lpXheAJWaOvJGse1
KOQkhTj0WEf3Is9rLtv6Rfb4bAn6BxpGZlAXCoFUgcij40iQBTsoX8HTG4vgSlE7
29RemOhazKz4DoiUhsZU0mqhwBHFmwhISiuftzANYVNElnlXBn14Oc9zsCHZyB59
d5E0ivrNMBw6VqeJIGf4OuJwp6CWOpK4ts+XEkjmB7sNJY9dapa+7G59b/2tJcmA
ftlifkFZeK2nhNDnrTxGmkYSQoN7eOOAuQINBGhvSiABEAC8JeIUrZVIq6Jjm4xB
vyosA1keKfmwI7VYykfMAhvkxguzXaXNd5N+tCj1+t4O0y3AfR1fbenOGynjYoEH
2OrShMIZNHOZ+kBVJtZQqyekfF0SjwS4xDi4LQI7qs2aTpebE12lAPi2YV6kShkd
9NmXHgaQrFDYNSpm5iD1BEMJe8nptlOugFVLt7yVc/LAjlrTxt3eWsLt8+6YSZ4I
/o5Os8QAzvx83Im6jeBe3w7/H/CzURPj7IAxO+bdBZ14Cl0OUFKNr2edglDCfHNT
56frImzVieZD4bmkc8mC+1StETjBS1qVkKmGARlRCpLjO3x85gZRyoqHr6Zv58Sm
bLmYMs7XiSMw6k4U6SMmxPkp0P7TX821yNOnHyW7BYI4ttbZ8Q7l92VU59MLAsSq
u1/5Oxe8siixRlvzD56YuPSkQ8MKvvKSBdullNQQ9/80vyC5m+YClPdwmnkmxnv5
FyE9Z4DfJCLMdAnqBjIui3anAAf2WKISuov5DDwV+ZathGJ4hvcongxX1ZUDubtW
b5xeNtqEh5ITeMx9iHuw8oGTuQtreGEybIcB0c3JChKl7P5ci4Kicyj2u/FjTtT7
0SDHQ5KwH4CuhIC7dMXw7eF4CcDer1yH5AeaiAJ5xzPgwCJC2niIna11iW9q0K8T
qmpUZLiu/O6BmoTfgLp2GQ4IowARAQABiQI2BBgBCgAgFiEEuaVN1Odvm2lU78EO
IWKFMKgkWiEFAmhvSiACGwwACgkQIWKFMKgkWiGoBw//T+6yzZFFArlGFA8YEwSH
rIuNAnyQlS55QbIpkl6YIgqPbPfEonXrUYZRVf4whhgXmXKyHm+qYM0xzprMTT8g
+EHXKrbWxoqU7zfZZiUSxjz/BG8YP0Kko29TzqWk0g54N5Q0feEiKvLRwaxRYYjf
YU56Q6OYZKvEAEino3GppqBdHDR04/5t5lhQ9DrUvB+AvqXiMkSLNGWHpTPea65C
I3MvQ+cQAZ2IpUu/iRkEVeFcULFBjNHSCdU4yjprRw1DYtbO1QNDihd7QZ0QbbWE
MJzsTLaO5dqJ16gNpBoU1DeQlDxiI8rxWhmUgw4VozwFFDp8pekgbTL7kAJ+L2Kp
WVuI8y7i/VNXZswXOOVQ75rSIatISUQwey5m8K48S23LWuY7+kHVMcsaTMjKCDjD
BA/bJ4iKVQvkF/C2NNhwHkefTQu7XPu6LL5lE0mseqy5d1WlFOJnfi+2uTQdIdhX
CIn6XYg6hgfD80NzeHxCT7EYku/AQljkRA5EWJ70ujiLsArsfYTWM0PXWmgrGAIm
Pfe2GqA9MdvzzmvEQR3n13dsb0q7Z1yrfwJIaqDCcJ2U70AaWWosfnPXtAZku50n
TGydbS0r3MkljDyHy3ll8/rKAcMsy/IonXPxHI3Y2EiJKgHP6fOfOc1FHyTTajlN
0iJ83QdDS0Tos4pDwaXITWM=
=YeAb
-----END PGP PUBLIC KEY BLOCK-----
```

# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/connect_with_me/contact_me.md

2025-07-10

# Contact me

## Online

**My email: samuel.da.shadrach@gmail.com**

I also accept anonymous email.

## In person

Probably first meet me on audio/video call, and if it goes well, we could meet in person.

## Security

For low security: Use gmail ID above.

For medium security: [Contact me (secure)](./contact_me_secure.md)

For highest possible security: I do not have security setup sufficient for this. Submit to a [SecureDrop server](https://securedrop.org/directory/) or use it to meet the journalist in-person. Journalists at orgs operating SecureDrop likely have previous experience dealing with anonymous high-stakes submissions.

## How to send me an email

How?

Please consider emailing me if you at all believe either you might benefit from it, or I might benefit from it.
 - To maximise value you get from me, consider writing a short document about what you'd like to talk about and attach it to the email. Then we can schedule a quick call where I go through your document and provide input to you.
 - I find call faster than text for extended conversation.
 - Please don't write filler words, just get to the point. Imagine you were messaging me not emailing me, if it helps.
 - Just in case I ignore your email, you can get my attention by [donating me](./donate_to_me.md) a tiny amount like $1 and attaching proof in the email.

Why?
 - I'm okay being spammed. Worst case I'll ignore your email. It doesn't consume a lot of your time to write me a one para email, so you shouldn't overthink it.
 - I don't mind spending one-hour calls with lots of strangers to help them. I expect some fraction of these callers will repay the favour back to me when I need it.
 - It does not matter if we've spoken before or not, or if you're approaching in a personal or professional context, or what your message is about, as long it'll add value to me. Even if you're criticising or accusing me for example, it's possible this adds value to my life, and I might appreciate you for the same.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/connect_with_me/hire_me.md

2025-04-13

# Hire me

You can hire me for $200k / annum or $150 / hour for almost any task in almost any location.

I might also be open to offers in range $100-200k / annum, but whether I accept will depend on specifics of the opportunity.

[Contact me](./contact_me.md) for the same.

*Referral bonus: If you can refer me to a job I end up sticking to for atleast one year, I can pay you my first 3 months of income minus minimum wage. Referral means I don't need spend a large number of hours in application processes. To claim reward, make sure to keep a copy of our chats and a copy of this webpage as proof, as I might forget otherwise.*


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/connect_with_me/donate_to_me.md

2025-05-27

# Donate

#### Update

Please see my other document for specific project instead. This document is for unrestricted donations.

## Main

**My XMR address:**

```
446uMYQvFeN5kmwC4TKs4h9BLTbDmXD9ECTTpe5NfnXKby7XSeY876LMpSryKMLFBAYLYWumrxdftLcqnp2F1CSQMHrHpPb
```

#### Do not donate if

 - if your expected networth in next 5 years is below $500k. There's a possibility you need this money more than me.
 - if you feel you have obtained most of your money by morally grey or immoral means. This is true whether or not I ever get to know how you obtained your money.

#### Donating less than $1000

How?
 - For amounts below $1000, I only accept Monero (XMR) from anonymous donors.
 - If you've donated to me, do not message me or anyone in my circle saying you've donated to me, as that can break the anonymity.
 - The best way to buy XMR is usually buying BTC first using a bank transfer, then swapping that for XMR. Reddit has recommendations for both KYC and no-KYC options. If you find this hard or time-consuming, consider finding a friend who can buy it for you and paying them a commission. Learning these skills can be fun and can come in handy in the future.

Why?
 - I prefer anonymous donations because I don't have time or interest in doing a background check on small donors. And I would not want to entangle my reputation with yours without a background check.

#### Donating $1000 to $500k

How?
 - The above XMR address still works. 
 - I can also likely support other payment modes such as wire transfer (SWIFT), Paypal/Stripe or US equity. This may require a background check.
 - I can also list you on my website after a background check, if you prefer this.
 - [Email me](./contact_me.md) and I'll quickly figure out the best way to accept your donation. If you're undecided on whether to donate and want a 1-hour call with me first, please email me.

#### Donating above $500k

How?
 - I'd recommend donating $500k first, and giving me a couple weeks to figure out the legal aspects. 
 - Once I'm ready, you can send the remaining amount.
 - [Email me](./contact_me.md) and I'll figure it out.
 - Allying with a politician or billionaire would be one of my potential lifetime goals. If you are one, I'm willing to put in a lot of effort to build such a relationship.
   - It could be a standard donor-recipient relationship, in which case I will insist on maintaining autonomy over my decision-making.
   - Or it could be a relationship that involves shared decision-making, in which case we may both have to make compromises.

## Consequences of donating to me

Money
 - This is the unrestricted donations page. Money donated to me is not earmarked for a specific project, it is given so that I have more agency in the world. **If I spend all of the money on failed activities or on self-serving activities, I have not done anything unethical.** I'll try not to do that, but I make no legally binding promises.
 - Receiving $500k cumulative donation would significantly change my life, as that would let me get an EU citizenship and fast-tracked US green card.
 - Receiving small donations will not significantly change my life but would provide me encouragement, and allow me to tell others that I get paid for my work, instead of telling them I'm unemployed.

Sponsoring visa
 - If I were to live in the US (most probably in SF), I could meet potential collaborators and mentors more easily, and abide by the free-speech norms of the US. This is likely to help me as free-speech norms affect not just what I am allowed to say but what thoughts I allow myself to think.
 - I am not keen on shifting to the US on a visa that lets someone else arbitrarily restrict what I do or say.
   - I understand that my speech and behaviour may affect your reputation. We may be able to negotiate a way for your reputation to be insulated from mine despite you providing me a visa. Please contact me to discuss more.

## Ideal funder

I'm not sure if there's even one grantee that satisfies following criteria
 - received $1M funding
 - non-profit
 - public advocacy
 - AI xrisk

There are currently very non-profit funders for public advocacy around AI risk. Most large funders are funding alignment researchers and US policymakers, in order to make societal changes while keeping most of humanity out of the decision-making process. I don't have a moral problem with this strategy but I think it has high probability of failure, and that public advocacy is more likely to succeed than this strategy.

Relevant: [FLI digital media accelerator](https://futureoflife.org/project/digital-media-accelerator/)

## Past financial record

As of 2025-05 I currently have net worth in range $20k-$100k.
 - I am not dependent on donations.
 - I currently live in India on approx $200 per month (approx $2500 per year).
 - The exact amount I spend can go up or down, for example depending on the projects I'm working on.
 - I can sustain myself on my savings for atleast a few years.

Most of this was made via crypto trading and investments, including working at Rari Capital and investing in crypto during bull run in 2021.

An investment I made that I now consider morally grey was purchasing significant amount of REPT to profit from victims of smart contract hacks.
 - Rari Capital's smart contracts were hacked twice, first in 2021-05 and second in 2022-04. After the first hack, IOU tokens called REPT were issued to all victims. They could either hold these tokens till they were repaid back by Rari, or sell them to someone else. Some of the users sold them to me at a deep discount. Eventually Rari did in fact pay back the hack victims, allowing me to profit.
 - I did not cause the smart contract hacks, was not responsible for smart contract security and had left the company over 3 weeks prior to these investments in REPT.
 - I do not currently intend to return this money to the counterparties who sold me REPT. I might consider returning it if I became much wealthier than I currently was. If I had to redo the situation again, I might have made different choices. I think Samuel!2021 undervalued the importance of building a strong reputation compared to Samuel!2025.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/connect_with_me/donate_to_me_ai_whistleblowers.md

2025-06-03

# Funding request - enable whistleblowers at AI companies

#### Disclaimer

 - Quick note
 - Might update document as I get more info
 - Contains politically sensitive info
 - I wrote this document mainly as reference to make funding applications to a number of institutional funders in the AI x-risk space.

## Social proof, comments and feedback received

[REDACTED]

All details not published for privacy reasons. Message me for more details.

## Project description 

Project 1
 - Publish a guide for whistleblowers in AI companies leaking classified information.
   - Guide will include all knowledge that is relevant to a would-be whistleblower including technical knowledge on opsec, legal knowledge, geopolitical knowledge, knowledge on dealing with journalists and mental health resources.
   - Assume that AI companies will have classified some of this information. Assume that best strategy for such a whistleblower is breaking US law and escaping to a country like Russia, not attempting a legal defence within US.
   - Example: Edward Snowden

Project 2
 - Provide youtubers and journalists with technical education on a) opsec for handling leaked documents including classified info b) AI risk
   - I am primarily focussed on onboarding atleast 2-3 popular youtubers and journalists located in each US-allied nuclear state (US, UK, France) and each US-non-allied nuclear state (Russia, China, India, Pakistan, Israel).

![AI whistleblowers](../../non_text_non_video/ai_whistleblowers.png "AI whistleblowers")

## Work done on this project so far

Project 1

Incomplete drafts: [database](../my_research/us_govt_whistleblower_database.md), [guide](../my_research/us_govt_whistleblower_guide.md)

This is like v0.1, expect final version to be based a lot more on historical data and consensus of expert opinions, and less on my personal opinions.

Project 2

Not yet started. Just made lists of journalists and youtubers I'd like to contact in ideal world.

## Theory of change

Theory of change for project 1
 - We publish a high-quality guide supporting whistleblowers
 - We make the guide popular in relevant circles
 - => a) Whistleblowing without going to prison becomes less risky, and b) whistleblowers feel they have moral support from our team, and c) whistleblower becomes better calibrated on success likelihood of this plan, which is above 50% IMO.
 - => Increased probability that a whistleblower comes forward and leaks information.
 - => Leaked information gets published.
 - => Citizens in multiple countries are convinced AI risk is a top priority issue. More people within government of US and China are convinced AI risk is a top priority issue. They get an accurate picture of the situation as it plays out in real time.
 - => "Common knowledge" is established. It becomes more in the self-interest of anyone in US Congress/Senate to take steps on the issue.
 - => Complete global ban on AI research.

Theory of change for project 2
 - We allow more youtubers to safely handle classified information
 - Case 1: No hot war between US and China or Russia
   - Existing journalists in US may already be ready to publish the information. For instance those hosting SecureDrop servers.
   - However they are likely to publish a political piece that aligns with the media outlet's perceived self-interest, and they are less likely to publish the original documents.
   - => We will ensure original documents are published, by putting US corporate media in competition with youtubers inside and outside US
   - => If original documents get published, citizens in US and China get a more accurate picture of the situation as it plays out in real time.
   - => Similar theory of change here onwards as project 1
 - Case 2: Hot war between US and China or Russia. All US journalists are gagged from publishing leaked US classified information
   - Only journalists outside US geopolitical sphere of influence may be safely able to publish the information
   - => Our project may be critical in ensuring the information is published at all.
   - => Similar theory of change here onwards as project 1

Whistleblowers are disprortionately important for changing everyone's minds (collective epistemology)
 - I generally think the average human being is bad (relative to me) at reasoning about hypothetical futures, but not that bad (relative to me) at reasoning about something with clear empirical data.
 - Imagine two hypothetical worlds. Which one leads to more people being convinced of the truth?
   - One world in which lots of people speculated about NSA backdoors in fiber optic cables and operating systems and so on but no actual backdoors were found, and no real empirical data for it.
   - And another world in which nobody speculated about it, but a whistleblower like Snowden found clear empirical evidence for it.
 - I also think if you're inside an org that has lot of collective attention, you will also get a lot of collective attention.
   - Case study: As of 2025-01, Geoffrey Hinton has been covered by a lot of mainstream news channels. For most people on Earth who are now introduced to AI risk as a serious topic, I'm guessing this their first introduction.
   - Case study: As of 2025-05, Daniel Kokjatilo's predictions have been [read by people inside US executive branch](https://x.com/AlecStapp/status/1925138917143044154)
   - Most AI risk advocates have probably not acquired same amount of attention as either of these two, despite spending lot more of their energy and attention in a attempt to acquire it.

 - **I think one single whistleblower providing clear empirical evidence of incoming AI risk might end up convincing more people than hundreds of people making speculative arguments.**

## Funding requested

Minimum: $20k for 6 months for 1 full-time founder (me)
 - Shift to SF
 - Main
   - Helps with personal motivation
   - Build credibility for my public brand
   - Obtain in-person networking opportunities
 - Get expert feedback on whistleblower guide from following sets of people. Feedback is easier to obtain in-person if I'm physically present in the US.
   - People in AI risk community. (Very US-centric.)
   - People working in AI labs, including potential whistleblowers. (Very US-centric.)
   - Journalists covering AI labs. (US-centric but not exclusively US.)
   - Cybersecurity experts. (US-centric but not exclusively US.)
   - Legal experts in US law and international law. (Very US-centric.)
   - Past whistleblowers of other organisations in the US. (US-centric but not exclusively US.)
 - Provide opsec education and AI risk education to journalists and youtubers
   - Sales pitches generally work best in-person.
   - Capital raise increases brand value, which makes sales pitch more effective.

Standard: $80k for 1 year for 2 full-time cofounders (me and a cofounder)
 - Same as above
 - Also: will search for a full-time cofounder
   - Will help with personal motivation
   - WIll help with filling in knowledge gaps, such as lack of legal knowledge

Ambitious: $200k for 1 year for 5 full-time team members
 - Same as above
 - Also: Will demarcate roles more clearly. My current guess: 2 people will only work full-time on the whistleblower guide. 3 people will only work full-time on networking and sales pitches to journalists.
   - In-person networking is hard to scale and can benefit from multiple people to putting in full-time work.
   - In particular, wish to contact youtubers and journalists with a sales pitch on improving their opsec and AI risk knowledge. This can require extensive in-person effort.

## Bottlenecks

Bottlenecks for project 1
 - Capital
   - See above for how capital will accelerate and improve quality of this project.
   - Not blocked by lack of capital. It is possible to complete first draft of this project without capital.
 - Attention
   - If project builds enough credibility, can get expert advice from cybersecurity experts, journalists, lawyers, political strategists and others who might provide the advice voluntarily for alruistic reasons.
 - Knowledge
   - Requires expert knowledge in a variety of areas.
   - I have above-average knowledge on:
     - cybersecurity and opsec
       - I can review multiple approaches taken by different projects and their tradeoffs
     - historical whistleblower cases
   - I have below-average (but greater than zero) knowledge on:
     - US law, international law
       - whistleblower guide needs in-depth legal knowledge
     - geopolitics
     - psychology
       - whistleblower may need mental health reasons provided by the guide, as it is too risky for them to hire a psychiatrist.
     - newswriting, media training
       - whistleblower may need media training to persuasively argue their case in front of the public. This can then be posted to youtube or similar platform.

Bottlenecks for project 2
 - Capital
   - See above for how capital will help.
   - Not blocked by lack of capital. It is possible to complete first draft of this project without capital.
 - Attention
   - If project builds enough credibility, can work directly with existing youtubers and journalists. Not mandatory to acquire public attention from scratch.
   - It is typically in a youtuber's or journalist's self-interest to publish documents before the others do it.
   - However, for publishing classified information, they may face significant risks to their career and reputation. (See case studies for more.)
 - Knowledge
   - I have above-average technical knowledge on:
     - cybersecurity and opsec.
       - For instance, teaching youtubers to operate a securedrop server or similar setup. I have extensively gone through their docs so I know I'll be able to install it.
     - AI risk
   - I have below-average technical knowledge on:
     - US and international law.
       - Youtubers and journalists might want to understand their legal exposure, for soliciting, processing or publishing leaked information. It's possible their existing legal team is poorly equipped for this.

Knowledge bottleneck for both projects
 - The biggest bottleneck by far is that I currently lack legal knowledge.
   - Will need to onboard someone with legal knowledge, either as an advisor, or full-time.
 - I want as much of the whistleblower guide as possible to be backed by either historical evidence or a consensus of experts.
   - My personal intuition should only be relevant in sections where neither historical evidence or a consensus of experts exists.
 - If multiple experts disagree on the best advice, it may even be possible to post both pieces of advice side by side.
   - I'm currently undecided on if this is a good idea, but it is an option available to me.
   - Even if I don't publish view of people I disagree with, they are free to publish their views on their own.

## Team

Currently just me (Samuel Shadrach).

My bio
 - Indian citizen, born and resident in India.
 - Made research notes including those that convinced me to work on this project. 2025
 - Wrote AI-based search engine for books. 2024-25
 - Attempted building EA IIT Delhi community. 2024.
 - Completed IIT Delhi, BTech and Mtech, biochemical engineering and biotechnology. (Considered among top 2 engineering colleges in India.) 2018-2023
 - Completed ML Safety Scholars (40h/week, 9 weeks) under Dan Hendrycks, then at UC Berkeley. 2022
 - Worked at Rari Capital, managed risk framework for $1 billion AUM in cryptocurrency lending/borrowing. 2021-22
 - Worked at Market.xyz, helping with founding and risk framework for cryptocurrency lending/borrowing. 2021-22

#### Previous track record of team (detailed)

 - Research on surveillance and whistleblowers - Partially successful, **most relevant to this funding application**
   - [All links published here](https://samuelshadrach.com/raw/text_english_html/my_research/)
   - Wrote detailed review and criticism for SecureDrop, an existing solution in this space used by many of the top corporate media outlets in the US including the Guardian, New York Times, etc
   - Wrote about potential improvement to ecosystem for whistleblowing
   - Wrote about long-term societal implications of lots of information coming out in public
   - Wrote about mitigating downsides such as weapons capabilities becoming open source
   - I consider this ongoing research project, not completed.

 - Wrote AI-based search engine for books. - Successful
   - [More notes on this on my website](https://samuelshadrach.com/raw/text_english_html/my_projects/my_projects.html)
   - Still think this app is valuable for a niche of researchers. Haven't prioritised user acquisition due to shifted life priorities.
   - Learnings
     - Iterated through 5-10 different failed approaches before I got the current approach working. Improved as a software developer as a result.
     - Obtained very fine-grained view of AI capabilities in 2025, including the capability differences between finetuning, RL and embedding search. As of 2025, embedding search is still by far the most economically useful. Main reason is error rate is lowest when end user can directly refer to ground-truth data.

 - Full-time travelling, including teaching english at a school in vietnam for 1.5 months - Successful
   - Learnings
     - Accomplished personal goals. (Not elaborating here as this is a professional document.)

 - Completed ML Safety Scholars under Dan Hendrycks, UC Berkeley - Successful
   - Spent 40 hours / week, 9 weeks studying deep learning and safety under Dan Hendrycks
   - Multiple pytorch assignments training our own models, and multiple video lectures that I found very high-quality. 
   - Learnings 
     - Better technical understanding of deep learning also enabled me to form more accurate views on AI timelines and AI x-risk.
     - I have now published my latest views on AI timelines, intelligence explosion and so on on my website, including rebuttals to Ajeya Cotra's evolutionary anchors, Yudkowsky's certainty on intelligence explosion, and so on. [AI timelines](https://samuelshadrach.com/raw/text_english_html/my_research/superintelligent_ai_timelines.html), [Intelligence explosion](https://samuelshadrach.com/raw/text_english_html/my_research/intelligence_explosion.html)

 - Started EA IIT Delhi community, with CEA UGAP funding - Not very succesful
   - Influential in ensuring ~10 people from my college attended EAGx Jaipur. Zero full-time employees of EA orgs recruited.
   - Ran weekly meetups to complete intro to EA course.
   - Multiple members of EA IIT Delhi are now aiming to work on AI SaaS startups. Other members are currently doing higher studies in AI-related fields in UK and US respectively.
   - Learnings:
     - I significantly underestimated the extent to which people I'm talking to are facing pressure to pick standard career paths, including financial pressure and social pressure from family.
     - If I could go back and do it again I would filter hard on these criteria from day one, and prefer finding 1 person who is both passionate and financially privileged, over 10 people with lukewarm interest.
     - Even people working full-time on accelerating AI and think AI risk is a real problem might not switch career paths unless they've done the psychological work to become resistant to this social pressure.

 - Managed risk framework and operations for Rari Capital and Market.xyz - Partially successful
   - Designed risk framework for managing $1B in cryptocurrency at peak
   - Worked on risk framework. No major loss of funds due to lending/borrowing risk, indicating my risk framework did not fail. (There was loss of funds due to smart contract hack, but that was out of scope of my work.)
   - Operations work. Helped scale team from 6 to 15 team members via outreach and vetting of applicants, including via short-term projects.

 - Cracked JEE Mains and Advanced to study at IIT Delhi - Successful
   - Standard competitive exam for most STEM degrees in India
   - Obtained All India Rank of ~4000 out of over 1 million annual applicants.
   - This required multiple years of full-time coaching and preparation.

## Location

If funded, all team members will be located in the US for the next 1 year.

[REDACTED]

All details not published for strategic reasons. Message me for more details.

## Looking for cofounder

 - Alignment
   - High level of commitment is most important criteria. Cofounder should believe in the project enough that they find a way to complete it even if our cofounder relationship breaks apart and current funder pulls back funding, for example.
 - Knowledge
   - I prefer someone with background in either international law or journalism, and someone who is not completely against tech (being neutral or positive is OK). This is not mandatory.
 - Citizenship
   - [REDACTED] Message me for more details.

## Existing projects and weaknesses

 - EA projects
   - Most EA funding currently goes to:
     - technical AI research
       - I am supportive of technical AI alignment research but bearish on most of it working out given the timeframe of 5 years.
       - I am more optimistic on AI ban and international coordination, than I am on most alignment research agendas.
       - Out of the alignment research agendas I'm most optimistic on AI boxing. This too will only work until some level of superintelligence, and eventually international coordination to pause or ban research is needed.
     - supporting AI policymakers
       - By 2026-27, AI will likely become a top item of national security interest of both US and China.
       - Convincing US policymakers is less helpful, if the policymakers are attempting to act against the perceived self-interest of both the US Congress and Senate members, and the perceived self-interest of leaders of US intelligence community,
   - 99% of humanity is still broadly unaware of the AI risk arguments as presented by Yudkowsky and others, despite billions in funding for AI risk. Movements that aim for mass public support are able to get more awareness for less money spent IMO.

 - Existing whistleblower guides
   - SecureDrop (Freedom of the Press Foundation) provides basic guidelines for whistleblowers using their system
   - [Guides by Government Accountability Project (GAP)](https://whistleblower.org/resources/#resources-about-whistleblowing)
   - [Tech Worker's Handbook by the Signals Network](https://techworkerhandbook.org)
   - [Guides by Protonmail](https://proton.me/blog/whistleblower-communication)
   - Did not find guides by EFF, ACLU, yet
   - [GIJN list of guides for journalists (not whistleblowers)](https://gijn.org/resource/working-with-whistleblowers/)

 - My generic view on most existing whistleblower guides
   - Generally supportive
   - Mostly written by citizens of US resident in US. This has obvious chilling effect on what they can publish in public.
   - Some of these guides have a lot of useful information. I have read all the above guides in detail.
   - Most of their guides are useful for a whistleblower who does not leak classified information, and stays within the US for a legal defence.
   - They have a lot less public information on whistleblowers who escape the country (such as Snowden or Assange) or leak classified information. Such whistleblowers need legal expertise on international law, making asylum requests, and geopolitics. I have not found a public guide with this information yet.

 - Existing tech projects
   - Side note: Less than 100 devs seem to be managing the codebases for all the below listed projects. This all by itself seems like an attack vector to me. I have not verified exact figures.
   - Tails (owned by Tor project)
     - Highly supportive
     - It acts as a foundation for this project
   - Tor browser
     - Highly supportive
     - It acts as a foundation for this project
   - SecureDrop (uses pgp and Tor under-the-hood)
     - Highly supportive
     - See my document on this, for a detailed review and criticism of SecureDrop
     - Biggest weakness is that it is only used by journalists inside of the US geopolitical sphere of influence (North America, Europe, Australia, etc). Not used by journalists in India, China, Russia, Israel for example.
     - Second biggest weakness is that it does very little to improve opsec of whistleblower, and primarily improves opsec of journalist. IMO this is not great, as it does not earn trust of whistleblower who is more likely to go to prison. (Multiple previous case studies including Reality Winner, where journalists have been careless about whistleblower privacy.)
   - Signal
     - Highly supportive
     - For this project, I think Signal is not optimal, atleast as it works today.
     - SecureDrop or similar approach may still be best for this project. Journalist and youtubers can run their own onion servers without involving any third party server.
     - Posting messages to Signal's servers is slightly better than posting encrypted messages on some random third party server, like how say, dark web vendors do currently. Signal takes some measures such as sealed sender (which I like) and secure enclave (which I don't like) to reduce metadata they collect on their server.
     - Signal doesn't run well on tails, some of its features are mobile-native. For instance it only does phone number registration, no anonymous method such as cryptocurrency payment or PoW hashes. Waiting for this to get fixed.
     - I might write a detailed review of Signal sometime.
   - Protonmail
     - Highly supportive
     - For this project, Protonmail does not offer significant security features compared to just using gmail. Can assume both maintain logs and will cooperate with law enforcement. The main feature I like is that Protonmail offers anonymous registration using cryptocurrency.
     - Bare minimum opsec: Protonmail + PGP + tails
     - Bare minimum that any journalist or youtuber should be offering IMO, even if they don't have full-time effort to devote to running servers or studying cybersecurity.
     - Lack of technical knowledge is the only reason I can see why journalists and youtubers don't offer this.

 - Existing journalist projects
   - The Intercept
     - I take neutral and not positive opinion on their work today
     - Was founded with help of Laura Poitras and Glenn Greenwald, who worked with the Edward Snowden leaks
     - Both of them ended up resigned due to lapses of opsec on the part of the Intercept editors, which ended with Reality Winner being imprisoned
     - Both journalists are now working independently.
     - [Glenn Greenwald runs an independent news channel](https://rumble.com/GGreenwald) on Rumble.
   - Independent SecureDrop servers
     - Some journalists such as Stefania Maurizi and Kenneth Rosen run SecureDrop servers independent of any corporate-funded media org.
     - Stefania Maurizi has worked with both Assange and Snowden previously.
   - Wikileaks
     - Highly supportive
     - Assange has recently been released from prison after few years in UK prison and few years in Ecuador embassy.
     - [His latest statement after being released](https://www.youtube.com/watch?v=Ai34Uxnv_4s) does not indicate any plans to continue operating wikileaks.
     - Likely has extensive expertise relevant to this project, but may not be able to publicly provide it.

 - Existing whistleblowers
   - With enough branding, it may be possible to contact existing govt whistleblowers for their expertise.
   - Edward Snowden
     - Likely has extensive expertise relevant to this project, but may not be able to provide it. Putin's terms for asylum explicitly included not carrying out any further leaks.
   - See list of case studies for more.

## Red-teaming theory of change

Here is a giant list of reasons why people might not want to fund this.

(You can add a comment to upvote any reason or add a new one. I will add more arguments and evidence for whichever points get upvoted by more people.)

 - US intelligence circles will significantly underestimate national security implications of AI, lots of information about AI companies will not become classified - Disagree
   - I think AI will likely be the number one national security issue of the US by 2026 or 2027 and lots of important information will get classified soon after.
   - I'm putting more evidence here because this argument got upvoted.
   - Attention of govt and intelligence circles
     - Paul Nakasone
       - ex-Director of NSA, now on OpenAI board
       - [Recent talk by Paul Nakasone](https://www.youtube.com/watch?v=D1bzZwFQUcI)
         - Says: Cybersecurity, AI, and protecting US intellectual property (including AI model weights) as the primary focus for the NSA.
         - Likely a significant reason why he was hired by Sam Altman.
     - Timothy Haugh
       - ex-Director of NSA, fired by Trump in 2025
       - [Recent talk by Timothy Haugh](https://youtu.be/JIrw_ybds0s?si=RntO2mx9XzTeGftj) (12:00 onwards)
         - Says: Cybersecurity and AI are top challenges of US govt.
         - Says: AI-enabled cyberwarfare such as automated penetration testing now used by NSA.
         - Says: Over 7000 NSA analsysts are now using LLMs in their toolkit.
     - William Burns
       - ex-Director of CIA
       - [Recent talk by William Burns](https://youtu.be/S9yqvn8B3no?si=puq9--89dXjrpLLE) (22:00 onwards)
         - Says: Ability of CIA to adapt to emerging technologies including large language models is number one criteria of success of CIA.
         - Says: Analysts use LLMs to process large volumes of data, process biometric data and city-level surveillance data.
         - Says: Aware of ASI risk as a theoretical possibility.
         - Says: CIA uses social media to identify and recruit potential Russian agents.
     - Avril Haines
       - ex-Director of National Intelligence, ex-Deputy Director of CIA, ex-National Security Advisor
       - [Recent talk by Avril Haines](https://www.youtube.com/watch?v=Hxh87QQUzLc)
         - Says: major priority for her is US election interference using generative AI in social media by Russia and China
   - US policy efforts on GPU sanctioning China
     - Significant US policy efforts already in place to sanction GPU exports to other countries.
     - China is currently bypassing export controls, which will lead US intelligence circles to devise measures to tighten export controls.
     - Export controls are a standard lever that US policymaking and intelligence circles pull on many technologies, not just AI. This ensure US remains in frontier R&D of most science, technology and engineering.
   - Attention of Big Tech companies
     - Leaders of Big Tech companies, including Jensen Huang, Satya Nadella, Larry Ellison, Reid Hoffman, Mark Zuckerberg, Elon Musk and Bill Gates have made public statements that their major focus is AI competitiveness.
     - Elon Musk
       - Elon Musk is explicitly interested in influencing US govt policy on tech.
       - As of 2025-05, Elon Musk likely owns the world's largest GPU datacenter.
       - Has publicly spoken about AI risk on multiple podcasts
     - Mark Zuckerberg
       - As of 2025-05, open sources latest AI models
       - Has previously interacted with multiple members of Senate and Congress
       - Has publicly spoken about AI risk on multiple pdocasts
   - People who understand nothing about AI will follow **lagging indicators** like capital and attention. This includes people within US govt.
   - Capital inflow to AI industry and AI risk
     - OpenAI, Deepmind and Anthropic have posted total annual revenue of $10B in 2024. This implies total market cap between $100B and $1T as of 2024. For reference, combined market cap of Apple, Google, Microsoft and Facebook is $10T as of 2025-05.
     - All Big Tech companies have experience handling US classified information. Amazon and Microsoft manage significant fraction of US government datacentres.
     - If you believe AI in 2027 will be signifcantly better than AI in 2024, you can make corresponding estimates of revenue and market cap.
   - Attention inflow to AI industry
     - OpenAI claims 400 million weekly active users. This is 5% of world population. For reference, estimated 67% of world population has ever used the internet.
     - As of 2025-05, Geoffrey Hinton speaking about AI risk has been covered by mainstream news channels across the world, which has significant increased the fraction of humanity that is aware of AI risk. (You can test this hypothesis by speaking to strangers outside of your friends-of-friends bubble.)

 - AI capability increases will outpace ability of US intelligence circles to adapt. Lots of information won't become classified. - Weakly disagree
   - I have low (but not zero) probability we get ASI by 2027. If we get ASI by 2030, I think there's enough time for them to adapt.
   - Classifying information is possible without significant changes in org structure or operational practices of AI labs. This means it can be done very quickly.
     - Classification is a legal tool.
     - The actual operational practices to defend information can take multiple years to implement, but this can come after information is already marked classified in terms of legality.
   - US govt can retroactively classify information after it has already been leaked.
     - This allows the US govt to pursue a legal case against the whistleblower under Espionage Act and prevent them from presenting evidence in court because it is now classified information.
     - The specific detail of whether information was classified at the time of leaking is less important than whether it poses a national security threat as deemed by US intelligence circles. (Law does not matter when it collides with incentives, basically.)
     - Case studies
       - Mark Klein's 2006 leak of AT&T wiretapping - retroactively classified
       - Hillary Clinton 2016 email leak - retroactively classified
       - Abu Ghraib abuse photographs 2004 - retroactively classified
       - Sgt. Bowe Bergdahl 15-6 investigation file 2016 - retroactively classified

 - Opsec requirements to protect yourself from the US govt is very hard, and my current level of technical competence is not enough to add value here. - Disagree
   - The whistleblower guide I'm writing explicitly states that the strategy is leaving the US and publishing, not staying anonymous indefinitely. It is assumed the NSA or AI lab internal security will doxx you eventually.
   - A lot of people with cybersecurity background are IMO too focussed on ensuring anonymity indefinitely, which I agree is very hard. My strategy requires as much preparation of legal and media training aspects as it does cybersecurity, because it is not a strategy that relies on indefinite anonymity.
   - IMO there's atleast 50% chance a top software developer at an AI company can pull this off even without a guide.
   - Part of the value of the guide is simply reminding developers that succeeding at this plan is hard but not impossibly hard, and that with some preparation there's a high probability they can pull this off.
   - Increasing probability of success from say 50% to 70% makes this guide worth publishing for me. There's a strong case IMO for why this success probability increase directly correlates with reduction in AI x-risk probability.
     - One whistleblower's publication of empirical real world data could be worth more for collective epistemology than hundreds of high-status people making speculative arguments.

 - Should support whistleblowers coming out publicly in the US, instead of going to another country to release the documents - Disagree
   - I agree that for people who don't leak classified information, staying in the US with a legal defence is an option.
     - That requires a very different guide from this one, and it may be worth writing that too. Once I'm done writing this guide I might consider writing that one too.
     - There's also a lot more empirical data for that type of guide, because there are more such cases.
   - Case studies
     - Every single example of leaking US government classified information in past 30 years has led to the whistleblower getting imprisoned, if they leaked US classified information
       - Example: Reality Winner, Chelsea Manning
     - Only one person leaked US classified information and escaped prison.
       - Example: Edward Snowden
     - Examples of people who were not imprisoned despite leaking US classified information are >30 years old.
       - Example: Daniel Ellsberg, maybe Perry Fellwock

 - Should privately support whistleblowers leaking classified information, but publicly not talk about leaking classified information - Disagree
   - Many aspects of the plan depend critically on whether classified information is leaked or not, as most whistleblowers who don't leak classified information don't get imprisoned, and all whistleblowers who leak classified information get imprisoned.
   - Having clear writing and thinking on this issue is extremely important. I can't muddy it by replacing "classified" with some euphemism for it.
   - I would like to earn the trust of a potential whistleblower without deception.
   - In case of a conflict of interest between doing what is right for the whistleblower, versus doing what is right for the journalist / lawyer / funder / etc, I will prioritise the whistleblower.
   - Whistleblowers are the people most likely to go to prison, and this is a factor influencing why I prioritise earning their trust over everyone else's.

 - Writing such a guide is too hard. Any whistleblower who needs your guide is going to get themselves arrested anyway. - Disagree
   - Since the plan involves leaving the US, I think it's possible for a potential whistleblower to make some opsec mistakes and still escape with their life.
   - The guide explicitly states that indefinite anonymity is not possible, and what is at stake is only the time duration before they get doxxed.
   - The guide needs to transmit both the opsec mindset, and a specific set of opsec rules. If both are successfully transmitted, the whistleblower can then attempt to adapt any part of it to their unique circumstances.
   - Transmitting opsec mindset, not just a set of opsec rules, is hard to do in a short amount of time. I still think it is worth figuring out how to do this. A lot of employees at OpenAI/Anthropic/Deepmind/etc are software developers and teaching them security mindset may be easier to do.
   - A high-quality guide for this set of circumstances does not exist on the internet as far as I know. There are a lot of opsec guides on the internet for everyone from software developers to non-classified whistleblowers to dark web drug vendors to hackers. Obtaining information from those guides and distilling it into a clear set of rules requires time and energy the whistleblower may not have. Most likely, they will benefit from a ready-to-go guide.
   - The guide will have to be regularly updated with new information as security landscape changes.

 - Hot war between US and China/Russia is very unlikely. US journalists and youtubers can be trusted to publish the documents, non-US journalists don't need to be involved. - Disagree
   - I agree there is a significant probability no hot war happens. I think hot war has atleast 10% chance, and is worth preparing for.
   - I'm guessing our actual disagreement is on how likely superintelligent AI is to be built in the first place, or something similar.
     - It is obvious to me why a intelligence community and AI lab that has succeeded at building aligned superintelligent AI will try to disable the military and nuclear command of the other country, for instance by cyberhacking, hyperpersuasion, bioweapons, nanotech weapons or so on. 
     - Even if superintelligent AI has not yet been built, if your country has significant chance of building it first, it makes game theoretic sense to pre-emptively escalate military conflict.
     - If any country's govt actually gets convinced that superintelligent AI is likely to cause human extinction, they might pre-emptively escalate military conflict to get other govts to stop building it.
   - US journalists will have specific pro-US govt biases in how the publish about the piece.
     - This could make it harder to convince the general public outside the US of the issue, even they are aware of it.

 - Publishing original redacted documents is not necessary. Journalists writing a propaganda piece on the issue without publishing documents is fine - Disagree
   - Counterexamples:
     - Kelsey Piper at Vox publishes redacted documents.
     - Snowden insisted on transferring documents to journalists and picking journalists who will report the truth honestly and choose what to disclose.

 - Supporting independent whistleblowers is useful, but supporting independent cyberhackers is not useful - Disagree
   - Choosing not to support independent cyberhackers significantly reduces the amount of information that can be published.
   - Case study
     - Guccifer 2.0 leak revealing private emails of Hillary Clinton about Bernie Sanders was most likely obtained by cyberhacking not by a whistleblower
     - This had non-negligible influence on 2016 US election

 - Whistleblower providing clearcut evidence will not lead to an AI ban - Disagree
   - Examples of specific evidence a whistleblower may uncover:
     - Latest capabilities of AI models, including real examples of AI being using for persuasion (including hyperpersuasion), cyberhacking (including discovering novel zerodays in hardware or software), bioweapons R&D (including inventing novel bioweapons), designs for hypothetical future weapons, and so on.
     - Private notes such as diaries and emails where leadership of AI company or leadership of government talks about their true values, including whose lives they prioritise above whose.
       - Example: 2016 DNC email leak uncovered some of Hillary Clinton's true thoughts on her campaign backers or on Bernie Sanders. This led to non-zero influence on US election in 2016.
       - Example: Assange's collateral murder video uncovered true thoughts of those involved in the shooting. This led to non-zero influence on anti-war protests in US, but has not yet significantly changed US foreign policy.
       - Since ideologies (including moral ideologies) associated with AI risk are more extreme, the uncovered information on true values of leaders could also be more extreme.
     - Full causal chain behind a certain decision, including all the decison-makers involved and how certain stakeholders conspired to take control away from other stakeholders.
       - Example: Snowden explains in detail how the US Supreme Court, members of Congress and Senate, judges on the FISA court, US inspector general, internal reporting channels within NSA were all systematically used or hacked, in order to keep secret the extent to which NSA surveillance has pervaded. This has not yet lead to major changes in this issue.
       - There may be a similar causal chain behind important decisons taken using AI, where many nominal decision-makers in US democracy are routed around and power is centralised in the hands of a very small number of people.
       - See also: AI governance papers posted on Lesswrong on AI-enabled dictatorships and coups.

 - Instead of mass broadcasting whistleblower guide, consider passing a message to AI employees privately - Maybe
   - Someone else can take this strategy while I follow my strategy. I think both are worth doing.
   - Me starting my project could inspire someone else to collaborate or start similar projects.

 - Finding journalists in non-US-allied states who cover tech accurately and can adopt latest tech tools may be difficult - Agree
   - Agree, but still think it is worth trying.

 - Legal expertise is currently missing on the team - Agree
   - Will need to collaborate with someone with legal expertise

## Moral

It is possible there is misalignment of moral values between me and the funder. This is discussed here.

 - Should not leak classified info, breaking US law is morally wrong - Disagree
   - Moral disagreements are hard to resolve, I don't know of any one paragraph answer that is likely to convince you. Maybe post your specific disagreement somewhere and I can have a look.

 - Supporting whistleblowers is morally correct but supporting independent cyberhackers is morally incorrect - Disagree
   - Both will necessarily leak important information of people without their consent.
   - Moral disagreements are hard to resolve, I don't know of any one paragraph answer that is likely to convince you. Maybe post your specific disagreement somewhere and I can have a look.

 - Private lives of people at the companies might get leaked and this is bad. - Disagree
   - I assume that the whistleblower or the journalist will often redact information that is not politically important, in either case. Private lives of people can still be redacted.
   - It is possible that both the whistleblower and the journalist choose to not redact some information. I am okay with this happening in the worst case, and I understand why some people might not be okay with this.
   - I think downsides of a world with powerful organisations able to keep secrets are worse than downsides of their private lives also being revealed to public.
   - Case studies
     - Wikileaks has previously posted private life details such as evidence a politically influential person hired sex workers.

## Legal 

#### Project 1

[REDACTED]

All details not published for strategic reasons. Message me for more details.

#### Project 2

 - Providing journalists and youtubers with opsec tools such as SecureDrop is legal in most countries including US and India.
   - Journalists and youtubers can face legal repercussions for publishing leaked documents. However the team that provides them opsec and training is not likely to face significant repercussions.

## Relationship between me and funder

If you are an existing funder active in this space, there may be multiple conflicts of interests involved with funding me.
 - Supporting me versus supporting alignment researchers and policymakers inside AI labs.
 - Supporting whistleblowers versus supporting journalists.
 - Supporting me versus reducing your legal exposure.

You can anonymously donate me XMR if you would like to avoid entangling your reputation with mine.

Do not donate:
 - if your expected networth in next 5 years is below $500k. There's a possibility you need this money more than me.
 - if you feel you have obtained most of your money by morally grey or immoral means. This is true whether or not I ever get to know how you obtained your money.

I pre-commit to not accepting funding publicly from Dustin Moskowitz or Jaan Tallinn until they can personally have a 30-min private conversation with me around conflicts of interest.

[REDACTED]

All details not published publicly for strategic reasons. Message me for more details.

## Deprioritised

I'm not pursuing below projects as of 2025-06.

Other projects that also fit in the broad cluster of leaking, publishing and popularising info in one nuclear state, that is difficult to publish and popularise in another nuclear state.
 - Improving independent cyberhacking - contributing knowledge or funding, etc
 - Legally grey data collection - doxxing social media accounts of powerful individuals, obtaining drone/cctv footage of them, etc
 - Improvements to airgapped tech - improved pgp tool, improvements over Tor, dead drop coordination system, reviews of faraday cages, etc
 - Open source search engine - open source web crawlers, open source LLM embedding models (without advancing state of the art), etc
 - Bypassing geopolitical barriers on internet - improved browser tools for language translation, pay-to-get foreign SIM card, etc
 - Starting a popular youtube channel myself
 - Starting hard-to-censor social media platform or distributed social media platform


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/connect_with_me/contact_me_secure.md

2025-07-10

# Contact me (secure)

**Email:** samuel.da.shadrach.1@proton.me

**PGP pubkey: [horse.pub](./horse.pub.md)**

**Warrant canary:** to do

**Frequency of checking emails:** Once in one to three months.

Anonymous emails accepted.

**Overall security: Medium**

Security, digital
 - Dedicated PC for this purpose.
 - Inbox and keys accessed on tails only.
 - 128-bit memorised seedphrase (never used outside of tails) + 128-bit seedphrase in secure physical location (never used outside of tails) + PGP privkey (has left tails). All 3 pieces of info required to decrypt.
 - All emails are deleted within 1 month of being read.

Security, physical
 - This inbox is only operated from secure physical location. Seedphrase (mentioned above) stored in secure physical location.
 - I do not live 24x7 in a secure physical location. I visit secure location to operate the inbox.

Legal
 - Do not have significant legal budget
 - Likely to cooperate with law enforcement in case of investigation. Discretion used on case by case basis.

Social / psychological
 - Operating solo. No dedicated team
 - Only operating part-time not full-time
 - No social isolation practised
 - Keeping secrets solo may be psychologically unsafe

Server host
 - Proton may be preserving email content and metadata (timestamps, file sizes, browser fingerprints, etc)
 - Proton has previous cooperated with law enforcement in investigations
 - Proton has never encountered a significant breach of data or passwords, as of 2025-06


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/privacy_policy.md

2025-07-11

# Privacy policy

#### Summary

 - I respect people's consent when it comes to protecting their privacy except in following situations:
   - I might not respect privacy of people who have hurt me using "immoral" means. I might choose to reveal information such as what they did to publicly damage their reputation and prevent them from doing the same to others. I use my discretion here.
   - I don't respect privacy of people at the top companies building superintelligent AI. They are my publicly declared enemy.
   - I might in future build general-purpose infra that reduces privacy for everyone, without selectively picking targets. Example: drone surveillance, social media doxxing tools, etc
 - My security level is not very high, and I am likely to cooperate with law enforcement. If your secret is sufficiently high-stakes, consider not giving it to me.

#### Main

This document describes under circumstances I respect privacy and secrecy of others, and under what circumstances I don't or can't.

If you have any doubts whatsoever, I generally recommend asking me explicitly. We can probably figure out the right solution with more discussion.

I have not yet figured out an information policy that works for the rest of my life. I am still experimenting with it to find what works. Given below is my information policy as of 2025-05.

For most other people's information, here's the policy I follow:
 - By default, I provide you with a reasonable level of privacy.
   - I will not share information about you to others or in public if I at all sense that you want it to be kept private.
   - By default, I follow liberal consent norms. I only publish something about you if I have either explicitly obtained your consent, or can implicitly assume I have it.
   - If I am likely to be psychologically damaged by keeping your secret, maybe think careful before sharing it with me. You can try asking me "meta" questions about whether it is safe to share it with me or not.
   - In most cases I think it is psychologically more safe for me to able to share your secrets with atleast one other person. But I will not share it without your explicit permission.
 - I will not be able to protect your information in case of a) a targeted attack on me b) investigation by law enforcement. I am likely to comply with investigation by law enforcement.
 - As per my discretion, I may choose to publish anonymised or generalised accounts of some events, without naming individual names.
   - If you do not want to be publicly published about, even in anonymised manner, please let me know.
   - If you notice that something I published is pointing at you, you can let me know. I am likely to rephrase or remove it.
 - Exceptions where I don't respect your privacy
   - In a few rare cases, I may choose to escalate an interpersonal conflict with you to the point where I decide to publicly shame you on my website. I do not do this with every single conflict I end up in. You can see my document on this topic for more.
   - I do not respect the consent of people working at AI companies building superintelligent AI, or the major investors or intelligence companies suppporting them. You can see my document on this topic for more.
   - In future, it is possible that I pick more enemy targets than just people building superintelligent AI.
   - In future, it is possible that I build general-purpose infrastructure that reduces people's privacy in general. For instance I have considered working on a social media doxxing tool, or on drone surveillance.
 - I encourage you to send me anonymous email, if you think my privacy and security level is not sufficient to handle certain communication.
 - For certain high-stakes scenarios, my privacy and security level is genuinely not high enough to be handling communication for you. Consider contacting a lawyer or a journalist operating a SecureDrop server instead. 

For my own information, here's the policy I follow:
 - If revealing information about me significantly also reveals the information about another person, then I use discretion. If I at all think they don't want this information shared publicly, I won't share it.
   - Please let me know if you think this is happening and would like me to remove something.
 - If revealing information about me does not reveal information about anyone else, then I reserve right to share it.
   - I am generally experimenting with being open about as many aspects of my life as possible, including social dark matter topics such as death, morality, criminality, sex, relationship conflicts, mental and physical health, money, religion and politics.
   - The two main reasons I avoid sharing information about myself are respecting privacy of people around me, and protecting my safety.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/use_ai_with_my_website.md

2025-06-02

# Use AI with my website

Disclaimer
 - I still support a complete ban on AI R&D. Using AI features does not mean I'm pro-AI.

**Update (2025-06-02)**
 - Arc Max, Safari and Chrome now support some of these features. If not, try finding a good browser extension.
 - I now consider it waste of my time to work on these features.
 - **It's now obvious to me that in the long run, all these AI features are best implemented at the browser level.**
   - Individual websites shouldn't implement them. OSes shouldn't implement them. Browser extensions shouldn't implement them. Locally run software taking screenshots is not a good implementation as of 2025-06. (AI is smart enough to read screenshots but this is unnecessarily expensive.)
 - Within 1-3 years, Safari and Chrome will likely support most of these features.

Here's some ways to use AI to improve your experience browsing this website.

The broader thesis I'm betting on is that as long as the raw content is present in a folder, AI can do everything else with the content - simplify it, style it, translate it, do voiceover etc.

#### Original

See: /raw/

Contains all the static files of my website, including markdown, images, audio and video.

See: /raw/text_english/

Contains all the text content of my website in markdown format.

#### Audio/Video

See: /raw/english_video_ai_generated_{date}

Generated using: OpenAI DALLE-3 images, ElevenLabs audio voiceover

This may have inaccuracies or be outdated. You can also use AI to generate this yourself.

Current
 - Use the voiceover I've generated.
 - Browser or browser extension that supports voiceover.
   - Update: safari does this. Right click > Speech > Start speaking

Future
 - All browsers will natively support voiceover.

#### Search

See: /raw/misc/english_singlepage_{date}.md

I have dumped my entire website contents into a single page called "singlepage, so it is easier to copy-paste.

I have updated my robots.txt but OpenAI still refuses to crawl my website.

Current
 - Use the singlepage I've generated, do ctrl+F 
 - Use the singlepage I've generated, copy-paste it to an AI, and ask a question.
 - Go to singlepage I've generated, and ask browser or browser extension to accept the page as a prompt.
   - Update: Arc max does this
 - Some AI company will crawl the website client-side, so no singlepage needed.
   - Update: perplexity.ai does this.

Future
 - Some AI company will crawl the website server-side, so no singlepage needed.
   - Legal liability might be a reason they're avoiding doing this. I'm unsure. Maybe they don't want to be held responsible for sending GET requests on IPs with bad reputations. Or they don't want to handle responses containing data of malicious intent.
 - All browsers will crawl the website client-side, so no singlepage needed.

#### Styling

My website does not do any styling (font family, font size, line spacing etc), so the styling used is your browser default.

Current
 - Safari > Settings > Advanced > Stylesheet. Ask latest AI (try openai o3) to write a css stylesheet for you and insert it here. This changes the default styling for your entire browser.
 - Install greasemonkey or similar browser extension. Ask latest AI to write a css stylesheet for you, and a JS script to insert this stylesheet. This changes the styling only for specified websites.
 - Copy paste the markdown files locally and view them in a text editor or markdown editor. Change styling here.

Future
 - All browsers will natively support AI-generated CSS.

#### Voice input

Current
 - Manually copy-paste "singlepage" into OpenAI Realtime API playground, and ask it questions.

Future
 - WebRTC client connecting to OpenAI Realtime API (audio), with this website's content injected as context

#### Language translation

Current
 - Browser or browser extension that does page translations to other languages
   - Arc max does this.

Future
 - All browsers will natively support language translation.

#### Summaries, simplified english

Current
 - Browser or browser extension that does summaries and simplified english
   - Safari, Arc mac do this.

Future
 - All browsers will natively support summaries and simplification.

#### Full stack

See: /raw/misc/english_tarball_{date}

I have generated a tarball of the text content of my website.

Current
 - If you want to do something more complicated (such as doing many of the above steps together), you can download the tarball and then ask AI to write a bash pipeline for you that does all the necessary steps.

Future
 - AI will directly crawl whole website client-side.

#### No payments

I don't want to invest energy into supporting payments.

Current
 - Some AI features not supported.

Future
 - App doesn't need to process payments
   - There would be a lot more AI apps if each app didn't need to separately collect payments from end user.
   - Legal liability might be a reason they're avoiding doing this. I'm unsure. Maybe OpenAI wants an app dev who they can hold responsible for API inputs with malicious intent.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/taboo_words.md

# Taboo words on this website

Disclaimer
 - Quick Note

2025-03-29

I am sympathetic to the lesswrong social norm of taboo words. Basically you avoid the usage of words that are not defined in a clear way where everyone agrees on the definition. If you think somsone is using a word without clear definition agreed by everyone, you can ask them to repeat their point without using this word.

Taboo words on this websites:
 - can, possible, because
   - On this website I almost will never say "X happened because Y". I will almost never say "X is possible" or "X is impossible".
   - Saying "X happened because Y" means in an alternate universe where X did not happen, Y would not have happened.
   - Especially when studying history it is difficult to construct this alternate universe thought experiment in a way everyone agrees what the thought experiment should be, and what its outcome would be. (I accept that in theory this thought experiment works fine. Human brains don't violate deterministic universe. It's just in practice we don't know how to get the outcomes of these experiments.)
   - In practice often multiple pre-conditions existed when an event happened. The combination of those pre-existing conditions is not simple linear algebra, "A and B and C => Y", it's "f(A,B,C) => Y" for some poorly understood f.
   - I prefer the phrases "causally upstream" and "causally downstream" over the word "because".
   - There are different levels of "impossible". There are events that would have required people to have different ideas or imaginations in order to happen, there are events that merely required the right amount of funding or the right set of social bonds to form, there are events that required increased amounts of engineering resources, there are events that would straight up violate the laws of physics if they happened. Unless an event would violate the laws of physics, I'm hesitant to say it is impossible. And if it violates the laws of physics, I may be open to its possibility, human understanding of our current physical laws is less than a century old, it could have exceptions.
   - If nothing is impossible, then saying "X is possible" also conveys zero useful information so I avoid saying it.
 - less, more
   - I will almost never say "X is less" or "X is more". Scales are relative, and where exactly the middle of the scale gets defined to be is often a political question. I will almost always say "X is less than Y" or "X is more than Y". This could be when referring to any adjective. X could be more beautiful, powerful, wealthy, empathetic, knowledgible etc. than Y, but X will never just be more beautiful, powerful, wealthy, empathetic, knowledgible etc. period.
 - should, ought
   - I wrote a lengthy disclaimer on is-ought distinction elsewhere on this website. In short, I believe is-ought is real and really important. I will almost never tell someone they should value XYZ as a terminal goal in life. Instead if I am asked to offer advice, I will tell them potential consequences of various actions. Whether those consequences are good as per their values is something they'll have to decide on.
   - In general I'll be specific about who the subject of each sentence is, and make sure it points to individuals not a group. Often people skip the subject and the sentence gets interpreted with "everyone" as the subject. I will almost never say "society should do/think/say/do" or "the government should do/think/say X" or "everyone should do/think/so", I will generally be more specific about exactly which people in society or government I am referring to.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/old_accounts.md

2025-07-02

# Old accounts

I have accepted that everything on the internet is now permanent record. I don't see much harm in doxxing myself before anyone else doxxes me.

older usernames I've used: acylhalide, ghosts_in_the_code

some platforms I may have used: reddit, hackernews, stackoverflow, twitter, discord, lesswrong, EA forum, etc

example crawls that might contain stuff: discord unveiled (all public discords), commoncrawl, internet archive, greaterwrong (lesswrong mirror), etc

example: https://web.archive.org/web/20221215201458/https://www.lesswrong.com/users/acylhalide


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/value_differences_disclaimer.md

2024-12-24

# Value differences (Disclaimer)

Your first impression after meeting me in person or via audio/video call may be noticeably different from what you'd expect if your only impression of me is from this website. Different contexts elicit diifferent slices of one's personality.

My website attempts to have high information density of things not already obvious to you, and things you may disagree with. Whereas if we meet in person or on call, and don't know each other well, it may be better to find things we agree on first.

Also, if I'm meeting you in person, I'm likely to put more effort trying to understand your values and beliefs, as compared to preaching my own values and beliefs. I want to understand you and be able to empathise with you even if I don't agree with you.

I am still trying to figure out the optimal balance between sharing my own views and trying to understand other people's views. This is my current best attempt.

My core values include truth and empathy. If you practice these in your daily life, we are more likely to get along, no matter what our specific values and beliefs may be.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/why_jargon.md

2025-04-17

# Why jargon?

Disclaimer
 - Quick note

Some of my posts probably have too much jargon for the average reader of my website.

If I have used a lot of jargon to explain something it usually means either I don't understand it well myself or I haven't prioritised communicating it to others. I generally believe you have understood something well if you can explain it in simple words.

I use this website more to think aloud than to communicate to others, as of today. Inside my head I use a lot of abstraction to find generalisable patterns across situations. Also I lack formal background in many of the fields I read about, there may exist standard jargon that's equivalent to the non-standard jargon I invent.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/comment_policy.md

2025-05-23

# Comment policy

Any content allowed, including potentially objectionable content.

DDOS attack not allowed.

Anonymous comments allowed. Comments attached to real name allowed. No identity verification though, so adding your name to the comment does not prove anything.

I can choose which comments to post publicly versus not. If you don't want your comment posted publicly, it's best you mention this in the comment itself. Since all comments are unverified, it may be difficult to prove later on who sent which comment.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/website_meta/target_audience.md

2025-06-03

# Target audience

 - Target audience varies depending on the document.
 - Unless specified otherwise, target audience is my future self and anyone else who shares similar curiosity as me.
 - Some documents start with a target audience declared explicitly.
 - Effectively persuading anyone else of my ideas takes a significantly different approach than just recording them for my future self.
   - If writing to my future self, crisp world models and terse vocabulary is generally a good idea.
   - If writing to persuade others, often sharing ideas and arguments is not enough.
     - A more effective method is to teach via "exposure" and "immersion", by providing them a large amount of input from groups that already accept my ideas as true. (This is similar to Krashen's ideas around language acquisition, I'm extrapolating it more broadly to cultural acquisition here.)
   - If writing to persuade others, I am probably running a simulation of a representative of the target audience in my head.
     - Humans do this by default, we imagine representatives of various groups and simulate them in our head. This is possible even if we've never met an actual representative of that group. This is very useful neurological machinery. It also contributes to prejudice, as one can end up forming strong conclusions based on incomplete information, and later updating these beliefs can be painful.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/advice_for_you/advice_disclaimer.md

2024-02-20

# Advice (Disclaimer)

I am quite confused which audience I should tailor this advice page and generally my entire internet presence to. I'm currently experimenting with trying to tailor to everyone simultaneously. I think maintaining multiple separate identities on the internet does not work well over long time horizon. I probably won't update these pages regularly, as giving advice is not my primary life goal.

I generally hesitate on giving advice. There is a lot of advice on the internet and a lot of it is noise to me. 

#### 1. Evidence behind the advice

Assuming I was the recipient of advice, here's the hierarchy of how seriously I'd take that advice:
 - advice empirically proven to work
   - the more meaningfully distinct circumstances it has worked in, the better
   - a theoretical model for why the advice works is a bonus, but sufficient empirical evidence always trumps theory
   - IMO there are entire fields where the replication crisis can be solved by making larger sample sizes the norm and small sample sizes as abnormal (by getting the funding, culture etc so this is possible)
 - theoretical model of why some advice would work 
   - the more distinct types of people are confident in the model, the better
   - empirical evidence about similar-but-not-same situations is a bonus

Whenever you are taking advice from me, try to figure out where exactly on the hierarchy the advice sits. Even if something worked for me, please remember I am just one person. And if I don't have personal experience with it, please remember this is just one person's guess with no empirical evidence.

#### 2. Context-dependence of advice

I rarely ever tell people that they "must" or "should" do xyz. If I make this mistake please correct me.

You have information about:
 - how your brain works - your likes, dislikes, what motivates you, etc. Conveying information about your feelings to me is difficult, even if you genuinely want to convey it. So I assume I lack this information about you. In particular there is an is-ought distinction where transmitting information is easy ("if you do X, Y will probably happen") and transmitting values is hard ("you should want Y to happen"). So I mostly talk about the former.
 - your context, your unique life circumstances. Conveying this information to me is possible but takes a lot of time. So I may also be lacking this information about you.

This doc is full of me saying "consider" doing xyz. If I use the word "consider", that means there may exist exceptions to the advice. Sometimes I may be aware of the exceptions but I'm not listing them out, and sometime I may not be aware of the exceptions. The point is to use your own judgement. You are the master of your own life and I'm not interested in being your guru.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/advice_for_you/advice_for_you_knowledge.md

2025-05-20

To do for myself:
 - Not a priority for me right now
 - Rewrite page so it's simpler and more people can understand
 - Make it easier to directly navigate to relevant section. This page is currently too long.

# Advice for you (Knowledge class)

If you don't understand anything on this page, please ask me about it.

Please read [Advice Disclaimer](./advice_disclaimer.md) first.

#### Summary

 - If you want an amazing life and not a mediocre one, one of the biggest mistakes you can make is blindly copy-pasting values and beliefs of your family and friends, because you want their approval. If you want an amazing life, be more deliberate who you are learning from, whose decisions you are copying and why you are doing this.
 - A fast way to learn anything is to immerse yourself in an environment with other people who know how to do the thing.
   - This is also possible online if you can't meet people in person. I generally recommend spending a lot of time online.
   - Internet-sellable skills like writing, podcasting, game dev, software, etc. make more money than other skills.
 - My (weak) guess is that meaningful work and meaningful relationships will probably be biggest determinants of your life satisfaction. Figure out for yourself for makes your life worthwhile, don't blindly trust my guesses.
 - Don't blindly assume that money can help you, don't blindly assume it can't help you, it is better to think about the specific problems you are facing.
   - If you have < 50k INR, more money may fix some of your life problems.
   - If you have > 10 lakh INR, more money may or may not fix some of your life problems. 
   - If you have > 10 Cr INR you can live in any country and any culture. Culture influences many things about who you are.
 - Make lots of notes. Publish information about yourself online.
 - If you have savings you plan to not touch for many years, invest them in S&P500 not in real estate (exceptions exist).


# Career advice

### If you have atleast 6 months worth of savings - career advice

Proof: I don't have personal experience with this.

In India, 6 months worth of savings is around 50,000 INR. My advice is slightly (but hopefully not very) biased towards Indian context because a lot of people I encounter are from here. It's possible there are unique challenges faced by people of your country that I'm unaware of.

Consider finding a job that provides you with a lot more savings. Making more money may reduce your anxiety of the future. Making more money may allow to make less compromises on your moral values. Also you might get more free time to devote to something or someone you care about.

I understand you might be happy not trying to increase your income, and that is also fine. Don't let me ruin your happiness. Depending on how much free time your job gives you, you may also choose to start a side project or hobby around things you care about. This is great, although I would usually recommend starting it part-time and not quitting your job.

(If you are very ambitious, you can also read the section after this section.)

How?
 - Independent decision-making
   - The more ambitious you are, the more you have to take your own career decisions instead of blindly copying what your friends and family are asking you to do. The decisions required to climb one step in society's status heirarchies may look smart to them and the decisions required to climb five steps may look stupid to them, if none of them have climbed five steps or seen anyone else who has done it. For example if you're a shopkeeper it's easier to propose to your family that you'll buy a second shop, than to propose that you will campaign for your town's MLA seat. How ambitious you are is a personal choice. I am not telling you how ambitious you should or should not be.
   - If they dislike your choices, sometimes you can simply explain to them why you think your decisions are better than the decisions they are proposing. If that fails, there are non-violent ways to resist other people trying to influence your career in a negative way. Whether you choose to use such methods is a personal decision and depends on the exact details of your situation. See my [Controversial ideas page](../my_ideas/controversial.md) for more.
 - Work ethic
   - I should maybe start by repeating all the advice that should already be obvious to you. If you want to work hard in order to increase your income: don't be addicted to substances, get good diet, sleep and exercise, don't gamble your money, etc. (If you don't want to work hard or make money, that is a different matter.)
 - Degree and credentials
   - If you can get a bank loan to pay for your college, you should probably take it. If someone is willing to loan you money for the same, maybe you should accept. But there are more details here which I can't explain in short. (This advice is India-specific. Cost-benefit analysis of acquiring college education depends on which country you are in.)
   - A lot of the really good jobs do not require a degree but require skills. However entry-level and easier jobs have a lot of competition hence employers use the degree as a filtering criteria. It is easier to climb up if you have a degree, but don't let lack of degree discourage you, your chance of success is still high if you work hard and smart.
   - If you already have some skills but lack credentials, upload short videos on youtube showcasing your skills. Good candidate includes competent and aligned. Youtube videos can quickly showcase that you are trustworthy, that you can function in a professional context, and that you are competent. Basically, you want to reduce the number of hours the employer or human resources (HR) department has to invest to find out if you are a good candidate. Every hour of additional work for HR is a cost, and credential-based filters (like degrees) are a way for HR to reduce that cost. If you lack credentials, you have to find other ways to quickly get their attention.
 - How to actually study
   - If I have to summarise this entire section in one word, that word is "exposure". Expose yourself to people doing the thing you want to learn to do.
   - Getting access to a laptop is extremely important if you want to acquire new skills, as most useful skills require atleast some amount of self-studying, which is difficult to do on a phone or using books. If you can't buy a laptop or borrow it from a friend or family member, consider going to an internet cafe.
   - Try to find videos of people's day-to-day life doing the job you wish to acquire. There is a lot of "tacit knowledge" you will quickly pick up from such videos, which you won't get from textbooks.
     - For example if you want to be an architect, see a video of an architect while they're working. You'll get to know which features of their CAD software they actually use versus ignore, which parts of their work are original versus googled, and so on. A textbook that teaches you how to use the same CAD software won't teach you this.
   - A lot of the actual learning you can obtain by self-studying on your computer. However, making friends who know that skill is useful. Some ways they may help: they will be able to tell you which resources are useful and which ones are not useful. If you go on the wrong track, they'll be able to correct you. They may also motivate you and support you emotionally.
   - Making friends will also tell you about the job besides knowledge itself. For example how much it pays, which companies and cities/countries have good opportunities, what is valued within the field, what people pay attention to and ignore in general, the social structure inside the job etc.
   - You can also consider just observing them while they do their work. There is a lot of tacit knowledge that can be acquired this way.
 - Selecting a job to study for
   - This process is likely going to take some hit and trial. It'll depend a bit on your interests, on your current educational background (what you actually know, not what degree you have), and on who you know and what skills they can help you learn. The first skill you try to learn isn't necessarily the skill you will finally end up making money from.
   - Figure out how much time you are willing to invest. Learning a skill like software or finance will pay more, but will require atleast 1-2 years of full-time hard work. (If you're only studying part-time it'll take a lot more time.) Learning a skill like digital marketing or copywriting (well enough to get a job) may be doable in 6 months of full-time hard work, with the right guidance.
   - In general acquiring a skill that allows you to sell something over the internet is a good idea. For example: software, writing, marketing, research, podcasting, consulting, coaching, graphic design, composing music, game development, etc. An internet business can get 1000 or 1000000 times the paying customers without putting in 1000 or 1000000 times the work.
   - If you are joining an internet business like the options listed above, it is common to see the following:
     - Earnings if successful: Founder > Employee at small company > Employee at big company
     - Time required to be successful: Founder > Employee at small company = Employee at big company
     - For a lot of people, your best option for a first job is an early employee at a small company that sells services online. Expect to put atleast 3 months searching for such a job, after you have requisite skills.
     - You can also found your own company. Expect to wait atleast 5 years before you make any money. This is unless you have some special insight why you'll do it faster than that.
   - In general I recommend picking a skill or job you have atleast a little bit of genuine interest in. If you hate it, you might lose motivation to study for it and give up. It does not have to be your number one passion, but you should not hate it. You may have to self-study for multiple years till you get a job, so you need to know how to stay motivated for a long time. Even once you get the job, you don't want to end up stuck for many more years doing something you hate.
   - Try to experiment as much as possible (without spending money). It will help you learn what you are good at and what you like doing. I did this in college and benefitted a lot from it. I tried editing music, writing a book, writing rap, practising debate and public speaking, starting a college club, and a bunch of other stuff. Most of the experiments flopped but I learned about what motivates me.
   - I have basics of software development skill which I mostly self-studied. If making money is your goal, becoming a software developer is unambiguously a good option. You must be good at math and average at english to pursue it. (See your class 10 math and english marks to know if you're good, otherwise go and study class 10 syllabus first.). You can ask me more if you want to learn software.
   - Some random opportunities you can look into:
     - Join French foreign legion, its part of the army in France - no degree needed, but you have to pay for flight yourself, and selection not guaranteed (you have above 50% chance if you take the opportunity seriously)
     - Teach English in southeast Asia - any degree ok, get TEFL certificate for $40 online, you can find job only after reaching destination country, contact me to know more.
     - Start a kitchen for a niche (less popular) cuisine. Many Indian cities and towns lack this. For example if you live in Kolkata, start an Odisha restaurant. If you live in Silchar, start a Manipuri restaurant. If you live in Delhi, start a Thai restaurant instead of a generic (indianised) Chinese one. Monopoly is good for business, 4-5 kitchens competing for the same customers is bad for business.

I will update this section once I have more data or spend more time thinking about it. 

### If you have atleast 5 years worth of savings - career advice

In India, 5 years worth of savings is around 10 lakh INR. The exact number varies a lot by country.

I assuming you have the skills and credentials to make this money yourself, i.e., you didn't inherit it or win it in a lottery or something.

(If you are very ambitious, you can also read the section after this section.)

Misc advice:
 - Consider hiring a secretary (college-educated, remote, english-speaking, digitally literate, full-time), to do some of your admin work like drafting emails. It is possible to hire a full-time remote secretary for $200/month. It probably doesn't matter which country you are living in, if you are willing to hire remotely. If your salary exceeds say, $1000/month, there's a good chance it makes financial sense to hire a secretary and save your time for something else.

#### Making another 5-30 years worth of savings might or might not be a good goal

Proof: I don't have personal experience with trying to make 1 crore INR. Consider getting the opinion of people who have tried this.

This section is a bit country-specific, sorry.

If one of your main life goals is to make > 1 crore INR and you believe this will make your life more happy/peaceful/meaningful/etc than it is today, I strongly recommend you try the following experiments. I don't think I'll be significantly happier with 1 crore, and my guess is there are many others whose brain is wired similar to mine in this regard.

Experiments to run to test if you *really* want 1 crore or not:
 - Make a list of things that you will be able to afford with more money that you can't afford today. Try renting a trial version of this today for one month, and see if this indeed makes you happy. Examples: bigger house, more international trips, high-end gaming PC etc.
 - Ask atleast 5 people in your social circle who have >1 crore and <1 crore how happy they are. Find people who are self-aware and likely to be honest with you. Also try to notice why they are happier - is it more luxury, better partner, freedom from social circle, less pressure to get job, etc.)
 - Run following thought experiment. Imagine you had a button that could switch off the part of your brain that generates anxiety when you pursue a higher risk career path. Imagine you had a similar button that could switch off the part of your brain that internalises social pressure from family and friends to make more money. What would be a good decision for this hypothetical person?
 - Take a few months off from your job. Make a list of things you'd like to do that don't require a lot of money, but require a lot of time. Then actually spend few months doing these things. Notice how it feels. Examples: learning a language, completing college courses online, becoming better at video editing.

For me personally, things I don't consider major benefits of 1 crore INR, assuming one already has 10 lakh INR:
 - more status among my social circle, possibly better partner
 - house, bigger house, vehicle, bigger vehicle
 - more travel
 - more tasty food, higher-end clothes, home decor, other consumerist stuff
 - safety net for medical emergencies

For me personally, things I consider major benefits of 1 crore INR:
 - going really deep into a hobby
   - For example one could setup a private research lab or a recording studio or a high-end kitchen. A hobby of this sort is capable of capturing my sustained attention for multiple years.
   - I have met only a few people in India doing this sort of thing with their money. If you are doing something cool, please [let me know](../contact_me.md). I'd love to learn more about you and what you're working on.
 - school education for (potential) children
   - But also like, consider homeschooling your kids, if your and your child's aptitude exceeds the school teachers'. 80% of my school education was a waste of time and money (no offense intended to my schoolteachers, most of them were decent people). Your kid can socialise with other kids elsewhere for free, and they may be eligible for a loan for college.

If you still conclude at the end of these experiments that making this money will make you significantly happier, feel free to do what as your heart desires. Ignore my advice and any social pressure I might be providing.

#### Making $1 million might be a good goal

Proof: I have some experience trying to make this amount and failing at it. I don't have personal experience with having this much money. Consider getting the opinion of people who do have this amount.

For me personally, main benefit of $1 million:
 - You can get an investor visa and live in any country in the world. You are no longer beholden to the laws and culture of a single nation, and you'll feel more safe. People of different cultures are likely to treat you differently both in the average case (day-to-day experience, workplace, etc) and worst case (during wartime, genocide etc)
 - The main determinants of your quality of life no matter where you live will probably be your work and your handful of close relationships. Living in a different country is especially valuable if it will lead to better work or better close relationships.

For me personally, not a benefit of $1 million:
 - FIRE (become Financial Independent and Retire Early) is not an interesting goal for me, most people who spend > 10 years to achive FIRE seem to get bored at the end anyway. Travelling and relaxing 24x7x365 is not enjoyable for most people, you will probably have to start your own projects to keep yourself occupied.

How?
Most stable career paths in India take too long (>10 years) to make this amount of money. If this is your goal, you are much better off doing one of the following:
 - upskilling and getting YoE on resume to the point where you can get a job from a developed country, either in-person or remote. This could involve getting foreign degree, or Indian job and in-company transfer
   - I don't have personal experience with this so ask people who do. I know this is quite common.
 - starting your own startup or business
   - I have a lot of resources on this, but here's not the place to share it. Read Paul Graham's blog, even if you are not founding a tech startup.
 - cracking UPSC
   - I generally don't recommend UPSC to most people, as most UPSC preparants can get higher rate of return on their time by studying a skill like software or writing or research or finance
   - Consider studying like 1/4th of the syllabus and give previous year exam papers on just that portion. From your marks you'll get a rough idea whether you're on track to cracking the exam or not.

Pursuing either of these requires you to quit your current job in India and not worry too much about leaving the "low risk" path.

There are a few other edge cases, for example you could marry someone with money or network with them and become their consultant or become a sex worker or something. I won't discuss them much here.

#### Not making money might be a good goal

Proof: I have over a year of personal experience with this.

The important thing is to measure your money not in number of lakhs but in number of years it buys you. Assume a frugal lifestyle and write down this number of years for you personally. If this number exceeds 5 years, you have a good reason to not focus on making more money.

Personally I realised I would be miserable trying to start a tech startup with sole goal of making $1 million, hence I gave up on this goal. (I'd probably be happier once I had the money. I'm talking here only about the process of acquiring it.) I'd much rather work on projects I care about. Figuring out which projects are worth doing takes time. I am still in the process of figuring this out for myself, but I'm confident I'm better off this way compared to when I tried to make money.

I might still take a job at some point, but atleast I'm clear the primary reason I chose this was not money but instead learning or social connections or something else. This is a huge privilege and I'm grateful to have it.

I don't recommend my current life path to everyone, but I recommend atleast considering it as an option.

### If you have $1 million - career advice

Proof: I don't have $1 million. Consider getting opinion from those who do.

Consider reading my advice on the other advice page.

If you are sufficiently ambitious, I would generally recommend considering not living in your home country. There are probably better opportunies available outside of it. The main exception I can think of is if you have already built a lot of country-specific knowledge that gives you an edge in business or politics in your country.

Business cultures vary *a lot* depending on country, based on my limited experience. Consider experiencing business cultures in multiple countries, no matter where you finally settle.

#### Becoming a major politician or a billionaire might be a good goal

Proof: I don't know much about this. Consider asking someone who does.

If you want significant influence over society, this is an option. You will have to work very hard, in order to acquire the required attention or capital. Also you will have to learn some of the behaviours of a socioeconomic class different from yours. You will definitely lose some of your current family and friends in the process, so it will be socially isolating. Also there may be an aspect of luck, many self-made billionaires admit to the role luck played in their success. (Although most people who do report luck as a factor report odds like 10% or 30%, not something unlikely like 1%. So don't let the luck factor dissuade you.)

I don't have a lot to say because it isn't something I know enough about yet. But I am interested in studying this, whether or not I ever seriously pursue this myself.

# Advice apart from career advice

### Advice for everyone - Make notes

Proof: I make lots of notes and it benefits me a lot. I have also observed this in a few people around me.

For almost everyone on Earth, I recommend making more notes about your life.

Why?
 - You will likely gain a deeper understanding of various aspects of your life.
   - It could be fundamental things like what your likes, dislikes, values, etc. are.
   - It could be practical things like how to dress or what to eat or how to write good emails.
 - You will retain thoughts that are otherwise temporary and easily forgotten.
 - It'll help you make long-term plans.
 - You'll notice patterns over long periods of time, such as problems that repeated themselves multiple times. and solutions that have worked multiple times.

How?
 - Pen/paper or digital works. I prefer digital because it's easier to edit and easier to preserve. (Yes, paper survives 100+ years, hard disk survives only ~5 years, but you can make copies of digital work more easily and preserve the copies.) Preserving long-term is very important, you want to be able to preserve your notes for 30+ years.
 - I've never made paper notes. If you're making paper notes, it is best you get advice from someone who does this. If I remember, state of the art is making a backup using acid-free paper with archival ink and storing the backup in a secure location with low humidity. (The paper is more important than the ink.)
 - If you're making digital notes, the easiest option is google drive or apple notes or something similar.
 - If you don't have or want to use KYC/phone number/digital ID, protonmail/protondrive is a good option as of 2024-12. Most websites (like Google, Apple, etc) force you to enter phone number which is tied to KYC in many countries.
 - Also it is a good idea to keep a backup on a disk in a secure physical location. Check the disk for errors once a year and swap it every few years, most disks fail in ~5 years.
 - If you know basic software development, I recommend you rent a linux machine, ssh into it and store plaintext/markdown notes instead. File formats, folder structure, account login methods, payment methods, UI/UX and lots of other things change every few years. Many of these rapid changes are causally downstream of the war between Big Tech companies, and it's better if your second brain isn't stored in some random file format you don't understand.
 - Some people like the fancy features of Roam, Obsidian, hackmd and so on. I've personally not found any of the features worth sacrificing the ability to preserve your notes 30+ years. If any of these tools support import/export in plaintext (NOT json, csv, rtf, docx or anything else), you could consider using them while also maintaining a plaintext backup outside of the tool.
 - Beyond this how much security you need is upto you, as additional security takes a lot more effort. Most people don't critically "need" more security, although it would be nice to live in a world where higher security could be obtained for little effort. Make sure to define your threat model. If your threat model is defending against targetted govt investigation and intelligence agencies, you could look into the security recommendations of SecureDrop or generally those found on the dark web. Some specific examples: LUKS disk encryption, anonymous server host paid anonymously (or better, host your own server in a secure location), separate devices and locations to login, PGP-encrypted ASCII paper backups along with disk backups, cronjob or custom deadmans switch, study enough network programming that you actually understand network vulnerabilities of your system, and so on. An airgapped machine in a faraday cage is the most secure setup, but most people want network access.
   - If you're using an NSA-proof security setup like this, maybe consider publishing the details so the rest of us can converge on best practices. (I understand people who use maximally secure setups are the same set of people who will not talk about their secure setups. That's why I'm nudging you a little bit here.)
   - Also remember that if you're operating this sort of setup, 50% of your opsec is the people you surround yourself with, not cybersecurity stuff. Spend as much effort studying that as you spend studying cybersecurity.


### Advice for everyone - Learn from the internet

Proof: I have learned as much from the internet as my entire formal education.

Why?
 - You can improve nearly every aspect of your life by learning from the internet.

How?
 - As of 2024, most google search results are SEO-optimised crap. (The website author deliberately wants your search to hit their page so they can serve you ads, and they don't care if they actually answer your question or not.) 
 - Popular platforms worth using as of 2024
   - Reddit - honest recommendations, life perspectives, technical Qs. Use "site:reddit.com" in your search query to get responses from reddit. Reddit users are not being paid to answer your query, and are more likely to be honest.
   - Libgen - free books
   - scihub - free papers
   - Youtube - podcasts, life perspectives
   - Stackexchange - very technical Qs, academia-heavy audience
   - Twitter - I hate the forced login and politics, but its good for following latest news in any community
   - Tiktok - yes lol you can even get latest news and technical tips on tiktok
   - discord - fastest way to audio/video call strangers, can follow open source (usually tech) projects
   - hackernews - basically the number one forum for software developers
   - lesswrong - niche forum I follow. In general whatever niche interests you have, there's probably a very small number of places on the internet that have its userbase. Find these places and follow them.
 - As of 2024, attention on the internet follows a power-law distribution. A very small number of websites have almost all the attention of all people on Earth. If you want to get answers to your queries, you have to learn to post your queries on these platforms.
 - LLMs like GPT4 (ChatGPT) are also good for receiving advice, and getting better.
 - Search even things you'd not think about searching. Every question you can imagine can have answers online. For example, how to become better at running, how to ensure house remains heated, how to sleep better, how to win arguments, etc etc etc.
 - Learn to get good at filtering information.
   - Most of the people around you in real life are a biased audience, they represent a very small portion of humanity. They are selected based on nationality, class, profession, religion, style of thinking, etc. Most of the people on the internet are also a biased audience. Different websites filter for different types of people. (And yes, please remember that people on the internet are people, with an entire life outside of the handful of comments of theirs you read.)
   - For example chemsistry stackexchange will predominantly attract people with a chemistry degree and predominantly from US/Europe. Whereas a subreddit about your country will attract the upper and upper-middle class of your country, all degrees, and likely only one political side. (The people of the other political sides were probably annoyed and went to a different subreddit or website instead.)
   - [Social dark matter](https://homosabiens.substack.com/p/social-dark-matter) is another very important idea to internalise here. Almost everyone has things they will avoid talking about or be dishonest about, especially on the intetnet, because of their irl situation. You will get a skewed picture of reality if you trust their internet representation to completely capture the irl version of them.


### Advice for (almost) everyone - Make a website

Proof: I have previously benefitted from this in terms of my career. For instance when working in cryptocurrency space. Even more recently (as of 2025-05) I benefit as putting content publicly forces me to think clearly and keep me motivated.

 - For almost everyone I recommend making your own website
 - How much and what types of content you post on the website is a personal decision. However, having a website will remind you that you have this option.

Why?
 - You can benefit in terms of your career and relationships. Other people may connect with you through it.
 - You can become a clearer thinker. Putting content out in public forces you to be clear about what you are writing and thinking. Also it can act as an external accountility and motivation mechanism. If you have publicly committed to doing something you are more likely to go through with doing it.

How?
 - As of 2025, if you lack software skills, it's easier to host content on substack and youtube respectively. I'd still recommend making atleast a one page website that links to your substack and youtube. You can learn to use github pages or link.bio or something. If your website gets enough users (or you can afford it), you can pay someone to build a website for you.
 - Check if your website allows or disallows scrapers (using robots.txt).
   - For instance substack allows scrapers as of 2025-05 but reddit disallows them. If your site requires people to "follow" you to view your content, it is safe to assume scrapers are disallowed.
   - Legal scrapes are posted by orgs like internet archive and common crawl.
   - Scraping disallowed doesn't mean scraping won't happen, just that they'll end up in illegal scrapes which may be slightly harder to find.
 - If you have non-zero software skills, here's a simplified version of the setup I use. Rent the cheapest Hetzner Cloud server you find. Store your plaintext and other static files in `/var/www/Website`. Store the config file below in `/etc/nginx/sites-available/yourwebsitenamegoeshere.conf`. Delete the default website and config file found in these folders. Then `sudo apt update && sudo apt upgrade -y && sudo apt install nginx -y && sudo systemctl start nginx`. Whenever modifying static files do: `sudo systemctl reload nginx`. Purchase a domain from cloudflare or anywhere else (cloudflare does free DDOS protection by doxxing visiting IP addresses), and set `A` records to your server's IP (IPv4). You can also use AI to write a more elaborate nginx conf and write CSS files for you.
```
server {
    listen 80;
    server_name yourwebsitenamegoeshere.com www.yourwebsitenamegoeshere.com;
    root /var/www/Website/;
    location / {
        autoindex on;
        try_files $uri $uri/ =404;
    }
}
```

### Advice for people who publish content online

Proof: Some of my older data is effectively inaccessible because of incorrect file formats. I regret this.

Please consider keeping plaintext (txt, not json, docx or whatever format) backups of your work.

Why?
 - Your favourite publishing platform (be it substack or wordpress or whatever) is not guaranteed to exist in 5 years, and if it does it might look like a different company altogether.
   - Payment policy might change
   - KYC / identification policy might change
   - business model or revenue stream might change, which changes everything about a company (even Big tech companies are not immune, Apple is the only whose revenue is not ad-based, LLMs might change the ad revenue landscape, etc.)
   - company might be acquired or lose talent etc. Check lifespan of median tech company.
 - Your account might get censored, hacked, lose access etc.
 - Popular file formats might change in 5 years. Plaintext lasts forever.

### Using AI to accelerate learning

Proof: I use AI a lot while learning anything.

Why?
 - AI can help you learn anything faster.

How?
 - It is best if you know basic software developer skills and can write simple bash scripts. (If you're on windows please rent a linux machine. Don't learn windows commandline, it's a waste of time.)
 - If you don't have software skills, you can probably still find websites with ready-made tools to do most of the following for you. The ready-made tools may have some disadvantages compared to writing your own tools (such as rate limits, ads, higher cost etc) but you'll find them.

Some specific tricks using AI to speedup learning:
 - Transcribe youtube videos: yt-dlp + ffmpeg silenceremove + whisper API
 - Extract plaintext from any webpage: curl + htmlq
 - Extract plaintext from epubs: unzip + find + htmlq
 - Once you've extracted plaintext, you can use AI to generate summaries and notes, simplify complex terminology, ask specific questions, generate quizzes or anki decks, etc.
 - Ideally you should be able to embedding-search everything. As of 2025-01-14 I'm not aware of any embedding search solution where you can "bring your own plaintext" and it'll do embedding generation and inference. But I'm expecting this will soon exist.

### Trading and investment advice

Proof: I have personal experience. I have invested and traded my money, and worked at trading-related companies.

Why?
 - Most obvious reason is making money. Knowledge of S&P500 and basics of finance are among the most valuable pieces of knowledge on earth. You might be losing a significant amount of potential earnings if you choose not to invest your money and leave it in your bank account instead.
 - IMO studying finance is somewhat useful even if you're not investing your own money. Looking at the world through a lens of probability and linear optimisation is a skill that many finance/econ people have and many non-finance people could benefit from.

How?
 - Investments and trading will likely not be the primary determinant of your life trajectory, unless you make it your full-time job. Increasing your income or revenue is more useful than trading/investing your existing money.
 - If you are very ambitious, there's probably some projects you actually believe in that you can make concentrated bets on - finanically and otherwise. But for most other people, its worth knowing about investment basics and making a handful of investments in the background.
 - If your savings cover less than 3 months of your living expenses (and this isn't expected to increase over the next 5 years), most of this information is not relevant to you. I'd recommend keeping your money safe in your bank account for emergencies, and not trading or investing it. You can still acquire some of the knowledge, without putting in any money.

#### Investments

Proof: Just look at historical price charts of all the assets you're buying, and study basic financial concepts like expected value, variance and tail risk.

 - I have used IBKR to buy S&P500 as a long-term investment (>5 years hold). I may or may not also own some BTC and ETH.

How?
 - S&P500 is the single most important investment you should understand deeply. It is more important to understand basics of S&P500 than it is to understand gold or real estate or cryptocurrency or bonds or interest rates or anything else. This is true even if you finally decide not to invest in it yourself.
 - If you are considering buying S&P500: Make sure to look at S&P500 100 years historical chart. Consider studying basic concepts of finance like probability, expected value, variance, tail risk and timeframes. This knowledge is not mandatory but it'll prevent you from buying too much / too little or selling your investment too early.

#### Trading

How?
 - It is possible to self-study trading from the internet. However, trading knowledge has an uncanny valley where someone who has a medium amount of knowledge and experience will tend to lose more money than someone who has no knowledge or experience. You have to cross this valley to become an expert. If you practice trading for multiple years you'll probably be profitable eventually, but consider if the time invested doing this is worth it. Long-term passive investment is good when you don't have a lot of time to devote to trading.
 - If you'd like to learn trading, practice with a small amount first, but take your decision-making process as seriously as if you had invested a significant amount. Don't bother with paper trading, just use a small amount of real money. Make sure to list multiple possible outcomes of each trade, their probabilities, the underlying hypotheses behind your trade and what could invalidate it.
 - "Technical analysis" is mostly a scam IMO, ignore it. Fundamentals-based trading is fine. Using an algo to predict trends in other people's trades is fine. Small cap is usually better than big cap for trading due to less efficiency. Shorter timeframe (seconds, minutes) is usually better than longer timeframe (weeks, months) if you're writing a bot, and vice versa if you're not.
 - Being successful at options trading is harder than it looks, I've lost money almost everytime I've done it and I don't personally recommend it unless you can devote atleast a year full-time just to options. This also applies to any unusual derivative whose theoretical valuation model looks at all similar to black scholes. (For example, binary options, there are mobile apps that try to addict you to it.)


### If you are a software developer

Proof: I have personal experience as a software dev.

Every development in how information transmits - be it printing press, semaphore, radio, television, telephone, and now smartphones and internet - has brought significant changes to how war and politics are practised. Consider studying the political implications of your own work. I only recommend studying this if you genuinely care or are genuinely curious.

You might have some influence on where your work finally ends up being used.
 - No matter which country or company you're in, the knowledge you produce as a developer will likely eventually transfer to the Big Tech companies. This is both the technical knowledge and the business knowledge (how you onboarded and retained your users).
 - This could be via a chain of acqusitions, via people leaving your company to start another, or via blogs and github repos.
 - The fraction of Earth's population using Big Tech services has been going strictly upward for 20+ years, and Big Tech getting more ways to onboard and keep users is a major reason for this.

### If you want to learn software dev

Why?
 - As of 2024, software development is one of the highest paying professions that you can get just by self-studying for 1-2 years on the internet. This is because a software business can serve millions of customers while investing very little money. Most other industries either can't serve millions of customers, or require significant initial investment to do it.

`Net revenue = number of customers * (profit per customer - cost per customer) - initial investment`

How?
 - Self-studying software often means climbing a learning curve that goes vertically upward. You might end up spending a lot of time and effort feeling like you are not learning much, only for you to suddenly realise at the end that you have learned a lot. Digging in deep into one concept might require you to learn two more concepts which might require you to learn two more, and only finally does even the first concept start making sense. Ensuring you are psychologically capable of putting effort even when you don't see immediate results, is an important skill in order to learn software development.
 - There's broadly 4 pieces of knowledge that are useful for you: basics of programming, linux and compiler and IDE knowledge, specific tech stacks for specific types of problems, and data structures and algorithms.

 - w3chools has good tutorials for learning software development.
 - Start with their tutorial on Python. Python is a good language to start. After learning some python, you can learn C. Don't make javascript your first language.
 - Ask a friend who knows software to install the tools for you, such as compilers/interpreters, libraries, etc. Installing things is not beginner-friendly and you should not spend time on it as a beginner. Making a friend who knows software dev is also useful to ensure you're not on the wrong path.
 - Initial concepts worth learning are: how to do standard input/output, data types, arithmetic, loops, functions. At first, ignore concepts that are not in this list, and become experienced with these concepts first.
 - Write code in a simple text editor like Notepad, Notepad++, Sublime Text, etc. No need for fancy IDE. Online code editor is okay but also learn how to do things on your own machine.
 - Also learn some linux commands such cp, mkdir, rm, mv, etc. Sign up to a platform like aws ec2 or gcp to get access to linux machines. You might get free credits.

 - Once you have done all of this, try learning a language for a specific task. For example ReactJS for frontend web development or Java for mobile app development or whatever. It is possible to directly jump into building a website without knowing fundamentals, but you will eventually end up going back to fundamentals anyway.
 - You might also want to do a course in data structure and algorithms at some point. If you enjoy data structures and algorithms, you can also do this before you learn a specific tech stack such as for website or app development.
 - Do whatever keeps you motivated. It's better to learn things the wrong way than to give up and not learn.

### If you want to build a social media following

Proof: I have not run a social media channel with significant attention. Go get advice from someone who has.

This section is especially relevant to professions such as journalists or online therapists, as they are more financially dependent on building a large following of the general public.

A useful mental model: There are 3 types of leverage - capital, attention and products that can be copied for free on the internet. You are either competing for people's attention, or you are competing to build products that are copied for free on the internet (i.e. blogs, videos) by many people.

You should decide whether the goal of your channel is to:
 - acquire lots of people's attention (which you can later use towards some goal), OR
 - tell people a useful way of thinking/listening/speaking/acting/etc

Both of these are interrelated, usually doing 2 also gets 1 as a byproduct. But never lose track of the fact that these two are different.

A direct parallel to this is starting a business. A business can produce capital for you (which you can later use towards some goal) and provide useful products to other people. Doing 2 can get you 1, but you should always be aware of the difference between the two.

If you want to build a social media following there's IMO three routes:
1. Enter the competition of top 10 / top 100 YouTubers on earth. If you win you can basically form the government of your country yourself.
2. Pick a niche and then become top 3 YouTubers on earth in that niche. This is profitable enough to survive. 
3. Go work for someone doing 1 or 2. This way you won’t be the public face that earns trust or attention. But you’ll get income. 

Attention on the internet is power-law distributed. Each person has not more than 10 channels they are paying attention to at any given moment. (They might follow >100 but they are not paying attention to them enough for those channels to affect their life.) If you are not in anyone's top 10 follows you effectively don't exist. "Better to be loved by few than liked by many." is a general rule of thumb for acquiring attention on the internet.

I recommend youtube over other channels because you can earn people's trust by showing your face on video.

In order to become top 3 social media channels on Earth in any niche, you'll have to run multiple iteration cycles where you will see zero reward. You'll have to manage your psychology as get zero reward for a long time.

Make sure to get direct feedback by asking people (using video call or in-person meeting, not just through comment section) what they would or would not like to see in your videos. 


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/advice_for_you/advice_for_you_india_knowledge.md

2025-03-03

## Trading and investment advice

Read the section on the other advice page.

Real estate underperforms S&P500 by a lot unless you have insider knowldge. Buying real estate as an investment (not for living in it) is the number one financial mistake I see people make in India.

NIFTY is also a decent option if you can't or don't want open a foreign account to buy S&P500 ; difference between NIFTY and S&P500 performance is not large if I remember correctly.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/advice_for_you/controversial_india.md

2025-06-19

To do for self
 - Shorten this document. Many of the core ideas here are repetitive and can be expressed more concisely.
 - Consider using more "show not tell" and "exposure". This document seems atleast partly written to my future self, not to persuade others.

#### Update (2025-06-19)

I no longer endorse significant portions of this document. I'm unsure about either free markets or representative democracy being compatible with a technologically advanced future. I will update this document once I have more clarity.

# Controversial topics (India)

If you are not from India, this post is likely less valuable to you than some of my other posts.

Please consider first reading the page on [Disclaimer on value differences](../value_differences_disclaimer.md).

## Summary

 - free speech
   - As per my values, the following world is better than current world: All places on Earth support free speech.
   - Some topics are considered sensitive in India and my socioeconomic class. All my opinions on these topics are in this post.
   - I am not trying to start a social movement on any of these topics. I am pragmatic about how much risk I would like to expose myself to with respect to listed topics, as using my website to spread my values is not my top priority in life.
 - politics
   - As per my values, the following world is better than current world: India has a strong rule of law. I morally support people who protect safety of people whose values/beliefs they disagree with.
   - As per my values, the following world is better than current world: India passes a law protecting right to bear arms. I morally support people who arm themselves even in the absence of such a law.
   - I have low trust in Indian political authorities across all branches of govt. I avoid publicly supporting any political party in India.
 - individual
   - I am highly individualist. I do not let anyone else control important decisions in my life, and am willing to go to extreme lengths to protect this freedom.
   - I am deliberate about what I am learning from whom and why. I don't copy-paste values/beliefs of my friends and family.
   - I am highly cosmpolitan.
 - religion
   - I am atheist. I wish more people were atheist like me, but I can peacefully coexist with religious people.
 - economic
   - I am weakly optimistic on Indian economic growth.
   - I think Indian culture has been historically too sympathetic to economically left-leaning ideology and this is directly responsible for the vast majority of suffering in India today.

A world that is "better" is still not optimal. I don't see democracy or free markets or nationalism or liberalism or any other existing ideology or political institution as the optimal ideology or institution for all of humanity until the end of time. There may exist even better worlds that require very different political systems from the ones suggested here.

## Main

## Freedom of speech in India

I live in India, upper-middle socioeconomic class (college-educated, net worth $20k-$100k). This might place certain restrictions on what I can say and how I say it.

Some topics that are considered sensitive:
 - Major reason: I've made some promises of privacy to some people (either explicitly or implicitly)
   - criticism of people in my immediate social circle
 - Major reason: I'd like to stay safe physically.
   - criticism of any religion or religious leaders, including related issues like dating, caste, etc.
   - criticism of any political party or politicians in India
   - criticism of India
 - ???

All my opinions on these topics are within this one post, and I will avoid mentioning them in other posts. 

I generally support freedom of speech as a cultural norm and a legally protected right.

(I might use pseudonyms to talk about these topics anonymously. I understand that someone putting enough effort can probaly doxx them anyway.)

## No movement pls

Social change happens not when 10% of people want change, but when 10% of people know that everyone in that 10% wants change. Establishing "common knowledge" is a major part of it. See also: [Blue Eyes Puzzle](https://xkcd.com/blue_eyes.html), [Scott Alexander's political posts](https://slatestarcodex.com/about/), [Srdja Popovick's book](https://www.amazon.com/Blueprint-Revolution-Nonviolent-Techniques-Communities/dp/0812995309#customerReviews)

**I am NOT interested in acting as a leader for social change around any topic mentioned on this page.**

Goals of this website:
 - Find high quality research collaborators, receive high quality feedback
 - Spread knowledge

NOT Goals of this website:
 - Establish "common knowledge", start a political movement

## Rule of law in India

Protecting your safety, belongings and relationships requires living in a culture that allows you to protect these things. Law enforcers are usually members of your culture and ultimately accountable to members of your culture. It is not practical for law enforcement to protect you if most members of your culture don't feel you deserve to be protected.

Physical safety is the second level of Maslow's hierarchy, after having enough food and water. In today's world (and particularly in India), it is rare to find someone dying of thirst or starvation. But it is not rare to find someone who feels unsafe.

Fear begets fear. If you are afraid of people of a different value system from yours, you're less likely to advocate for rule of law to protect them. They will then use extra-legal means to protect themselves. Now both you and them are less safe.

People who feel less safe can be less empathetic, less honest, less trusting and more isolationist. If many people collectively feel unsafe, this has impacts on nearly every aspect of culture. These damaging effects can persist multiple generations, as it is transmitted from parents to children.

In particular, if you are benefitting in any way from making others feel less safe, please consider whether your benefits are worth inflicting this sort of damage on another person. The damage can persist long after you die, the benefits may not.

Please consider protecting people in your daily life from harm irrespective of their value system. Please consider advocating for a rule of law that protects everyone. This way everyone can peacefully coexist despite their differences.

## Right to bear arms in India

I support passing a law that gives all citizens the right to own guns, with no exceptions.

In the absence of such a law, I am generally sympathetic to individuals and groups that illegally arm themselves for defensive purposes.

I support both majority and minority groups that arm themselves. I believe the truth about any matter is more likely to come out when opinions besides the consensus opinion can also be expressed, and this requires protection of the safety of those who express such opinions. I think guns inherently favour defence, as they increase the cost (both financial and moral) a majority group needs to pay to suppress the opinion of a minority group.

## Trust in authority in India

I generally have low trust in authority, atleast within a representative democratic setup such as that in India.

I think many Indians are not surprised when leaders of the executive branch or any political party prioritise their self-interest above what's good for citizens.

However Indian military chiefs-of-staff and leaders of indian intelligence (RAW, IB) enjoy a huge amount of trust from the Indian public which might be undeserved.
 - Often political groups in India use their own private militias to perpetuate communal riots. It is common for the IB to avoid actively entering such conflicts, allowing existing groups to succeed at their goals. RAW and IB are run by IPS officers. Indian bureaucrats in general face an incentive structure that trains them during their lifetime to be risk-averse with their career.

Similiarly, I think Indian public on average has a fairly high trust level in Supreme Court judges and heads of Election Commission, and I have a lower level of trust in them.

## Indian politics

I mostly avoid publicly supporting any political party in India
 - I am supportive of the many long-term structural and cultural changes to Indian politics proposed in this document. I think implementing will lead to bigger changes in outcomes than overturning the result of any one election from BJP to Congress or vice versa.
 - There may be safety risks to discussing Indian politics online.
   - Discussing or influencing Indian politics is not my number one life priority. So I am more pragmatic about what risks I am willing or unwilling to undertake. If it were my number one priority I'd probably be finding a way to bypass the safety risks.
   - I don't think India on its current trajectory is likely to be an important player globally. (See the section on economic growth for more.) This partly informs my reasoning for why involving myself in Indian politics is not my top priority.

## Individualism

I am quite individualist. I take my life decisions myself, and I want to ensure I'm the primary person responsible for the outcomes of my life. IMO some people, but not necessarily all people, could benefit from being more individualist. How you want to live your life is a personal decision. I am not recommending everyone copy me.

I am aware that belonging to upper-middle class is a significant causal factor in this choice of mine. I am not significantly financially dependent on anyone, no one is significantly financially dependent on me, and bad choices taken by any one person in my social circle (including me) don't significantly financially affect other people in the circle.

Why?

Some important ideas that I believe in are as follows.

1. Almost nobody likes being controlled by another person.
 - This could be a parent or sibling or spouse or employer or political leader or anyone else. Voluntarily following someone almost always feels better than following them under threat of violence or starvation.
 - Ideally you can negotiate boundaries and agreements with people in your current social circle in a mutually beneficial way, be it when deciding your actions or when expanding or narrowing your circle.
 - I'm strongly in favour of consensual social connections over non-consensual ones.
 - I'm also generally in favour of making long-term commitments not only short-term ones. Consensual agreements doesn't necessarily mean no commitments or responabilities.
 - However, sometimes people in your existing social circle might restrict you and seek control over you. If you are currently under someone else's control, and you would prefer to break out for any reason, this might be possible. See below for how.

2. There are 8 billion people on this planet. The probability that your family and friends are among the wisest / most knowledgible / happiest / kindest / etc people out of these 8 billion is approximately zero.

 - You can still selectively learn good things from your family and friends, but it'll help to also consider learning from others.
 - You can increase the odds you have an awesome life by deliberately choosing what you learn and from whom. Your source of inspiration and learning could be a podcaster or religious leader or politician or filmstar or anyone really.
   - Copying other people's decisions tends to get you life outcomes similar to those you're copying. Make sure to track what outcomes your role models got in their own life.
 - If you are not satisfied with your role models' life outcomes and want something better (or even just different) for your life, you must learn to think independently and act independently.
   - Becoming an independent thinker starts with exposing yourself to books, people and situations that others in your social circle are not being exposed to. Your brain is a computer, its output depends on its input.
   - I have gained a tremendous amount of value in my life doing this, arguably more than my entire formal education (although it's kinda hard to compare, they both support each other).

3. Approval / disapproval of you by your social circle is one of the largest predictors of your behaviour.

 - Social incentives (what your friends and family approve of and disapprove of) are often a stronger predictor of your behaviour than financial incentives (what makes you more money) or culture (what other people around you usually do).
 - If you choose your social circle deliberately, you can alter your own thinking and behaviour.
 - Pain of social disapproval is often greater than reward of social approval, atleast in terms of how it feels inside your head. You should be especially careful about what your peers disapprove of, as this will affect you more strongly.
 - See also: Asch's conformity tests and other research that confirms these hypotheses in a specific lab setting


#### See also: Dealing with abusive situations

DISCLAIMER: I am not an expert on dealing with abusive situations. If you are in such a situation, try to get advice and support from people who have more experience with it than me.

How to break out of another person's control?
 - Two things you need:
   - a source of income or sufficient savings not under this person's control. If you do not have this, obtaining this should be your life priority.
   - moral convinction that you are not anyone's slave or property.
 - Conflicts of this type that I've seen are often more psychological warfare than actual war.
   - You will find a lot of resources online on how this mind control works. I like this article on [Frame Control](https://aella.substack.com/p/frame-control).
   - If you believe you are not under their control, this reduces their control over you.
   - Non-violent resistance is powerful and ensures you retain the moral high ground. Keeping this internal sense of morality and self-respect is important in order to break out.
   - If they threaten violence, it is often a bluff, they lose their power if you call their bluff.
   - Even if they are violent, spending one month in a hospital may be better than spending decades as their slave.
   - Some situations are extreme enough that your abuser would literally rather murder you than let you go free. In this case, you may have to be more strategic, move slowly, find allies before making any move. Your opponent is not willing to play fair and neither do you have to. You may end up scarred whether you stay in this situation or attempt to break free. Remember that one day you are going to die, and your choices will affect your life many years to come.


## Cosmopolitanism

I am extremely cosmopolitan.

If this at all interests you, I recommend making friends (or atleast acquaintances) of people from countries other than your native one. I would also recommend making friends with people of different cultures in your country, of different socioeconomic classes, and of different value systems.

How?
 - This could be online or by travelling to other places or by finding people from those places visiting or living in your city.

Why?
 - Friends can tell you things you won't learn as easily from social media or elite-funded news channels (i.e. almost all news channels).
 - Also you'll be able to empathise with actual people rather than figments of your imagination. I've personally had good experiences doing this.
 - The really optimistic version of this is, maybe wars will become less likely if everyone had friends from other countries, maybe class boundaries would reduce and so on. But I think the benefits to you as an individual alone are worth it, ignoring the societal benefits.


## Religion

I am atheist. I strive to be tolerant of people of other faiths, although there are some topics we may have to avoid in casual conversations in order to have a pleasant interaction. I wish more people in this world were atheist like I am, but converting people to atheism is not the primary goal of my life. I can probably peacefully coexist with you if you are religious.

Which religion you choose to belong to is one of the most important decisions of your life. You don't want to blindly pick whichever religion you happened to be born into. It is better to think deliberately about it.

How?
 - If you are interested in exploring further on this topic, there are lots of freely available websites, books and videos online. Consider reading the writings of all the major world religions, and reading materials from multiple conflicting points of view.
 - I'm not sure what the best intro resource on atheism is, maybe [this one?](https://www.suchanek.name/texts/atheism/ChapterUniverse.html).

**The rest of this section elaborates my personal views on religion. If you're not interested, you are free to ignore it.**

There's a bunch of separate threads of evidence in favour of atheism that I consider worth studying. Which ones you study more deeply may depend on your interests. The most important thing is you actually study these topics yourself from the ground up, so you aren't blindly trusting a bunch of scientists or scientific institutions.

Stuff I've studied reasonably well:
 - biochemistry, genomics, computational genomics ([example](https://asia.ensembl.org/info/genome/compara/mlss.html?mlss=1098))
 - newton's laws, quantum mechanics, newton's laws implying deterministic universe, philosophical implications of human brains also being deterministic
 - occam's razor, solomonoff induction
 - chalmer's hard problem of consciousness

Stuff I've studied at surface level:
 - radioactive carbon dating of fossils
 - particle physics, esp. particle physics around big bang, star formation
 - evolutionary explanations for mammalian morality
 - redshift of stars due to universe expansion
 - psychology, evopsych, sociology of groupthink

Stuff I wish I'd study sometime:
 - reading holy texts of all major religions (I'm still halfway on this)
 - CMB radiation
 - neuroscience and psychology research on meditation, prayer, community rituals

Some potential advantages of being atheist:
 - In the very long run, you might help humanity find the truth. Truth tends to survive the rise and fall of civilisations, as information is easy to preserve and hard to destroy. A record of your beliefs and actions may be preserved and later read by others even if your nation gets nuked to dust, for example.
 - People around you might benefit in terms of material wellbeing. Post-Englightenment, there's a direct correlation between countries' GDP growth and their cultural tolerance of atheist scientists and engineers. (I also suspect there's a direct correlation between this tolerance and rate of new inventions by country, but this is harder to prove.)
 - You might avoid some traps like using medicines that haven't undergone double-blinded randomised control trials and may hence be toxic.
 - It might improve your critical thinking skills. Many cognitive biases are due to social pressure, learning to resist social pressure will make you a better thinker. Your beliefs may be less compartmentalised, as knowing the truth in one subject also aids with knowing the truth in other subjects (and vice versa knowing falsehoods deteriorates it).

Some potential disadvantages of being atheist:
 - You may end up distanced from people who no longer share your values. Coordination is easier between people with shared values and beliefs. You may find it harder to coordinate with others.
 - You may end up distrusting of other people's beliefs. You'll realise truth-seeking is really hard and a lifetime of effort is not enough to discover all the important truths about the world.
 - You may have more anxiety about the future. You'll realise many events in life are random and beyond your control. (For example, this could be who you date or how good your health is or how a particular exam result went.) You'll realise people often imagine fictitious narratives to make sense of random events, because they don't like random events.
 - You'll realise most problems in life don't solve themselves, someone needs to actually put in the work to solve them.
  - You'll have to take life decisions while facing more uncertainty about potential outcomes. You'll realise many facts about our past, present and future are currently unknown. Many of these facts will remain unknown even after you die.
 - You may become more confused about your morality. You'll realise that different cultures produced people with different moral values.
 - You'll realise the world is not a fair place, karma doesn't always work. History is full of mass murderers who received more love and popular support than you ever will.
 - You may end up less satisfied with your present circumstances. Most religions teach some degree of contentment with one's present circumstances.

If you have recently become atheist, especially if you live in India, I encourage being kind to yourself while you figure out its implications for your life. This took me many years to figure out, and could take you as well.

## Economic growth of India

I currently have no strong reason to favour economic growth of one country over another country, or to value the happiness or lives of people of one country over another country. I support economic growth of humanity in general, and this includes people in India.

That being said, if you were interested in increasing economic growth rate of India, here's my guesses of how you'd do it. Economic growth is typically of two types.

 - Zero-to-one scientific innovation
   - If you have a lead time of even one month over the rest of the world on some frontier technology, you get revenue and geopolitical power from the whole world. Every additional month or year of lead time makes all the difference.
   - Achieving this in India is really hard IMO
   - Will require cultural tolerance of atheist scientists and engineers in broader society, and a well-funded political faction backing them.
   - For a given research field, there is typically only 1-2 places on Earth where top scientists go. Will require building culture behind such institutions in some fields.
   - Will require funding too, although my guess is funding is not the sole factor, and culture is typically a stronger factor than funding behind outcomes of research labs. Indian govt annual revenue is ~$500 billion. Becoming world leader in any field typically requires $100 million to $10 billion in non-profit/govt funding.
   - Will require sufficient people to become wealthy enough to join upper / upper-middle class where independent thinking is possible and valued.
   - Indian elites choosing to detonate Pokhran nuclear test in May 1974 has likely directly led to US, Russian and China elites and all their allied countries' elites refusing to share tech and infra with the Indian elites. (It has also been a causal factor for why no country is building nuclear power plants and global energy prices are permanently stuck at $0.10/kWh, but that's a separate discussion.)
 - Globalisation, i.e. copy-pasting innovations that already been successful in other countries
   - This is already ongoing in India
   - Stronger rule of law and respect for individual property rights will help.
   - More free market will help IMO, with less selection of winners by govt. IMO Indian economy should have taken more capitalist direction in 1950s itself instead of waiting till 1990s, if economic growth was the priority.
   - This will also need regulatory bodies who are educated about the fields they are regulating.
   - Inviting more foreign companies to undertake major infra projects will help. As long as knowledge ends up diffused to people locally, it matters less who invests in the infra or takes governance decisions IMO. Allowing foreign investors to own significant infra projects can be net good for Indian economic growth, depending on the details of the deal. For instance it is important that projects are profitable and financed via equity investors, not via increasing national debt.

## Indian political history

I think many leaders in Congress in the period 1950-90 were too sympathetic to economically left-leaning ideals, and this has directly lead to many social ills in today's society, including generational trauma for people whose parents dealt with poverty and poor rule of law, poor rule of law in some places even as of 2025, widespread unemployment as of 2025 and so on.

I am aware it is easy for me to say in hindsight that, let's say, Stalinist communism was doomed to fail, and this was a harder prediction to make back in the 1950s. Nehru was clearly sympathetic to Stalinist economic system and their industry and manufacturing for example.

I see this sympathy for economically left-learning ideology as a broader problem in Indian culture, not just something that its elites supported.

I am aware that many people today are not actually interested in studying history in pursuit of the truth. Instead it is common to suppress or selectively ignore historical facts that risk delegitimising whichever political ideology you happen to support at present.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/advice_for_you/advice_for_you_elite.md

2025-06-11

# Advice for you (elite class)

Update (2025-06-10)
 - Specific projects I would recommend:
   - Invest $1B in solar R&D to reduce cost of energy.
   - Donate $1B to a non-profit team of cyberhackers that attacks highest value targets on earth (typically govt or big tech) and posts all their unredacted databases to public internet.
     - Atleast $1M long-term vested per employee (performance-based incentives), less than 1000 employees for secrecy reasons. Many security researchers today are underpaid relative to the market or military value of their work, so you can exploit that.
     - LLMs are rapidly changing many parts of security research, prefer hiring people who can adapt to that.

2025-04-18

#### Summary

 - If you want an amazing life and not a mediocre one, one of the biggest mistakes you can make is blindly copy-pasting values and beliefs of your family and friends, because you want their approval. If you want an amazing life, be more deliberate who you are learning from, whose decisions you are copying and why you are doing this.
 - Because of principal-agent problem, you can't trust other people to do your work for you, no matter how much money you are willing to pay them. You can't pay someone to actually care about a goal, you can only pay them to pretend to care. Someone who cares about achieving a goal will generally strongly outcompete someone who does not care.
 - Therefore, you have to become knowledgible yourself.
   - Study physics and engineering. Most power in society is built through technology.
   - Study politics.
 - You can probably have more impact than you think, if you want to have it. One elite can single-handedly alter the trajectory of human history. Don't let your family and friends tell you otherwise. People like me are counting you to make good decisions.

## If you have above $1M net worth and want influence over society

Proof: I don't have personal experience with this, I just read a lot.

Consider studying politics. Many rich people (even billionaires) don't understand politics and this limits how much influence they can have. Money does not directly translate to political power, the relationship between the two is more complicated. If you only focus on making money, you might be disappointed at the end.

Consider studying maths and physics, if you don't know it. Since the Enlightenment, most power in society is built via technology. No amount of hired experts will substitute for the fact that you don't understand calculus. You won't know how to hire the right people or verify the truth of what they're saying. If you understand technology, you will have much better options available to you in life.

Consider sharing more about your life experience and opinions in the public domain. Your money buys you a degree of safety most people don't have, and many people can benefit from someone who can safely talk about controversial topics. If people verbally attack you it's fine, you are still safe.

## If you have above $100M net worth or could get atleast 10M votes in a hypothetical election

#### My pitch on raising your ambition

A handful of actions taken by you can literally mean the difference between human extinction and world peace and prosperity beyond our current imagination. The fate of 10,000 years of human history and 1,000s more in the future sits on your individual shoulders. If you play your cards right, you could found a nation state. You could bridge differences between leaders of existing nations, and help form a world government. You could figure out sane policies and institutions for technological progress, and also teach other elites to have sane well-informed decision-making on it. You could even start a new religion that lasts centuries after you die, and provide people with safety and meaning.

Please don't limit your own imagination, and go for a small goal like "I will buy a seat in XYZ political party and that's it, the sum total of my legacy".

You don't need to wait for the support of a lot of other elites, you can just start making moves by yourself and wait for others to follow.

You may be able to derive a lot of personal meaning and happiness from this. If you do, you might come to the conclusion that one person's suffering or risk of humilitation is a price worth paying, when considering the stakes. (Or you might not, I am not claiming I'm good at predicting what gives people in your situation meaning or happiness.)

A lot of people including me are counting on you to take good decisions. You might or might not have a healthy way of handling this burden psychologically, but it is a true fact about reality IMO and I think you should know. I'm generally not a fan of psychological strategies that involve hiding yourself from the truth.

#### Other things you can try

 - Maybe try explaining your moral code and your general life situation to people outside your socioeconomic class, without dumbing it down or pretending to be a common person. I would like to understand your perspective better here. I think this could help increase empathy and understanding across the class difference. (No, many people will still not get it, but a handful of people might, if you explain it well enough.)
   - Examples of this: people who do podcasts, people who post on twitter
   - "Born Rich" by Jamie Johnson (owner of Johnson and Johnson) seems like one example, although it mostly covers (relatively) unambitious elite, not the ambitious ones. Also I think there's way to shoot that exact same documentary in a more positive light.
   - Naval Ravikant's writings also are a good example for people not of the elite class to learn more about elite class norms

 - Consider purchasing a bunker for yourself that is resistant to bioweapon attacks.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/advice_for_you/advice_for_you_india_labour.md

2025-05-20

To do for myself:
 - Not a priority for me right now
 - Rewrite page so it's simpler and more people can understand
 - Make it easier to directly navigate to relevant section

# Advice for you (India, labour class)

You can also read the other post on knowledge class. If you are sufficiently ambitious, you can climb classes by following the advice in that post.

#### Summary

 - If you want to make more money, talk to people who know how to make money. To do this, you may need to learn english and shift to a tier-1 city. To learn english, spend many months watching shows and youtube videos in english.
 - If you want an amazing life, don't blindly copy-paste what your family and friends do. Don't try to get them to like you. Think about what you believe and what you do. Think about why you make these decisions.
 - If you want to improve your life in any way, spend atleast 1 hour everyday in front of a computer. Use Google a lot, don't just scroll instagram/tiktok/youtube/etc. Don't just use your mobile phone.

## If you don't know english (and live in India) - career advice

Proof: I have learned many skills from the internet, and this has helped me a lot. Most of my college education did not directly help me do this, but being good at english and math did.

There are a lot of benefits to knowing english. Think about it. Maybe studying english should be the number one priority of your life. (Or maybe not, but I want you to think about it.)

(If you are very ambitious, you can also read the section after this section.)

Why?
 - Most of the good job opportunities require you to study english.
 - A lot of good educational resources are available only in english.
 - A lot of people have a lot of useful knowledge but they don't know your local language. If you both learn english you can both learn something useful from each other.
 - If you can't afford to go to college, that is fine, there are other ways of making money.
 - For example, you can learn some skills by asking friends with those skills. If you don't have such friends, you can make new friends.
 - If you are ambitious, you can study how to do the job from the internet.
 - Your current family and friend circle probably does not know which job opportunities are best for you, and you will have to make new friends to guide you.

Proof: I learned English and Hindi as a kid and don't have much personal experience learning a language as an adult. I am just copy-pasting language learning advice from the internet.

How?
 - Do not waste time reading English textbooks. Do not study grammar.
 - Spend atleast a few months listening to movies or TV shows or podcasts etc in English. Start watching something that uses simple words. Do not use Hindi subtitles. This can be boring at first, but doing this for a few months will change your life.
 - Once you have spent enough time listening, you can practice speaking with friends who know some english. (If you don't have any such friends, make friends in-person or online.)

## If you have less than 50,000 INR in savings and live in a village or town (in India) - career advice

Proof: I have seen more than one person shift to a city and the struggles they face in first couple of months. I have not experienced this myself.

Think about shifting to a big city.

(If you are very ambitious, you can also read the section after this section.)

Why?
 - This decision is not permanent. You can go back later if you try it out and you don't like it.
 - If you live in a city, you can meet people who know a lot of skills and know how to make money with those skills. You can befriend them and learn these skills, and make more money yourself. However, learning new skills is a slow process. When you first shift, you will have to take a job you know how to do. After that, you can try learning a new skill.
 - If you live in a city, you can meet people who think and behave differently than you. This is not just about career. This includes what their personality is like, how they approach relationships, what beliefs they think are important, what are their hobbies and interests, etc.  You might want to learn some good things from them, or you might not. Until you try, you probably won't know. It is difficult for someone to explain this to you if you haven't experienced meeting lots of people yourself.
 - If life in your house or neighbourhood is not good right now, shifting away from it can improve all the aspects of your life. In cities, your friends might not know each other, so they might gossip less about you. And you might feel less pressure to behave one way in front of others.

How?
 - Very important things you must carry with you:
   - ID cards - aadhaar card, driving license, etc
   - Debit card - make sure you have money saved in your account and you are not carrying all your money as cash
   - Smartphone, SIM card
   - Laptop, if you own one
 - Important things you should also carry:
   - Induction stove, cooking utensils, food storage containers - it could be expensive for you to buy these again
   - Clothes
 - Important things you should practice before coming:
   - Learn to use Google Maps and practice using it
   - Learn to use Whatsapp, SMS, phone call, UPI, if you don't already know how to
   - Learn to use IRCTC online website to book train tickets, and MakeMyTrip app to book bus tickets.
 - Important things you should do soon after coming:
   - Find a PG to stay on the first day itself. Do not waste a single day of your time finding a PG. Every day you waste means loss of money.
   - Pick your PG roommate carefully, if they are depressed or an alcoholic for example, it will affect your mood a lot. 
   - Find a job soon after you have a place to stay. Do not waste more than 3-4 days of your time finding a job.
   - Do not relax once you have a job. Make sure you have some backup job if the first job you pick later turns out to be a bad choice. Ask your coworkers to make sure you will get your salary on time.
   - Within the first month, try to make some acquaintances of the same gender. This is especially important if you are a woman. It is fine if you don't trust them a lot at first.
   - Nobody is guaranteed to help you when you run out of money. Try your best to keep some savings at all times.
 - Important things you should do eventually:
   - Talk to lots of people of different backgrounds. Put active effort to do this. You will learn a lot if you talk to the right people and ask the right questions.


# File: /Users/samuel/Documents/Samuel/Website/scripts/../raw/text_english/about_me_summary.md

2025-07-11

# About me

![Samuel Headshot Casual](../non_text_non_video/self_solo_pics/samuel_photo_headshot_casual.png "Samuel Headshot Casual")

History
 - CV: DOB 2001-01-03, Male. Completed schooling in Delhi 2018, Graduated IIT Delhi biochem engg BTech+MTech 2023. Managed risk for $1B AUM at cryptocurrency startup Rari Capital. Completed ML Safety Scholars under Dan Hendrycks, UC Berkeley. Have intermediate-level skills in software, finance, biotech and probably some other topics too.
 - Notable influences: cypherpunks mailing list and its successors such as Tor/Signal/blockchain, extropians mailing list and its successors such as EA/lesswrong.

Immediate goal
 - Enable whistleblowers of companies building superintelligent AI to leak information to the general public.
   - What?
     - I am working on a guide for whistleblowers.
     - I can provide high-quality technical education for journalists or youtubers on a) how to safely accept and handle leaked documents b) AI risk as a political cause
   - Work done so far
     - [Whistleblower Database](./my_research/us_govt_whistleblower_database.md), [Whistleblower Guide](./my_research/us_govt_whistleblower_guide.md)
   - Similar projects by others
     - Similar projects supporting whistleblowers: [Wikileaks](https://wikileaks.org), [SecureDrop](https://securedrop.org)
     - Some popular people's views on AI risk: [Geoffrey Hinton's views](https://www.youtube.com/watch?v=66WiF8fXL0k), [Yoshua Bengio's views](https://yoshuabengio.org/2023/05/22/how-rogue-ais-may-arise/), [Dan Hendrycks' letter](https://safe.ai/work/statement-on-ai-risk), [Eliezer Yudkowsky's views](https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/)
   - Support me
     - Fund me: [Request for $20,000 to support AI whistleblowers](./connect_with_me/donate_to_me_ai_whistleblowers.md)
     - Work with me: Provide me feedback or do fact-checking for me. Especially interested in people with legal expertise in US or international law. Part-time or full-time possible.
   - Long-term plan
     - I support a complete ban on building superintelligent AI. I think humanity might go extinct by 2030 if not for a ban. Leaked information could help with coordinating a ban.
     - I can accelerate towards a world with fewer secrets. Resulting world may be radically different than ours. I think individual freedom can be protected despite lack of privacy, if everyone trying to suppress your freedom also lacks privacy when taking such actions.
   - Previous work
     - Research: [SecureDrop review](./my_research/securedrop_review.md), [Open source weaponry](./my_research/open_source_weaponry.md), [Long-term view on information, tech and society](./my_research/longterm_view_on_info_tech_society.md), [Intelligence explosion](./my_research/ai_forecasts/intelligence_explosion.md), [AI forecasts](./my_research/ai_forecasts/)
     - Software: [Books Search for Researchers](./my_projects/my_projects.md), [Open source search](./my_research/open_source_search_summary.md)

[Contact me](./connect_with_me/contact_me.md)

Current location: Bangalore, India. Willing to shift if funded.