Home | Search

2026-01-29

Disagreements app Retrospective

Disclaimer

Quick Note

2026-01-29

Update

If I was making yet another version of this app (I'm not), I'd do the following:
- start with an initial whitelist of urls that the user has personally gone through and liked (dont use a generic url list from substack or searchmysite)
- extract entire commoncrawl content of those urls, using AWS Athena and ranged GETs
  - write very short script using aws cli and bash that takes list of urls as input, retrieves their byte locations from commoncrawl columnar-index, then uses these to get the entire content and store it in a new s3 bucket. suggest multiple approaches, all using aws
  - old method: use my ccfilter script (cheaper, not trivial to parallelise, no error handling)
- load entire content into a single openai vector store. openai vector store has 100 GB plaintext limit
  - old method: for datasets bigger than 100 GB than this, you have to manage the batch API polling and qdrant MCP hosting yourself (same cost, works on bigger datasets, not trivial to parallelise further, no error handling)
- let gpt-5.2 search this database
- provide a hard-coded prompt that includes people's tips on how to write well (like paul graham's tips, scott alexander's tips). ask gpt-5.2 to use these tips to decide which search results should be shown to the end user.
- also implement some user-based search thing, to identify other users who liked similar content as you.

2026-01-19

Update

I will no longer be working on search, or discovery, or curation of resources, or anything in similar cluster. I will not be re-evaluating this decision again.

This is a difficult choice for me to make. But I have to cut my losses here and move on from this. I have invested atleast 9 months of my life, and atleast $3000 on this whole cluster of projects. (I haven't estimated exact numbers, this is just a guess.)
If I want my own blog posts to become popular, I have to learn to write them in a way that is actually persuasive towards some target audience. If I want someone else's blog posts to become popular, I have to rewrite them in a way that is actually persuasive towards some target audience. The same applies to video content, or games, or any other type of content.
I can't slap a better discovery or search algorithm on top of an insufficiently persuasive piece of content, and hope that will make up for its shortcoming, and get enough readers.
I can use AI to help me write better posts. I can use AI to help me model other people's minds better. But ultimately, I have to take in all these insights and actually create the damn persuasive content myself.
If I had sufficiently persuasive posts, I could just cold email to lots of junior people in my target audience. If it is persuasive enough, they will automatically forward it to the senior people whose networks I actually want to get access to.

2026-01-18

Update

To summarise this really bluntly, search and discovery are two related but ultimately different problems
Multiple reasons people search
- They already know what exact piece of content by what exact person they want, but didn't save its URL the first time (or saved it but lost it). Now they need to just retrieve this content.
  - This is a solved problem, assuming you are willing to rely on google which is closed source and therefore censors people. This is an unsolved problem if you want open source search.
- They have some topic they want to read about. They already have a manually curated set of people whose content they like. They want all possible content from this manually curated set of people, on the queried topic.
  - Sub-problem - If there is too much content from the people in this manually curated list, they want some intelligent system (either human or AI or human + AI) to predict which of these pieces of content is most useful/interesting/etc to them.
    - This is discovery and is only partially solved, be it closed or open source. Improvements might be possible.
- They have some topic they want to read about. They already have a manually curated set of people whose content they like. They want an intelligent system (either human or AI or human + AI) to predict which of these is most useful/interesting/etc to them.
  - This is discovery and is only partially solved, be it closed or open source. Improvements might be possible.
- They have some topic they want to read about. They intentionally want to get outside their social media filter bubble, and read content from people that they normally don't like reading. This takes more effort for the reader but sometimes they might be fine with putting this effort.
  - (I think this is where my libgen book search actually excels.)

Some potential solutions

Embedding search is good at finding content about a topic. Embedding search (atleast if used the naive way) is not good at predicting if you will find some piece of content on this topic useful/interesting/persuasive/etc.
Therefore right now the best solution for discovery might be the following. Find readers whose past reading history is similar to yours, but have read/liked/shared a lot more content than you have. Recommend you the same content that they already liked.
- These terminally online users are basically doing unpaid curation for the whole platform. It may be useful for the system to identify which users are more willing to do this unpaid labour, and make them do more of it on purpose, and which users are less willing to do this unpaid labour, and make them do less of it.
- Maybe embedding search fits as part of the pipeline then? Rather than working in isolation.
Maybe reasoning models (gpt-5.2 high?) can actually predict what piece of content will be useful/interesting/persuaive/etc to you, if given the right heuristics.
- For example, Paul Graham says a blog post is useful if it is true, important, surprising and strong
- Some of these criteria are audience-dependent.
  - "Important" and "Surprising" can depend heavily on the target audience. What background knowledge can you assume for this audience, and what does this audience consider important?
  - "True" and "Strong" are more objective in theory. In practice, for many problems, we don't have access to ground truth, and everyone has their own priors on what is true. So you might want to read content that assumes similar priors to yours.
- Maybe a reasoning model can evaluate these criteria and come up with a good guess of whether a blog post is likely to be useful to you or not?
- If reasoning models can do this, then they can also go give advice back to the blog post writer, on how they can improve their post. But maybe the writer doesn't want to improve their post. In which case their post shouldn't be recommended to readers.

Why do I even care about this problem anymore?

I no longer intrinsically care that much about fixing search and discovery for the internet, as compared to how much I care about fixing ASI risks.
I care about resolving disagreements among people who are not convinced about ASI risks. If I knew of some app that could resolve disagreements among a large audience, I would want to build that.
- Resolving disagreements at scale is hard but worth trying anyway. (For example, if you knew how to resolve disagreements about core life questions at scale, you could immediately run for politics or found a religion next.) AI alone is probably not intelligent enough to write its own content that resolve disagreements at scale, as of 2026-01. But AI could maybe connect users to writers whose content is atleast somewhat persuasive to them.
I care about getting a large audience by any means possible, so that I can later push ads about ASI risk on them (regardless of whether they want these ads or not). Being high status in society is useful for persuasion. If I knew of some app that could get me any large audience, I would want to build that. If this is not quick though, I might switch to some other method. I don't want to spend 5 years building a tech startup, one reason being ASI might be here in 5 years.

2026-01-13

I still haven't found PMF for a wide audience for this app, so I might still make major changes or take it down.

Main

This app initially started out with the idea of finding literally every piece of text on the internet that is commenting about ASI. I wanted to find every possible opinion people could have on the topic, summarise it, and so on. Over time, I got some insights and the idea evolved.
Insights about the target audience
- I assumed myself to be the target user of this app when I first made it (PG-style), and still assume this today.
- I am maybe going to annoy someone by saying this, but:
  - I think some users think quite deeply, and for them this app is maybe not giving them enough useful information. For example, gpt-5.2 is not currently that good at correctly guessing what the double cruxes are. Making accurate guesses of double cruxes, with incomplete information, is a hard problem. (There is incomplete information because many influential people use public writing as PR in order to raise funds, get a job, get employees, etc. They don't share their unfiltered opinions publicly.)
  - I think some other users don't think that deeply to begin with, and for them this app is giving them too much information. They want emojis and one-line summaries and so on, to persuade them to look at it further. I think this can be built with today's tech but I'm not that motivated to go build it.
Insights about use case
- A lot of people's writings are not that interesting to me. I can spend a few hours reading what anti-capitalists are saying about ASI, I don't want to spend a month reading about this. I am most interested in views of people I atleast share something in common with. I made a prototype that scraped for anti-AI blogs, but this was the failure mode I ran into it.
- I don't just want summaries of other people's work, I want to know where exactly they disagree with me, or what new and important information do they have that I don't already know. I made a prototype that providing surprising differences in worldview, as opposed to just summaries, and found it more useful. Many blog writers have given advice online on what makes a blog good. A key part of all their advice is to figure out a target audience, respect their time, and only write about what is surprising to them.
- I deliberately avoided putting anything in the app that involves persuasion or resolving disagreements or finding double cruxes. That is quite hard for humans and it is also hard for AI. Maybe someone can pull this off, I couldn't crack it yet.
- (Credits to a friend for this) A lot of blogs are PR, either for the company they work for or the company they want to be hired by. I realised I have to filter most of these blogs out, because they have nothing useful to teach me. Many people don't write their actual views on most things online, because they would like to get funding from some company or investor or something. For example, many people in Silicon Valley seem pessimistic about the tractability of pausing AI research. This is a disagreement with me. But, because these people would likely to still raise funding, they haven't gone into detail and defended this belief publicly.
- I currently seem to disagree with many of my supposed "role models", either in thought or action. Knowing where these disagreements are happening seems useful to me. I might now make longer writeups based on this.
- Another use case might be saving your time by scheduling the right meetings at conferences, and progressing the discussion more quickly because you don't spend the first 15 minutes of each meeting hunting for the disagreements. I have focussed less on this use case as of 2026-01-13.
Insights about the tech
- I initially imagined this app to be a search engine for blogs about AI risk. I figured I should just filter out entire commoncrawl for the parts that talk about AI risk.
- The easiest way to do that is have gpt-5.2 read entire commoncrawl but that's absurdly expensive.
- You end up with a multi-stage pipeline. First hard filter based on some url whitelist, then use text-embedding-3-small or gpt-5-mini to filter, then use gpt-5.2 high to filter further. Also, doing this still ends up with a cost in the ballpark of $10k (which I don't have right now, for this).
- Open source models still have bad accuracy (or precision or recall or whatever), and writing code for OpenAI's Batch API still sucks (because of all sorts of rate limits). It is doable to write code for OpenAI's models, it is just a lot of work.
- I also tried just getting attendee lists of people at conferences, as opposed to starting with entire commoncrawl or entire searchmysite.net lists. But then most of those people have their bios optimised for PR, in order to increase their own employability or fundraising ability. People with high signal-to-noise and honest opinions online are less.

Enter email or phone number to subscribe. You will receive atmost one update per month

Comment

Enter comment

Home | Search

Disagreements app Retrospective

Update

Update

Update

Subscribe

Comment