Method

How the ranking works. No black box.

Candidate Pool

Posts enter the candidate pool from two paths:

  • Seed graph — authors followed by or mutual with the trust source account. Follows, mutuals, and followers each carry different weight.
  • Domain exceptions — posts linking to recognized primary-source domains (government, legal, academic, technical) can enter even if the author is outside the graph.

Everything else is ignored at the firehose level. The feed does not attempt to process the entire network.

Scoring Components

Each candidate is scored by summing five components:

  • Graph affinity — mutual (+5), followed (+3), trusted list (+4), follower (+1.5). This is the primary anti-garbage prior.
  • Originality — original top-level post (+3), repost (-6), quote-post (-1), substantive reply (+0.5), short reply (-1).
  • Evidence — external link (+2), primary-source domain (+2 to +3.5), reporting domain (+1 to +2).
  • Substance — text in a reasonable length band, multiple links or facets, structured content. Capped, not linear.
  • Freshness — exponential decay with a ~12 hour half-life. Recent posts score higher, but not aggressively.

Penalties and Dampening

  • Flood control — more than 10 posts/day from the same author triggers a ramping penalty, capped at -5.
  • Volume dampener — separate from flood control, prolific authors face diminishing returns. At 20 posts/day, their score is multiplied by 0.7. At 40 posts/day, 0.5. Their posts need to be proportionally better to compete.
  • Repost — hard -6. Reposts are noise in this context.
  • Relay penalty — posts with an external link but minimal non-URL commentary are penalized. This catches syndication exhaust: accounts that post link + title with no editorial value. Measured by stripping URLs from the text and checking what remains.
  • Account stink score — a rolling 7-day behavior score measuring how relay-shaped an account acts: link-post ratio, reply ratio, average commentary length, domain concentration. Accounts scoring above 0.6 get a penalty. This is behavior-based, not identity-based.
  • Outsider relay penalty — accounts outside the seed graph that enter via domain bonuses are penalized if they've posted 10+ times to the same domain. Catches prolific structured bots that are verbose enough to dodge post-level checks.
  • Author weights — a small number of specific authors have manual score multipliers applied. These are editorial overrides for known edge cases where the general rules aren't enough. This is taste, not algorithm, and it's acknowledged rather than hidden.

Clustering and Representative Selection

Posts linking to the same URL or in the same reply thread are grouped into story clusters. The site homepage shows one lead post per cluster.

Within a cluster, the lead representative is chosen by a multi-tier preference: graph membership first, commentary quality second, raw score as tiebreaker. A mutual with 80 characters of real commentary beats an outsider relay with 180 characters of reformatted bill text, even if the relay scored higher on raw metrics. This is the key distinction: commentary value vs publication exhaust length.

Document floods (congress, research, filings) are further compacted into docket cards: the best item stays as a full story, the rest are bundled into a summary card.

Composition Rules

After scoring, the ranked list is filtered through composition rules to prevent monotony:

  • Maximum 2 posts per author per page
  • Maximum 3 posts per root thread
  • Duplicate links collapsed to the top 2

Without these, the feed tends to become one person having a day, or one argument from nineteen angles.

Domain Bonuses

A curated set of domains receives scoring bonuses. The list is deliberately transparent:

  • Primary sources (+2.0 to +3.5): courtlistener.com, supremecourt.gov, congress.gov, sec.gov, arxiv.org, pubmed, github.com, and similar.
  • Reporting outlets (+1.0 to +2.0): reuters.com, apnews.com, propublica.org, 404media.co, and similar.

This list has a point of view. It favors primary sources over commentary. It is editable and will evolve.

Visibility Policy

The public site only shows posts that are visible to logged-out viewers. Authors who have opted out of unauthenticated visibility via Bluesky's !no-unauthenticated label are silently excluded from the public edition. The feed itself (inside Bluesky) may include posts that signed-in users can see.

This means the site edition and the feed are not always identical. That's intentional: one is a public page, the other operates inside a signed-in context.

Known Biases

  • English-language bias: the trust graph and domain list are English-centric
  • Graph bias: mutuals and follows get a structural advantage. This is intentional but not invisible
  • Domain list bias: the bonus domains reflect one person's judgment about what constitutes a primary source
  • Editorial overrides: a small number of authors have manual weight adjustments. This is acknowledged, not hidden
  • Recency bias: the feed only considers the last 24 hours

Exclusions

Authors can request project-level exclusion by DMing @instantinternet.news. Excluded DIDs are dropped from ranking candidates entirely — they do not appear in the feed, the site, or any future archives. This is separate from Bluesky's !no-unauthenticated label, which only affects logged-out visibility.

Update Cadence

  • Posts ingested continuously via Bluesky Jetstream
  • Ranking rebuilt every 2 minutes
  • Site editions frozen every 15 minutes
  • Stale posts purged after 48 hours