I spent a weekend digging through the open-sourced X (formerly Twitter) algorithm. What I found was both fascinating and practical: a Grok-based transformer model that predicts your behavior with surprising sophistication.

This isn’t speculation. This is what the code actually does.

X Algorithm Pipeline


The Architecture: Phoenix, Thunder, and Home Mixer

The “For You” feed is powered by three main systems:

1. Phoenix: The Brain

Phoenix is a Grok-based transformer model (yes, the same transformer architecture family as xAI’s Grok). It handles two critical functions:

  • Retrieval: Finding relevant posts from millions of candidates using a two-tower model
  • Ranking: Scoring posts by predicting engagement probabilities

The ranking model uses a clever technique called candidate isolation: posts can’t “see” each other during scoring, only the user’s context. This ensures consistent, cacheable scores.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
User Context (history, preferences)
         โ”‚
         โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚     Phoenix     โ”‚
    โ”‚   Transformer   โ”‚
    โ”‚                 โ”‚
    โ”‚  [User] โ†’ [Candidates]  
    โ”‚     โ†“         โ†“
    โ”‚   Full      Self-only
    โ”‚  Attention   Attention
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ–ผ
    P(like), P(reply), P(repost), P(block)...

2. Thunder: The In-Network Source

Thunder is an in-memory post store that tracks recent posts from accounts you follow. It’s optimized for sub-millisecond lookups.

Key insight: In-network posts get preference. When you follow someone, their posts are more likely to appear in your feed than posts from strangers with similar engagement predictions.

3. Home Mixer: The Orchestrator

This is the glue. It:

  1. Fetches your engagement history
  2. Retrieves candidates from both Thunder (in-network) and Phoenix (out-of-network)
  3. Hydrates posts with metadata
  4. Filters ineligible content
  5. Runs the scoring pipeline
  6. Selects top candidates
  7. Returns your ranked feed

The Scoring Formula

Here’s the actual scoring logic from the codebase:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// From weighted_scorer.rs
fn compute_weighted_score(candidate: &PostCandidate) -> f64 {
    let s: &PhoenixScores = &candidate.phoenix_scores;

    Self::apply(s.favorite_score, FAVORITE_WEIGHT)
        + Self::apply(s.reply_score, REPLY_WEIGHT)
        + Self::apply(s.retweet_score, RETWEET_WEIGHT)
        + Self::apply(s.photo_expand_score, PHOTO_EXPAND_WEIGHT)
        + Self::apply(s.click_score, CLICK_WEIGHT)
        + Self::apply(s.profile_click_score, PROFILE_CLICK_WEIGHT)
        + Self::apply(s.vqv_score, vqv_weight)
        + Self::apply(s.share_score, SHARE_WEIGHT)
        + Self::apply(s.share_via_dm_score, SHARE_VIA_DM_WEIGHT)
        + Self::apply(s.share_via_copy_link_score, SHARE_VIA_COPY_LINK_WEIGHT)
        + Self::apply(s.dwell_score, DWELL_WEIGHT)
        + Self::apply(s.quote_score, QUOTE_WEIGHT)
        + Self::apply(s.quoted_click_score, QUOTED_CLICK_WEIGHT)
        + Self::apply(s.dwell_time, CONT_DWELL_TIME_WEIGHT)
        + Self::apply(s.follow_author_score, FOLLOW_AUTHOR_WEIGHT)
        + Self::apply(s.not_interested_score, NOT_INTERESTED_WEIGHT)
        + Self::apply(s.block_author_score, BLOCK_AUTHOR_WEIGHT)
        + Self::apply(s.mute_author_score, MUTE_AUTHOR_WEIGHT)
        + Self::apply(s.report_score, REPORT_WEIGHT)
}

The model predicts the probability of each action, then multiplies by a weight. Positive actions add to your score. Negative actions subtract.

This is crucial: one block can hurt more than ten likes help.


The 19 Signals That Determine Your Reach

The Phoenix model predicts probabilities for 19 distinct user actions:

X Algorithm Signals

Positive Signals

SignalWhat It MeansCode Reference
favorite_scoreP(user will like)ServerTweetFav
reply_scoreP(user will reply)ServerTweetReply
retweet_scoreP(user will repost)ServerTweetRetweet
quote_scoreP(user will quote tweet)ServerTweetQuote
click_scoreP(user will click post)ClientTweetClick
profile_click_scoreP(user will click author profile)ClientTweetClickProfile
photo_expand_scoreP(user will expand image)ClientTweetPhotoExpand
vqv_scoreP(user will watch video quality view)ClientTweetVideoQualityView
share_scoreP(user will share)ClientTweetShare
share_via_dm_scoreP(user will DM post)ClientTweetClickSendViaDirectMessage
share_via_copy_link_scoreP(user will copy link)ClientTweetShareViaCopyLink
dwell_scoreP(user will dwell on post)ClientTweetRecapDwelled
quoted_click_scoreP(user will click quoted tweet)ClientQuotedTweetClick
follow_author_scoreP(user will follow author)ClientTweetFollowAuthor
dwell_timeExpected dwell duration (continuous)ContinuousActionName::DwellTime

Negative Signals

SignalWhat It MeansCode Reference
not_interested_scoreP(user clicks “not interested”)ClientTweetNotInterestedIn
block_author_scoreP(user will block author)ClientTweetBlockAuthor
mute_author_scoreP(user will mute author)ClientTweetMuteAuthor
report_scoreP(user will report post)ClientTweetReport

The Author Diversity Penalty

Here’s something most people don’t know: if you post multiple times, your posts compete against each other.

Author Diversity Penalty

From the code:

1
2
3
4
// From author_diversity_scorer.rs
fn multiplier(&self, position: usize) -> f64 {
    (1.0 - self.floor) * self.decay_factor.powf(position as f64) + self.floor
}

The algorithm sorts your posts by score. The first gets full weight. Each subsequent post gets penalized:

1
2
3
4
5
Position 0 (1st post):  multiplier โ‰ˆ 1.00  (100%)
Position 1 (2nd post):  multiplier โ‰ˆ 0.55  (55%)
Position 2 (3rd post):  multiplier โ‰ˆ 0.33  (33%)
Position 3 (4th post):  multiplier โ‰ˆ 0.21  (21%)
Position 4 (5th post):  multiplier โ‰ˆ 0.16  (16%)

The exact decay factor and floor are excluded from the open source release, but the exponential decay pattern is clear.

Implication: Quality beats quantity. One excellent post outperforms five mediocre ones.


The Filtering Pipeline

Before scoring, posts go through a gauntlet of filters:

FilterWhat It Does
DropDuplicatesFilterRemoves duplicate post IDs
AgeFilterRemoves posts older than threshold (~48h)
SelfTweetFilterRemoves your own posts from your feed
RetweetDeduplicationFilterDedupes reposts of same content
PreviouslySeenPostsFilterRemoves posts you’ve already seen
PreviouslyServedPostsFilterRemoves posts served in current session
MutedKeywordFilterRemoves posts with your muted keywords
AuthorSocialgraphFilterRemoves posts from blocked/muted accounts
IneligibleSubscriptionFilterRemoves paywalled content you can’t access
VFFilterPost-selection visibility filter (spam, violence, etc.)

The MutedKeywordFilter is worth noting. If your post contains keywords that many users have muted, it’ll be filtered out for those users regardless of its score.


Video Quality Views: The Duration Threshold

Videos only contribute to VQV (Video Quality View) scoring if they meet a minimum duration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// From weighted_scorer.rs
fn vqv_weight_eligibility(candidate: &PostCandidate) -> f64 {
    if candidate
        .video_duration_ms
        .is_some_and(|ms| ms > MIN_VIDEO_DURATION_MS)
    {
        VQV_WEIGHT
    } else {
        0.0
    }
}

The exact MIN_VIDEO_DURATION_MS is excluded, but industry standards suggest 2-3 seconds minimum. Videos shorter than this get zero weight for the video view signal.

Optimal Video Lengths

DurationBest ForNotes
2-15 secondsLoops, memes, quick hitsFast engagement, easy shares
30-60 secondsTips, insights, reactionsSweet spot for dwell time
1-2 minutesTutorials, stories, threadsMaximum dwell if engaging
2+ minutesEducational deep divesOnly if content is compelling

Video Best Practices

โœ… DoโŒ Don’t
Hook in first 1-2 secondsSlow intros
Add captionsSound-only content
Native uploadYouTube/TikTok links
Strong thumbnailBoring first frame

Dwell Time: The Underrated Signal

The algorithm tracks two distinct dwell signals:

  1. dwell_score: Binary: Did they stop scrolling?
  2. dwell_time: Continuous: How long did they spend?

The dwell time is treated as a continuous value, not a probability:

1
2
// From phoenix_scorer.rs
dwell_time: p.get_continuous(ContinuousActionName::DwellTime),

This means longer content that holds attention is genuinely rewarded, not just content that stops the scroll.

Dwell Time by Duration

DurationSignal StrengthWhat It Means
< 500msNoneScrolled past
500ms - 2sWeakBrief pause
2-5 secondsGoodGenuine interest
5+ secondsGreatStrong engagement
10+ secondsBestDeep engagement

How to Maximize Dwell Time

  • Write longer, multi-paragraph posts that take time to read
  • Use storytelling that keeps people engaged
  • Create threads with valuable content across multiple posts
  • Add multiple images to swipe through
  • Include detailed visuals people will examine

The Out-of-Network Penalty

In-network posts (from accounts you follow) get preference over out-of-network posts:

1
2
3
4
5
// From oon_scorer.rs
let updated_score = c.score.map(|base_score| match c.in_network {
    Some(false) => base_score * OON_WEIGHT_FACTOR,
    _ => base_score,
});

Out-of-network content is multiplied by OON_WEIGHT_FACTOR (a value less than 1). This means even a viral post from a stranger has to overcome a built-in handicap compared to a post from someone you follow.

Implication: Growing your follower count has compounding benefits beyond vanity. Your posts get an algorithmic boost with your followers.


The ML Model: Grok-Based Transformer

The ranking model is a transformer architecture ported from xAI’s Grok-1 release. Here’s what makes it interesting:

Hash-Based Embeddings

Instead of looking up users and posts in giant embedding tables, the model uses multiple hash functions:

1
2
3
4
5
6
# From recsys_model.py
@dataclass
class HashConfig:
    num_user_hashes: int = 2
    num_item_hashes: int = 2
    num_author_hashes: int = 2

This allows the model to handle any user or post ID without maintaining massive lookup tables.

The Input Structure

The model takes three components:

  1. User embedding: Who is viewing
  2. History embeddings: What they’ve engaged with recently
  3. Candidate embeddings: Posts to be scored
1
[User] + [History (128 items)] + [Candidates (32 items)]

Candidate Isolation Attention Mask

Here’s the clever part. During attention, candidates can see the user and history, but cannot see each other:

1
2
3
4
5
         User  History  Candidates
User     [โœ“]    [โœ“]       [โœ—]
History  [โœ“]    [โœ“]       [โœ—]  
Cand_1   [โœ“]    [โœ“]       [โœ“ self only]
Cand_2   [โœ“]    [โœ“]       [โœ“ self only]

This ensures that a post’s score doesn’t depend on which other posts happen to be in the same batch. Scores are consistent and can be cached.


Practical Takeaways

Based on the code, here’s what actually moves the needle:

What to Optimize For

  1. Replies: High-weight positive signal. Ask questions. Invite discussion.

  2. Shares via DM: Surprisingly high weight. Create “send this to someone” content.

  3. Profile clicks โ†’ Follows: The follow_author_score directly contributes to ranking.

  4. Dwell time: Write substantive content that takes time to read.

  5. Video quality views: Make videos 2+ seconds minimum, hook immediately.

What to Avoid

  1. Blocks: Severe negative weight. Don’t be hostile.

  2. Reports: Severe negative weight. Stay within guidelines.

  3. Mutes: High negative weight. Don’t be spammy.

  4. Excessive posting: The author diversity scorer will penalize your 3rd, 4th, 5th posts.

Content That Triggers Negative Signals

โŒ Avoid ThisWhy It Hurts
Engagement bait (“Like if you agree!”)Triggers “not interested”
Rage bait (intentionally provocative)Triggers blocks and mutes
Spam patterns (same content repeatedly)Triggers mutes
Excessive self-promotionTriggers mutes and unfollows
Misleading headlines (clickbait)Triggers “not interested”
Hostile replies (aggressive arguing)Triggers blocks
Posting 5+ times per dayDiversity penalty + mutes

These behaviors lead to mutes and blocks which severely damage your reach: often more than the positive engagement you might get.

Optimal Posting Cadence

Given the ~48-hour post retention window and the author diversity penalty:

Posts/DayRecommendation
1Best reach per post
2Good; space 12+ hours apart
3Acceptable; space 8+ hours
4+Diminishing returns

The Algorithm’s Core Question

All of this complexity reduces to one question the model is trying to answer:

“Will this specific user engage positively with this content?”

It’s not asking what’s objectively “good.” It’s predicting your personal behavior based on your history.

This is why generic growth hacks have diminishing returns. The algorithm is personalized. What works for one audience may not work for another.

The sustainable strategy is straightforward: create content that genuinely resonates with your specific audience, and avoid behaviors that trigger negative signals.


TL;DR Cheat Sheet

โœ… Do This

ActionImpact
Ask questions (triggers replies)๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ
Create shareable content๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ
Write longer posts (dwell time)๐Ÿ”ฅ๐Ÿ”ฅ
1-2 quality posts per day๐Ÿ”ฅ๐Ÿ”ฅ
Space posts 10-12 hours apart๐Ÿ”ฅ๐Ÿ”ฅ
Respond to replies quickly๐Ÿ”ฅ๐Ÿ”ฅ
Use multiple images๐Ÿ”ฅ
Videos 30s-2min with strong hooks๐Ÿ”ฅ

โŒ Avoid This

ActionImpact
Getting blocked๐Ÿ’€๐Ÿ’€๐Ÿ’€
Getting reported๐Ÿ’€๐Ÿ’€๐Ÿ’€
Posting 5+ times in 24 hours๐Ÿ’€๐Ÿ’€
Spammy/repetitive content๐Ÿ’€๐Ÿ’€
Engagement bait phrases๐Ÿ’€
Hostile replies๐Ÿ’€

Technical Notes

For those who want to dig deeper:

  • Model architecture: Transformer with special attention masking for candidate isolation
  • Inference: Predictions made per-user, not globally
  • Serving stack: Rust-based candidate pipeline (home-mixer/) calling Python ML models (phoenix/)
  • In-network source: thunder/; Redis-like in-memory post store with Kafka ingestion
  • Framework: JAX + Haiku for the ML components

The exact weight values for each signal are excluded from the open source release (noted as “Excluded from open source release for security reasons” in the code), but the relative importance is clear from the architecture.


Further Reading


The irony isn’t lost on me: I’m using insights about the algorithm to write content optimized for the algorithm. But at least now you know how the game works.


Changelog

  • 2026-01-20: Initial comprehensive analysis of open-sourced X algorithm
  • 2026-01-29: Added frontmatter metadata, minor formatting improvements