Rollout · phases & playbook · June 2026

Ship the Fable 5 page first. Earn the leaderboard. Then own launch days.

v0 is one entity end-to-end on real infrastructure, because we already have the data and a hand-built prototype to beat. Everything after that is widening the registry and compounding the archive. The forcing function is simple: the next big model launch should happen on our site.

v0 in ~2 weeksleaderboard at ~20 entitieseach phase independently shippable

Phases

v0wk 1-2

One entity, end to end

Replace the hand-run fable-5 pipeline with the real one: cron Workers, R2 layout, enrichment, schema v2, and a public entity page + frozen launch read. No leaderboard yet; the homepage IS the Fable 5 page.

Ships
  • Repo scaffold (private), wrangler envs, S3-API storage layer
  • Collector: counts hourly + content pulls, registry-driven
  • Enrichment: sentiment, themes, capabilities + tags, author class + affiliation (owned vs organic in the model from day one), caching in D1
  • summary.json v2 public/private split, atomic publish
  • Entity page on the placeholder domain, X embeds + fallbacks, incl. the what-people-say module (consensus summary + aspect chips + drill-in)
  • Fable 5 launch fingerprint: T0-aligned curve + provisional Launch Score, first row of launches.json
  • Fable 5 launch read regenerated from pipeline data, frozen
v1wk 3-5

The leaderboard and the brand

Widen to ~20 entities across all four vectors, turn on cross-entity mindshare, ship the homepage, and commit to the name. This is the public launch of the property itself.

Ships
  • Registry at ~20 entities: frontier, open-weight, labs, products; full-archive bootstrap each
  • index.json + leaderboard with filter tabs and window switcher
  • Launch-mode state machine (manual flag + z-score trigger)
  • Theme intelligence + theme pages: facets, trajectories, voiced-by, field benchmark (needs the multi-entity registry), verdict lines, /themes/<topic> URLs with OG cards
  • Crowd scorecard: the fixed 8-dimension rubric (coding, art, depth, speed, price…) on every entity page
  • Tag index + search: tags.json, typeahead, /good-at/<tag> pages ("this model is good at openclaw")
  • Deep threads: conversation grouping, thread scores, community-root default surface with owned rail + relationship split stat
  • Community voices: voice score (standing, not followers), community-default voices tabs, rising badges
  • Vibe Score v1: standing 0-100 favorability (crowd-only formula), leaderboard Score column + entity hero
  • Launch Score v1 + /launches archive: formula published, every tracked launch scored from here on
  • Change feed: events.json, homepage ticker, entity timeline
  • Name + domain live, "by Sift" attribution, OG cards per entity
  • Methodology page with public query strings, score formula + downloadable JSON (promises outputs, never transport)
  • Public registry repo: entities, aliases, builder panel, open for PRs
  • "How we're open" page: open data/methodology/registry + score-formula packages, closed machine, and why
  • Collector shadow test: cheap collector vs official on 2-3 entities, parity report
  • Reads feed + the editorial draft prompt in-repo
v2wk 6-10

Official vs community, compare, second source

The differentiating analytics: split every series by author class, let readers compare entities and launches, and add a non-X source so narratives get a second opinion.

Ships
  • Owned-vs-organic toggle on every entity-page panel (the data model lands in v0; this is the full display split)
  • Builder sentiment series: curated ~300-account panel, own line on every chart; Vibe Score v2 adds the builder double-weighting
  • The Gap: standing benchmarks-vs-vibes scatter on the homepage
  • Cross-model theme pages: hot-theme chips open "who gets roasted on pricing" across the field
  • Compare view: 2 entities side by side; launch-week overlay (day 0 aligned)
  • News ingestion (RSS/licensed sources only, no Reddit) as a second series
  • Amplification events surfaced (RT chains as their own metric)
  • Weekly digest read auto-drafted (the recurring editorial beat)
v3later

The moat: archive + distribution

Compound what nobody can backfill: the historical record of every launch, and distribution surfaces that feed on the data.

Candidates, not commitments
  • Launch archive: every frozen read + its data, browsable timeline
  • X account auto-drafting chart posts for human approval
  • Embeddable widgets ("mindshare badge") for blogs/press
  • Registry repo matured: contribution guide, community-proposed entities graduating to tracked
  • More platforms if licensing allows; never Reddit until it does

Launch-day playbook

The property's whole reason to exist is being great for 72 hours at a time. This is the operating procedure per major launch; most of it is automated, the human steps are bolded.

WhenWhat happens
T−1 dayAdd/verify the entity in the registry (aliases incl. rumored names, official handles, guard terms). Flip the launch flag with a 72h window. Pipeline warms: full-archive bootstrap if new, baseline captured for the z-score.
T0Launch mode kicks to 30-min cadence automatically. The Launch tab goes live on the entity page: day-0 curves, provisional Launch Score, past-launch overlay. All of this is guaranteed and human-free. Editorial decides whether the story warrants a read; if yes, the auto-draft is open within 4 hours.
T0 → T+72hScore subscores update each refresh; change feed narrates the arc (day 1 is novelty, day 2 is pricing, day 3 is verdicts). If a read is live, its data blocks re-hydrate and editorial does one pass per day. Charts posted to the X account (v3: auto-drafted).
T+72hLaunch mode decays to elevated, then baseline. The Launch Score locks (durability subscore lands) and the fingerprint freezes into /launches. If a read exists, editorial freezes it with the final verdict. The entity's Now tab keeps rolling on the leaderboard either way.
Surprise launches: the z-score trigger catches them within ~2 hours and pages the editorial channel (Slack webhook). The cost of being 2 hours late to a surprise is low; the cost of polling everything at launch cadence forever is not.

Distribution: how it gets seen

Launch-day virality

Be the screenshot

On launch day, everyone wants the chart that says "this is going well/badly." OG images per entity render the live mindshare + sentiment card, so sharing the URL shares the chart.

Citations

Be the source

Journalists need a number with a name attached. Downloadable JSON + a stable methodology page make "according to vibebench (a Sift property)" frictionless.

The funnel

Be the demo

"Powered by Sift" links to a landing page making the obvious pitch: this is what Sift does for your brand, every day, on every platform. DevRel and comms teams watching their own launch are the warmest possible leads.

Success metrics

HorizonMetricTarget
v0Pipeline uptime through one full week + parity with the hand-built fable-5 numbers7 days unattended, numbers reconcile
v1 launchUnique visitors in launch week; at least one unsolicited citation10k visitors; 1 citation
First tracked launchTraffic during a 72h launch window; screenshots of our charts circulating on X50k visitors; visible organic sharing
QuarterPress/newsletter citations; Sift-attributed inbound demos5 citations/mo; measurable demo source tag
Two quartersThe coined metric escapes the site: "Launch Score" or "the Gap" quoted without us prompting it1 unprompted use in coverage or a lab's own comms

Cost to operate

LineMonthlyNotes
X API$0 (Enterprise) · ~$2.4k equivalent pay-per-useFull math in Technical; per-entity caps enforced
LLM classification~$50-150~500k posts/mo through a flash-tier model, batched, cached
Cloudflare (Workers, R2, D1, Pages)~$20-50Static-first design keeps this near the floor
Domain + misc~$10
Total< $250/mo on our contractsThe expensive ingredient is editorial attention on launch days, by design

Open questions

1 · Name and domain

leaning VibeBench, with ModelPulse as the safe fallback. Candidates and rationale in Product. Needs a call + domain purchase before v1.

2 · Open source or not · decided: private machine, public registry

The pipeline and site stay private: anti-gaming rules don't survive publication, and the collector must be swappable to cheaper transports without the swap being visible. The entity registry ships as a small public repo for community alias/entity PRs, and the methodology page carries the full audit surface. Details in Technical.

3 · Collection transport over time

leaning start on Enterprise (paid for, official), shadow-test the cheap collector on 2-3 entities during v1, swap content pulls when parity clears ~95%. Counts can stay official as the reference series. The Collector contract makes the swap a config change, invisible externally.

4 · Who edits the reads

leaning reads ship only when the story warrants one, so the question shrinks to "who makes that call and does the daily pass during a launch." The Launch tab + score are guaranteed without anyone; a named owner per launch covers the rest.

5 · v1 entity list

~20 slots across the four vectors. Draft list exists in the registry section of Product; needs a final pass for "will anyone care" and alias/guard quality before bootstrap pulls.

6 · The Gap's capability axis

leaning artificialanalysis index with attribution (reach out; the chart is free marketing for them too). LMArena Elo as fallback; sentiment-only ranking if neither is usable. Also: builder-panel curation needs an initial 300-account pass and a public inclusion rule.

Execute: the agent build pack

Everything a coding agent needs to build v0 → v1 without re-deriving a single decision lives in build/: the master brief with build order and gates, then one spec per subsystem, plus real-data fixtures and the registry seed.

FileContents
README.mdAgent brief: constraints (privacy boundary, leak rule, budget guards), 9-step build order with gates, v0/v1 scope, definition of done
01-repo-scaffold.mdWorkspace layout, Cloudflare resources (R2/Queues/D1/crons), D1 schema, conventions
02-data-contracts.mdRegistry, NormalizedPost, Classification, and all five public artifacts as exact types; the leak rule as a test
03-collector.mdCollector interface, X API endpoints/params/limits, pull recipe, hard budget guards, shadow-swap harness
04-enrichment.mdTheme vocabulary with definitions, the actual classification/author/summary prompts, validator rules, spam heuristics, amplification detection
05-rollups-scores.mdWindow rollups, vibe-score.v1 and launch-score.v1 as exact math, change-event triggers, state machine, atomic publish
06-site.mdAstro pages/components mapped to the mockups, hydration + polling model, embed strategy + fallback, OG cards, quality bar
07-acceptance.mdCI gates, fixture reconciliation tolerances, mockup-parity check, the 7-day live soak
fixtures/ · registry-seed.jsonReal June 9 Fable 5 capture (trimmed, private-shape) + 21-entity registry seed (placeholder names flagged for verification)