v0 is one entity end-to-end on real infrastructure, because we already have the data and a hand-built prototype to beat. Everything after that is widening the registry and compounding the archive. The forcing function is simple: the next big model launch should happen on our site.
Replace the hand-run fable-5 pipeline with the real one: cron Workers, R2 layout, enrichment, schema v2, and a public entity page + frozen launch read. No leaderboard yet; the homepage IS the Fable 5 page.
Widen to ~20 entities across all four vectors, turn on cross-entity mindshare, ship the homepage, and commit to the name. This is the public launch of the property itself.
The differentiating analytics: split every series by author class, let readers compare entities and launches, and add a non-X source so narratives get a second opinion.
Compound what nobody can backfill: the historical record of every launch, and distribution surfaces that feed on the data.
The property's whole reason to exist is being great for 72 hours at a time. This is the operating procedure per major launch; most of it is automated, the human steps are bolded.
| When | What happens |
|---|---|
| T−1 day | Add/verify the entity in the registry (aliases incl. rumored names, official handles, guard terms). Flip the launch flag with a 72h window. Pipeline warms: full-archive bootstrap if new, baseline captured for the z-score. |
| T0 | Launch mode kicks to 30-min cadence automatically. The Launch tab goes live on the entity page: day-0 curves, provisional Launch Score, past-launch overlay. All of this is guaranteed and human-free. Editorial decides whether the story warrants a read; if yes, the auto-draft is open within 4 hours. |
| T0 → T+72h | Score subscores update each refresh; change feed narrates the arc (day 1 is novelty, day 2 is pricing, day 3 is verdicts). If a read is live, its data blocks re-hydrate and editorial does one pass per day. Charts posted to the X account (v3: auto-drafted). |
| T+72h | Launch mode decays to elevated, then baseline. The Launch Score locks (durability subscore lands) and the fingerprint freezes into /launches. If a read exists, editorial freezes it with the final verdict. The entity's Now tab keeps rolling on the leaderboard either way. |
On launch day, everyone wants the chart that says "this is going well/badly." OG images per entity render the live mindshare + sentiment card, so sharing the URL shares the chart.
Journalists need a number with a name attached. Downloadable JSON + a stable methodology page make "according to vibebench (a Sift property)" frictionless.
"Powered by Sift" links to a landing page making the obvious pitch: this is what Sift does for your brand, every day, on every platform. DevRel and comms teams watching their own launch are the warmest possible leads.
| Horizon | Metric | Target |
|---|---|---|
| v0 | Pipeline uptime through one full week + parity with the hand-built fable-5 numbers | 7 days unattended, numbers reconcile |
| v1 launch | Unique visitors in launch week; at least one unsolicited citation | 10k visitors; 1 citation |
| First tracked launch | Traffic during a 72h launch window; screenshots of our charts circulating on X | 50k visitors; visible organic sharing |
| Quarter | Press/newsletter citations; Sift-attributed inbound demos | 5 citations/mo; measurable demo source tag |
| Two quarters | The coined metric escapes the site: "Launch Score" or "the Gap" quoted without us prompting it | 1 unprompted use in coverage or a lab's own comms |
| Line | Monthly | Notes |
|---|---|---|
| X API | $0 (Enterprise) · ~$2.4k equivalent pay-per-use | Full math in Technical; per-entity caps enforced |
| LLM classification | ~$50-150 | ~500k posts/mo through a flash-tier model, batched, cached |
| Cloudflare (Workers, R2, D1, Pages) | ~$20-50 | Static-first design keeps this near the floor |
| Domain + misc | ~$10 | |
| Total | < $250/mo on our contracts | The expensive ingredient is editorial attention on launch days, by design |
leaning VibeBench, with ModelPulse as the safe fallback. Candidates and rationale in Product. Needs a call + domain purchase before v1.
The pipeline and site stay private: anti-gaming rules don't survive publication, and the collector must be swappable to cheaper transports without the swap being visible. The entity registry ships as a small public repo for community alias/entity PRs, and the methodology page carries the full audit surface. Details in Technical.
leaning start on Enterprise (paid for, official), shadow-test the cheap collector on 2-3 entities during v1, swap content pulls when parity clears ~95%. Counts can stay official as the reference series. The Collector contract makes the swap a config change, invisible externally.
leaning reads ship only when the story warrants one, so the question shrinks to "who makes that call and does the daily pass during a launch." The Launch tab + score are guaranteed without anyone; a named owner per launch covers the rest.
~20 slots across the four vectors. Draft list exists in the registry section of Product; needs a final pass for "will anyone care" and alias/guard quality before bootstrap pulls.
leaning artificialanalysis index with attribution (reach out; the chart is free marketing for them too). LMArena Elo as fallback; sentiment-only ranking if neither is usable. Also: builder-panel curation needs an initial 300-account pass and a public inclusion rule.
Everything a coding agent needs to build v0 → v1 without re-deriving a single decision lives in build/: the master brief with build order and gates, then one spec per subsystem, plus real-data fixtures and the registry seed.
| File | Contents |
|---|---|
| README.md | Agent brief: constraints (privacy boundary, leak rule, budget guards), 9-step build order with gates, v0/v1 scope, definition of done |
| 01-repo-scaffold.md | Workspace layout, Cloudflare resources (R2/Queues/D1/crons), D1 schema, conventions |
| 02-data-contracts.md | Registry, NormalizedPost, Classification, and all five public artifacts as exact types; the leak rule as a test |
| 03-collector.md | Collector interface, X API endpoints/params/limits, pull recipe, hard budget guards, shadow-swap harness |
| 04-enrichment.md | Theme vocabulary with definitions, the actual classification/author/summary prompts, validator rules, spam heuristics, amplification detection |
| 05-rollups-scores.md | Window rollups, vibe-score.v1 and launch-score.v1 as exact math, change-event triggers, state machine, atomic publish |
| 06-site.md | Astro pages/components mapped to the mockups, hydration + polling model, embed strategy + fallback, OG cards, quality bar |
| 07-acceptance.md | CI gates, fixture reconciliation tolerances, mockup-parity check, the 7-day live soak |
| fixtures/ · registry-seed.json | Real June 9 Fable 5 capture (trimmed, private-shape) + 21-entity registry seed (placeholder names flagged for verification) |