# Lighthouse Watch — Product Spec v1.0

**Status.** Locked May 3, 2026. Replaces and supersedes the prior one-line "Observatory Retainer · $28k/mo" mention in studio.html / lighthouse.html.

**One-line.** The continuous post-Sprint service that keeps the agent system you shipped stable, current, and improving — month after month, while the foundation layer underneath it churns.

---

## Why Watch exists

Two converging market realities forced the move from "optional retainer" to "default next step":

1. **Foundation-layer churn.** New models drop every 8–12 weeks. Each release forces enterprise teams to decide: stay, swap, run side-by-side, or delete. Without a partner running that decision tree, the system Studio shipped becomes brittle within a quarter.
2. **Pattern library drift.** Every Sprint extracts patterns into the `@umbra/patterns` library. Watch is how those patterns flow back into the client's running system. Without Watch, the client gets the v1 of every pattern and never benefits from the improvements that subsequent engagements produce.

The strategic claim (delivered as the Sprint→Watch handoff): **"We don't sell you an agent. We sell you a system that gets better while it runs."**

---

## What Watch covers (Standard tier · default)

Every month, Watch delivers four guaranteed work streams plus one improvement:

### Stream A · Foundation maintenance
- Model swap testing across all agents in scope. New release of Claude / GPT / Gemini → side-by-side eval run within 7 business days.
- Eval suite re-run on the full agent fleet. Drift threshold breach → ticket opened automatically.
- Prompt-drift detection — outputs scored monthly against the original baseline. Regressions investigated before the client sees them.
- Provider deprecation tracking. We give you a clean migration window before deprecation date, not after.

### Stream B · Governance + observability
- Governance Runbook revision. New incident, new policy, new regulatory ask → the runbook updates and your reviewers re-read.
- Acceptance-rate monitoring on every gated agent (target ≥80%). Below threshold → root-cause review with the reviewers.
- Reviewer UI feedback loop. We watch what your humans flag and feed it back into the agent's eval set.
- Quarterly architecture review (every 3rd month).

### Stream C · The +1 improvement (the differentiator)
**One delivered improvement per month.** Either:
- A new pattern from the Studio library deployed to your system (e.g., a stronger gating primitive), OR
- A workflow optimization that reduces a measured friction (e.g., cutting one approval step, automating one exception path).

Improvement is scoped jointly in the previous month's review. Delivered against an eval threshold. If the +1 fails its eval, it doesn't ship and we owe you the improvement next month + a call.

### Stream D · Monthly health report (the deliverable that lives in the inbox)
A 3–4 page document delivered on the 1st of every month. Signed by the lead engineer. Format spec in `04-monthly-report-template.md`.

Includes:
- Model status (versions in production, last evaluation, recommended action)
- Acceptance-rate trend (per agent, last 90 days)
- Eval pass-rate trend
- Incidents handled this month (with link to runbook update if applicable)
- The +1 improvement delivered (one paragraph + before/after metric)
- Pattern library updates available + recommendation
- Next month preview (the +1 on deck, any model migrations expected)

### Always available (not metered)
- Slack channel with response SLA: same-day for critical, next-business-day for standard.
- Direct line to the lead engineer who shipped the Sprint (not a support tier).

---

## Pricing — three tiers

All tiers are monthly, billed in advance, with the term and renewal terms specified in the service agreement. Pricing applies to a single agent system / Sprint output. Multi-system clients quote bespoke.

| Tier | Monthly | Min term | Includes |
|---|---|---|---|
| **Watch · Essentials** | $14k | 6 months | Streams A + B + D. No +1. For clients who want maintenance only. |
| **Watch · Standard** ★ | $24k | 12 months | A + B + C + D. The default offer. Includes the +1 improvement. |
| **Watch · Pro** | $38k | 12 months | Standard + 4 hrs/mo named-engineer advisory + priority response (4-hour SLA on critical) + bring-your-own-emergency hours bank (8 hrs/yr). |

**Default proposal at Sprint closeout: Watch · Standard, 12-month term.** The Essentials tier is offered only if the client pushes back on price or hasn't yet adopted enough of the system to justify the +1 cadence.

★ The previous "Observatory Retainer · $28k/mo" mention sat conceptually between Standard and Pro. Watch · Standard at $24k is **lower** than the prior implied price. The reason: productizing the offer reduces our delivery cost (templates, automation, fixed cadence) and lets us trade price for term commitment (12-month minimum). LTV per client improves even at the lower headline number.

### Bundle option (commercially aggressive)

For clients who prefer one purchase order:
- **Sprint + Watch · Standard 12-mo bundle** — $290k flat (vs. $356k separately at the lower Sprint price band of $68k + 12 × $24k).
- Discount is paid for by the term commitment. Bundle is non-cancellable for the first 6 months.

---

## What Watch is *not*

Important to spell this out, because the absence of these scopes is what keeps the price honest:

- **Not new agent development.** Building a brand-new agent or migrating a new workflow is a fresh Sprint, not a Watch deliverable. The +1 improvement extends what already exists; it doesn't add net-new scope.
- **Not unlimited advisory.** The Pro tier caps advisory at 4 hrs/mo for a reason. Beyond that we propose a Beacon Workshop or a follow-on Sprint.
- **Not breakdown insurance.** If the client's data infrastructure breaks (CDP, warehouse, unified ID layer), that's outside the Watch perimeter. We diagnose and refer; we don't fix data plumbing as a Watch deliverable.
- **Not a managed service.** We don't operate the agent for the client. The client's reviewers operate it; we keep the agent and the governance current.

The cleanest way to say it to a prospect: *"Watch keeps the system you shipped stable, current, and improving. It doesn't replace your operators, and it doesn't add new systems."*

---

## Cadence — what a typical month looks like

| Week | Activity |
|---|---|
| **Week 1** | Health report delivered. Standing review call (45 min, optional but encouraged). Last month's +1 confirmed delivered. This month's +1 confirmed in scope. |
| **Week 2** | +1 improvement build. Eval suite expanded to cover the +1. |
| **Week 3** | +1 improvement ships behind gate. Reviewer feedback loop opens. Foundation-layer scan (any new model releases since last cycle?). |
| **Week 4** | +1 improvement promoted (if eval passes). Governance runbook updated if applicable. Health report draft prepared for next month-1. |

Quarterly addition (every 3rd month): full architecture review session — 90-min call with the client's CMO/CTO sponsor + Studio's lead engineer. Output: written recommendation if any structural change is warranted.

Annual addition (every 12th month): renewal conversation. Either roll into a new 12-month term (default), step down to Essentials, step up to Pro, or scope a new Sprint.

---

## SLA matrix

| Severity | Definition | Response | Resolution target |
|---|---|---|---|
| **P0 · Critical** | System down, evals failing across the board, governance breach | 4 hrs (Pro) / 1 business day (Standard) / 1 business day (Essentials) | 24 hrs |
| **P1 · High** | Single agent failing, acceptance rate dropping, model deprecation imminent | 1 business day | 5 business days |
| **P2 · Standard** | Drift detected, runbook update needed, +1 scoping | Next monthly review | Within current cycle |
| **P3 · Backlog** | Nice-to-haves, future tier upgrades | Quarterly review | Quarterly review |

**Escalation path.** Slack channel → lead engineer → Studio principal. Pro tier adds named engineer with direct phone.

---

## Conversion mechanics — the Sprint→Watch handoff

Watch is positioned as the **default next step** at every Sprint closeout. The Sprint contract includes the proposal date for Watch (Week 9, the second-to-last Sprint week).

Three artifacts make the conversion mechanical, not awkward:

1. **The Watch proposal email** — sent in Week 8 of the Sprint. (`05-upsell-email.md`)
2. **The Watch sales sheet** — attached to the email. Single page, the value prop, the three tiers, the bundle option. (`/lighthouse-kit/lighthouse-watch-sales-sheet.html`)
3. **The closeout meeting talk track** — used in the Sprint Week 10 final review. The lead engineer walks through what's already true and what Watch keeps true. (`05-upsell-email.md` final section)

**Conversion target: 70% of Sprint clients sign Watch · Standard within 30 days of Sprint closeout.** Tracked from engagement #1 forward.

If a client declines: Essentials offered with no commitment beyond 6 months. If still no, the Sprint contract closes clean — but we mark the account in the CRM for re-pitch at the 90-day check-in (which is contractual anyway).

---

## How Watch feeds the Studio flywheel

Every Watch engagement is a research instrument as well as a service:

- **Pattern library inputs.** Every +1 improvement delivered to a client surfaces a candidate pattern for the library. Three +1's that look like the same pattern → that pattern moves from candidate to canonical.
- **Eval set inputs.** The reviewer feedback loop produces real-world failure modes that flow back into the standard eval suite for that agent type. Future Sprints inherit the harder eval set.
- **Sales evidence.** "Our clients average 24 months on Watch" is a pitch line that compounds over time. The longer a client stays, the better the proof.

This is also why the +1 improvement has to be eval-gated — we don't pollute the library with patterns that didn't actually work in production.

---

## Internal economics (sanity check)

Per Watch · Standard engagement, monthly:

- Lead engineer: 8–10 hours/mo (review + +1 + escalations)
- Eval/ops automation: ~$200/mo (model API calls for scheduled runs + observability)
- Total marginal cost: ~$3,500/mo at fully-loaded engineer cost
- Gross margin: ~85% at $24k, ~88% at $38k (Pro)

The Pro tier's added cost is the named engineer commitment — not pure margin. Capacity check: a single engineer can carry 3–4 Standard or 2 Pro Watch clients before quality slips. This bounds Studio's Watch revenue ceiling per engineer at roughly $96k/mo (4 × $24k) or $76k/mo (2 × $38k).

---

## Open questions for v1.1

- Should Pro tier include any pattern-library *contribution* royalty? (i.e., when a Pro client's +1 graduates into the canonical library, the client gets a referenceable case study + a pricing credit)
- Annual subscription option? (12 × monthly bundled into a single PO, prepay discount of ~5%)
- Should Watch have a public landing page (`/watch`) or live as a sub-section of `/studio` and `/lighthouse`? Default decision: sub-section for now, dedicated page after engagement #3.
- Multi-system pricing — does a client with 2 Sprint outputs get Watch at 1.5×, 2×, or a custom number? Default: 1.7× (acknowledges shared overhead, doesn't double-charge).

---

— v1.0 · 2026-05-03 · Locked. Update only via the phase-close protocol on the umbra-group skill.