Studio Operating Framework.
The machine that runs the Studio. This is the master document for how Umbra Studio itself operates — the tools we use to build agent systems, the infrastructure that runs them, the hardware we run it on, the security posture around client data, and the internal discipline that makes the Lighthouse Sprint reproducible.
Paired with five operational sub-documents — Agent Runtime, Collaboration Stack, Studio Internals, Hardware & Workspace, Security & Secrets. Together they describe everything it takes to deliver a Lighthouse Sprint without reinventing the wheel. Read before the first sprint; reviewed quarterly.
How to use this kit.
The Studio Ops Kit is the mirror image of the Lighthouse Kit. The Lighthouse Kit is what we sell — the playbooks a Sprint Lead executes inside a client engagement. The Studio Ops Kit is what makes selling that possible — the tools, infrastructure, hardware, and internal discipline that let the Studio deliver a Lighthouse Sprint reliably, profitably, and without the operating model being a bottleneck.
This master document is the map. Read it first. It establishes the operating principles, the full stack map, the budget envelope, the governance we apply to ourselves, and which of the five sub-documents to reach for when a specific question comes up.
If a single person is running the Studio, this kit is the operating manual for that person. If the Studio scales past one, this kit is what onboards the second, third, and fourth person without everything that one person knows having to be re-learned by tribal osmosis.
Who reads what
| Role | Must read | Should read | Reference when needed |
|---|---|---|---|
| Founder / Principal | All six | All six | — |
| Sprint Lead | Master, Collaboration, Internals | Agent Runtime, Security | Hardware |
| Agent Engineer | Master, Agent Runtime, Security | Hardware | Collaboration, Internals |
| Governance Architect | Master, Agent Runtime, Security | Internals | Hardware, Collaboration |
| New hire (day 1) | Master | Their role’s “Must read” | Everything else |
Relationship to the Lighthouse Kit
Every client-facing choice in the Lighthouse Kit is underwritten by a studio-side capability documented here. When the Discovery Playbook says “the Agent Engineer instruments monitoring”, it is the Agent Runtime doc that defines what “instrumented” actually means in our stack. When the Handoff Playbook says “transfer credentials via a secure vault”, it is the Security doc that specifies the vault, the rotation cadence, and the revocation procedure.
If you ever find a gap — the Lighthouse Kit asks a Sprint Lead to do something that the Studio Ops Kit does not enable — that is a P0 operational bug. Fix it in the Ops Kit before the next sprint starts.
Operating principles.
Tooling decisions made in isolation drift into a zoo. These seven principles are the constitution: every sub-document in this kit should defer to them. When a specific recommendation in a sub-document conflicts with a principle here, the principle wins and the sub-doc is wrong.
Own the primitives, rent the commodities.
We own prompt logic, pattern library, agent specs, governance runbooks, and the state-machine shape of every workflow. We rent everything else — compute, storage, messaging, identity, billing. If a vendor raises prices 3× overnight, the switching cost on rented commodities should be measured in days, not quarters.
Spreadsheet-first, database when pressure demands.
USP-009 applies to the Studio, not just to client systems. Every new workflow starts as a Google Sheet with Apps Script glue. Graduate to a real database (Supabase / Airtable / Postgres) only when a named pressure forces the move: concurrent writes, row count past ~5k, sub-second query needs, or a compliance requirement. Never migrate on aesthetics.
One model, three tiers, measured per run.
Default to Claude. Use Opus 4.6 for planning (ICPs, spec writing, post-mortems), Sonnet 4.6 for execution (most production agent runs), Haiku 4.5 for high-volume tight-loop tasks (classifiers, routers, format-check guards). Every agent run logs token count and dollar cost to the state sheet — no exceptions.
Governance is not a tool — it is a column.
No monitoring tool survives contact with a sprint if it lives in a separate UI the client will never check. Every governance signal must land on the state sheet (severity, last-alert, last-rollback, last-audit-review, readiness). The state sheet is the dashboard. External observability platforms are nice-to-have, not core.
Reversible by default.
Every tool choice, every deployment, every credential must have a documented “how to back out” runbook. If you cannot describe the reversal in under 15 minutes of work, the choice is too expensive. This principle is why we avoid bespoke infrastructure for anything that has not yet proven itself across three sprints.
Client data never lives on a laptop.
Client credentials are stored in the vault. Client data is read from the client’s systems in place. No client data is persisted to a personal machine. Local caches are ephemeral and wiped at end-of-sprint. This principle dictates many hardware and workspace choices.
Every Friday, the Studio ships a demo to itself.
USP-011 applies internally. We run a Friday demo of any Studio-internal tooling change, pattern addition, or process revision — in a 30-minute calendar block, solo or with collaborators. No internal improvement lands on a Monday without a Friday demo. This keeps the Studio’s own operating system from rotting.
The stack map.
We describe the Studio stack in five layers, top to bottom. Every tool mentioned in any sub-document of this kit maps to exactly one layer. If it does not map cleanly, it is probably redundant with something that already does.
What lives where, at a glance
| Capability | Layer | Primary tool | Alt under trade-off |
|---|---|---|---|
| Agent reasoning | L3 | Claude API (Opus/Sonnet/Haiku) | Local model (Ollama) for cost-sensitive routing |
| Agent orchestration | L3 | Claude Agent SDK | Raw API + custom harness |
| External data / tools | L3 | MCP servers | Direct API clients in code |
| Workflow state | L3 | Google Sheets + Apps Script | Airtable · Supabase (when USP-009 pressure fires) |
| Code hosting | L3 | GitHub (private repos) | GitLab (if EU residency requirement) |
| Agent deployment | L3 | Vercel Functions · Google Cloud Run | Fly.io · Render |
| Web hosting | L5 | Vercel | Cloudflare Pages |
| Project tracking | L4 | Linear | Notion · GitHub Projects |
| Client comms | L4 | Slack Connect · Gmail | Microsoft Teams (client-led) |
| Design | L4 | Figma | Penpot (open-source alt) |
| Async demos | L4 | Loom | Tella · Descript |
| Knowledge base / docs | L2 | Notion workspace | Google Docs folder tree |
| CRM / pipeline | L2 | Google Sheet (CRM tab) | Attio · Folk (when past ~40 deals) |
| Invoicing | L2 | Stripe Invoicing | Wave · QuickBooks |
| Contracts / NDAs | L2 | DocuSign | HelloSign · PDF + email |
| Secrets vault | L1 | 1Password Teams | Bitwarden · Doppler (for infra secrets) |
| Identity / SSO | L1 | Google Workspace | — |
| Workstation | L1 | MacBook Pro M4 Max (primary); Mac mini M4 (desk) | Framework laptop (Linux, long-term optional) |
| Backup | L1 | Time Machine to external SSD + Backblaze | Arq to B2 |
Budget & cost model.
The Studio runs on a two-line cost model: a monthly fixed envelope that holds whether or not a sprint is live, and a per-sprint variable line that tracks with engagement intensity. Numbers here are working targets at v1.0 — they will move, and the shape of the model matters more than the specific dollar figures.
Monthly fixed envelope — what it costs to keep the Studio running with zero sprints active
| Category | Tool / line item | Monthly (USD) | Notes |
|---|---|---|---|
| L3 Runtime baseline | Claude API (standby/dev) | ~$80 | Dev iteration, internal agents, small invocations. |
| Vercel Pro | $20 | All web properties — studio, lighthouse, case studies. | |
| L4 Collaboration | Linear (Standard) | $10 | Single-seat baseline. |
| Slack (Pro) | $9 | For Slack Connect with clients. | |
| Figma (Professional) | $18 | Files persist; minimal editor seats. | |
| L2 Studio Core | Notion (Plus) | $12 | Knowledge base, pattern library, post-mortems. |
| Google Workspace (Business Starter) | $7 | Identity, mail, Sheets = state-machine substrate. | |
| DocuSign (Personal) | $15 | Contracts, SOWs, NDAs. | |
| L1 Foundation | 1Password (Teams Starter) | $20 | Secrets vault for all categories. |
| Backblaze (unlimited) | $9 | Continuous backup. | |
| Domain registrations | ~$4 | Amortized annually. | |
| Reserve | Buffer / replacements | $40 | Software trials, ad-hoc services. |
| Envelope total | ~$244 | Zero-sprint steady state. |
Per-sprint variable line — what an active Lighthouse Sprint adds
| Phase | Variable cost driver | Expected (USD) | Notes |
|---|---|---|---|
| Discovery (W1–2) | Claude API (planning-heavy, Opus-biased) | $120–$200 | ICP writing, spec drafting, stakeholder synthesis. |
| Redesign (W3–4) | Claude API + light MCP calls | $180–$300 | Iteration on agent specs and wrapper design. |
| Build (W5–9) | Claude API (Sonnet heavy) + compute | $900–$1,600 | Production agent runs, ramp to 300+ invocations/wk. |
| Handoff (W10–11) | Claude API + training rehearsals | $200–$350 | Full-day rehearsal, supervised operation. |
| Support (30d) | Claude API (spot checks) | $60–$120 | Light ad-hoc invocations during support window. |
| Expected total | — | $1,460–$2,570 | COGS for a single sprint at v1.0 scale. |
Capex cadence — hardware and one-time investments
| Item | Expected cadence | Target spend | Driver |
|---|---|---|---|
| Primary workstation refresh | Every 3–4 yrs | $3,500–$4,500 | Dev speed & multi-core agent test harnesses. |
| Secondary workstation (desk Mac mini) | Every 4 yrs | ~$1,500 | Always-on backup runtime & build host. |
| Monitor (4K, 27”+) | Every 5 yrs | ~$700 | Screen real estate for state sheets + code + demo. |
| External SSD (2TB+) | Every 3 yrs | ~$200 | Time Machine + sprint scratch. |
| Mobile test devices | Opportunistic | varies | Phones for mobile web QA; keep one low-spec Android. |
| Network (router, mesh) | Every 5 yrs | ~$400 | Stable uplink during demo calls. |
Studio-level governance.
We apply the Six Governance Components from the Lighthouse framework (USO-LH-07) to the Studio itself. Same vocabulary, different scope. This section maps each component to its Studio-internal expression so that if a Studio-internal tool, pipeline, or process fails, we already know where to look.
| Component | Client scope (Lighthouse) | Studio scope (this kit) | Where it lives |
|---|---|---|---|
| Monitoring | Agent throughput, error rate, latency per run | Sprint cadence, pipeline health, renewal risk, tool cost drift | Studio State Sheet · Linear cycle view |
| Alerting | SEV-1/2/3 incident alerts on client agents | Cost over budget, credential expiring, backup missed, renewal due | Calendar reminders · email rules · budget sheet triggers |
| Audit trail | Per-invocation log of every agent run | Decision log of every tool swap, pattern promotion, sprint variation | studio-decision-log sheet |
| Escalation | 3-tier ladder to client sponsor | Collaborator or peer review for anything touching contracts, pricing, IP | Peer review checklist · founder final sign-off |
| Override | Pause, resume, force-failure, skip-next, redirect | Freeze tooling during an active sprint (no mid-sprint migrations) | P-05 reversibility principle · deploy freeze window |
| Rollback | Snapshot, restore, replay, partial | Tool choice reversal runbook; every subscription has a 15-min cancellation path | Security doc §06 revocation · Internals doc §08 tool reversal |
Four studio-level SEV definitions
| Severity | Condition | ACK window | Response |
|---|---|---|---|
| Studio-SEV-1 | Client data exposure risk, credential leak, active production incident on a client system traced to Studio tooling | < 15 min | Pause affected agents, invoke Security §09 incident response, write post-mortem within 48 h. |
| Studio-SEV-2 | Sprint at risk (gate blocker, stakeholder frustration, missed weekly demo), or Studio tooling outage blocking work > 4 h | < 2 h | Triage and escalate to Sprint Lead; document in Linear. Workaround must exist by EOD. |
| Studio-SEV-3 | Tooling degradation, cost drift, backup miss, vendor notice, renewal-due warning | < 24 h | Log to decision log; resolve in next work session or queue for next quarterly review. |
| Studio-SEV-0 | Observation, informational, “nice to look at next quarter” | — | Drop into quarterly review backlog. |
How the six docs interlock.
The Ops Kit is six documents. This master plus five sub-documents. Each sub-doc has a narrow, non-overlapping scope; cross-references are explicit. If you cannot find a topic where you expected it, check the owner matrix in §08 first — it indexes every topic to its home doc.
Studio Ops Master.
Stack map, operating principles, budget envelope, studio-level governance, decision log, owner matrix. The map of the whole kit.
Agent Runtime.
Claude API & model tiers, Agent SDK, MCP servers, prompt versioning, state-machine infra, observability, deployment targets. Everything at L3.
Collaboration Stack.
Client comms (Slack Connect / email), project tracking (Linear), design (Figma), async demos (Loom), calendaring, shared doc handoff. Everything at L4.
Studio Internals.
Sales pipeline/CRM, invoicing, contracts, knowledge base, pattern library versioning, post-mortem repository, playbook storage. Everything at L2.
Hardware & Workspace.
Workstations, monitors, peripherals, backup, network, mobile test devices, workspace setup, travel kit. Half of L1.
Security & Secrets.
Secrets vault, credential isolation, access revocation, data retention, NDAs/contracts/IP, studio-side incident response. The other half of L1.
Cross-reference index — where to find common topics
| Looking for | Primary doc | Section |
|---|---|---|
| Which model tier to use for a new agent | Agent Runtime | §03 Model tiers |
| How to publish an MCP server for a client integration | Agent Runtime | §04 MCP patterns |
| When to graduate from Sheets to a database | Agent Runtime | §05 State graduation |
| Slack Connect setup with a new client | Collaboration | §03 Client channels |
| Linear project template for a sprint | Collaboration | §04 Tracking |
| Loom recording & retention policy | Collaboration | §05 Async demos |
| Where the pattern library lives | Internals | §05 Pattern library |
| Invoice schedule & milestone payments | Internals | §03 Billing |
| Post-mortem repository structure | Internals | §07 Post-mortems |
| Workstation spec for a new hire | Hardware | §03 Workstation profiles |
| Backup & recovery procedure | Hardware | §05 Backup |
| Credential rotation cadence | Security | §04 Rotation |
| Client credential revocation at Gate 4 | Security | §05 Revocation |
| Data retention & end-of-engagement wipe | Security | §07 Retention |
| IP ownership on patterns extracted from client work | Security | §08 Contracts & IP |
Graduation path & decision log.
Tools are added, replaced, and promoted over time. P-05 (reversible by default) means we move with humility — but we still move. This section defines the named pressures that legitimately force a tool graduation, and the discipline for recording the decision.
Five legitimate pressures to graduate a tool
- Concurrency pressure. The current tool cannot tolerate the number of simultaneous writers, readers, or runs the workflow now requires. Example: a state sheet with more than ~3 Apps Scripts writing to it at once will race.
- Volume pressure. Row count, invocation count, or file size passes a threshold where the current tool becomes slow or unusable. Example: a Sheet past ~5,000 rows in the active tab starts to feel sluggish for interactive use.
- Latency pressure. An operation that must complete in under a specific budget can no longer hit it. Example: a governance readiness check that needs to return in under 300 ms.
- Compliance pressure. A client or regulatory requirement makes the current tool legally or contractually unusable. Example: PHI, EU residency, SOC 2 auditor ask.
- Vendor pressure. The current tool changes pricing, terms, or availability in a way that makes it untenable. Example: an MCP server’s paid tier spikes 5× or shutters.
Decision log — seven-column format
| Column | Purpose |
|---|---|
| date | YYYY-MM-DD when the decision was logged. |
| from | Tool / service / setup being replaced or evolved. |
| to | Tool / service / setup being adopted. |
| pressure | Which of the five legitimate pressures applies. Cite one. |
| cost_delta | Monthly (or one-time) cost difference in USD, signed. |
| rollback | One-sentence description of how to back out, and target duration. |
| owner | Who made and owns the decision. |
The decision log lives in the studio-ops Google Sheet, tab decision-log. Every Friday demo that changes tooling appends a row. The quarterly review (§09) reads the last quarter’s rows first — that is where pattern drift becomes visible.
Owner matrix.
Every tool and process in this kit has exactly one primary owner. When the Studio is one person, the primary owner is the founder across the board — but we still record it, because the matrix is what makes the second hire onboard in a week instead of a month.
| Area | Primary owner | Backup | Review cadence |
|---|---|---|---|
| Operating principles & stack map | Founder | Sprint Lead | Quarterly |
| Budget envelope | Founder | — | Monthly |
| Claude API usage & cost caps | Agent Engineer | Founder | Weekly (during active sprint) |
| Agent SDK & prompt versioning | Agent Engineer | Founder | Per-sprint |
| MCP server inventory | Agent Engineer | Founder | Quarterly |
| State-sheet architecture | Agent Engineer | Governance Architect | Per-sprint |
| Deployment targets (Vercel / Cloud Run) | Agent Engineer | Founder | Quarterly |
| Observability & dashboards | Governance Architect | Agent Engineer | Per-sprint |
| Client Slack Connect channels | Sprint Lead | Founder | Per-sprint |
| Linear workspace & templates | Sprint Lead | Founder | Quarterly |
| Figma library | Sprint Lead | Founder | Quarterly |
| Sales pipeline / CRM | Founder | Sprint Lead | Weekly |
| Invoicing & billing | Founder | — | Monthly |
| Contracts, SOWs, NDAs | Founder | — | Per-sprint |
| Pattern library (USP-001…014) | Founder | Agent Engineer | Per-sprint + quarterly |
| Post-mortem repository | Governance Architect | Founder | After each incident |
| Workstations & hardware | Founder | — | Semi-annual |
| Backup & restore test | Founder | — | Monthly restore-test |
| Secrets vault | Founder | — | Monthly rotation audit |
| Client credential lifecycle | Founder | Sprint Lead | Per-sprint + at Gate 4 |
| Incident response (studio-side) | Founder | Governance Architect | Drill semi-annually |
Quarterly review cadence.
Once a quarter, the Studio stops client work for a full half-day and reviews its own operating system. This is non-negotiable — it is how we resist the drift that eventually poisons any Studio’s tooling stack. Same day each quarter; same agenda; same deliverables.
Standing four-hour agenda
| Block | Duration | What happens |
|---|---|---|
| B1 · Prior-quarter read-out | 30 min | Read every row of the decision log since last review. Circle anything that violated P-01–P-07 without an explicit exception. |
| B2 · Cost audit | 45 min | Reconcile actual vs. envelope & per-sprint variable. Flag any line that moved > 15%. Kill one tool if possible. |
| B3 · Stack map sweep | 45 min | Walk the §03 stack map row by row. For each row, does the primary tool still fit? Has the named alt become more attractive? Is anything orphaned? |
| B4 · Pattern library review | 30 min | Review USP-001…N. Any new candidates from the last quarter? Any pattern that failed in reuse? Retire or demote as needed. |
| B5 · Post-mortem sweep | 30 min | Read every studio-side post-mortem written this quarter. Extract one remediation into the next quarter’s work queue. |
| B6 · Backup & restore drill | 30 min | Live restore of one arbitrary file from last month’s backup. If restore fails or takes > 10 min, Studio-SEV-2 until resolved. |
| B7 · Next-quarter plan | 30 min | Pick exactly 2–3 things to change next quarter. Never more. |
Deliverables out the other side
- Updated MEMORY index in Notion workspace with the quarter’s changes and rationale.
- New decision-log rows for every change adopted in B3/B4/B7.
- Budget sheet reconciled — variance rows resolved or moved to the next quarter.
- Two or three targeted changes scheduled for the next quarter — no more, each with an owner and due date.
- One retired tool — ideally. Subtract before adding.
Studio Ops checklist.
Weekly (every Friday, 20 min)
- Budget-vs-actual scan — any category drifting past 15% of envelope?
- Sprint state sheet review — governance columns green for every active sprint?
- Backup verification — last Time Machine & Backblaze timestamps healthy?
- Inbox zero on client comms — no thread older than 48 h without a response.
- Friday demo — any internal tooling change since Monday? Record one even if solo.
Monthly (first working day of the month, 60 min)
- Reconcile Claude API invoice against expected per-sprint variable lines.
- 1Password monthly rotation audit — any item older than its rotation target?
- Restore-test: pick a random file backed up in the last month — restore it. Must work in < 10 min.
- Pattern library sweep — any new candidate patterns from the month? Any reuse failure?
- Pipeline review — any deal stuck in the same stage for more than 30 days? Either advance or close-lost.
- Invoicing run — generate, send, and record expected receipt dates.
Quarterly (§09 cadence)
- Execute the seven-block agenda in §09 end to end.
- Retire at least one tool.
- Post the updated stack map to Notion.
- Schedule the next quarterly review on the calendar before closing the session.
Per-sprint (at the boundaries)
- Sprint start: verify every tool for each layer is reachable; credentials healthy; budget envelope adjusted for this sprint’s variable line.
- Gate 4: client credentials revoked per Security §05; post-mortem filed if any SEV-2/3 occurred; patterns nominated for the library.
- Support window close (30 d post Gate 4): retention policy executed per Security §07; final variable-cost reconciliation.